# => │ 7 │ 8 │ 18 │ 0.80 │ 7.00 │ c │ c │ b │ eight │
# => │ 8 │ 9 │ 19 │ 0.90 │ 8.00 │ c │ c │ b │ ninth │
# => │ 9 │ 0 │ 10 │ 0.00 │ 9.00 │ c │ c │ b │ ninth │
# => ╰───┴───────┴───────┴─────────┴─────────┴───────┴────────┴───────┴────────╯
```
With the dataframe in memory we can start doing column operations with the
`DataFrame`
::: tip
If you want to see all the dataframe commands that are available you
can use `scope commands | where category =~ dataframe`
:::
## Basic Aggregations
Let's start with basic aggregations on the dataframe. Let's sum all the columns
that exist in `df` by using the `aggregate` command
```nu
$df_1 | polars sum | polars collect
# => ╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬──────╮
# => │ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
# => ├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼──────┤
# => │ 0 │ 40 │ 145 │ 4.50 │ 46.00 │ │ │ │ │
# => ╰───┴───────┴───────┴─────────┴─────────┴───────┴────────┴───────┴──────╯
```
As you can see, the aggregate function computes the sum for those columns where
a sum makes sense. If you want to filter out the text column, you can select
the columns you want by using the [`polars select`](/commands/docs/polars_select.md) command
```nu
$df_1 | polars sum | polars select int_1 int_2 float_1 float_2 | polars collect
# => ╭───┬───────┬───────┬─────────┬─────────╮
# => │ # │ int_1 │ int_2 │ float_1 │ float_2 │
# => ├───┼───────┼───────┼─────────┼─────────┤
# => │ 0 │ 40 │ 145 │ 4.50 │ 46.00 │
# => ╰───┴───────┴───────┴─────────┴─────────╯
```
You can even store the result from this aggregation as you would store any
other Nushell variable
```nu
let res = $df_1 | polars sum | polars select int_1 int_2 float_1 float_2
```
::: tip
Type `let res = !!` and press enter. This will auto complete the previously
executed command. Note the space between `=` and `!!`.
:::
And now we have two dataframes stored in memory
```nu
polars store-ls | select key type columns rows estimated_size
╭──────────────────────────────────────┬───────────┬─────────┬──────┬────────────────╮
│ key │ type │ columns │ rows │ estimated_size │
├──────────────────────────────────────┼───────────┼─────────┼──────┼────────────────┤
│ e780af47-c106-49eb-b38d-d42d3946d66e │ DataFrame │ 8 │ 10 │ 403 B │
│ 3146f4c1-f2a0-475b-a623-7375c1fdb4a7 │ DataFrame │ 4 │ 1 │ 32 B │
╰──────────────────────────────────────┴───────────┴─────────┴──────┴────────────────╯
```
Pretty neat, isn't it?
You can perform several aggregations on the dataframe in order to extract basic
information from the dataframe and do basic data analysis on your brand new
dataframe.
## Joining a DataFrame
It is also possible to join two dataframes using a column as reference. We are
going to join our mini dataframe with another mini dataframe. Copy these lines
in another file and create the corresponding dataframe (for these examples we
are going to call it `test_small_a.csv`)
```nu
"int_1,int_2,float_1,float_2,first
9,14,0.4,3.0,a
8,13,0.3,2.0,a
7,12,0.2,1.0,a
6,11,0.1,0.0,b"
| save --raw --force test_small_a.csv
```
We use the `polars open` command to create the new variable
```nu
let df_2 = polars open --eager test_small_a.csv
```
Now, with the second dataframe loaded in memory we can join them using the
column called `int_1` from the left dataframe and the column `int_1` from the
right dataframe
```nu
$df_1 | polars join $df_2 int_1 int_1
# => ╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────┬─────────┬───────────┬───────────┬─────────╮
# => │ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │ int_2_x │ float_1_x │ float_2_x │ first_x │
# => ├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┼─────────┼───────────┼───────────┼─────────┤