# Summary of the tutorial

### Bogumił Kamiński

Before we finish let us summarize the major functions that DataFrames.jl provides:
1. data frame is a matrix-like data structure. You can index it just like a matrix. The differences are
 - you can use strings or `Symbol`s to select columns
 - if you select rows with `!` it selects you whole column of a data frame and passes it to you without copying
2. You can quickly summarize the contents of a data frame using the `describe` function
3. You can add rows to a data frame in-place using `push!` (similarly `append!` allows you to add multiple rows at the same time) (also `repeat`/`repeat!`, `hcat` and `vcat` are provided)
4. You can work on a grouped data frame that is created using the `groupby` function. It is a view and works as-if you have created a lookup index to a data frame.
5. There are `select`/`select!`/`transform`/`transform!`/`combine` functions that allow you to quickly transform/aggregate columns of a data frame or grouped data frame; there is also `mapcols`/`mapcols!` functions for quick aggregation of columns of a data frame
6. You can filter rows of a data frame using `filter` and `filter!` functions
7. Use `sort` and `sort!` functions to sort data frames
8. We have not discussed this but you can join multiple data frames using `innerjoin`, `outerjoin`, `leftjoin`, `rightjoin`, `semijoin`, `antijoin`, and `crossjoin` functions (they work as you would expect them if you know SQL)
9. If you want to iterate rows or columns of a data frame use `eachrow` and `eachcol` functions (we have not discussed them, but they work exactly like in Julia Base)
10. You can change names of columns in a data frame using `rename` and `rename!` functions; to get names of columns of a data frame use `names` (strings) or `propertynames` (`Symbol`s)
11. To get number of rows and columns of a data frame use `nrow` and `ncol` functions
12. To flatten nested columns of a data frame use `flatten`
13. You can easily allow/disallow missing values in columns of a data frame using `allowmising`/`allowmissing!`/`disallowmising`/`disallowmissing!` functions (similar functionality is provided for making columns categorical using `categorical`/`categorical!` functions)
14. You can drop rows with missing data with `dropmissing`/`dropmissing!` functions
15. You can switch between [long and wide](https://en.wikipedia.org/wiki/Wide_and_narrow_data) representation of a data frame using `stack` and `unstack`

Additionally we have covered `freqtable` from FreqTables.jl, `@pipe` from Pipe.jl, and `lm` from GLM.jl packages that are often useful when wrangling data.

Finally we have shown how to integrate DataFrames.jl with plotting using PyPlot.jl.

Of course this course was just an introduction.

You can find reviews of functionality of DataFrames.jl in:
* an official manual at https://juliadata.github.io/DataFrames.jl/stable/
* a tutorial going through all functionalities of DataFrames.jl at https://github.com/bkamins/Julia-DataFrames-Tutorial
* documentation strings of the respective funcions