{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary of the tutorial\n",
    "\n",
    "### Bogumił Kamiński"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we finish let us summarize the major functions that DataFrames.jl provides:\n",
    "1. data frame is a matrix-like data structure. You can index it just like a matrix. The differences are\n",
    "   - you can use strings or `Symbol`s to select columns\n",
    "   - if you select rows with `!` it selects you whole column of a data frame and passes it to you without copying\n",
    "2. You can quickly summarize the contents of a data frame using the `describe` function\n",
    "3. You can add rows to a data frame in-place using `push!` (similarly `append!` allows you to add multiple rows at the same time) (also `repeat`/`repeat!`, `hcat` and `vcat` are provided)\n",
    "4. You can work on a grouped data frame that is created using the `groupby` function. It is a view and works as-if you have created a lookup index to a data frame.\n",
    "5. There are `select`/`select!`/`transform`/`transform!`/`combine` functions that allow you to quickly transform/aggregate columns of a data frame or grouped data frame; there is also `mapcols`/`mapcols!` functions for quick aggregation of columns of a data frame\n",
    "6. You can filter rows of a data frame using `filter` and `filter!` functions\n",
    "7. Use `sort` and `sort!` functions to sort data frames\n",
    "8. We have not discussed this but you can join multiple data frames using `innerjoin`, `outerjoin`, `leftjoin`, `rightjoin`, `semijoin`, `antijoin`, and `crossjoin` functions (they work as you would expect them if you know SQL)\n",
    "9. If you want to iterate rows or columns of a data frame use `eachrow` and `eachcol` functions (we have not discussed them, but they work exactly like in Julia Base)\n",
    "10. You can change names of columns in a data frame using `rename` and `rename!` functions; to get names of columns of a data frame use `names` (strings) or `propertynames` (`Symbol`s)\n",
    "11. To get number of rows and columns of a data frame use `nrow` and `ncol` functions\n",
    "12. To flatten nested columns of a data frame use `flatten`\n",
    "13. You can easily allow/disallow missing values in columns of a data frame using `allowmising`/`allowmissing!`/`disallowmising`/`disallowmissing!` functions (similar functionality is provided for making columns categorical using `categorical`/`categorical!` functions)\n",
    "14. You can drop rows with missing data with `dropmissing`/`dropmissing!` functions\n",
    "15. You can switch between [long and wide](https://en.wikipedia.org/wiki/Wide_and_narrow_data) representation of a data frame using `stack` and `unstack`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Additionally we have covered `freqtable` from FreqTables.jl, `@pipe` from Pipe.jl, and `lm` from GLM.jl packages that are often useful when wrangling data.\n",
    "\n",
    "Finally we have shown how to integrate DataFrames.jl with plotting using PyPlot.jl."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course this course was just an introduction.\n",
    "\n",
    "You can find reviews of functionality of DataFrames.jl in:\n",
    "* an official manual at https://juliadata.github.io/DataFrames.jl/stable/\n",
    "* a tutorial going through all functionalities of DataFrames.jl at https://github.com/bkamins/Julia-DataFrames-Tutorial\n",
    "* documentation strings of the respective funcions"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.4.1",
   "language": "julia",
   "name": "julia-1.4"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.4.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}