{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with text files\n", "\n", "### Bogumił Kamiński\n", "\n", "In this notebook we will show how one can interact with CSV files when working with DataFrames." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using DataFrames" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "using CSV" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "using Statistics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we download the data set we will work with and save it as auto.txt file in a current working directory." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"auto.txt\"" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "download(\"https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data-original\",\n", " \"auto.txt\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us check how the file looks inside using the `readlines` function:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "406-element Array{String,1}:\n", " \"18.0 8. 307.0 130.0 3504. 12.0 70. 1.\\t\\\"chevrolet chevelle malibu\\\"\"\n", " \"15.0 8. 350.0 165.0 3693. 11.5 70. 1.\\t\\\"buick skylark 320\\\"\"\n", " \"18.0 8. 318.0 150.0 3436. 11.0 70. 1.\\t\\\"plymouth satellite\\\"\"\n", " \"16.0 8. 304.0 150.0 3433. 12.0 70. 1.\\t\\\"amc rebel sst\\\"\"\n", " \"17.0 8. 302.0 140.0 3449. 10.5 70. 1.\\t\\\"ford torino\\\"\"\n", " \"15.0 8. 429.0 198.0 4341. 10.0 70. 1.\\t\\\"ford galaxie 500\\\"\"\n", " \"14.0 8. 454.0 220.0 4354. 9.0 70. 1.\\t\\\"chevrolet impala\\\"\"\n", " \"14.0 8. 440.0 215.0 4312. 8.5 70. 1.\\t\\\"plymouth fury iii\\\"\"\n", " \"14.0 8. 455.0 225.0 4425. 10.0 70. 1.\\t\\\"pontiac catalina\\\"\"\n", " \"15.0 8. 390.0 190.0 3850. 8.5 70. 1.\\t\\\"amc ambassador dpl\\\"\"\n", " \"NA 4. 133.0 115.0 3090. 17.5 70. 2.\\t\\\"citroen ds-21 pallas\\\"\"\n", " \"NA 8. 350.0 165.0 4142. 11.5 70. 1.\\t\\\"chevrolet chevelle concours (sw)\\\"\"\n", " \"NA 8. 351.0 153.0 4034. 11.0 70. 1.\\t\\\"ford torino (sw)\\\"\"\n", " ⋮\n", " \"25.0 6. 181.0 110.0 2945. 16.4 82. 1.\\t\\\"buick century limited\\\"\"\n", " \"38.0 6. 262.0 85.00 3015. 17.0 82. 1.\\t\\\"oldsmobile cutlass ciera (diesel)\\\"\"\n", " \"26.0 4. 156.0 92.00 2585. 14.5 82. 1.\\t\\\"chrysler lebaron medallion\\\"\"\n", " \"22.0 6. 232.0 112.0 2835 14.7 82. 1.\\t\\\"ford granada l\\\"\"\n", " \"32.0 4. 144.0 96.00 2665. 13.9 82. 3.\\t\\\"toyota celica gt\\\"\"\n", " \"36.0 4. 135.0 84.00 2370. 13.0 82. 1.\\t\\\"dodge charger 2.2\\\"\"\n", " \"27.0 4. 151.0 90.00 2950. 17.3 82. 1.\\t\\\"chevrolet camaro\\\"\"\n", " \"27.0 4. 140.0 86.00 2790. 15.6 82. 1.\\t\\\"ford mustang gl\\\"\"\n", " \"44.0 4. 97.00 52.00 2130. 24.6 82. 2.\\t\\\"vw pickup\\\"\"\n", " \"32.0 4. 135.0 84.00 2295. 11.6 82. 1.\\t\\\"dodge rampage\\\"\"\n", " \"28.0 4. 120.0 79.00 2625. 18.6 82. 1.\\t\\\"ford ranger\\\"\"\n", " \"31.0 4. 119.0 82.00 2720. 19.4 82. 1.\\t\\\"chevy s-10\\\"\"" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "readlines(\"auto.txt\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this exercise we have chosen a typical file, which in practice means that things are not trivial.\n", "* first we note that it has no header with column names.\n", "* second, we see that the last column is tab separated, while earlier columns are separated with varying number of spaces\n", "* finally we see that missing values are encoded by \"NA\" in this file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will show several options how the file can be parsed to a `DataFrame`.\n", "The first one is to replace tabs with spaces in the source file, and then load it using `CSV.File` command." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We start by getting the contents of the file into a single string." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"18.0 8. 307.0 130.0 3504. 12.0 70. 1.\\t\\\"chevrolet chevelle malibu\\\"\\n15.0 8. 350.0 165.0 3693. 11.5 70. 1.\\t\\\"buick skylark 320\\\"\\n18.0 8. 318.0 150.0 3436. 11.0 70. 1.\\t\\\"plymouth satellite\\\"\\n16.0 8. 304.0 150.0 3433. 12.0 70. 1.\\t\\\"amc rebel sst\\\"\\n17.0 8. 302.0 140.0 3449. 10.5 70. 1.\\t\\\"ford torino\\\"\\n15.0 8. 429.0 198.0 4341. 10.0 70. 1.\\t\\\"ford galaxie 500\\\"\\n14.0 8. 454.0 220.0 4354. 9.0 70. 1.\\t\\\"chevrolet impala\\\"\\n14.0 8. 440.0 215.0 4312. 8.5 70. 1.\\t\\\"plymouth fury iii\\\"\\n14.0 8. 455.0 225.0 4425. 10.0 70. 1.\\t\\\"pontiac catalina\\\"\\n15.0 8. 390.0 190.0 3850. 8.5 70. 1.\\t\\\"amc ambassador dpl\\\"\\nNA 4. 133.0 115.0 3090. 17.5 70. 2.\\t\\\"citroen ds-21 pallas\\\"\\nNA 8. 350.0 165.0 4142. 11.5 70. 1.\\t\\\"chevrolet chevelle concours (sw)\\\"\\nNA 8. 351.0 153.0 4034. 11.0 70. 1.\\t\\\"ford torino (sw)\\\"\\nNA 8. 383.0 175.0 4166. 10.5 70. 1.\\t\\\"plymouth satellite (sw)\\\"\\nNA 8. 360.0 175.0 3850. 11.0 70. 1.\\t\\\"amc rebel sst (sw)\\\"\\n15.0 8. 383.0 170.0 3563. 10.0 70. 1.\\t\\\"dodge challenger se\\\"\\n14.0 8. 340.0 160.0 3609. 8.0 70. 1.\\t\\\"plymouth 'cuda 340\\\"\\nNA 8. 302.0 140.0 3353. 8.0 70. 1.\\t\\\"ford mustang boss 302\\\"\\n15.0 8. 400.0 150.0 3761. 9.5 70. 1.\\t\\\"chevrolet monte carlo\\\"\\n14.0 8. 455.0 225.0 3086. 10.0 70. 1.\\t\\\"buick estate wagon (sw)\\\"\\n24.0 4. 113.0 95.00 2372. 15.0 70. 3.\\t\\\"toyota corona mark ii\\\"\\n22.0 6. 198.0 95.00 2833. 15.5 70. 1.\\t\\\"plymouth duster\\\"\\n18.0 6. 199.0 97.00 2774. 15.5 70. 1.\\t\\\"amc hornet\\\"\\n21.0 6. 200.0 85.00 2587. 16.0 70. 1.\\t\\\"ford maverick\\\"\\n27.0 4. 97.00 88.00 2130. 14.5 70. 3.\\t\\\"datsun pl510\\\"\\n26.0 4. 97.00 46.00 1835. 20.5 70. 2.\\t\\\"volkswagen 1131 deluxe sedan\\\"\\n25.0 4. 110.0 87.00 2672. 17.5 70. 2.\\t\\\"peugeot 504\\\"\\n24.0 4. 107.0 90.00 2430. 14.5 70. 2.\\t\\\"audi 100 ls\\\"\\n25.0 4. 104.0 95.00 2375. 17.5 70. 2.\\t\\\"saab 99e\\\"\\n26.0 4. 121.0 113.0 2234. 12.5 70. 2.\\t\\\"bmw 2002\\\"\\n21.0 6. 199.0 90.00 2648. 15.0 70. 1.\\t\\\"amc gremlin\\\"\\n10.0 8. 360.0 215.0 4615. 14.0 70. 1.\\t\\\"ford f250\\\"\\n10.0 8. 307.0 200.0 4376. 15.0 70. 1.\\t\\\"chevy c20\\\"\\n11.0 8. 318.0 210.0 4382. 13.5 70. 1.\\t\\\"dodge d200\\\"\\n9.0 8. 304.0 193.0 4732. 18.5 70. 1.\\t\\\"hi 1200d\\\"\\n27.0 4. 97.00 88.00 2130. 14.5 71. 3.\\t\\\"datsun pl510\\\"\\n28.0 4. 140.0 90.00 2264. 15.5 71. 1.\\t\\\"chevrolet vega 2300\\\"\\n25.0 4. 113.0 95.00 2228. 14.0 71. 3.\\t\\\"toyota corona\\\"\\n25.0 4. 98.00 NA 2046. 19.0 71. 1.\\t\\\"ford pinto\\\"\\nNA 4. 97.00 48.00 1978. 20.0 71. 2.\\t\\\"volkswagen super beetle 117\\\"\\n19.0 6. 232.0 100.0 2634. 13.0 71. 1.\\t\\\"amc gremlin\\\"\\n16.0 6. 225.0 105.0 3439. 15.5 71. 1.\\t\\\"plymouth satellite custom\\\"\\n17.0 6. 250.0 100.0 3329. 15.5 71. 1.\\t\\\"chevrolet chevelle malibu\\\"\\n19.0 6. 250.0 88.00 3302. 15.5 71. 1.\\t\\\"ford torino 500\\\"\\n18.0 6. 232.0 100.0 3288. 15.5 71. 1.\\t\\\"amc matador\\\"\\n14.0 8. 350.0 165.0 4209. 12.0 71. 1.\\t\\\"chevrolet impala\\\"\\n14.0 8. 400.0 175.0 4464. 11.5 71. 1.\\t\\\"pontiac catalina brougham\\\"\\n14.0 8. 351.0 153.0 4154. 13.5 71. 1.\\t\\\"ford galaxie 500\\\"\\n14.0 8. 318.0 150.0 4096. 13.0 71. 1.\\t\\\"plymouth fury iii\\\"\\n12.0 8. 383.0 180.0 4955. 11.5 71. 1.\\t\\\"dodge monaco (sw)\\\"\\n13.0 8. 400.0 170.0 4746. 12.0 71. 1.\\t\\\"ford country squire (sw)\\\"\\n13.0 8. 400.0 175.0 5140. 12.0 71. 1.\\t\\\"pontiac safari (sw)\\\"\\n18.0 6. 258.0 110.0 2962. 13.5 71. 1.\\t\\\"amc hornet sportabout (sw)\\\"\\n22.0 4. 140.0 72.00 2408. 19.0 71. 1.\\t\\\"chevrolet vega (sw)\\\"\\n19.0 6. 250.0 100.0 3282. 15.0 71. 1.\\t\\\"pontiac firebird\\\"\\n18.0 6. 250.0 88.00 3139. 14.5 71. 1.\\t\\\"ford mustang\\\"\\n23.0 4. 122.0 86.00 2220. 14.0 71. 1.\\t\\\"mercury capri 2000\\\"\\n28.0 4. 116.0 90.00 2123. 14.0 71. 2.\\t\\\"opel 1900\\\"\\n30.0 4. 79.00 70.00 2074. 19.5 71. 2.\\t\\\"peugeot 304\\\"\\n30.0 4. 88.00 76.00 2065. 14.5 71. 2.\\t\\\"fiat 124b\\\"\\n31.0 4. 71.00 65.00 1773. 19.0 71. 3.\\t\\\"toyota corolla 1200\\\"\\n35.0 4. 72.00 69.00 1613. 18.0 71. 3.\\t\\\"datsun 1200\\\"\\n27.0 4. 97.00 60.00 1834. 19.0 71. 2.\\t\\\"volkswagen model 111\\\"\\n26.0 4. 91.00 70.00 1955. 20.5 71. 1.\\t\\\"plymouth cricket\\\"\\n24.0 4. 113.0 95.00 2278. 15.5 72. 3.\\t\\\"toyota corona hardtop\\\"\\n25.0 4. 97.50 80.00 2126. 17.0 72. 1.\\t\\\"dodge colt hardtop\\\"\\n23.0 4. 97.00 54.00 2254. 23.5 72. 2.\\t\\\"volkswagen type 3\\\"\\n20.0 4. 140.0 90.00 2408. 19.5 72. 1.\\t\\\"chevrolet vega\\\"\\n21.0 4. 122.0 86.00 2226. 16.5 72. 1.\\t\\\"ford pinto runabout\\\"\\n13.0 8. 350.0 165.0 4274. 12.0 72. 1.\\t\\\"chevrolet impala\\\"\\n14.0 8. 400.0 175.0 4385. 12.0 72. 1.\\t\\\"pontiac catalina\\\"\\n15.0 8. 318.0 150.0 4135. 13.5 72. 1.\\t\\\"plymouth fury iii\\\"\\n14.0 8. 351.0 153.0 4129. 13.0 72. 1.\\t\\\"ford galaxie 500\\\"\\n17.0 8. 304.0 150.0 3672. 11.5 72. 1.\\t\\\"amc ambassador sst\\\"\\n11.0 8. 429.0 208.0 4633. 11.0 72. 1.\\t\\\"mercury marquis\\\"\\n13.0 8. 350.0 155.0 4502. 13.5 72. 1.\\t\\\"buick lesabre custom\\\"\\n12.0 8. 350.0 160.0 4456. 13.5 72. 1.\\t\\\"oldsmobile delta 88 royale\\\"\\n13.0 8. 400.0 190.0 4422. 12.5 72. 1.\\t\\\"chrysler newport royal\\\"\\n19.0 3. 70.00 97.00 2330. 13.5 72. 3.\\t\\\"mazda rx2 coupe\\\"\\n15.0 8. 304.0 150.0 3892. 12.5 72. 1.\\t\\\"amc matador (sw)\\\"\\n13.0 8. 307.0 130.0 4098. 14.0 72. 1.\\t\\\"chevrolet chevelle concours (sw)\\\"\\n13.0 8. 302.0 140.0 4294. 16.0 72. 1.\\t\\\"ford gran torino (sw)\\\"\\n14.0 8. 318.0 150.0 4077. 14.0 72. 1.\\t\\\"plymouth satellite custom (sw)\\\"\\n18.0 4. 121.0 112.0 2933. 14.5 72. 2.\\t\\\"volvo 145e (sw)\\\"\\n22.0 4. 121.0 76.00 2511. 18.0 72. 2.\\t\\\"volkswagen 411 (sw)\\\"\\n21.0 4. 120.0 87.00 2979. 19.5 72. 2.\\t\\\"peugeot 504 (sw)\\\"\\n26.0 4. 96.00 69.00 2189. 18.0 72. 2.\\t\\\"renault 12 (sw)\\\"\\n22.0 4. 122.0 86.00 2395. 16.0 72. 1.\\t\\\"ford pinto (sw)\\\"\\n28.0 4. 97.00 92.00 2288. 17.0 72. 3.\\t\\\"datsun 510 (sw)\\\"\\n23.0 4. 120.0 97.00 2506. 14.5 72. 3.\\t\\\"toyouta corona mark ii (sw)\\\"\\n28.0 4. 98.00 80.00 2164. 15.0 72. 1.\\t\\\"dodge colt (sw)\\\"\\n27.0 4. 97.00 88.00 2100. 16.5 72. 3.\\t\\\"toyota corolla 1600 (sw)\\\"\\n13.0 8. 350.0 175.0 4100. 13.0 73. 1.\\t\\\"buick century 350\\\"\\n14.0 8. 304.0 150.0 3672. 11.5 73. 1.\\t\\\"amc matador\\\"\\n13.0 8. 350.0 145.0 3988. 13.0 73. 1.\\t\\\"chevrolet malibu\\\"\\n14.0 8. 302.0 137.0 4042. 14.5 73. 1.\\t\\\"ford gran torino\\\"\\n15.0 8. 318.0 150.0 3777. 12.5 73. 1.\\t\\\"dodge coronet custom\\\"\\n12.0 8. 429.0 198.0 4952. 11.5 73. 1.\\t\\\"mercury marquis brougham\\\"\\n13.0 8. 400.0 150.0 4464. 12.0 73. 1.\\t\\\"chevrolet caprice classic\\\"\\n13.0 8. 351.0 158.0 4363. 13.0 73. 1.\\t\\\"ford ltd\\\"\\n14.0 8. 318.0 150.0 4237. 14.5 73. 1.\\t\\\"plymouth fury gran sedan\\\"\\n13.0 8. 440.0 215.0 4735. 11.0 73. 1.\\t\\\"chrysler new yorker brougham\\\"\\n12.0 8. 455.0 225.0 4951. 11.0 73. 1.\\t\\\"buick electra 225 custom\\\"\\n13.0 8. 360.0 175.0 3821. 11.0 73. 1.\\t\\\"amc ambassador brougham\\\"\\n18.0 6. 225.0 105.0 3121. 16.5 73. 1.\\t\\\"plymouth valiant\\\"\\n16.0 6. 250.0 100.0 3278. 18.0 73. 1.\\t\\\"chevrolet nova custom\\\"\\n18.0 6. 232.0 100.0 2945. 16.0 73. 1.\\t\\\"amc hornet\\\"\\n18.0 6. 250.0 88.00 3021. 16.5 73. 1.\\t\\\"ford maverick\\\"\\n23.0 6. 198.0 95.00 2904. 16.0 73. 1.\\t\\\"plymouth duster\\\"\\n26.0 4. 97.00 46.00 1950. 21.0 73. 2.\\t\\\"volkswagen super beetle\\\"\\n11.0 8. 400.0 150.0 4997. 14.0 73. 1.\\t\\\"chevrolet impala\\\"\\n12.0 8. 400.0 167.0 4906. 12.5 73. 1.\\t\\\"ford country\\\"\\n13.0 8. 360.0 170.0 4654. 13.0 73. 1.\\t\\\"plymouth custom suburb\\\"\\n12.0 8. 350.0 180.0 4499. 12.5 73. 1.\\t\\\"oldsmobile vista cruiser\\\"\\n18.0 6. 232.0 100.0 2789. 15.0 73. 1.\\t\\\"amc gremlin\\\"\\n20.0 4. 97.00 88.00 2279. 19.0 73. 3.\\t\\\"toyota carina\\\"\\n21.0 4. 140.0 72.00 2401. 19.5 73. 1.\\t\\\"chevrolet vega\\\"\\n22.0 4. 108.0 94.00 2379. 16.5 73. 3.\\t\\\"datsun 610\\\"\\n18.0 3. 70.00 90.00 2124. 13.5 73. 3.\\t\\\"maxda rx3\\\"\\n19.0 4. 122.0 85.00 2310. 18.5 73. 1.\\t\\\"ford pinto\\\"\\n21.0 6. 155.0 107.0 2472. 14.0 73. 1.\\t\\\"mercury capri v6\\\"\\n26.0 4. 98.00 90.00 2265. 15.5 73. 2.\\t\\\"fiat 124 sport coupe\\\"\\n15.0 8. 350.0 145.0 4082. 13.0 73. 1.\\t\\\"chevrolet monte carlo s\\\"\\n16.0 8. 400.0 230.0 4278. 9.50 73. 1.\\t\\\"pontiac grand prix\\\"\\n29.0 4. 68.00 49.00 1867. 19.5 73. 2.\\t\\\"fiat 128\\\"\\n24.0 4. 116.0 75.00 2158. 15.5 73. 2.\\t\\\"opel manta\\\"\\n20.0 4. 114.0 91.00 2582. 14.0 73. 2.\\t\\\"audi 100ls\\\"\\n19.0 4. 121.0 112.0 2868. 15.5 73. 2.\\t\\\"volvo 144ea\\\"\\n15.0 8. 318.0 150.0 3399. 11.0 73. 1.\\t\\\"dodge dart custom\\\"\\n24.0 4. 121.0 110.0 2660. 14.0 73. 2.\\t\\\"saab 99le\\\"\\n20.0 6. 156.0 122.0 2807. 13.5 73. 3.\\t\\\"toyota mark ii\\\"\\n11.0 8. 350.0 180.0 3664. 11.0 73. 1.\\t\\\"oldsmobile omega\\\"\\n20.0 6. 198.0 95.00 3102. 16.5 74. 1.\\t\\\"plymouth duster\\\"\\n21.0 6. 200.0 NA 2875. 17.0 74. 1.\\t\\\"ford maverick\\\"\\n19.0 6. 232.0 100.0 2901. 16.0 74. 1.\\t\\\"amc hornet\\\"\\n15.0 6. 250.0 100.0 3336. 17.0 74. 1.\\t\\\"chevrolet nova\\\"\\n31.0 4. 79.00 67.00 1950. 19.0 74. 3.\\t\\\"datsun b210\\\"\\n26.0 4. 122.0 80.00 2451. 16.5 74. 1.\\t\\\"ford pinto\\\"\\n32.0 4. 71.00 65.00 1836. 21.0 74. 3.\\t\\\"toyota corolla 1200\\\"\\n25.0 4. 140.0 75.00 2542. 17.0 74. 1.\\t\\\"chevrolet vega\\\"\\n16.0 6. 250.0 100.0 3781. 17.0 74. 1.\\t\\\"chevrolet chevelle malibu classic\\\"\\n16.0 6. 258.0 110.0 3632. 18.0 74. 1.\\t\\\"amc matador\\\"\\n18.0 6. 225.0 105.0 3613. 16.5 74. 1.\\t\\\"plymouth satellite sebring\\\"\\n16.0 8. 302.0 140.0 4141. 14.0 74. 1.\\t\\\"ford gran torino\\\"\\n13.0 8. 350.0 150.0 4699. 14.5 74. 1.\\t\\\"buick century luxus (sw)\\\"\\n14.0 8. 318.0 150.0 4457. 13.5 74. 1.\\t\\\"dodge coronet custom (sw)\\\"\\n14.0 8. 302.0 140.0 4638. 16.0 74. 1.\\t\\\"ford gran torino (sw)\\\"\\n14.0 8. 304.0 150.0 4257. 15.5 74. 1.\\t\\\"amc matador (sw)\\\"\\n29.0 4. 98.00 83.00 2219. 16.5 74. 2.\\t\\\"audi fox\\\"\\n26.0 4. 79.00 67.00 1963. 15.5 74. 2.\\t\\\"volkswagen dasher\\\"\\n26.0 4. 97.00 78.00 2300. 14.5 74. 2.\\t\\\"opel manta\\\"\\n31.0 4. 76.00 52.00 1649. 16.5 74. 3.\\t\\\"toyota corona\\\"\\n32.0 4. 83.00 61.00 2003. 19.0 74. 3.\\t\\\"datsun 710\\\"\\n28.0 4. 90.00 75.00 2125. 14.5 74. 1.\\t\\\"dodge colt\\\"\\n24.0 4. 90.00 75.00 2108. 15.5 74. 2.\\t\\\"fiat 128\\\"\\n26.0 4. 116.0 75.00 2246. 14.0 74. 2.\\t\\\"fiat 124 tc\\\"\\n24.0 4. 120.0 97.00 2489. 15.0 74. 3.\\t\\\"honda civic\\\"\\n26.0 4. 108.0 93.00 2391. 15.5 74. 3.\\t\\\"subaru\\\"\\n31.0 4. 79.00 67.00 2000. 16.0 74. 2.\\t\\\"fiat x1.9\\\"\\n19.0 6. 225.0 95.00 3264. 16.0 75. 1.\\t\\\"plymouth valiant custom\\\"\\n18.0 6. 250.0 105.0 3459. 16.0 75. 1.\\t\\\"chevrolet nova\\\"\\n15.0 6. 250.0 72.00 3432. 21.0 75. 1.\\t\\\"mercury monarch\\\"\\n15.0 6. 250.0 72.00 3158. 19.5 75. 1.\\t\\\"ford maverick\\\"\\n16.0 8. 400.0 170.0 4668. 11.5 75. 1.\\t\\\"pontiac catalina\\\"\\n15.0 8. 350.0 145.0 4440. 14.0 75. 1.\\t\\\"chevrolet bel air\\\"\\n16.0 8. 318.0 150.0 4498. 14.5 75. 1.\\t\\\"plymouth grand fury\\\"\\n14.0 8. 351.0 148.0 4657. 13.5 75. 1.\\t\\\"ford ltd\\\"\\n17.0 6. 231.0 110.0 3907. 21.0 75. 1.\\t\\\"buick century\\\"\\n16.0 6. 250.0 105.0 3897. 18.5 75. 1.\\t\\\"chevroelt chevelle malibu\\\"\\n15.0 6. 258.0 110.0 3730. 19.0 75. 1.\\t\\\"amc matador\\\"\\n18.0 6. 225.0 95.00 3785. 19.0 75. 1.\\t\\\"plymouth fury\\\"\\n21.0 6. 231.0 110.0 3039. 15.0 75. 1.\\t\\\"buick skyhawk\\\"\\n20.0 8. 262.0 110.0 3221. 13.5 75. 1.\\t\\\"chevrolet monza 2+2\\\"\\n13.0 8. 302.0 129.0 3169. 12.0 75. 1.\\t\\\"ford mustang ii\\\"\\n29.0 4. 97.00 75.00 2171. 16.0 75. 3.\\t\\\"toyota corolla\\\"\\n23.0 4. 140.0 83.00 2639. 17.0 75. 1.\\t\\\"ford pinto\\\"\\n20.0 6. 232.0 100.0 2914. 16.0 75. 1.\\t\\\"amc gremlin\\\"\\n23.0 4. 140.0 78.00 2592. 18.5 75. 1.\\t\\\"pontiac astro\\\"\\n24.0 4. 134.0 96.00 2702. 13.5 75. 3.\\t\\\"toyota corona\\\"\\n25.0 4. 90.00 71.00 2223. 16.5 75. 2.\\t\\\"volkswagen dasher\\\"\\n24.0 4. 119.0 97.00 2545. 17.0 75. 3.\\t\\\"datsun 710\\\"\\n18.0 6. 171.0 97.00 2984. 14.5 75. 1.\\t\\\"ford pinto\\\"\\n29.0 4. 90.00 70.00 1937. 14.0 75. 2.\\t\\\"volkswagen rabbit\\\"\\n19.0 6. 232.0 90.00 3211. 17.0 75. 1.\\t\\\"amc pacer\\\"\\n23.0 4. 115.0 95.00 2694. 15.0 75. 2.\\t\\\"audi 100ls\\\"\\n23.0 4. 120.0 88.00 2957. 17.0 75. 2.\\t\\\"peugeot 504\\\"\\n22.0 4. 121.0 98.00 2945. 14.5 75. 2.\\t\\\"volvo 244dl\\\"\\n25.0 4. 121.0 115.0 2671. 13.5 75. 2.\\t\\\"saab 99le\\\"\\n33.0 4. 91.00 53.00 1795. 17.5 75. 3.\\t\\\"honda civic cvcc\\\"\\n28.0 4. 107.0 86.00 2464. 15.5 76. 2.\\t\\\"fiat 131\\\"\\n25.0 4. 116.0 81.00 2220. 16.9 76. 2.\\t\\\"opel 1900\\\"\\n25.0 4. 140.0 92.00 2572. 14.9 76. 1.\\t\\\"capri ii\\\"\\n26.0 4. 98.00 79.00 2255. 17.7 76. 1.\\t\\\"dodge colt\\\"\\n27.0 4. 101.0 83.00 2202. 15.3 76. 2.\\t\\\"renault 12tl\\\"\\n17.5 8. 305.0 140.0 4215. 13.0 76. 1.\\t\\\"chevrolet chevelle malibu classic\\\"\\n16.0 8. 318.0 150.0 4190. 13.0 76. 1.\\t\\\"dodge coronet brougham\\\"\\n15.5 8. 304.0 120.0 3962. 13.9 76. 1.\\t\\\"amc matador\\\"\\n14.5 8. 351.0 152.0 4215. 12.8 76. 1.\\t\\\"ford gran torino\\\"\\n22.0 6. 225.0 100.0 3233. 15.4 76. 1.\\t\\\"plymouth valiant\\\"\\n22.0 6. 250.0 105.0 3353. 14.5 76. 1.\\t\\\"chevrolet nova\\\"\\n24.0 6. 200.0 81.00 3012. 17.6 76. 1.\\t\\\"ford maverick\\\"\\n22.5 6. 232.0 90.00 3085. 17.6 76. 1.\\t\\\"amc hornet\\\"\\n29.0 4. 85.00 52.00 2035. 22.2 76. 1.\\t\\\"chevrolet chevette\\\"\\n24.5 4. 98.00 60.00 2164. 22.1 76. 1.\\t\\\"chevrolet woody\\\"\\n29.0 4. 90.00 70.00 1937. 14.2 76. 2.\\t\\\"vw rabbit\\\"\\n33.0 4. 91.00 53.00 1795. 17.4 76. 3.\\t\\\"honda civic\\\"\\n20.0 6. 225.0 100.0 3651. 17.7 76. 1.\\t\\\"dodge aspen se\\\"\\n18.0 6. 250.0 78.00 3574. 21.0 76. 1.\\t\\\"ford granada ghia\\\"\\n18.5 6. 250.0 110.0 3645. 16.2 76. 1.\\t\\\"pontiac ventura sj\\\"\\n17.5 6. 258.0 95.00 3193. 17.8 76. 1.\\t\\\"amc pacer d/l\\\"\\n29.5 4. 97.00 71.00 1825. 12.2 76. 2.\\t\\\"volkswagen rabbit\\\"\\n32.0 4. 85.00 70.00 1990. 17.0 76. 3.\\t\\\"datsun b-210\\\"\\n28.0 4. 97.00 75.00 2155. 16.4 76. 3.\\t\\\"toyota corolla\\\"\\n26.5 4. 140.0 72.00 2565. 13.6 76. 1.\\t\\\"ford pinto\\\"\\n20.0 4. 130.0 102.0 3150. 15.7 76. 2.\\t\\\"volvo 245\\\"\\n13.0 8. 318.0 150.0 3940. 13.2 76. 1.\\t\\\"plymouth volare premier v8\\\"\\n19.0 4. 120.0 88.00 3270. 21.9 76. 2.\\t\\\"peugeot 504\\\"\\n19.0 6. 156.0 108.0 2930. 15.5 76. 3.\\t\\\"toyota mark ii\\\"\\n16.5 6. 168.0 120.0 3820. 16.7 76. 2.\\t\\\"mercedes-benz 280s\\\"\\n16.5 8. 350.0 180.0 4380. 12.1 76. 1.\\t\\\"cadillac seville\\\"\\n13.0 8. 350.0 145.0 4055. 12.0 76. 1.\\t\\\"chevy c10\\\"\\n13.0 8. 302.0 130.0 3870. 15.0 76. 1.\\t\\\"ford f108\\\"\\n13.0 8. 318.0 150.0 3755. 14.0 76. 1.\\t\\\"dodge d100\\\"\\n31.5 4. 98.00 68.00 2045. 18.5 77. 3.\\t\\\"honda accord cvcc\\\"\\n30.0 4. 111.0 80.00 2155. 14.8 77. 1.\\t\\\"buick opel isuzu deluxe\\\"\\n36.0 4. 79.00 58.00 1825. 18.6 77. 2.\\t\\\"renault 5 gtl\\\"\\n25.5 4. 122.0 96.00 2300. 15.5 77. 1.\\t\\\"plymouth arrow gs\\\"\\n33.5 4. 85.00 70.00 1945. 16.8 77. 3.\\t\\\"datsun f-10 hatchback\\\"\\n17.5 8. 305.0 145.0 3880. 12.5 77. 1.\\t\\\"chevrolet caprice classic\\\"\\n17.0 8. 260.0 110.0 4060. 19.0 77. 1.\\t\\\"oldsmobile cutlass supreme\\\"\\n15.5 8. 318.0 145.0 4140. 13.7 77. 1.\\t\\\"dodge monaco brougham\\\"\\n15.0 8. 302.0 130.0 4295. 14.9 77. 1.\\t\\\"mercury cougar brougham\\\"\\n17.5 6. 250.0 110.0 3520. 16.4 77. 1.\\t\\\"chevrolet concours\\\"\\n20.5 6. 231.0 105.0 3425. 16.9 77. 1.\\t\\\"buick skylark\\\"\\n19.0 6. 225.0 100.0 3630. 17.7 77. 1.\\t\\\"plymouth volare custom\\\"\\n18.5 6. 250.0 98.00 3525. 19.0 77. 1.\\t\\\"ford granada\\\"\\n16.0 8. 400.0 180.0 4220. 11.1 77. 1.\\t\\\"pontiac grand prix lj\\\"\\n15.5 8. 350.0 170.0 4165. 11.4 77. 1.\\t\\\"chevrolet monte carlo landau\\\"\\n15.5 8. 400.0 190.0 4325. 12.2 77. 1.\\t\\\"chrysler cordoba\\\"\\n16.0 8. 351.0 149.0 4335. 14.5 77. 1.\\t\\\"ford thunderbird\\\"\\n29.0 4. 97.00 78.00 1940. 14.5 77. 2.\\t\\\"volkswagen rabbit custom\\\"\\n24.5 4. 151.0 88.00 2740. 16.0 77. 1.\\t\\\"pontiac sunbird coupe\\\"\\n26.0 4. 97.00 75.00 2265. 18.2 77. 3.\\t\\\"toyota corolla liftback\\\"\\n25.5 4. 140.0 89.00 2755. 15.8 77. 1.\\t\\\"ford mustang ii 2+2\\\"\\n30.5 4. 98.00 63.00 2051. 17.0 77. 1.\\t\\\"chevrolet chevette\\\"\\n33.5 4. 98.00 83.00 2075. 15.9 77. 1.\\t\\\"dodge colt m/m\\\"\\n30.0 4. 97.00 67.00 1985. 16.4 77. 3.\\t\\\"subaru dl\\\"\\n30.5 4. 97.00 78.00 2190. 14.1 77. 2.\\t\\\"volkswagen dasher\\\"\\n22.0 6. 146.0 97.00 2815. 14.5 77. 3.\\t\\\"datsun 810\\\"\\n21.5 4. 121.0 110.0 2600. 12.8 77. 2.\\t\\\"bmw 320i\\\"\\n21.5 3. 80.00 110.0 2720. 13.5 77. 3.\\t\\\"mazda rx-4\\\"\\n43.1 4. 90.00 48.00 1985. 21.5 78 2.\\t\\\"volkswagen rabbit custom diesel\\\"\\n36.1 4. 98.00 66.00 1800. 14.4 78 1.\\t\\\"ford fiesta\\\"\\n32.8 4. 78.00 52.00 1985. 19.4 78. 3.\\t\\\"mazda glc deluxe\\\"\\n39.4 4. 85.00 70.00 2070. 18.6 78. 3.\\t\\\"datsun b210 gx\\\"\\n36.1 4. 91.00 60.00 1800. 16.4 78. 3.\\t\\\"honda civic cvcc\\\"\\n19.9 8. 260.0 110.0 3365. 15.5 78. 1.\\t\\\"oldsmobile cutlass salon brougham\\\"\\n19.4 8. 318.0 140.0 3735. 13.2 78. 1.\\t\\\"dodge diplomat\\\"\\n20.2 8. 302.0 139.0 3570. 12.8 78. 1.\\t\\\"mercury monarch ghia\\\"\\n19.2 6. 231.0 105.0 3535. 19.2 78. 1.\\t\\\"pontiac phoenix lj\\\"\\n20.5 6. 200.0 95.00 3155. 18.2 78. 1.\\t\\\"chevrolet malibu\\\"\\n20.2 6. 200.0 85.00 2965. 15.8 78. 1.\\t\\\"ford fairmont (auto)\\\"\\n25.1 4. 140.0 88.00 2720. 15.4 78. 1.\\t\\\"ford fairmont (man)\\\"\\n20.5 6. 225.0 100.0 3430. 17.2 78. 1.\\t\\\"plymouth volare\\\"\\n19.4 6. 232.0 90.00 3210. 17.2 78. 1.\\t\\\"amc concord\\\"\\n20.6 6. 231.0 105.0 3380. 15.8 78. 1.\\t\\\"buick century special\\\"\\n20.8 6. 200.0 85.00 3070. 16.7 78. 1.\\t\\\"mercury zephyr\\\"\\n18.6 6. 225.0 110.0 3620. 18.7 78. 1.\\t\\\"dodge aspen\\\"\\n18.1 6. 258.0 120.0 3410. 15.1 78. 1.\\t\\\"amc concord d/l\\\"\\n19.2 8. 305.0 145.0 3425. 13.2 78. 1.\\t\\\"chevrolet monte carlo landau\\\"\\n17.7 6. 231.0 165.0 3445. 13.4 78. 1.\\t\\\"buick regal sport coupe (turbo)\\\"\\n18.1 8. 302.0 139.0 3205. 11.2 78. 1.\\t\\\"ford futura\\\"\\n17.5 8. 318.0 140.0 4080. 13.7 78. 1.\\t\\\"dodge magnum xe\\\"\\n30.0 4. 98.00 68.00 2155. 16.5 78. 1.\\t\\\"chevrolet chevette\\\"\\n27.5 4. 134.0 95.00 2560. 14.2 78. 3.\\t\\\"toyota corona\\\"\\n27.2 4. 119.0 97.00 2300. 14.7 78. 3.\\t\\\"datsun 510\\\"\\n30.9 4. 105.0 75.00 2230. 14.5 78. 1.\\t\\\"dodge omni\\\"\\n21.1 4. 134.0 95.00 2515. 14.8 78. 3.\\t\\\"toyota celica gt liftback\\\"\\n23.2 4. 156.0 105.0 2745. 16.7 78. 1.\\t\\\"plymouth sapporo\\\"\\n23.8 4. 151.0 85.00 2855. 17.6 78. 1.\\t\\\"oldsmobile starfire sx\\\"\\n23.9 4. 119.0 97.00 2405. 14.9 78. 3.\\t\\\"datsun 200-sx\\\"\\n20.3 5. 131.0 103.0 2830. 15.9 78. 2.\\t\\\"audi 5000\\\"\\n17.0 6. 163.0 125.0 3140. 13.6 78. 2.\\t\\\"volvo 264gl\\\"\\n21.6 4. 121.0 115.0 2795. 15.7 78. 2.\\t\\\"saab 99gle\\\"\\n16.2 6. 163.0 133.0 3410. 15.8 78. 2.\\t\\\"peugeot 604sl\\\"\\n31.5 4. 89.00 71.00 1990. 14.9 78. 2.\\t\\\"volkswagen scirocco\\\"\\n29.5 4. 98.00 68.00 2135. 16.6 78. 3.\\t\\\"honda accord lx\\\"\\n21.5 6. 231.0 115.0 3245. 15.4 79. 1.\\t\\\"pontiac lemans v6\\\"\\n19.8 6. 200.0 85.00 2990. 18.2 79. 1.\\t\\\"mercury zephyr 6\\\"\\n22.3 4. 140.0 88.00 2890. 17.3 79. 1.\\t\\\"ford fairmont 4\\\"\\n20.2 6. 232.0 90.00 3265. 18.2 79. 1.\\t\\\"amc concord dl 6\\\"\\n20.6 6. 225.0 110.0 3360. 16.6 79. 1.\\t\\\"dodge aspen 6\\\"\\n17.0 8. 305.0 130.0 3840. 15.4 79. 1.\\t\\\"chevrolet caprice classic\\\"\\n17.6 8. 302.0 129.0 3725. 13.4 79. 1.\\t\\\"ford ltd landau\\\"\\n16.5 8. 351.0 138.0 3955. 13.2 79. 1.\\t\\\"mercury grand marquis\\\"\\n18.2 8. 318.0 135.0 3830. 15.2 79. 1.\\t\\\"dodge st. regis\\\"\\n16.9 8. 350.0 155.0 4360. 14.9 79. 1.\\t\\\"buick estate wagon (sw)\\\"\\n15.5 8. 351.0 142.0 4054. 14.3 79. 1.\\t\\\"ford country squire (sw)\\\"\\n19.2 8. 267.0 125.0 3605. 15.0 79. 1.\\t\\\"chevrolet malibu classic (sw)\\\"\\n18.5 8. 360.0 150.0 3940. 13.0 79. 1.\\t\\\"chrysler lebaron town @ country (sw)\\\"\\n31.9 4. 89.00 71.00 1925. 14.0 79. 2.\\t\\\"vw rabbit custom\\\"\\n34.1 4. 86.00 65.00 1975. 15.2 79. 3.\\t\\\"maxda glc deluxe\\\"\\n35.7 4. 98.00 80.00 1915. 14.4 79. 1.\\t\\\"dodge colt hatchback custom\\\"\\n27.4 4. 121.0 80.00 2670. 15.0 79. 1.\\t\\\"amc spirit dl\\\"\\n25.4 5. 183.0 77.00 3530. 20.1 79. 2.\\t\\\"mercedes benz 300d\\\"\\n23.0 8. 350.0 125.0 3900. 17.4 79. 1.\\t\\\"cadillac eldorado\\\"\\n27.2 4. 141.0 71.00 3190. 24.8 79. 2.\\t\\\"peugeot 504\\\"\\n23.9 8. 260.0 90.00 3420. 22.2 79. 1.\\t\\\"oldsmobile cutlass salon brougham\\\"\\n34.2 4. 105.0 70.00 2200. 13.2 79. 1.\\t\\\"plymouth horizon\\\"\\n34.5 4. 105.0 70.00 2150. 14.9 79. 1.\\t\\\"plymouth horizon tc3\\\"\\n31.8 4. 85.00 65.00 2020. 19.2 79. 3.\\t\\\"datsun 210\\\"\\n37.3 4. 91.00 69.00 2130. 14.7 79. 2.\\t\\\"fiat strada custom\\\"\\n28.4 4. 151.0 90.00 2670. 16.0 79. 1.\\t\\\"buick skylark limited\\\"\\n28.8 6. 173.0 115.0 2595. 11.3 79. 1.\\t\\\"chevrolet citation\\\"\\n26.8 6. 173.0 115.0 2700. 12.9 79. 1.\\t\\\"oldsmobile omega brougham\\\"\\n33.5 4. 151.0 90.00 2556. 13.2 79. 1.\\t\\\"pontiac phoenix\\\"\\n41.5 4. 98.00 76.00 2144. 14.7 80. 2.\\t\\\"vw rabbit\\\"\\n38.1 4. 89.00 60.00 1968. 18.8 80. 3.\\t\\\"toyota corolla tercel\\\"\\n32.1 4. 98.00 70.00 2120. 15.5 80. 1.\\t\\\"chevrolet chevette\\\"\\n37.2 4. 86.00 65.00 2019. 16.4 80. 3.\\t\\\"datsun 310\\\"\\n28.0 4. 151.0 90.00 2678. 16.5 80. 1.\\t\\\"chevrolet citation\\\"\\n26.4 4. 140.0 88.00 2870. 18.1 80. 1.\\t\\\"ford fairmont\\\"\\n24.3 4. 151.0 90.00 3003. 20.1 80. 1.\\t\\\"amc concord\\\"\\n19.1 6. 225.0 90.00 3381. 18.7 80. 1.\\t\\\"dodge aspen\\\"\\n34.3 4. 97.00 78.00 2188. 15.8 80. 2.\\t\\\"audi 4000\\\"\\n29.8 4. 134.0 90.00 2711. 15.5 80. 3.\\t\\\"toyota corona liftback\\\"\\n31.3 4. 120.0 75.00 2542. 17.5 80. 3.\\t\\\"mazda 626\\\"\\n37.0 4. 119.0 92.00 2434. 15.0 80. 3.\\t\\\"datsun 510 hatchback\\\"\\n32.2 4. 108.0 75.00 2265. 15.2 80. 3.\\t\\\"toyota corolla\\\"\\n46.6 4. 86.00 65.00 2110. 17.9 80. 3.\\t\\\"mazda glc\\\"\\n27.9 4. 156.0 105.0 2800. 14.4 80. 1.\\t\\\"dodge colt\\\"\\n40.8 4. 85.00 65.00 2110. 19.2 80. 3.\\t\\\"datsun 210\\\"\\n44.3 4. 90.00 48.00 2085. 21.7 80. 2.\\t\\\"vw rabbit c (diesel)\\\"\\n43.4 4. 90.00 48.00 2335. 23.7 80. 2.\\t\\\"vw dasher (diesel)\\\"\\n36.4 5. 121.0 67.00 2950. 19.9 80. 2.\\t\\\"audi 5000s (diesel)\\\"\\n30.0 4. 146.0 67.00 3250. 21.8 80. 2.\\t\\\"mercedes-benz 240d\\\"\\n44.6 4. 91.00 67.00 1850. 13.8 80. 3.\\t\\\"honda civic 1500 gl\\\"\\n40.9 4. 85.00 NA 1835. 17.3 80. 2.\\t\\\"renault lecar deluxe\\\"\\n33.8 4. 97.00 67.00 2145. 18.0 80. 3.\\t\\\"subaru dl\\\"\\n29.8 4. 89.00 62.00 1845. 15.3 80. 2.\\t\\\"vokswagen rabbit\\\"\\n32.7 6. 168.0 132.0 2910. 11.4 80. 3.\\t\\\"datsun 280-zx\\\"\\n23.7 3. 70.00 100.0 2420. 12.5 80. 3.\\t\\\"mazda rx-7 gs\\\"\\n35.0 4. 122.0 88.00 2500. 15.1 80. 2.\\t\\\"triumph tr7 coupe\\\"\\n23.6 4. 140.0 NA 2905. 14.3 80. 1.\\t\\\"ford mustang cobra\\\"\\n32.4 4. 107.0 72.00 2290. 17.0 80. 3.\\t\\\"honda accord\\\"\\n27.2 4. 135.0 84.00 2490. 15.7 81. 1.\\t\\\"plymouth reliant\\\"\\n26.6 4. 151.0 84.00 2635. 16.4 81. 1.\\t\\\"buick skylark\\\"\\n25.8 4. 156.0 92.00 2620. 14.4 81. 1.\\t\\\"dodge aries wagon (sw)\\\"\\n23.5 6. 173.0 110.0 2725. 12.6 81. 1.\\t\\\"chevrolet citation\\\"\\n30.0 4. 135.0 84.00 2385. 12.9 81. 1.\\t\\\"plymouth reliant\\\"\\n39.1 4. 79.00 58.00 1755. 16.9 81. 3.\\t\\\"toyota starlet\\\"\\n39.0 4. 86.00 64.00 1875. 16.4 81. 1.\\t\\\"plymouth champ\\\"\\n35.1 4. 81.00 60.00 1760. 16.1 81. 3.\\t\\\"honda civic 1300\\\"\\n32.3 4. 97.00 67.00 2065. 17.8 81. 3.\\t\\\"subaru\\\"\\n37.0 4. 85.00 65.00 1975. 19.4 81. 3.\\t\\\"datsun 210 mpg\\\"\\n37.7 4. 89.00 62.00 2050. 17.3 81. 3.\\t\\\"toyota tercel\\\"\\n34.1 4. 91.00 68.00 1985. 16.0 81. 3.\\t\\\"mazda glc 4\\\"\\n34.7 4. 105.0 63.00 2215. 14.9 81. 1.\\t\\\"plymouth horizon 4\\\"\\n34.4 4. 98.00 65.00 2045. 16.2 81. 1.\\t\\\"ford escort 4w\\\"\\n29.9 4. 98.00 65.00 2380. 20.7 81. 1.\\t\\\"ford escort 2h\\\"\\n33.0 4. 105.0 74.00 2190. 14.2 81. 2.\\t\\\"volkswagen jetta\\\"\\n34.5 4. 100.0 NA 2320. 15.8 81. 2.\\t\\\"renault 18i\\\"\\n33.7 4. 107.0 75.00 2210. 14.4 81. 3.\\t\\\"honda prelude\\\"\\n32.4 4. 108.0 75.00 2350. 16.8 81. 3.\\t\\\"toyota corolla\\\"\\n32.9 4. 119.0 100.0 2615. 14.8 81. 3.\\t\\\"datsun 200sx\\\"\\n31.6 4. 120.0 74.00 2635. 18.3 81. 3.\\t\\\"mazda 626\\\"\\n28.1 4. 141.0 80.00 3230. 20.4 81. 2.\\t\\\"peugeot 505s turbo diesel\\\"\\nNA 4. 121.0 110.0 2800. 15.4 81. 2.\\t\\\"saab 900s\\\"\\n30.7 6. 145.0 76.00 3160. 19.6 81. 2.\\t\\\"volvo diesel\\\"\\n25.4 6. 168.0 116.0 2900. 12.6 81. 3.\\t\\\"toyota cressida\\\"\\n24.2 6. 146.0 120.0 2930. 13.8 81. 3.\\t\\\"datsun 810 maxima\\\"\\n22.4 6. 231.0 110.0 3415. 15.8 81. 1.\\t\\\"buick century\\\"\\n26.6 8. 350.0 105.0 3725. 19.0 81. 1.\\t\\\"oldsmobile cutlass ls\\\"\\n20.2 6. 200.0 88.00 3060. 17.1 81. 1.\\t\\\"ford granada gl\\\"\\n17.6 6. 225.0 85.00 3465. 16.6 81. 1.\\t\\\"chrysler lebaron salon\\\"\\n28.0 4. 112.0 88.00 2605. 19.6 82. 1.\\t\\\"chevrolet cavalier\\\"\\n27.0 4. 112.0 88.00 2640. 18.6 82. 1.\\t\\\"chevrolet cavalier wagon\\\"\\n34.0 4. 112.0 88.00 2395. 18.0 82. 1.\\t\\\"chevrolet cavalier 2-door\\\"\\n31.0 4. 112.0 85.00 2575. 16.2 82. 1.\\t\\\"pontiac j2000 se hatchback\\\"\\n29.0 4. 135.0 84.00 2525. 16.0 82. 1.\\t\\\"dodge aries se\\\"\\n27.0 4. 151.0 90.00 2735. 18.0 82. 1.\\t\\\"pontiac phoenix\\\"\\n24.0 4. 140.0 92.00 2865. 16.4 82. 1.\\t\\\"ford fairmont futura\\\"\\n23.0 4. 151.0 NA 3035. 20.5 82. 1.\\t\\\"amc concord dl\\\"\\n36.0 4. 105.0 74.00 1980. 15.3 82. 2.\\t\\\"volkswagen rabbit l\\\"\\n37.0 4. 91.00 68.00 2025. 18.2 82. 3.\\t\\\"mazda glc custom l\\\"\\n31.0 4. 91.00 68.00 1970. 17.6 82. 3.\\t\\\"mazda glc custom\\\"\\n38.0 4. 105.0 63.00 2125. 14.7 82. 1.\\t\\\"plymouth horizon miser\\\"\\n36.0 4. 98.00 70.00 2125. 17.3 82. 1.\\t\\\"mercury lynx l\\\"\\n36.0 4. 120.0 88.00 2160. 14.5 82. 3.\\t\\\"nissan stanza xe\\\"\\n36.0 4. 107.0 75.00 2205. 14.5 82. 3.\\t\\\"honda accord\\\"\\n34.0 4. 108.0 70.00 2245 16.9 82. 3.\\t\\\"toyota corolla\\\"\\n38.0 4. 91.00 67.00 1965. 15.0 82. 3.\\t\\\"honda civic\\\"\\n32.0 4. 91.00 67.00 1965. 15.7 82. 3.\\t\\\"honda civic (auto)\\\"\\n38.0 4. 91.00 67.00 1995. 16.2 82. 3.\\t\\\"datsun 310 gx\\\"\\n25.0 6. 181.0 110.0 2945. 16.4 82. 1.\\t\\\"buick century limited\\\"\\n38.0 6. 262.0 85.00 3015. 17.0 82. 1.\\t\\\"oldsmobile cutlass ciera (diesel)\\\"\\n26.0 4. 156.0 92.00 2585. 14.5 82. 1.\\t\\\"chrysler lebaron medallion\\\"\\n22.0 6. 232.0 112.0 2835 14.7 82. 1.\\t\\\"ford granada l\\\"\\n32.0 4. 144.0 96.00 2665. 13.9 82. 3.\\t\\\"toyota celica gt\\\"\\n36.0 4. 135.0 84.00 2370. 13.0 82. 1.\\t\\\"dodge charger 2.2\\\"\\n27.0 4. 151.0 90.00 2950. 17.3 82. 1.\\t\\\"chevrolet camaro\\\"\\n27.0 4. 140.0 86.00 2790. 15.6 82. 1.\\t\\\"ford mustang gl\\\"\\n44.0 4. 97.00 52.00 2130. 24.6 82. 2.\\t\\\"vw pickup\\\"\\n32.0 4. 135.0 84.00 2295. 11.6 82. 1.\\t\\\"dodge rampage\\\"\\n28.0 4. 120.0 79.00 2625. 18.6 82. 1.\\t\\\"ford ranger\\\"\\n31.0 4. 119.0 82.00 2720. 19.4 82. 1.\\t\\\"chevy s-10\\\"\\n\"" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_str = read(\"auto.txt\", String)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we replace all tabs in this sting by spaces\n", "\n", "(note that in general it is not a safe operation as theoretically if you had columns that are strings they might have contained quoted tabs; fortunately in this case they do not have them so we are safe)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"18.0 8. 307.0 130.0 3504. 12.0 70. 1. \\\"chevrolet chevelle malibu\\\"\\n15.0 8. 350.0 165.0 3693. 11.5 70. 1. \\\"buick skylark 320\\\"\\n18.0 8. 318.0 150.0 3436. 11.0 70. 1. \\\"plymouth satellite\\\"\\n16.0 8. 304.0 150.0 3433. 12.0 70. 1. \\\"amc rebel sst\\\"\\n17.0 8. 302.0 140.0 3449. 10.5 70. 1. \\\"ford torino\\\"\\n15.0 8. 429.0 198.0 4341. 10.0 70. 1. \\\"ford galaxie 500\\\"\\n14.0 8. 454.0 220.0 4354. 9.0 70. 1. \\\"chevrolet impala\\\"\\n14.0 8. 440.0 215.0 4312. 8.5 70. 1. \\\"plymouth fury iii\\\"\\n14.0 8. 455.0 225.0 4425. 10.0 70. 1. \\\"pontiac catalina\\\"\\n15.0 8. 390.0 190.0 3850. 8.5 70. 1. \\\"amc ambassador dpl\\\"\\nNA 4. 133.0 115.0 3090. 17.5 70. 2. \\\"citroen ds-21 pallas\\\"\\nNA 8. 350.0 165.0 4142. 11.5 70. 1. \\\"chevrolet chevelle concours (sw)\\\"\\nNA 8. 351.0 153.0 4034. 11.0 70. 1. \\\"ford torino (sw)\\\"\\nNA 8. 383.0 175.0 4166. 10.5 70. 1. \\\"plymouth satellite (sw)\\\"\\nNA 8. 360.0 175.0 3850. 11.0 70. 1. \\\"amc rebel sst (sw)\\\"\\n15.0 8. 383.0 170.0 3563. 10.0 70. 1. \\\"dodge challenger se\\\"\\n14.0 8. 340.0 160.0 3609. 8.0 70. 1. \\\"plymouth 'cuda 340\\\"\\nNA 8. 302.0 140.0 3353. 8.0 70. 1. \\\"ford mustang boss 302\\\"\\n15.0 8. 400.0 150.0 3761. 9.5 70. 1. \\\"chevrolet monte carlo\\\"\\n14.0 8. 455.0 225.0 3086. 10.0 70. 1. \\\"buick estate wagon (sw)\\\"\\n24.0 4. 113.0 95.00 2372. 15.0 70. 3. \\\"toyota corona mark ii\\\"\\n22.0 6. 198.0 95.00 2833. 15.5 70. 1. \\\"plymouth duster\\\"\\n18.0 6. 199.0 97.00 2774. 15.5 70. 1. \\\"amc hornet\\\"\\n21.0 6. 200.0 85.00 2587. 16.0 70. 1. \\\"ford maverick\\\"\\n27.0 4. 97.00 88.00 2130. 14.5 70. 3. \\\"datsun pl510\\\"\\n26.0 4. 97.00 46.00 1835. 20.5 70. 2. \\\"volkswagen 1131 deluxe sedan\\\"\\n25.0 4. 110.0 87.00 2672. 17.5 70. 2. \\\"peugeot 504\\\"\\n24.0 4. 107.0 90.00 2430. 14.5 70. 2. \\\"audi 100 ls\\\"\\n25.0 4. 104.0 95.00 2375. 17.5 70. 2. \\\"saab 99e\\\"\\n26.0 4. 121.0 113.0 2234. 12.5 70. 2. \\\"bmw 2002\\\"\\n21.0 6. 199.0 90.00 2648. 15.0 70. 1. \\\"amc gremlin\\\"\\n10.0 8. 360.0 215.0 4615. 14.0 70. 1. \\\"ford f250\\\"\\n10.0 8. 307.0 200.0 4376. 15.0 70. 1. \\\"chevy c20\\\"\\n11.0 8. 318.0 210.0 4382. 13.5 70. 1. \\\"dodge d200\\\"\\n9.0 8. 304.0 193.0 4732. 18.5 70. 1. \\\"hi 1200d\\\"\\n27.0 4. 97.00 88.00 2130. 14.5 71. 3. \\\"datsun pl510\\\"\\n28.0 4. 140.0 90.00 2264. 15.5 71. 1. \\\"chevrolet vega 2300\\\"\\n25.0 4. 113.0 95.00 2228. 14.0 71. 3. \\\"toyota corona\\\"\\n25.0 4. 98.00 NA 2046. 19.0 71. 1. \\\"ford pinto\\\"\\nNA 4. 97.00 48.00 1978. 20.0 71. 2. \\\"volkswagen super beetle 117\\\"\\n19.0 6. 232.0 100.0 2634. 13.0 71. 1. \\\"amc gremlin\\\"\\n16.0 6. 225.0 105.0 3439. 15.5 71. 1. \\\"plymouth satellite custom\\\"\\n17.0 6. 250.0 100.0 3329. 15.5 71. 1. \\\"chevrolet chevelle malibu\\\"\\n19.0 6. 250.0 88.00 3302. 15.5 71. 1. \\\"ford torino 500\\\"\\n18.0 6. 232.0 100.0 3288. 15.5 71. 1. \\\"amc matador\\\"\\n14.0 8. 350.0 165.0 4209. 12.0 71. 1. \\\"chevrolet impala\\\"\\n14.0 8. 400.0 175.0 4464. 11.5 71. 1. \\\"pontiac catalina brougham\\\"\\n14.0 8. 351.0 153.0 4154. 13.5 71. 1. \\\"ford galaxie 500\\\"\\n14.0 8. 318.0 150.0 4096. 13.0 71. 1. \\\"plymouth fury iii\\\"\\n12.0 8. 383.0 180.0 4955. 11.5 71. 1. \\\"dodge monaco (sw)\\\"\\n13.0 8. 400.0 170.0 4746. 12.0 71. 1. \\\"ford country squire (sw)\\\"\\n13.0 8. 400.0 175.0 5140. 12.0 71. 1. \\\"pontiac safari (sw)\\\"\\n18.0 6. 258.0 110.0 2962. 13.5 71. 1. \\\"amc hornet sportabout (sw)\\\"\\n22.0 4. 140.0 72.00 2408. 19.0 71. 1. \\\"chevrolet vega (sw)\\\"\\n19.0 6. 250.0 100.0 3282. 15.0 71. 1. \\\"pontiac firebird\\\"\\n18.0 6. 250.0 88.00 3139. 14.5 71. 1. \\\"ford mustang\\\"\\n23.0 4. 122.0 86.00 2220. 14.0 71. 1. \\\"mercury capri 2000\\\"\\n28.0 4. 116.0 90.00 2123. 14.0 71. 2. \\\"opel 1900\\\"\\n30.0 4. 79.00 70.00 2074. 19.5 71. 2. \\\"peugeot 304\\\"\\n30.0 4. 88.00 76.00 2065. 14.5 71. 2. \\\"fiat 124b\\\"\\n31.0 4. 71.00 65.00 1773. 19.0 71. 3. \\\"toyota corolla 1200\\\"\\n35.0 4. 72.00 69.00 1613. 18.0 71. 3. \\\"datsun 1200\\\"\\n27.0 4. 97.00 60.00 1834. 19.0 71. 2. \\\"volkswagen model 111\\\"\\n26.0 4. 91.00 70.00 1955. 20.5 71. 1. \\\"plymouth cricket\\\"\\n24.0 4. 113.0 95.00 2278. 15.5 72. 3. \\\"toyota corona hardtop\\\"\\n25.0 4. 97.50 80.00 2126. 17.0 72. 1. \\\"dodge colt hardtop\\\"\\n23.0 4. 97.00 54.00 2254. 23.5 72. 2. \\\"volkswagen type 3\\\"\\n20.0 4. 140.0 90.00 2408. 19.5 72. 1. \\\"chevrolet vega\\\"\\n21.0 4. 122.0 86.00 2226. 16.5 72. 1. \\\"ford pinto runabout\\\"\\n13.0 8. 350.0 165.0 4274. 12.0 72. 1. \\\"chevrolet impala\\\"\\n14.0 8. 400.0 175.0 4385. 12.0 72. 1. \\\"pontiac catalina\\\"\\n15.0 8. 318.0 150.0 4135. 13.5 72. 1. \\\"plymouth fury iii\\\"\\n14.0 8. 351.0 153.0 4129. 13.0 72. 1. \\\"ford galaxie 500\\\"\\n17.0 8. 304.0 150.0 3672. 11.5 72. 1. \\\"amc ambassador sst\\\"\\n11.0 8. 429.0 208.0 4633. 11.0 72. 1. \\\"mercury marquis\\\"\\n13.0 8. 350.0 155.0 4502. 13.5 72. 1. \\\"buick lesabre custom\\\"\\n12.0 8. 350.0 160.0 4456. 13.5 72. 1. \\\"oldsmobile delta 88 royale\\\"\\n13.0 8. 400.0 190.0 4422. 12.5 72. 1. \\\"chrysler newport royal\\\"\\n19.0 3. 70.00 97.00 2330. 13.5 72. 3. \\\"mazda rx2 coupe\\\"\\n15.0 8. 304.0 150.0 3892. 12.5 72. 1. \\\"amc matador (sw)\\\"\\n13.0 8. 307.0 130.0 4098. 14.0 72. 1. \\\"chevrolet chevelle concours (sw)\\\"\\n13.0 8. 302.0 140.0 4294. 16.0 72. 1. \\\"ford gran torino (sw)\\\"\\n14.0 8. 318.0 150.0 4077. 14.0 72. 1. \\\"plymouth satellite custom (sw)\\\"\\n18.0 4. 121.0 112.0 2933. 14.5 72. 2. \\\"volvo 145e (sw)\\\"\\n22.0 4. 121.0 76.00 2511. 18.0 72. 2. \\\"volkswagen 411 (sw)\\\"\\n21.0 4. 120.0 87.00 2979. 19.5 72. 2. \\\"peugeot 504 (sw)\\\"\\n26.0 4. 96.00 69.00 2189. 18.0 72. 2. \\\"renault 12 (sw)\\\"\\n22.0 4. 122.0 86.00 2395. 16.0 72. 1. \\\"ford pinto (sw)\\\"\\n28.0 4. 97.00 92.00 2288. 17.0 72. 3. \\\"datsun 510 (sw)\\\"\\n23.0 4. 120.0 97.00 2506. 14.5 72. 3. \\\"toyouta corona mark ii (sw)\\\"\\n28.0 4. 98.00 80.00 2164. 15.0 72. 1. \\\"dodge colt (sw)\\\"\\n27.0 4. 97.00 88.00 2100. 16.5 72. 3. \\\"toyota corolla 1600 (sw)\\\"\\n13.0 8. 350.0 175.0 4100. 13.0 73. 1. \\\"buick century 350\\\"\\n14.0 8. 304.0 150.0 3672. 11.5 73. 1. \\\"amc matador\\\"\\n13.0 8. 350.0 145.0 3988. 13.0 73. 1. \\\"chevrolet malibu\\\"\\n14.0 8. 302.0 137.0 4042. 14.5 73. 1. \\\"ford gran torino\\\"\\n15.0 8. 318.0 150.0 3777. 12.5 73. 1. \\\"dodge coronet custom\\\"\\n12.0 8. 429.0 198.0 4952. 11.5 73. 1. \\\"mercury marquis brougham\\\"\\n13.0 8. 400.0 150.0 4464. 12.0 73. 1. \\\"chevrolet caprice classic\\\"\\n13.0 8. 351.0 158.0 4363. 13.0 73. 1. \\\"ford ltd\\\"\\n14.0 8. 318.0 150.0 4237. 14.5 73. 1. \\\"plymouth fury gran sedan\\\"\\n13.0 8. 440.0 215.0 4735. 11.0 73. 1. \\\"chrysler new yorker brougham\\\"\\n12.0 8. 455.0 225.0 4951. 11.0 73. 1. \\\"buick electra 225 custom\\\"\\n13.0 8. 360.0 175.0 3821. 11.0 73. 1. \\\"amc ambassador brougham\\\"\\n18.0 6. 225.0 105.0 3121. 16.5 73. 1. \\\"plymouth valiant\\\"\\n16.0 6. 250.0 100.0 3278. 18.0 73. 1. \\\"chevrolet nova custom\\\"\\n18.0 6. 232.0 100.0 2945. 16.0 73. 1. \\\"amc hornet\\\"\\n18.0 6. 250.0 88.00 3021. 16.5 73. 1. \\\"ford maverick\\\"\\n23.0 6. 198.0 95.00 2904. 16.0 73. 1. \\\"plymouth duster\\\"\\n26.0 4. 97.00 46.00 1950. 21.0 73. 2. \\\"volkswagen super beetle\\\"\\n11.0 8. 400.0 150.0 4997. 14.0 73. 1. \\\"chevrolet impala\\\"\\n12.0 8. 400.0 167.0 4906. 12.5 73. 1. \\\"ford country\\\"\\n13.0 8. 360.0 170.0 4654. 13.0 73. 1. \\\"plymouth custom suburb\\\"\\n12.0 8. 350.0 180.0 4499. 12.5 73. 1. \\\"oldsmobile vista cruiser\\\"\\n18.0 6. 232.0 100.0 2789. 15.0 73. 1. \\\"amc gremlin\\\"\\n20.0 4. 97.00 88.00 2279. 19.0 73. 3. \\\"toyota carina\\\"\\n21.0 4. 140.0 72.00 2401. 19.5 73. 1. \\\"chevrolet vega\\\"\\n22.0 4. 108.0 94.00 2379. 16.5 73. 3. \\\"datsun 610\\\"\\n18.0 3. 70.00 90.00 2124. 13.5 73. 3. \\\"maxda rx3\\\"\\n19.0 4. 122.0 85.00 2310. 18.5 73. 1. \\\"ford pinto\\\"\\n21.0 6. 155.0 107.0 2472. 14.0 73. 1. \\\"mercury capri v6\\\"\\n26.0 4. 98.00 90.00 2265. 15.5 73. 2. \\\"fiat 124 sport coupe\\\"\\n15.0 8. 350.0 145.0 4082. 13.0 73. 1. \\\"chevrolet monte carlo s\\\"\\n16.0 8. 400.0 230.0 4278. 9.50 73. 1. \\\"pontiac grand prix\\\"\\n29.0 4. 68.00 49.00 1867. 19.5 73. 2. \\\"fiat 128\\\"\\n24.0 4. 116.0 75.00 2158. 15.5 73. 2. \\\"opel manta\\\"\\n20.0 4. 114.0 91.00 2582. 14.0 73. 2. \\\"audi 100ls\\\"\\n19.0 4. 121.0 112.0 2868. 15.5 73. 2. \\\"volvo 144ea\\\"\\n15.0 8. 318.0 150.0 3399. 11.0 73. 1. \\\"dodge dart custom\\\"\\n24.0 4. 121.0 110.0 2660. 14.0 73. 2. \\\"saab 99le\\\"\\n20.0 6. 156.0 122.0 2807. 13.5 73. 3. \\\"toyota mark ii\\\"\\n11.0 8. 350.0 180.0 3664. 11.0 73. 1. \\\"oldsmobile omega\\\"\\n20.0 6. 198.0 95.00 3102. 16.5 74. 1. \\\"plymouth duster\\\"\\n21.0 6. 200.0 NA 2875. 17.0 74. 1. \\\"ford maverick\\\"\\n19.0 6. 232.0 100.0 2901. 16.0 74. 1. \\\"amc hornet\\\"\\n15.0 6. 250.0 100.0 3336. 17.0 74. 1. \\\"chevrolet nova\\\"\\n31.0 4. 79.00 67.00 1950. 19.0 74. 3. \\\"datsun b210\\\"\\n26.0 4. 122.0 80.00 2451. 16.5 74. 1. \\\"ford pinto\\\"\\n32.0 4. 71.00 65.00 1836. 21.0 74. 3. \\\"toyota corolla 1200\\\"\\n25.0 4. 140.0 75.00 2542. 17.0 74. 1. \\\"chevrolet vega\\\"\\n16.0 6. 250.0 100.0 3781. 17.0 74. 1. \\\"chevrolet chevelle malibu classic\\\"\\n16.0 6. 258.0 110.0 3632. 18.0 74. 1. \\\"amc matador\\\"\\n18.0 6. 225.0 105.0 3613. 16.5 74. 1. \\\"plymouth satellite sebring\\\"\\n16.0 8. 302.0 140.0 4141. 14.0 74. 1. \\\"ford gran torino\\\"\\n13.0 8. 350.0 150.0 4699. 14.5 74. 1. \\\"buick century luxus (sw)\\\"\\n14.0 8. 318.0 150.0 4457. 13.5 74. 1. \\\"dodge coronet custom (sw)\\\"\\n14.0 8. 302.0 140.0 4638. 16.0 74. 1. \\\"ford gran torino (sw)\\\"\\n14.0 8. 304.0 150.0 4257. 15.5 74. 1. \\\"amc matador (sw)\\\"\\n29.0 4. 98.00 83.00 2219. 16.5 74. 2. \\\"audi fox\\\"\\n26.0 4. 79.00 67.00 1963. 15.5 74. 2. \\\"volkswagen dasher\\\"\\n26.0 4. 97.00 78.00 2300. 14.5 74. 2. \\\"opel manta\\\"\\n31.0 4. 76.00 52.00 1649. 16.5 74. 3. \\\"toyota corona\\\"\\n32.0 4. 83.00 61.00 2003. 19.0 74. 3. \\\"datsun 710\\\"\\n28.0 4. 90.00 75.00 2125. 14.5 74. 1. \\\"dodge colt\\\"\\n24.0 4. 90.00 75.00 2108. 15.5 74. 2. \\\"fiat 128\\\"\\n26.0 4. 116.0 75.00 2246. 14.0 74. 2. \\\"fiat 124 tc\\\"\\n24.0 4. 120.0 97.00 2489. 15.0 74. 3. \\\"honda civic\\\"\\n26.0 4. 108.0 93.00 2391. 15.5 74. 3. \\\"subaru\\\"\\n31.0 4. 79.00 67.00 2000. 16.0 74. 2. \\\"fiat x1.9\\\"\\n19.0 6. 225.0 95.00 3264. 16.0 75. 1. \\\"plymouth valiant custom\\\"\\n18.0 6. 250.0 105.0 3459. 16.0 75. 1. \\\"chevrolet nova\\\"\\n15.0 6. 250.0 72.00 3432. 21.0 75. 1. \\\"mercury monarch\\\"\\n15.0 6. 250.0 72.00 3158. 19.5 75. 1. \\\"ford maverick\\\"\\n16.0 8. 400.0 170.0 4668. 11.5 75. 1. \\\"pontiac catalina\\\"\\n15.0 8. 350.0 145.0 4440. 14.0 75. 1. \\\"chevrolet bel air\\\"\\n16.0 8. 318.0 150.0 4498. 14.5 75. 1. \\\"plymouth grand fury\\\"\\n14.0 8. 351.0 148.0 4657. 13.5 75. 1. \\\"ford ltd\\\"\\n17.0 6. 231.0 110.0 3907. 21.0 75. 1. \\\"buick century\\\"\\n16.0 6. 250.0 105.0 3897. 18.5 75. 1. \\\"chevroelt chevelle malibu\\\"\\n15.0 6. 258.0 110.0 3730. 19.0 75. 1. \\\"amc matador\\\"\\n18.0 6. 225.0 95.00 3785. 19.0 75. 1. \\\"plymouth fury\\\"\\n21.0 6. 231.0 110.0 3039. 15.0 75. 1. \\\"buick skyhawk\\\"\\n20.0 8. 262.0 110.0 3221. 13.5 75. 1. \\\"chevrolet monza 2+2\\\"\\n13.0 8. 302.0 129.0 3169. 12.0 75. 1. \\\"ford mustang ii\\\"\\n29.0 4. 97.00 75.00 2171. 16.0 75. 3. \\\"toyota corolla\\\"\\n23.0 4. 140.0 83.00 2639. 17.0 75. 1. \\\"ford pinto\\\"\\n20.0 6. 232.0 100.0 2914. 16.0 75. 1. \\\"amc gremlin\\\"\\n23.0 4. 140.0 78.00 2592. 18.5 75. 1. \\\"pontiac astro\\\"\\n24.0 4. 134.0 96.00 2702. 13.5 75. 3. \\\"toyota corona\\\"\\n25.0 4. 90.00 71.00 2223. 16.5 75. 2. \\\"volkswagen dasher\\\"\\n24.0 4. 119.0 97.00 2545. 17.0 75. 3. \\\"datsun 710\\\"\\n18.0 6. 171.0 97.00 2984. 14.5 75. 1. \\\"ford pinto\\\"\\n29.0 4. 90.00 70.00 1937. 14.0 75. 2. \\\"volkswagen rabbit\\\"\\n19.0 6. 232.0 90.00 3211. 17.0 75. 1. \\\"amc pacer\\\"\\n23.0 4. 115.0 95.00 2694. 15.0 75. 2. \\\"audi 100ls\\\"\\n23.0 4. 120.0 88.00 2957. 17.0 75. 2. \\\"peugeot 504\\\"\\n22.0 4. 121.0 98.00 2945. 14.5 75. 2. \\\"volvo 244dl\\\"\\n25.0 4. 121.0 115.0 2671. 13.5 75. 2. \\\"saab 99le\\\"\\n33.0 4. 91.00 53.00 1795. 17.5 75. 3. \\\"honda civic cvcc\\\"\\n28.0 4. 107.0 86.00 2464. 15.5 76. 2. \\\"fiat 131\\\"\\n25.0 4. 116.0 81.00 2220. 16.9 76. 2. \\\"opel 1900\\\"\\n25.0 4. 140.0 92.00 2572. 14.9 76. 1. \\\"capri ii\\\"\\n26.0 4. 98.00 79.00 2255. 17.7 76. 1. \\\"dodge colt\\\"\\n27.0 4. 101.0 83.00 2202. 15.3 76. 2. \\\"renault 12tl\\\"\\n17.5 8. 305.0 140.0 4215. 13.0 76. 1. \\\"chevrolet chevelle malibu classic\\\"\\n16.0 8. 318.0 150.0 4190. 13.0 76. 1. \\\"dodge coronet brougham\\\"\\n15.5 8. 304.0 120.0 3962. 13.9 76. 1. \\\"amc matador\\\"\\n14.5 8. 351.0 152.0 4215. 12.8 76. 1. \\\"ford gran torino\\\"\\n22.0 6. 225.0 100.0 3233. 15.4 76. 1. \\\"plymouth valiant\\\"\\n22.0 6. 250.0 105.0 3353. 14.5 76. 1. \\\"chevrolet nova\\\"\\n24.0 6. 200.0 81.00 3012. 17.6 76. 1. \\\"ford maverick\\\"\\n22.5 6. 232.0 90.00 3085. 17.6 76. 1. \\\"amc hornet\\\"\\n29.0 4. 85.00 52.00 2035. 22.2 76. 1. \\\"chevrolet chevette\\\"\\n24.5 4. 98.00 60.00 2164. 22.1 76. 1. \\\"chevrolet woody\\\"\\n29.0 4. 90.00 70.00 1937. 14.2 76. 2. \\\"vw rabbit\\\"\\n33.0 4. 91.00 53.00 1795. 17.4 76. 3. \\\"honda civic\\\"\\n20.0 6. 225.0 100.0 3651. 17.7 76. 1. \\\"dodge aspen se\\\"\\n18.0 6. 250.0 78.00 3574. 21.0 76. 1. \\\"ford granada ghia\\\"\\n18.5 6. 250.0 110.0 3645. 16.2 76. 1. \\\"pontiac ventura sj\\\"\\n17.5 6. 258.0 95.00 3193. 17.8 76. 1. \\\"amc pacer d/l\\\"\\n29.5 4. 97.00 71.00 1825. 12.2 76. 2. \\\"volkswagen rabbit\\\"\\n32.0 4. 85.00 70.00 1990. 17.0 76. 3. \\\"datsun b-210\\\"\\n28.0 4. 97.00 75.00 2155. 16.4 76. 3. \\\"toyota corolla\\\"\\n26.5 4. 140.0 72.00 2565. 13.6 76. 1. \\\"ford pinto\\\"\\n20.0 4. 130.0 102.0 3150. 15.7 76. 2. \\\"volvo 245\\\"\\n13.0 8. 318.0 150.0 3940. 13.2 76. 1. \\\"plymouth volare premier v8\\\"\\n19.0 4. 120.0 88.00 3270. 21.9 76. 2. \\\"peugeot 504\\\"\\n19.0 6. 156.0 108.0 2930. 15.5 76. 3. \\\"toyota mark ii\\\"\\n16.5 6. 168.0 120.0 3820. 16.7 76. 2. \\\"mercedes-benz 280s\\\"\\n16.5 8. 350.0 180.0 4380. 12.1 76. 1. \\\"cadillac seville\\\"\\n13.0 8. 350.0 145.0 4055. 12.0 76. 1. \\\"chevy c10\\\"\\n13.0 8. 302.0 130.0 3870. 15.0 76. 1. \\\"ford f108\\\"\\n13.0 8. 318.0 150.0 3755. 14.0 76. 1. \\\"dodge d100\\\"\\n31.5 4. 98.00 68.00 2045. 18.5 77. 3. \\\"honda accord cvcc\\\"\\n30.0 4. 111.0 80.00 2155. 14.8 77. 1. \\\"buick opel isuzu deluxe\\\"\\n36.0 4. 79.00 58.00 1825. 18.6 77. 2. \\\"renault 5 gtl\\\"\\n25.5 4. 122.0 96.00 2300. 15.5 77. 1. \\\"plymouth arrow gs\\\"\\n33.5 4. 85.00 70.00 1945. 16.8 77. 3. \\\"datsun f-10 hatchback\\\"\\n17.5 8. 305.0 145.0 3880. 12.5 77. 1. \\\"chevrolet caprice classic\\\"\\n17.0 8. 260.0 110.0 4060. 19.0 77. 1. \\\"oldsmobile cutlass supreme\\\"\\n15.5 8. 318.0 145.0 4140. 13.7 77. 1. \\\"dodge monaco brougham\\\"\\n15.0 8. 302.0 130.0 4295. 14.9 77. 1. \\\"mercury cougar brougham\\\"\\n17.5 6. 250.0 110.0 3520. 16.4 77. 1. \\\"chevrolet concours\\\"\\n20.5 6. 231.0 105.0 3425. 16.9 77. 1. \\\"buick skylark\\\"\\n19.0 6. 225.0 100.0 3630. 17.7 77. 1. \\\"plymouth volare custom\\\"\\n18.5 6. 250.0 98.00 3525. 19.0 77. 1. \\\"ford granada\\\"\\n16.0 8. 400.0 180.0 4220. 11.1 77. 1. \\\"pontiac grand prix lj\\\"\\n15.5 8. 350.0 170.0 4165. 11.4 77. 1. \\\"chevrolet monte carlo landau\\\"\\n15.5 8. 400.0 190.0 4325. 12.2 77. 1. \\\"chrysler cordoba\\\"\\n16.0 8. 351.0 149.0 4335. 14.5 77. 1. \\\"ford thunderbird\\\"\\n29.0 4. 97.00 78.00 1940. 14.5 77. 2. \\\"volkswagen rabbit custom\\\"\\n24.5 4. 151.0 88.00 2740. 16.0 77. 1. \\\"pontiac sunbird coupe\\\"\\n26.0 4. 97.00 75.00 2265. 18.2 77. 3. \\\"toyota corolla liftback\\\"\\n25.5 4. 140.0 89.00 2755. 15.8 77. 1. \\\"ford mustang ii 2+2\\\"\\n30.5 4. 98.00 63.00 2051. 17.0 77. 1. \\\"chevrolet chevette\\\"\\n33.5 4. 98.00 83.00 2075. 15.9 77. 1. \\\"dodge colt m/m\\\"\\n30.0 4. 97.00 67.00 1985. 16.4 77. 3. \\\"subaru dl\\\"\\n30.5 4. 97.00 78.00 2190. 14.1 77. 2. \\\"volkswagen dasher\\\"\\n22.0 6. 146.0 97.00 2815. 14.5 77. 3. \\\"datsun 810\\\"\\n21.5 4. 121.0 110.0 2600. 12.8 77. 2. \\\"bmw 320i\\\"\\n21.5 3. 80.00 110.0 2720. 13.5 77. 3. \\\"mazda rx-4\\\"\\n43.1 4. 90.00 48.00 1985. 21.5 78 2. \\\"volkswagen rabbit custom diesel\\\"\\n36.1 4. 98.00 66.00 1800. 14.4 78 1. \\\"ford fiesta\\\"\\n32.8 4. 78.00 52.00 1985. 19.4 78. 3. \\\"mazda glc deluxe\\\"\\n39.4 4. 85.00 70.00 2070. 18.6 78. 3. \\\"datsun b210 gx\\\"\\n36.1 4. 91.00 60.00 1800. 16.4 78. 3. \\\"honda civic cvcc\\\"\\n19.9 8. 260.0 110.0 3365. 15.5 78. 1. \\\"oldsmobile cutlass salon brougham\\\"\\n19.4 8. 318.0 140.0 3735. 13.2 78. 1. \\\"dodge diplomat\\\"\\n20.2 8. 302.0 139.0 3570. 12.8 78. 1. \\\"mercury monarch ghia\\\"\\n19.2 6. 231.0 105.0 3535. 19.2 78. 1. \\\"pontiac phoenix lj\\\"\\n20.5 6. 200.0 95.00 3155. 18.2 78. 1. \\\"chevrolet malibu\\\"\\n20.2 6. 200.0 85.00 2965. 15.8 78. 1. \\\"ford fairmont (auto)\\\"\\n25.1 4. 140.0 88.00 2720. 15.4 78. 1. \\\"ford fairmont (man)\\\"\\n20.5 6. 225.0 100.0 3430. 17.2 78. 1. \\\"plymouth volare\\\"\\n19.4 6. 232.0 90.00 3210. 17.2 78. 1. \\\"amc concord\\\"\\n20.6 6. 231.0 105.0 3380. 15.8 78. 1. \\\"buick century special\\\"\\n20.8 6. 200.0 85.00 3070. 16.7 78. 1. \\\"mercury zephyr\\\"\\n18.6 6. 225.0 110.0 3620. 18.7 78. 1. \\\"dodge aspen\\\"\\n18.1 6. 258.0 120.0 3410. 15.1 78. 1. \\\"amc concord d/l\\\"\\n19.2 8. 305.0 145.0 3425. 13.2 78. 1. \\\"chevrolet monte carlo landau\\\"\\n17.7 6. 231.0 165.0 3445. 13.4 78. 1. \\\"buick regal sport coupe (turbo)\\\"\\n18.1 8. 302.0 139.0 3205. 11.2 78. 1. \\\"ford futura\\\"\\n17.5 8. 318.0 140.0 4080. 13.7 78. 1. \\\"dodge magnum xe\\\"\\n30.0 4. 98.00 68.00 2155. 16.5 78. 1. \\\"chevrolet chevette\\\"\\n27.5 4. 134.0 95.00 2560. 14.2 78. 3. \\\"toyota corona\\\"\\n27.2 4. 119.0 97.00 2300. 14.7 78. 3. \\\"datsun 510\\\"\\n30.9 4. 105.0 75.00 2230. 14.5 78. 1. \\\"dodge omni\\\"\\n21.1 4. 134.0 95.00 2515. 14.8 78. 3. \\\"toyota celica gt liftback\\\"\\n23.2 4. 156.0 105.0 2745. 16.7 78. 1. \\\"plymouth sapporo\\\"\\n23.8 4. 151.0 85.00 2855. 17.6 78. 1. \\\"oldsmobile starfire sx\\\"\\n23.9 4. 119.0 97.00 2405. 14.9 78. 3. \\\"datsun 200-sx\\\"\\n20.3 5. 131.0 103.0 2830. 15.9 78. 2. \\\"audi 5000\\\"\\n17.0 6. 163.0 125.0 3140. 13.6 78. 2. \\\"volvo 264gl\\\"\\n21.6 4. 121.0 115.0 2795. 15.7 78. 2. \\\"saab 99gle\\\"\\n16.2 6. 163.0 133.0 3410. 15.8 78. 2. \\\"peugeot 604sl\\\"\\n31.5 4. 89.00 71.00 1990. 14.9 78. 2. \\\"volkswagen scirocco\\\"\\n29.5 4. 98.00 68.00 2135. 16.6 78. 3. \\\"honda accord lx\\\"\\n21.5 6. 231.0 115.0 3245. 15.4 79. 1. \\\"pontiac lemans v6\\\"\\n19.8 6. 200.0 85.00 2990. 18.2 79. 1. \\\"mercury zephyr 6\\\"\\n22.3 4. 140.0 88.00 2890. 17.3 79. 1. \\\"ford fairmont 4\\\"\\n20.2 6. 232.0 90.00 3265. 18.2 79. 1. \\\"amc concord dl 6\\\"\\n20.6 6. 225.0 110.0 3360. 16.6 79. 1. \\\"dodge aspen 6\\\"\\n17.0 8. 305.0 130.0 3840. 15.4 79. 1. \\\"chevrolet caprice classic\\\"\\n17.6 8. 302.0 129.0 3725. 13.4 79. 1. \\\"ford ltd landau\\\"\\n16.5 8. 351.0 138.0 3955. 13.2 79. 1. \\\"mercury grand marquis\\\"\\n18.2 8. 318.0 135.0 3830. 15.2 79. 1. \\\"dodge st. regis\\\"\\n16.9 8. 350.0 155.0 4360. 14.9 79. 1. \\\"buick estate wagon (sw)\\\"\\n15.5 8. 351.0 142.0 4054. 14.3 79. 1. \\\"ford country squire (sw)\\\"\\n19.2 8. 267.0 125.0 3605. 15.0 79. 1. \\\"chevrolet malibu classic (sw)\\\"\\n18.5 8. 360.0 150.0 3940. 13.0 79. 1. \\\"chrysler lebaron town @ country (sw)\\\"\\n31.9 4. 89.00 71.00 1925. 14.0 79. 2. \\\"vw rabbit custom\\\"\\n34.1 4. 86.00 65.00 1975. 15.2 79. 3. \\\"maxda glc deluxe\\\"\\n35.7 4. 98.00 80.00 1915. 14.4 79. 1. \\\"dodge colt hatchback custom\\\"\\n27.4 4. 121.0 80.00 2670. 15.0 79. 1. \\\"amc spirit dl\\\"\\n25.4 5. 183.0 77.00 3530. 20.1 79. 2. \\\"mercedes benz 300d\\\"\\n23.0 8. 350.0 125.0 3900. 17.4 79. 1. \\\"cadillac eldorado\\\"\\n27.2 4. 141.0 71.00 3190. 24.8 79. 2. \\\"peugeot 504\\\"\\n23.9 8. 260.0 90.00 3420. 22.2 79. 1. \\\"oldsmobile cutlass salon brougham\\\"\\n34.2 4. 105.0 70.00 2200. 13.2 79. 1. \\\"plymouth horizon\\\"\\n34.5 4. 105.0 70.00 2150. 14.9 79. 1. \\\"plymouth horizon tc3\\\"\\n31.8 4. 85.00 65.00 2020. 19.2 79. 3. \\\"datsun 210\\\"\\n37.3 4. 91.00 69.00 2130. 14.7 79. 2. \\\"fiat strada custom\\\"\\n28.4 4. 151.0 90.00 2670. 16.0 79. 1. \\\"buick skylark limited\\\"\\n28.8 6. 173.0 115.0 2595. 11.3 79. 1. \\\"chevrolet citation\\\"\\n26.8 6. 173.0 115.0 2700. 12.9 79. 1. \\\"oldsmobile omega brougham\\\"\\n33.5 4. 151.0 90.00 2556. 13.2 79. 1. \\\"pontiac phoenix\\\"\\n41.5 4. 98.00 76.00 2144. 14.7 80. 2. \\\"vw rabbit\\\"\\n38.1 4. 89.00 60.00 1968. 18.8 80. 3. \\\"toyota corolla tercel\\\"\\n32.1 4. 98.00 70.00 2120. 15.5 80. 1. \\\"chevrolet chevette\\\"\\n37.2 4. 86.00 65.00 2019. 16.4 80. 3. \\\"datsun 310\\\"\\n28.0 4. 151.0 90.00 2678. 16.5 80. 1. \\\"chevrolet citation\\\"\\n26.4 4. 140.0 88.00 2870. 18.1 80. 1. \\\"ford fairmont\\\"\\n24.3 4. 151.0 90.00 3003. 20.1 80. 1. \\\"amc concord\\\"\\n19.1 6. 225.0 90.00 3381. 18.7 80. 1. \\\"dodge aspen\\\"\\n34.3 4. 97.00 78.00 2188. 15.8 80. 2. \\\"audi 4000\\\"\\n29.8 4. 134.0 90.00 2711. 15.5 80. 3. \\\"toyota corona liftback\\\"\\n31.3 4. 120.0 75.00 2542. 17.5 80. 3. \\\"mazda 626\\\"\\n37.0 4. 119.0 92.00 2434. 15.0 80. 3. \\\"datsun 510 hatchback\\\"\\n32.2 4. 108.0 75.00 2265. 15.2 80. 3. \\\"toyota corolla\\\"\\n46.6 4. 86.00 65.00 2110. 17.9 80. 3. \\\"mazda glc\\\"\\n27.9 4. 156.0 105.0 2800. 14.4 80. 1. \\\"dodge colt\\\"\\n40.8 4. 85.00 65.00 2110. 19.2 80. 3. \\\"datsun 210\\\"\\n44.3 4. 90.00 48.00 2085. 21.7 80. 2. \\\"vw rabbit c (diesel)\\\"\\n43.4 4. 90.00 48.00 2335. 23.7 80. 2. \\\"vw dasher (diesel)\\\"\\n36.4 5. 121.0 67.00 2950. 19.9 80. 2. \\\"audi 5000s (diesel)\\\"\\n30.0 4. 146.0 67.00 3250. 21.8 80. 2. \\\"mercedes-benz 240d\\\"\\n44.6 4. 91.00 67.00 1850. 13.8 80. 3. \\\"honda civic 1500 gl\\\"\\n40.9 4. 85.00 NA 1835. 17.3 80. 2. \\\"renault lecar deluxe\\\"\\n33.8 4. 97.00 67.00 2145. 18.0 80. 3. \\\"subaru dl\\\"\\n29.8 4. 89.00 62.00 1845. 15.3 80. 2. \\\"vokswagen rabbit\\\"\\n32.7 6. 168.0 132.0 2910. 11.4 80. 3. \\\"datsun 280-zx\\\"\\n23.7 3. 70.00 100.0 2420. 12.5 80. 3. \\\"mazda rx-7 gs\\\"\\n35.0 4. 122.0 88.00 2500. 15.1 80. 2. \\\"triumph tr7 coupe\\\"\\n23.6 4. 140.0 NA 2905. 14.3 80. 1. \\\"ford mustang cobra\\\"\\n32.4 4. 107.0 72.00 2290. 17.0 80. 3. \\\"honda accord\\\"\\n27.2 4. 135.0 84.00 2490. 15.7 81. 1. \\\"plymouth reliant\\\"\\n26.6 4. 151.0 84.00 2635. 16.4 81. 1. \\\"buick skylark\\\"\\n25.8 4. 156.0 92.00 2620. 14.4 81. 1. \\\"dodge aries wagon (sw)\\\"\\n23.5 6. 173.0 110.0 2725. 12.6 81. 1. \\\"chevrolet citation\\\"\\n30.0 4. 135.0 84.00 2385. 12.9 81. 1. \\\"plymouth reliant\\\"\\n39.1 4. 79.00 58.00 1755. 16.9 81. 3. \\\"toyota starlet\\\"\\n39.0 4. 86.00 64.00 1875. 16.4 81. 1. \\\"plymouth champ\\\"\\n35.1 4. 81.00 60.00 1760. 16.1 81. 3. \\\"honda civic 1300\\\"\\n32.3 4. 97.00 67.00 2065. 17.8 81. 3. \\\"subaru\\\"\\n37.0 4. 85.00 65.00 1975. 19.4 81. 3. \\\"datsun 210 mpg\\\"\\n37.7 4. 89.00 62.00 2050. 17.3 81. 3. \\\"toyota tercel\\\"\\n34.1 4. 91.00 68.00 1985. 16.0 81. 3. \\\"mazda glc 4\\\"\\n34.7 4. 105.0 63.00 2215. 14.9 81. 1. \\\"plymouth horizon 4\\\"\\n34.4 4. 98.00 65.00 2045. 16.2 81. 1. \\\"ford escort 4w\\\"\\n29.9 4. 98.00 65.00 2380. 20.7 81. 1. \\\"ford escort 2h\\\"\\n33.0 4. 105.0 74.00 2190. 14.2 81. 2. \\\"volkswagen jetta\\\"\\n34.5 4. 100.0 NA 2320. 15.8 81. 2. \\\"renault 18i\\\"\\n33.7 4. 107.0 75.00 2210. 14.4 81. 3. \\\"honda prelude\\\"\\n32.4 4. 108.0 75.00 2350. 16.8 81. 3. \\\"toyota corolla\\\"\\n32.9 4. 119.0 100.0 2615. 14.8 81. 3. \\\"datsun 200sx\\\"\\n31.6 4. 120.0 74.00 2635. 18.3 81. 3. \\\"mazda 626\\\"\\n28.1 4. 141.0 80.00 3230. 20.4 81. 2. \\\"peugeot 505s turbo diesel\\\"\\nNA 4. 121.0 110.0 2800. 15.4 81. 2. \\\"saab 900s\\\"\\n30.7 6. 145.0 76.00 3160. 19.6 81. 2. \\\"volvo diesel\\\"\\n25.4 6. 168.0 116.0 2900. 12.6 81. 3. \\\"toyota cressida\\\"\\n24.2 6. 146.0 120.0 2930. 13.8 81. 3. \\\"datsun 810 maxima\\\"\\n22.4 6. 231.0 110.0 3415. 15.8 81. 1. \\\"buick century\\\"\\n26.6 8. 350.0 105.0 3725. 19.0 81. 1. \\\"oldsmobile cutlass ls\\\"\\n20.2 6. 200.0 88.00 3060. 17.1 81. 1. \\\"ford granada gl\\\"\\n17.6 6. 225.0 85.00 3465. 16.6 81. 1. \\\"chrysler lebaron salon\\\"\\n28.0 4. 112.0 88.00 2605. 19.6 82. 1. \\\"chevrolet cavalier\\\"\\n27.0 4. 112.0 88.00 2640. 18.6 82. 1. \\\"chevrolet cavalier wagon\\\"\\n34.0 4. 112.0 88.00 2395. 18.0 82. 1. \\\"chevrolet cavalier 2-door\\\"\\n31.0 4. 112.0 85.00 2575. 16.2 82. 1. \\\"pontiac j2000 se hatchback\\\"\\n29.0 4. 135.0 84.00 2525. 16.0 82. 1. \\\"dodge aries se\\\"\\n27.0 4. 151.0 90.00 2735. 18.0 82. 1. \\\"pontiac phoenix\\\"\\n24.0 4. 140.0 92.00 2865. 16.4 82. 1. \\\"ford fairmont futura\\\"\\n23.0 4. 151.0 NA 3035. 20.5 82. 1. \\\"amc concord dl\\\"\\n36.0 4. 105.0 74.00 1980. 15.3 82. 2. \\\"volkswagen rabbit l\\\"\\n37.0 4. 91.00 68.00 2025. 18.2 82. 3. \\\"mazda glc custom l\\\"\\n31.0 4. 91.00 68.00 1970. 17.6 82. 3. \\\"mazda glc custom\\\"\\n38.0 4. 105.0 63.00 2125. 14.7 82. 1. \\\"plymouth horizon miser\\\"\\n36.0 4. 98.00 70.00 2125. 17.3 82. 1. \\\"mercury lynx l\\\"\\n36.0 4. 120.0 88.00 2160. 14.5 82. 3. \\\"nissan stanza xe\\\"\\n36.0 4. 107.0 75.00 2205. 14.5 82. 3. \\\"honda accord\\\"\\n34.0 4. 108.0 70.00 2245 16.9 82. 3. \\\"toyota corolla\\\"\\n38.0 4. 91.00 67.00 1965. 15.0 82. 3. \\\"honda civic\\\"\\n32.0 4. 91.00 67.00 1965. 15.7 82. 3. \\\"honda civic (auto)\\\"\\n38.0 4. 91.00 67.00 1995. 16.2 82. 3. \\\"datsun 310 gx\\\"\\n25.0 6. 181.0 110.0 2945. 16.4 82. 1. \\\"buick century limited\\\"\\n38.0 6. 262.0 85.00 3015. 17.0 82. 1. \\\"oldsmobile cutlass ciera (diesel)\\\"\\n26.0 4. 156.0 92.00 2585. 14.5 82. 1. \\\"chrysler lebaron medallion\\\"\\n22.0 6. 232.0 112.0 2835 14.7 82. 1. \\\"ford granada l\\\"\\n32.0 4. 144.0 96.00 2665. 13.9 82. 3. \\\"toyota celica gt\\\"\\n36.0 4. 135.0 84.00 2370. 13.0 82. 1. \\\"dodge charger 2.2\\\"\\n27.0 4. 151.0 90.00 2950. 17.3 82. 1. \\\"chevrolet camaro\\\"\\n27.0 4. 140.0 86.00 2790. 15.6 82. 1. \\\"ford mustang gl\\\"\\n44.0 4. 97.00 52.00 2130. 24.6 82. 2. \\\"vw pickup\\\"\\n32.0 4. 135.0 84.00 2295. 11.6 82. 1. \\\"dodge rampage\\\"\\n28.0 4. 120.0 79.00 2625. 18.6 82. 1. \\\"ford ranger\\\"\\n31.0 4. 119.0 82.00 2720. 19.4 82. 1. \\\"chevy s-10\\\"\\n\"" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str_no_tab = replace(raw_str, '\\t'=>' ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally we create an `IOBuffer` backed by the string we have just created." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=32149, maxsize=Inf, ptr=1, mark=-1)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "io = IOBuffer(str_no_tab)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can think of variable `io` as an in-memory I/O stream. Therefore we can pass this stream to the `CSV.File` function to read it as-if it were a CSV file. Note that in the options we choose that:\n", "* the delimiter is space\n", "* we ignore repeated (consecutive) occurences of the delimiter (so we correctly handle our file which has columns padded by spaces)\n", "* we explicitly pass column names via `header` keyword argument\n", "* we specify that missing values are represented using `\"NA\"` string in our file\n", "\n", "Finally note that we pass the result of `CSV.File` operation to a `DataFrame` constructor using `|>`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 9 columns (omitted printing of 1 columns)

mpgcylindersdisplacementhorsepowerweightaccelerationyearorigin
Float64?Float64Float64Float64?Float64Float64Float64Float64
118.08.0307.0130.03504.012.070.01.0
215.08.0350.0165.03693.011.570.01.0
318.08.0318.0150.03436.011.070.01.0
416.08.0304.0150.03433.012.070.01.0
517.08.0302.0140.03449.010.570.01.0
615.08.0429.0198.04341.010.070.01.0
714.08.0454.0220.04354.09.070.01.0
814.08.0440.0215.04312.08.570.01.0
914.08.0455.0225.04425.010.070.01.0
1015.08.0390.0190.03850.08.570.01.0
11missing4.0133.0115.03090.017.570.02.0
12missing8.0350.0165.04142.011.570.01.0
13missing8.0351.0153.04034.011.070.01.0
14missing8.0383.0175.04166.010.570.01.0
15missing8.0360.0175.03850.011.070.01.0
1615.08.0383.0170.03563.010.070.01.0
1714.08.0340.0160.03609.08.070.01.0
18missing8.0302.0140.03353.08.070.01.0
1915.08.0400.0150.03761.09.570.01.0
2014.08.0455.0225.03086.010.070.01.0
2124.04.0113.095.02372.015.070.03.0
2222.06.0198.095.02833.015.570.01.0
2318.06.0199.097.02774.015.570.01.0
2421.06.0200.085.02587.016.070.01.0
2527.04.097.088.02130.014.570.03.0
2626.04.097.046.01835.020.570.02.0
2725.04.0110.087.02672.017.570.02.0
2824.04.0107.090.02430.014.570.02.0
2925.04.0104.095.02375.017.570.02.0
3026.04.0121.0113.02234.012.570.02.0
" ], "text/latex": [ "\\begin{tabular}{r|ccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & \\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64 & \\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t11 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t12 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t13 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t14 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t15 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t16 & 15.0 & 8.0 & 383.0 & 170.0 & 3563.0 & 10.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t17 & 14.0 & 8.0 & 340.0 & 160.0 & 3609.0 & 8.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t18 & \\emph{missing} & 8.0 & 302.0 & 140.0 & 3353.0 & 8.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t19 & 15.0 & 8.0 & 400.0 & 150.0 & 3761.0 & 9.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t20 & 14.0 & 8.0 & 455.0 & 225.0 & 3086.0 & 10.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t21 & 24.0 & 4.0 & 113.0 & 95.0 & 2372.0 & 15.0 & 70.0 & 3.0 & $\\dots$ \\\\\n", "\t22 & 22.0 & 6.0 & 198.0 & 95.0 & 2833.0 & 15.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t23 & 18.0 & 6.0 & 199.0 & 97.0 & 2774.0 & 15.5 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t24 & 21.0 & 6.0 & 200.0 & 85.0 & 2587.0 & 16.0 & 70.0 & 1.0 & $\\dots$ \\\\\n", "\t25 & 27.0 & 4.0 & 97.0 & 88.0 & 2130.0 & 14.5 & 70.0 & 3.0 & $\\dots$ \\\\\n", "\t26 & 26.0 & 4.0 & 97.0 & 46.0 & 1835.0 & 20.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t27 & 25.0 & 4.0 & 110.0 & 87.0 & 2672.0 & 17.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t28 & 24.0 & 4.0 & 107.0 & 90.0 & 2430.0 & 14.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t29 & 25.0 & 4.0 & 104.0 & 95.0 & 2375.0 & 17.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t30 & 26.0 & 4.0 & 121.0 & 113.0 & 2234.0 & 12.5 & 70.0 & 2.0 & $\\dots$ \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×9 DataFrame. Omitted printing of 4 columns\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │\n", "│ 4 │ 16.0 │ 8.0 │ 304.0 │ 150.0 │ 3433.0 │\n", "│ 5 │ 17.0 │ 8.0 │ 302.0 │ 140.0 │ 3449.0 │\n", "│ 6 │ 15.0 │ 8.0 │ 429.0 │ 198.0 │ 4341.0 │\n", "│ 7 │ 14.0 │ 8.0 │ 454.0 │ 220.0 │ 4354.0 │\n", "│ 8 │ 14.0 │ 8.0 │ 440.0 │ 215.0 │ 4312.0 │\n", "│ 9 │ 14.0 │ 8.0 │ 455.0 │ 225.0 │ 4425.0 │\n", "│ 10 │ 15.0 │ 8.0 │ 390.0 │ 190.0 │ 3850.0 │\n", "⋮\n", "│ 396 │ 38.0 │ 6.0 │ 262.0 │ 85.0 │ 3015.0 │\n", "│ 397 │ 26.0 │ 4.0 │ 156.0 │ 92.0 │ 2585.0 │\n", "│ 398 │ 22.0 │ 6.0 │ 232.0 │ 112.0 │ 2835.0 │\n", "│ 399 │ 32.0 │ 4.0 │ 144.0 │ 96.0 │ 2665.0 │\n", "│ 400 │ 36.0 │ 4.0 │ 135.0 │ 84.0 │ 2370.0 │\n", "│ 401 │ 27.0 │ 4.0 │ 151.0 │ 90.0 │ 2950.0 │\n", "│ 402 │ 27.0 │ 4.0 │ 140.0 │ 86.0 │ 2790.0 │\n", "│ 403 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │\n", "│ 404 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │\n", "│ 405 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │\n", "│ 406 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 = CSV.File(io,\n", " delim=' ',\n", " ignorerepeated=true,\n", " header=[:mpg, :cylinders, :displacement, :horsepower,\n", " :weight, :acceleration, :year, :origin, :name],\n", " missingstring=\"NA\") |>\n", " DataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that not all columns of the data frame have been displayed and we see 30 rows (they do not fit on my screen).\n", "\n", "It is easy to change the default maximum width and height of the output by setting appropriate values in `ENV` dictionary." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(200, 15)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ENV[\"COLUMNS\"], ENV[\"LINES\"] = 200, 15" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 9 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginname
Float64?Float64Float64Float64?Float64Float64Float64Float64String
118.08.0307.0130.03504.012.070.01.0chevrolet chevelle malibu
215.08.0350.0165.03693.011.570.01.0buick skylark 320
318.08.0318.0150.03436.011.070.01.0plymouth satellite
416.08.0304.0150.03433.012.070.01.0amc rebel sst
517.08.0302.0140.03449.010.570.01.0ford torino
615.08.0429.0198.04341.010.070.01.0ford galaxie 500
714.08.0454.0220.04354.09.070.01.0chevrolet impala
814.08.0440.0215.04312.08.570.01.0plymouth fury iii
914.08.0455.0225.04425.010.070.01.0pontiac catalina
1015.08.0390.0190.03850.08.570.01.0amc ambassador dpl
11missing4.0133.0115.03090.017.570.02.0citroen ds-21 pallas
12missing8.0350.0165.04142.011.570.01.0chevrolet chevelle concours (sw)
13missing8.0351.0153.04034.011.070.01.0ford torino (sw)
14missing8.0383.0175.04166.010.570.01.0plymouth satellite (sw)
15missing8.0360.0175.03850.011.070.01.0amc rebel sst (sw)
" ], "text/latex": [ "\\begin{tabular}{r|ccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64 & String\\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 & chevrolet chevelle malibu \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 & buick skylark 320 \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 & plymouth satellite \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 & amc rebel sst \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 & ford torino \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 & ford galaxie 500 \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 & chevrolet impala \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 & plymouth fury iii \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 & pontiac catalina \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 & amc ambassador dpl \\\\\n", "\t11 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 & citroen ds-21 pallas \\\\\n", "\t12 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 & chevrolet chevelle concours (sw) \\\\\n", "\t13 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 & ford torino (sw) \\\\\n", "\t14 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 & plymouth satellite (sw) \\\\\n", "\t15 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 & amc rebel sst (sw) \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×9 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼───────────────────────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │ chevrolet chevelle malibu │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │ buick skylark 320 │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │ plymouth satellite │\n", "⋮\n", "│ 403 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │ 24.6 │ 82.0 │ 2.0 │ vw pickup │\n", "│ 404 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │ 11.6 │ 82.0 │ 1.0 │ dodge rampage │\n", "│ 405 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │ 18.6 │ 82.0 │ 1.0 │ ford ranger │\n", "│ 406 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │ 19.4 │ 82.0 │ 1.0 │ chevy s-10 │" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the display is much nicer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us discuss an alternative way to read in the original file.\n", "This time we will first read the data in directly from the file." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 2 columns

metricsname
StringString
118.0 8. 307.0 130.0 3504. 12.0 70. 1.chevrolet chevelle malibu
215.0 8. 350.0 165.0 3693. 11.5 70. 1.buick skylark 320
318.0 8. 318.0 150.0 3436. 11.0 70. 1.plymouth satellite
416.0 8. 304.0 150.0 3433. 12.0 70. 1.amc rebel sst
517.0 8. 302.0 140.0 3449. 10.5 70. 1.ford torino
615.0 8. 429.0 198.0 4341. 10.0 70. 1.ford galaxie 500
714.0 8. 454.0 220.0 4354. 9.0 70. 1.chevrolet impala
814.0 8. 440.0 215.0 4312. 8.5 70. 1.plymouth fury iii
914.0 8. 455.0 225.0 4425. 10.0 70. 1.pontiac catalina
1015.0 8. 390.0 190.0 3850. 8.5 70. 1.amc ambassador dpl
11NA 4. 133.0 115.0 3090. 17.5 70. 2.citroen ds-21 pallas
12NA 8. 350.0 165.0 4142. 11.5 70. 1.chevrolet chevelle concours (sw)
13NA 8. 351.0 153.0 4034. 11.0 70. 1.ford torino (sw)
14NA 8. 383.0 175.0 4166. 10.5 70. 1.plymouth satellite (sw)
15NA 8. 360.0 175.0 3850. 11.0 70. 1.amc rebel sst (sw)
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& metrics & name\\\\\n", "\t\\hline\n", "\t& String & String\\\\\n", "\t\\hline\n", "\t1 & 18.0 8. 307.0 130.0 3504. 12.0 70. 1. & chevrolet chevelle malibu \\\\\n", "\t2 & 15.0 8. 350.0 165.0 3693. 11.5 70. 1. & buick skylark 320 \\\\\n", "\t3 & 18.0 8. 318.0 150.0 3436. 11.0 70. 1. & plymouth satellite \\\\\n", "\t4 & 16.0 8. 304.0 150.0 3433. 12.0 70. 1. & amc rebel sst \\\\\n", "\t5 & 17.0 8. 302.0 140.0 3449. 10.5 70. 1. & ford torino \\\\\n", "\t6 & 15.0 8. 429.0 198.0 4341. 10.0 70. 1. & ford galaxie 500 \\\\\n", "\t7 & 14.0 8. 454.0 220.0 4354. 9.0 70. 1. & chevrolet impala \\\\\n", "\t8 & 14.0 8. 440.0 215.0 4312. 8.5 70. 1. & plymouth fury iii \\\\\n", "\t9 & 14.0 8. 455.0 225.0 4425. 10.0 70. 1. & pontiac catalina \\\\\n", "\t10 & 15.0 8. 390.0 190.0 3850. 8.5 70. 1. & amc ambassador dpl \\\\\n", "\t11 & NA 4. 133.0 115.0 3090. 17.5 70. 2. & citroen ds-21 pallas \\\\\n", "\t12 & NA 8. 350.0 165.0 4142. 11.5 70. 1. & chevrolet chevelle concours (sw) \\\\\n", "\t13 & NA 8. 351.0 153.0 4034. 11.0 70. 1. & ford torino (sw) \\\\\n", "\t14 & NA 8. 383.0 175.0 4166. 10.5 70. 1. & plymouth satellite (sw) \\\\\n", "\t15 & NA 8. 360.0 175.0 3850. 11.0 70. 1. & amc rebel sst (sw) \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×2 DataFrame\n", "│ Row │ metrics │ name │\n", "│ │ \u001b[90mString\u001b[39m │ \u001b[90mString\u001b[39m │\n", "├─────┼─────────────────────────────────────────────────────────────┼───────────────────────────┤\n", "│ 1 │ 18.0 8. 307.0 130.0 3504. 12.0 70. 1. │ chevrolet chevelle malibu │\n", "│ 2 │ 15.0 8. 350.0 165.0 3693. 11.5 70. 1. │ buick skylark 320 │\n", "│ 3 │ 18.0 8. 318.0 150.0 3436. 11.0 70. 1. │ plymouth satellite │\n", "⋮\n", "│ 403 │ 44.0 4. 97.00 52.00 2130. 24.6 82. 2. │ vw pickup │\n", "│ 404 │ 32.0 4. 135.0 84.00 2295. 11.6 82. 1. │ dodge rampage │\n", "│ 405 │ 28.0 4. 120.0 79.00 2625. 18.6 82. 1. │ ford ranger │\n", "│ 406 │ 31.0 4. 119.0 82.00 2720. 19.4 82. 1. │ chevy s-10 │" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_raw = CSV.File(\"auto.txt\", header=[:metrics, :name]) |> DataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that this time CSV.jl auto-detected that tab is the right delimiter to split the columns\n", "\n", "(it was the only delimiter that produced consistent number of columns)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will split `:metrics` column manually now" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "406-element Array{Array{SubString{String},1},1}:\n", " [\"18.0\", \"8.\", \"307.0\", \"130.0\", \"3504.\", \"12.0\", \"70.\", \"1.\"]\n", " [\"15.0\", \"8.\", \"350.0\", \"165.0\", \"3693.\", \"11.5\", \"70.\", \"1.\"]\n", " [\"18.0\", \"8.\", \"318.0\", \"150.0\", \"3436.\", \"11.0\", \"70.\", \"1.\"]\n", " [\"16.0\", \"8.\", \"304.0\", \"150.0\", \"3433.\", \"12.0\", \"70.\", \"1.\"]\n", " [\"17.0\", \"8.\", \"302.0\", \"140.0\", \"3449.\", \"10.5\", \"70.\", \"1.\"]\n", " ⋮\n", " [\"27.0\", \"4.\", \"140.0\", \"86.00\", \"2790.\", \"15.6\", \"82.\", \"1.\"]\n", " [\"44.0\", \"4.\", \"97.00\", \"52.00\", \"2130.\", \"24.6\", \"82.\", \"2.\"]\n", " [\"32.0\", \"4.\", \"135.0\", \"84.00\", \"2295.\", \"11.6\", \"82.\", \"1.\"]\n", " [\"28.0\", \"4.\", \"120.0\", \"79.00\", \"2625.\", \"18.6\", \"82.\", \"1.\"]\n", " [\"31.0\", \"4.\", \"119.0\", \"82.00\", \"2720.\", \"19.4\", \"82.\", \"1.\"]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str_metrics = split.(df_raw.metrics)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us create an empty `df1_2` data frame that we will populate with appropriate columns.\n", "The pattern we use here is typical when you e.g. perform repeated computations whose results you want to store in a `DataFrame`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

0 rows × 8 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearorigin
Float64Float64Float64Float64Float64Float64Float64Float64
" ], "text/latex": [ "\\begin{tabular}{r|cccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin\\\\\n", "\t\\hline\n", "\t& Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64\\\\\n", "\t\\hline\n", "\\end{tabular}\n" ], "text/plain": [ "0×8 DataFrame\n" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2 = DataFrame(fill(Float64, 8),\n", " [:mpg, :cylinders, :displacement, :horsepower, :weight, :acceleration, :year, :origin])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a data frame that has 8 columns and 0 rows. It accepts floating point values. However in columns `:mpg` and `:horsepower` we have to allow the data frame to hold missing values. We do it using `allowmissing!` function" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

0 rows × 8 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearorigin
Float64?Float64Float64Float64?Float64Float64Float64Float64
" ], "text/latex": [ "\\begin{tabular}{r|cccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64\\\\\n", "\t\\hline\n", "\\end{tabular}\n" ], "text/plain": [ "0×8 DataFrame\n" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "allowmissing!(df1_2, [:mpg, :horsepower])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the element type of columns `:mpg` and `:horsepower` changed to `Float64?` which signals that that columns allows missing values." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are ready to populate our data frame." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "for row in str_metrics\n", " push!(df1_2, [v == \"NA\" ? missing : parse(Float64, v) for v in row])\n", "end" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 8 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearorigin
Float64?Float64Float64Float64?Float64Float64Float64Float64
118.08.0307.0130.03504.012.070.01.0
215.08.0350.0165.03693.011.570.01.0
318.08.0318.0150.03436.011.070.01.0
416.08.0304.0150.03433.012.070.01.0
517.08.0302.0140.03449.010.570.01.0
615.08.0429.0198.04341.010.070.01.0
714.08.0454.0220.04354.09.070.01.0
814.08.0440.0215.04312.08.570.01.0
914.08.0455.0225.04425.010.070.01.0
1015.08.0390.0190.03850.08.570.01.0
11missing4.0133.0115.03090.017.570.02.0
12missing8.0350.0165.04142.011.570.01.0
13missing8.0351.0153.04034.011.070.01.0
14missing8.0383.0175.04166.010.570.01.0
15missing8.0360.0175.03850.011.070.01.0
" ], "text/latex": [ "\\begin{tabular}{r|cccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64\\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 \\\\\n", "\t11 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 \\\\\n", "\t12 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 \\\\\n", "\t13 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 \\\\\n", "\t14 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 \\\\\n", "\t15 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×8 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │\n", "⋮\n", "│ 403 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │ 24.6 │ 82.0 │ 2.0 │\n", "│ 404 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │ 11.6 │ 82.0 │ 1.0 │\n", "│ 405 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │ 18.6 │ 82.0 │ 1.0 │\n", "│ 406 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │ 19.4 │ 82.0 │ 1.0 │" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "finally, let us add a column `:name` from the `df_raw` data frame" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "406-element WeakRefStrings.StringArray{String,1}:\n", " \"chevrolet chevelle malibu\"\n", " \"buick skylark 320\"\n", " \"plymouth satellite\"\n", " \"amc rebel sst\"\n", " \"ford torino\"\n", " ⋮\n", " \"ford mustang gl\"\n", " \"vw pickup\"\n", " \"dodge rampage\"\n", " \"ford ranger\"\n", " \"chevy s-10\"" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2.name = df_raw.name" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 9 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginname
Float64?Float64Float64Float64?Float64Float64Float64Float64String
118.08.0307.0130.03504.012.070.01.0chevrolet chevelle malibu
215.08.0350.0165.03693.011.570.01.0buick skylark 320
318.08.0318.0150.03436.011.070.01.0plymouth satellite
416.08.0304.0150.03433.012.070.01.0amc rebel sst
517.08.0302.0140.03449.010.570.01.0ford torino
615.08.0429.0198.04341.010.070.01.0ford galaxie 500
714.08.0454.0220.04354.09.070.01.0chevrolet impala
814.08.0440.0215.04312.08.570.01.0plymouth fury iii
914.08.0455.0225.04425.010.070.01.0pontiac catalina
1015.08.0390.0190.03850.08.570.01.0amc ambassador dpl
11missing4.0133.0115.03090.017.570.02.0citroen ds-21 pallas
12missing8.0350.0165.04142.011.570.01.0chevrolet chevelle concours (sw)
13missing8.0351.0153.04034.011.070.01.0ford torino (sw)
14missing8.0383.0175.04166.010.570.01.0plymouth satellite (sw)
15missing8.0360.0175.03850.011.070.01.0amc rebel sst (sw)
" ], "text/latex": [ "\\begin{tabular}{r|ccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64 & String\\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 & chevrolet chevelle malibu \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 & buick skylark 320 \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 & plymouth satellite \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 & amc rebel sst \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 & ford torino \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 & ford galaxie 500 \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 & chevrolet impala \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 & plymouth fury iii \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 & pontiac catalina \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 & amc ambassador dpl \\\\\n", "\t11 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 & citroen ds-21 pallas \\\\\n", "\t12 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 & chevrolet chevelle concours (sw) \\\\\n", "\t13 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 & ford torino (sw) \\\\\n", "\t14 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 & plymouth satellite (sw) \\\\\n", "\t15 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 & amc rebel sst (sw) \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×9 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼───────────────────────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │ chevrolet chevelle malibu │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │ buick skylark 320 │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │ plymouth satellite │\n", "⋮\n", "│ 403 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │ 24.6 │ 82.0 │ 2.0 │ vw pickup │\n", "│ 404 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │ 11.6 │ 82.0 │ 1.0 │ dodge rampage │\n", "│ 405 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │ 18.6 │ 82.0 │ 1.0 │ ford ranger │\n", "│ 406 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │ 19.4 │ 82.0 │ 1.0 │ chevy s-10 │" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we move forward we should stress one very important thing that is related to `df1_2.name = df_raw.name` assignment. After this operation columns `:name` in both `df1_2` and `df_raw` are the same objects.\n", "We can easily check it:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2.name === df_raw.name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Such behavior is allowed for performance reasons. In this case we accepted it as we want to discard `df_raw` data frame and not use it in later analysis.\n", "However, in general it would be safer to create a copy of `:name` column when assigning it ti `df1_2` data frame.\n", "\n", "It can be achieved like this:\n", "```\n", "df1_2[:, :name] = df_raw.name\n", "```\n", "or like this\n", "```\n", "df1_2.name = df_raw[:, name]\n", "```\n", "We could also write:\n", "```\n", "df1_2[:, :name] = df_raw[:, name]\n", "```\n", "but this time there would be one unnecessary copy made (one when reading the data from `df_raw` the other when writing data to `df1_2`)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can check that `df1` and `df1_2` data frames are equal using `isequal` funtion, so that we ended up with identical data frames." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isequal(df1_2, df1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that as the data frames contain missing values comparing them with `missing` would produce `missing`." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "missing" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1_2 == df1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can easily count the number of missing values in the `df1` data frame using the `eachcol` function that returns the iterator over columns of the data frame:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum(count(ismissing, col) for col in eachcol(df1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "we could alternatively transform our data frame to a `Matrix` and use `count` on it (it would be slower though as unnecessary copies of data would be performed):" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "count(ismissing, Matrix(df1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively could use the `mapcols` function to get the number of missings per column:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

1 rows × 9 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginname
Int64Int64Int64Int64Int64Int64Int64Int64Int64
1800600000
" ], "text/latex": [ "\\begin{tabular}{r|ccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name\\\\\n", "\t\\hline\n", "\t& Int64 & Int64 & Int64 & Int64 & Int64 & Int64 & Int64 & Int64 & Int64\\\\\n", "\t\\hline\n", "\t1 & 8 & 0 & 0 & 6 & 0 & 0 & 0 & 0 & 0 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "1×9 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │\n", "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │ \u001b[90mInt64\u001b[39m │\n", "├─────┼───────┼───────────┼──────────────┼────────────┼────────┼──────────────┼───────┼────────┼───────┤\n", "│ 1 │ 8 │ 0 │ 0 │ 6 │ 0 │ 0 │ 0 │ 0 │ 0 │" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mapcols(x -> count(ismissing, x), df1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "also it is easy to find the rows containing missing values using the `filter` function:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

14 rows × 9 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginname
Float64?Float64Float64Float64?Float64Float64Float64Float64String
1missing4.0133.0115.03090.017.570.02.0citroen ds-21 pallas
2missing8.0350.0165.04142.011.570.01.0chevrolet chevelle concours (sw)
3missing8.0351.0153.04034.011.070.01.0ford torino (sw)
4missing8.0383.0175.04166.010.570.01.0plymouth satellite (sw)
5missing8.0360.0175.03850.011.070.01.0amc rebel sst (sw)
6missing8.0302.0140.03353.08.070.01.0ford mustang boss 302
725.04.098.0missing2046.019.071.01.0ford pinto
8missing4.097.048.01978.020.071.02.0volkswagen super beetle 117
921.06.0200.0missing2875.017.074.01.0ford maverick
1040.94.085.0missing1835.017.380.02.0renault lecar deluxe
1123.64.0140.0missing2905.014.380.01.0ford mustang cobra
1234.54.0100.0missing2320.015.881.02.0renault 18i
13missing4.0121.0110.02800.015.481.02.0saab 900s
1423.04.0151.0missing3035.020.582.01.0amc concord dl
" ], "text/latex": [ "\\begin{tabular}{r|ccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64 & String\\\\\n", "\t\\hline\n", "\t1 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 & citroen ds-21 pallas \\\\\n", "\t2 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 & chevrolet chevelle concours (sw) \\\\\n", "\t3 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 & ford torino (sw) \\\\\n", "\t4 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 & plymouth satellite (sw) \\\\\n", "\t5 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 & amc rebel sst (sw) \\\\\n", "\t6 & \\emph{missing} & 8.0 & 302.0 & 140.0 & 3353.0 & 8.0 & 70.0 & 1.0 & ford mustang boss 302 \\\\\n", "\t7 & 25.0 & 4.0 & 98.0 & \\emph{missing} & 2046.0 & 19.0 & 71.0 & 1.0 & ford pinto \\\\\n", "\t8 & \\emph{missing} & 4.0 & 97.0 & 48.0 & 1978.0 & 20.0 & 71.0 & 2.0 & volkswagen super beetle 117 \\\\\n", "\t9 & 21.0 & 6.0 & 200.0 & \\emph{missing} & 2875.0 & 17.0 & 74.0 & 1.0 & ford maverick \\\\\n", "\t10 & 40.9 & 4.0 & 85.0 & \\emph{missing} & 1835.0 & 17.3 & 80.0 & 2.0 & renault lecar deluxe \\\\\n", "\t11 & 23.6 & 4.0 & 140.0 & \\emph{missing} & 2905.0 & 14.3 & 80.0 & 1.0 & ford mustang cobra \\\\\n", "\t12 & 34.5 & 4.0 & 100.0 & \\emph{missing} & 2320.0 & 15.8 & 81.0 & 2.0 & renault 18i \\\\\n", "\t13 & \\emph{missing} & 4.0 & 121.0 & 110.0 & 2800.0 & 15.4 & 81.0 & 2.0 & saab 900s \\\\\n", "\t14 & 23.0 & 4.0 & 151.0 & \\emph{missing} & 3035.0 & 20.5 & 82.0 & 1.0 & amc concord dl \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "14×9 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼──────────────────────────────────┤\n", "│ 1 │ \u001b[90mmissing\u001b[39m │ 4.0 │ 133.0 │ 115.0 │ 3090.0 │ 17.5 │ 70.0 │ 2.0 │ citroen ds-21 pallas │\n", "│ 2 │ \u001b[90mmissing\u001b[39m │ 8.0 │ 350.0 │ 165.0 │ 4142.0 │ 11.5 │ 70.0 │ 1.0 │ chevrolet chevelle concours (sw) │\n", "│ 3 │ \u001b[90mmissing\u001b[39m │ 8.0 │ 351.0 │ 153.0 │ 4034.0 │ 11.0 │ 70.0 │ 1.0 │ ford torino (sw) │\n", "⋮\n", "│ 11 │ 23.6 │ 4.0 │ 140.0 │ \u001b[90mmissing\u001b[39m │ 2905.0 │ 14.3 │ 80.0 │ 1.0 │ ford mustang cobra │\n", "│ 12 │ 34.5 │ 4.0 │ 100.0 │ \u001b[90mmissing\u001b[39m │ 2320.0 │ 15.8 │ 81.0 │ 2.0 │ renault 18i │\n", "│ 13 │ \u001b[90mmissing\u001b[39m │ 4.0 │ 121.0 │ 110.0 │ 2800.0 │ 15.4 │ 81.0 │ 2.0 │ saab 900s │\n", "│ 14 │ 23.0 │ 4.0 │ 151.0 │ \u001b[90mmissing\u001b[39m │ 3035.0 │ 20.5 │ 82.0 │ 1.0 │ amc concord dl │" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filter(row -> any(ismissing, row), df1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Assume we are interested in the brand of each car. We can extract it from `:name` column using broadcasting like this:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "406-element Array{SubString{String},1}:\n", " \"chevrolet\"\n", " \"buick\"\n", " \"plymouth\"\n", " \"amc\"\n", " \"ford\"\n", " ⋮\n", " \"ford\"\n", " \"vw\"\n", " \"dodge\"\n", " \"ford\"\n", " \"chevy\"" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1.brand = first.(split.(df1.name))" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

406 rows × 10 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginnamebrand
Float64?Float64Float64Float64?Float64Float64Float64Float64StringSubStri…
118.08.0307.0130.03504.012.070.01.0chevrolet chevelle malibuchevrolet
215.08.0350.0165.03693.011.570.01.0buick skylark 320buick
318.08.0318.0150.03436.011.070.01.0plymouth satelliteplymouth
416.08.0304.0150.03433.012.070.01.0amc rebel sstamc
517.08.0302.0140.03449.010.570.01.0ford torinoford
615.08.0429.0198.04341.010.070.01.0ford galaxie 500ford
714.08.0454.0220.04354.09.070.01.0chevrolet impalachevrolet
814.08.0440.0215.04312.08.570.01.0plymouth fury iiiplymouth
914.08.0455.0225.04425.010.070.01.0pontiac catalinapontiac
1015.08.0390.0190.03850.08.570.01.0amc ambassador dplamc
11missing4.0133.0115.03090.017.570.02.0citroen ds-21 pallascitroen
12missing8.0350.0165.04142.011.570.01.0chevrolet chevelle concours (sw)chevrolet
13missing8.0351.0153.04034.011.070.01.0ford torino (sw)ford
14missing8.0383.0175.04166.010.570.01.0plymouth satellite (sw)plymouth
15missing8.0360.0175.03850.011.070.01.0amc rebel sst (sw)amc
" ], "text/latex": [ "\\begin{tabular}{r|cccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name & brand\\\\\n", "\t\\hline\n", "\t& Float64? & Float64 & Float64 & Float64? & Float64 & Float64 & Float64 & Float64 & String & SubStri…\\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 & chevrolet chevelle malibu & chevrolet \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 & buick skylark 320 & buick \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 & plymouth satellite & plymouth \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 & amc rebel sst & amc \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 & ford torino & ford \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 & ford galaxie 500 & ford \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 & chevrolet impala & chevrolet \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 & plymouth fury iii & plymouth \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 & pontiac catalina & pontiac \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 & amc ambassador dpl & amc \\\\\n", "\t11 & \\emph{missing} & 4.0 & 133.0 & 115.0 & 3090.0 & 17.5 & 70.0 & 2.0 & citroen ds-21 pallas & citroen \\\\\n", "\t12 & \\emph{missing} & 8.0 & 350.0 & 165.0 & 4142.0 & 11.5 & 70.0 & 1.0 & chevrolet chevelle concours (sw) & chevrolet \\\\\n", "\t13 & \\emph{missing} & 8.0 & 351.0 & 153.0 & 4034.0 & 11.0 & 70.0 & 1.0 & ford torino (sw) & ford \\\\\n", "\t14 & \\emph{missing} & 8.0 & 383.0 & 175.0 & 4166.0 & 10.5 & 70.0 & 1.0 & plymouth satellite (sw) & plymouth \\\\\n", "\t15 & \\emph{missing} & 8.0 & 360.0 & 175.0 & 3850.0 & 11.0 & 70.0 & 1.0 & amc rebel sst (sw) & amc \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "406×10 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │ brand │\n", "│ │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │ \u001b[90mSubStrin…\u001b[39m │\n", "├─────┼──────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼───────────────────────────┼───────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │ chevrolet chevelle malibu │ chevrolet │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │ buick skylark 320 │ buick │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │ plymouth satellite │ plymouth │\n", "⋮\n", "│ 403 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │ 24.6 │ 82.0 │ 2.0 │ vw pickup │ vw │\n", "│ 404 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │ 11.6 │ 82.0 │ 1.0 │ dodge rampage │ dodge │\n", "│ 405 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │ 18.6 │ 82.0 │ 1.0 │ ford ranger │ ford │\n", "│ 406 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │ 19.4 │ 82.0 │ 1.0 │ chevy s-10 │ chevy │" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Earlier we have shown how one can manually find rows that have missing values. A common operation is the reverse - i.e. selecting only rows that do not contain missing values. This can be achieved like this:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

392 rows × 10 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginnamebrand
Float64Float64Float64Float64Float64Float64Float64Float64StringSubStri…
118.08.0307.0130.03504.012.070.01.0chevrolet chevelle malibuchevrolet
215.08.0350.0165.03693.011.570.01.0buick skylark 320buick
318.08.0318.0150.03436.011.070.01.0plymouth satelliteplymouth
416.08.0304.0150.03433.012.070.01.0amc rebel sstamc
517.08.0302.0140.03449.010.570.01.0ford torinoford
615.08.0429.0198.04341.010.070.01.0ford galaxie 500ford
714.08.0454.0220.04354.09.070.01.0chevrolet impalachevrolet
814.08.0440.0215.04312.08.570.01.0plymouth fury iiiplymouth
914.08.0455.0225.04425.010.070.01.0pontiac catalinapontiac
1015.08.0390.0190.03850.08.570.01.0amc ambassador dplamc
1115.08.0383.0170.03563.010.070.01.0dodge challenger sedodge
1214.08.0340.0160.03609.08.070.01.0plymouth 'cuda 340plymouth
1315.08.0400.0150.03761.09.570.01.0chevrolet monte carlochevrolet
1414.08.0455.0225.03086.010.070.01.0buick estate wagon (sw)buick
1524.04.0113.095.02372.015.070.03.0toyota corona mark iitoyota
" ], "text/latex": [ "\\begin{tabular}{r|cccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name & brand\\\\\n", "\t\\hline\n", "\t& Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & String & SubStri…\\\\\n", "\t\\hline\n", "\t1 & 18.0 & 8.0 & 307.0 & 130.0 & 3504.0 & 12.0 & 70.0 & 1.0 & chevrolet chevelle malibu & chevrolet \\\\\n", "\t2 & 15.0 & 8.0 & 350.0 & 165.0 & 3693.0 & 11.5 & 70.0 & 1.0 & buick skylark 320 & buick \\\\\n", "\t3 & 18.0 & 8.0 & 318.0 & 150.0 & 3436.0 & 11.0 & 70.0 & 1.0 & plymouth satellite & plymouth \\\\\n", "\t4 & 16.0 & 8.0 & 304.0 & 150.0 & 3433.0 & 12.0 & 70.0 & 1.0 & amc rebel sst & amc \\\\\n", "\t5 & 17.0 & 8.0 & 302.0 & 140.0 & 3449.0 & 10.5 & 70.0 & 1.0 & ford torino & ford \\\\\n", "\t6 & 15.0 & 8.0 & 429.0 & 198.0 & 4341.0 & 10.0 & 70.0 & 1.0 & ford galaxie 500 & ford \\\\\n", "\t7 & 14.0 & 8.0 & 454.0 & 220.0 & 4354.0 & 9.0 & 70.0 & 1.0 & chevrolet impala & chevrolet \\\\\n", "\t8 & 14.0 & 8.0 & 440.0 & 215.0 & 4312.0 & 8.5 & 70.0 & 1.0 & plymouth fury iii & plymouth \\\\\n", "\t9 & 14.0 & 8.0 & 455.0 & 225.0 & 4425.0 & 10.0 & 70.0 & 1.0 & pontiac catalina & pontiac \\\\\n", "\t10 & 15.0 & 8.0 & 390.0 & 190.0 & 3850.0 & 8.5 & 70.0 & 1.0 & amc ambassador dpl & amc \\\\\n", "\t11 & 15.0 & 8.0 & 383.0 & 170.0 & 3563.0 & 10.0 & 70.0 & 1.0 & dodge challenger se & dodge \\\\\n", "\t12 & 14.0 & 8.0 & 340.0 & 160.0 & 3609.0 & 8.0 & 70.0 & 1.0 & plymouth 'cuda 340 & plymouth \\\\\n", "\t13 & 15.0 & 8.0 & 400.0 & 150.0 & 3761.0 & 9.5 & 70.0 & 1.0 & chevrolet monte carlo & chevrolet \\\\\n", "\t14 & 14.0 & 8.0 & 455.0 & 225.0 & 3086.0 & 10.0 & 70.0 & 1.0 & buick estate wagon (sw) & buick \\\\\n", "\t15 & 24.0 & 4.0 & 113.0 & 95.0 & 2372.0 & 15.0 & 70.0 & 3.0 & toyota corona mark ii & toyota \\\\\n", "\t$\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ & $\\dots$ \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "392×10 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │ brand │\n", "│ │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │ \u001b[90mSubStrin…\u001b[39m │\n", "├─────┼─────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼───────────────────────────┼───────────┤\n", "│ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │ chevrolet chevelle malibu │ chevrolet │\n", "│ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │ buick skylark 320 │ buick │\n", "│ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │ plymouth satellite │ plymouth │\n", "⋮\n", "│ 389 │ 44.0 │ 4.0 │ 97.0 │ 52.0 │ 2130.0 │ 24.6 │ 82.0 │ 2.0 │ vw pickup │ vw │\n", "│ 390 │ 32.0 │ 4.0 │ 135.0 │ 84.0 │ 2295.0 │ 11.6 │ 82.0 │ 1.0 │ dodge rampage │ dodge │\n", "│ 391 │ 28.0 │ 4.0 │ 120.0 │ 79.0 │ 2625.0 │ 18.6 │ 82.0 │ 1.0 │ ford ranger │ ford │\n", "│ 392 │ 31.0 │ 4.0 │ 119.0 │ 82.0 │ 2720.0 │ 19.4 │ 82.0 │ 1.0 │ chevy s-10 │ chevy │" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = dropmissing(df1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us find all rows that correspond to `\"saab\"` brand. You can do it in two ways, either indexing or using `filter` function:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

4 rows × 10 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginnamebrand
Float64Float64Float64Float64Float64Float64Float64Float64StringSubStri…
125.04.0104.095.02375.017.570.02.0saab 99esaab
224.04.0121.0110.02660.014.073.02.0saab 99lesaab
325.04.0121.0115.02671.013.575.02.0saab 99lesaab
421.64.0121.0115.02795.015.778.02.0saab 99glesaab
" ], "text/latex": [ "\\begin{tabular}{r|cccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name & brand\\\\\n", "\t\\hline\n", "\t& Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & String & SubStri…\\\\\n", "\t\\hline\n", "\t1 & 25.0 & 4.0 & 104.0 & 95.0 & 2375.0 & 17.5 & 70.0 & 2.0 & saab 99e & saab \\\\\n", "\t2 & 24.0 & 4.0 & 121.0 & 110.0 & 2660.0 & 14.0 & 73.0 & 2.0 & saab 99le & saab \\\\\n", "\t3 & 25.0 & 4.0 & 121.0 & 115.0 & 2671.0 & 13.5 & 75.0 & 2.0 & saab 99le & saab \\\\\n", "\t4 & 21.6 & 4.0 & 121.0 & 115.0 & 2795.0 & 15.7 & 78.0 & 2.0 & saab 99gle & saab \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×10 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │ brand │\n", "│ │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │ \u001b[90mSubStri…\u001b[39m │\n", "├─────┼─────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼────────────┼──────────┤\n", "│ 1 │ 25.0 │ 4.0 │ 104.0 │ 95.0 │ 2375.0 │ 17.5 │ 70.0 │ 2.0 │ saab 99e │ saab │\n", "│ 2 │ 24.0 │ 4.0 │ 121.0 │ 110.0 │ 2660.0 │ 14.0 │ 73.0 │ 2.0 │ saab 99le │ saab │\n", "│ 3 │ 25.0 │ 4.0 │ 121.0 │ 115.0 │ 2671.0 │ 13.5 │ 75.0 │ 2.0 │ saab 99le │ saab │\n", "│ 4 │ 21.6 │ 4.0 │ 121.0 │ 115.0 │ 2795.0 │ 15.7 │ 78.0 │ 2.0 │ saab 99gle │ saab │" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2[df2.brand .== \"saab\", :]" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

4 rows × 10 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginnamebrand
Float64Float64Float64Float64Float64Float64Float64Float64StringSubStri…
125.04.0104.095.02375.017.570.02.0saab 99esaab
224.04.0121.0110.02660.014.073.02.0saab 99lesaab
325.04.0121.0115.02671.013.575.02.0saab 99lesaab
421.64.0121.0115.02795.015.778.02.0saab 99glesaab
" ], "text/latex": [ "\\begin{tabular}{r|cccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name & brand\\\\\n", "\t\\hline\n", "\t& Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & String & SubStri…\\\\\n", "\t\\hline\n", "\t1 & 25.0 & 4.0 & 104.0 & 95.0 & 2375.0 & 17.5 & 70.0 & 2.0 & saab 99e & saab \\\\\n", "\t2 & 24.0 & 4.0 & 121.0 & 110.0 & 2660.0 & 14.0 & 73.0 & 2.0 & saab 99le & saab \\\\\n", "\t3 & 25.0 & 4.0 & 121.0 & 115.0 & 2671.0 & 13.5 & 75.0 & 2.0 & saab 99le & saab \\\\\n", "\t4 & 21.6 & 4.0 & 121.0 & 115.0 & 2795.0 & 15.7 & 78.0 & 2.0 & saab 99gle & saab \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×10 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │ brand │\n", "│ │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │ \u001b[90mSubStri…\u001b[39m │\n", "├─────┼─────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼────────────┼──────────┤\n", "│ 1 │ 25.0 │ 4.0 │ 104.0 │ 95.0 │ 2375.0 │ 17.5 │ 70.0 │ 2.0 │ saab 99e │ saab │\n", "│ 2 │ 24.0 │ 4.0 │ 121.0 │ 110.0 │ 2660.0 │ 14.0 │ 73.0 │ 2.0 │ saab 99le │ saab │\n", "│ 3 │ 25.0 │ 4.0 │ 121.0 │ 115.0 │ 2671.0 │ 13.5 │ 75.0 │ 2.0 │ saab 99le │ saab │\n", "│ 4 │ 21.6 │ 4.0 │ 121.0 │ 115.0 │ 2795.0 │ 15.7 │ 78.0 │ 2.0 │ saab 99gle │ saab │" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filter(:brand => ==(\"saab\"), df2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the `:brand => ==(\"saab\")` syntax means that we take elements of `:brand` column and pass it to `==(\"saab\")` function.\n", "\n", "Now `==(\"saab\")` is just a shorthand for `x -> x == \"saab\"`.\n", "\n", "Alternatively we could do the filtering operation in the following way (this is a bit slower but you might find it more readable):" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

4 rows × 10 columns

mpgcylindersdisplacementhorsepowerweightaccelerationyearoriginnamebrand
Float64Float64Float64Float64Float64Float64Float64Float64StringSubStri…
125.04.0104.095.02375.017.570.02.0saab 99esaab
224.04.0121.0110.02660.014.073.02.0saab 99lesaab
325.04.0121.0115.02671.013.575.02.0saab 99lesaab
421.64.0121.0115.02795.015.778.02.0saab 99glesaab
" ], "text/latex": [ "\\begin{tabular}{r|cccccccccc}\n", "\t& mpg & cylinders & displacement & horsepower & weight & acceleration & year & origin & name & brand\\\\\n", "\t\\hline\n", "\t& Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & Float64 & String & SubStri…\\\\\n", "\t\\hline\n", "\t1 & 25.0 & 4.0 & 104.0 & 95.0 & 2375.0 & 17.5 & 70.0 & 2.0 & saab 99e & saab \\\\\n", "\t2 & 24.0 & 4.0 & 121.0 & 110.0 & 2660.0 & 14.0 & 73.0 & 2.0 & saab 99le & saab \\\\\n", "\t3 & 25.0 & 4.0 & 121.0 & 115.0 & 2671.0 & 13.5 & 75.0 & 2.0 & saab 99le & saab \\\\\n", "\t4 & 21.6 & 4.0 & 121.0 & 115.0 & 2795.0 & 15.7 & 78.0 & 2.0 & saab 99gle & saab \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×10 DataFrame\n", "│ Row │ mpg │ cylinders │ displacement │ horsepower │ weight │ acceleration │ year │ origin │ name │ brand │\n", "│ │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mString\u001b[39m │ \u001b[90mSubStri…\u001b[39m │\n", "├─────┼─────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼────────────┼──────────┤\n", "│ 1 │ 25.0 │ 4.0 │ 104.0 │ 95.0 │ 2375.0 │ 17.5 │ 70.0 │ 2.0 │ saab 99e │ saab │\n", "│ 2 │ 24.0 │ 4.0 │ 121.0 │ 110.0 │ 2660.0 │ 14.0 │ 73.0 │ 2.0 │ saab 99le │ saab │\n", "│ 3 │ 25.0 │ 4.0 │ 121.0 │ 115.0 │ 2671.0 │ 13.5 │ 75.0 │ 2.0 │ saab 99le │ saab │\n", "│ 4 │ 21.6 │ 4.0 │ 121.0 │ 115.0 │ 2795.0 │ 15.7 │ 78.0 │ 2.0 │ saab 99gle │ saab │" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filter(row -> row.brand == \"saab\", df2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To finish this part of the tutorial let us save the `df2` file to auto2.csv file. We will use it later in the next parts of the course." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"auto2.csv\"" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "CSV.write(\"auto2.csv\", df2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us just quickly inspect what we have written to disk before we finish:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "393-element Array{String,1}:\n", " \"mpg,cylinders,displacement,horsepower,weight,acceleration,year,origin,name,brand\"\n", " \"18.0,8.0,307.0,130.0,3504.0,12.0,70.0,1.0,chevrolet chevelle malibu,chevrolet\"\n", " \"15.0,8.0,350.0,165.0,3693.0,11.5,70.0,1.0,buick skylark 320,buick\"\n", " \"18.0,8.0,318.0,150.0,3436.0,11.0,70.0,1.0,plymouth satellite,plymouth\"\n", " \"16.0,8.0,304.0,150.0,3433.0,12.0,70.0,1.0,amc rebel sst,amc\"\n", " ⋮\n", " \"27.0,4.0,140.0,86.0,2790.0,15.6,82.0,1.0,ford mustang gl,ford\"\n", " \"44.0,4.0,97.0,52.0,2130.0,24.6,82.0,2.0,vw pickup,vw\"\n", " \"32.0,4.0,135.0,84.0,2295.0,11.6,82.0,1.0,dodge rampage,dodge\"\n", " \"28.0,4.0,120.0,79.0,2625.0,18.6,82.0,1.0,ford ranger,ford\"\n", " \"31.0,4.0,119.0,82.0,2720.0,19.4,82.0,1.0,chevy s-10,chevy\"" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "readlines(\"auto2.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that CSV.jl by default used a comma to separate the fields in our file and written a header in the first row of the file." ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.4.1", "language": "julia", "name": "julia-1.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.1" } }, "nbformat": 4, "nbformat_minor": 4 }