Week 3 Friday#

Announcements#

  • Videos and video quizzes due. Finish them later if you haven’t already.

  • Worksheet 3 is graded. (If you got a 0 and submitted something, probably you did not enable public sharing. Enable public sharing following the instructions on the bottom of Worksheet 3 and send me an email to let me know.)

  • Worksheets 5 and 6 are due Tuesday before discussion section.

  • The in-class quiz on Tuesday will be based on the material up to and including Worksheet 6

Plan#

The main goal of today is to see how to use groupby with a for loop. (I have had that on the agenda several times but never gotten to it.) We will have plenty of time for questions, so please ask when something doesn’t make sense.

  • Overall question: Within the “mpg” dataset from Seaborn, do newer cars seem to become more fuel-efficient (i.e., have higher mpg)?

import pandas as pd
import seaborn as sns
df = sns.load_dataset("mpg")
df.head(3)
mpg cylinders displacement horsepower weight acceleration model_year origin name
0 18.0 8 307.0 130.0 3504 12.0 70 usa chevrolet chevelle malibu
1 15.0 8 350.0 165.0 3693 11.5 70 usa buick skylark 320
2 18.0 8 318.0 150.0 3436 11.0 70 usa plymouth satellite

Using groupby with a for loop#

Recall that df.groupby produces its own type of object. I often abbreviate the long name to a pandas GroupBy object.

x = df.groupby("model_year")
x
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fad5d6cb7f0>

Here’s a brief reminder about for loops in an easier example. Here we are iterating over a range object. Later we will iterate over a pandas GroupBy object.

for z in range(5):
    print(z)
0
1
2
3
4

If we call print on the type of an object, we see an expression like the following. As far as I know, <class 'int'> is exactly the same as int.

for z in range(5):
    print(type(z))
<class 'int'>
<class 'int'>
<class 'int'>
<class 'int'>
<class 'int'>

It would be very reasonable to think that z is not defined, because it only occurred as a variable inside a for loop, but that’s not how Python works with respect to for loops. The variables are accessible external to the for loop. We only see 4 because the assignment z=4 overwrote the earlier assignments z=0, then z=1, and so on.

z
4

This is what we usually see when we check the type of an object.

type(z)
int

But if we print the type of the object, we see the same expression we saw above. (Comment. This material about print(type(z)) is not itself important for us. The only reason we are covering it, is so you are not confused when we see the word class below.)

print(type(z))
<class 'int'>

Let’s finally get to the topic of today, which is what type of objects do we get when we iterate through a pandas GroupBy object. We get tuples.

for z in df.groupby("model_year"):
    print(type(z))
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>

The most natural question to ask then is, what is the length of these tuples? The length is 2.

for z in df.groupby("model_year"):
    print(len(z))
2
2
2
2
2
2
2
2
2
2
2
2
2

What is the initial element of the length 2 tuple? Here they are. Notice that these values are exactly the “model_year” values that occur in our pandas DataFrame df.

for z in df.groupby("model_year"):
    print(z[0])
70
71
72
73
74
75
76
77
78
79
80
81
82

We will usually “unpack” the tuple into two separate variables. The a value in the following is exactly the same as the z[0] value above.

for z in df.groupby("model_year"):
    (a,b) = z # tuple unpacking
    print(a)
70
71
72
73
74
75
76
77
78
79
80
81
82

Here is a more elegant way to do the tuple unpacking; we unpack immediately when we create the for loop, as opposed to defining a variable z which we never use.

for a,b in df.groupby("model_year"):
    print(a)
70
71
72
73
74
75
76
77
78
79
80
81
82

This tuple unpacking only works because the elements produced when we iterate over df.groupby are length 2 tuples. If we try to do this same unpacking with our range(5) for loop, we get an error.

for a,b in range(5):
    print(a)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [15], line 1
----> 1 for a,b in range(5):
      2     print(a)

TypeError: cannot unpack non-iterable int object

We get the exact same error if we try to unpack the initial thing in that for loop, 0.

a,b = 0
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [16], line 1
----> 1 a,b = 0

TypeError: cannot unpack non-iterable int object

Why do we see 2 so many times? Does the following help? Initially z is assigned to be (3,5), and its length is 2. Then z is assigned to be (1,2,3,4), and its length is 4, and so on. In our GroupBy for loop, the values of z always have the same length.

for z in [(3,5), (1,2,3,4), (1,6,5)]:
    print(len(z))
2
4
3

We’ve already seen the following example. The initial element (the zeroth element) in the length 2 tuples is the value we are grouping by.

for a,b in df.groupby("model_year"):
    print(a)
70
71
72
73
74
75
76
77
78
79
80
81
82

If we try to do the analogous thing for b, the second object in the tuple, it is much less helpful. We can at least tell that b is much more complicated than a.

for a,b in df.groupby("model_year"):
    print(b)
     mpg  cylinders  displacement  horsepower  weight  acceleration  \
0   18.0          8         307.0       130.0    3504          12.0   
1   15.0          8         350.0       165.0    3693          11.5   
2   18.0          8         318.0       150.0    3436          11.0   
3   16.0          8         304.0       150.0    3433          12.0   
4   17.0          8         302.0       140.0    3449          10.5   
5   15.0          8         429.0       198.0    4341          10.0   
6   14.0          8         454.0       220.0    4354           9.0   
7   14.0          8         440.0       215.0    4312           8.5   
8   14.0          8         455.0       225.0    4425          10.0   
9   15.0          8         390.0       190.0    3850           8.5   
10  15.0          8         383.0       170.0    3563          10.0   
11  14.0          8         340.0       160.0    3609           8.0   
12  15.0          8         400.0       150.0    3761           9.5   
13  14.0          8         455.0       225.0    3086          10.0   
14  24.0          4         113.0        95.0    2372          15.0   
15  22.0          6         198.0        95.0    2833          15.5   
16  18.0          6         199.0        97.0    2774          15.5   
17  21.0          6         200.0        85.0    2587          16.0   
18  27.0          4          97.0        88.0    2130          14.5   
19  26.0          4          97.0        46.0    1835          20.5   
20  25.0          4         110.0        87.0    2672          17.5   
21  24.0          4         107.0        90.0    2430          14.5   
22  25.0          4         104.0        95.0    2375          17.5   
23  26.0          4         121.0       113.0    2234          12.5   
24  21.0          6         199.0        90.0    2648          15.0   
25  10.0          8         360.0       215.0    4615          14.0   
26  10.0          8         307.0       200.0    4376          15.0   
27  11.0          8         318.0       210.0    4382          13.5   
28   9.0          8         304.0       193.0    4732          18.5   

    model_year  origin                          name  
0           70     usa     chevrolet chevelle malibu  
1           70     usa             buick skylark 320  
2           70     usa            plymouth satellite  
3           70     usa                 amc rebel sst  
4           70     usa                   ford torino  
5           70     usa              ford galaxie 500  
6           70     usa              chevrolet impala  
7           70     usa             plymouth fury iii  
8           70     usa              pontiac catalina  
9           70     usa            amc ambassador dpl  
10          70     usa           dodge challenger se  
11          70     usa            plymouth 'cuda 340  
12          70     usa         chevrolet monte carlo  
13          70     usa       buick estate wagon (sw)  
14          70   japan         toyota corona mark ii  
15          70     usa               plymouth duster  
16          70     usa                    amc hornet  
17          70     usa                 ford maverick  
18          70   japan                  datsun pl510  
19          70  europe  volkswagen 1131 deluxe sedan  
20          70  europe                   peugeot 504  
21          70  europe                   audi 100 ls  
22          70  europe                      saab 99e  
23          70  europe                      bmw 2002  
24          70     usa                   amc gremlin  
25          70     usa                     ford f250  
26          70     usa                     chevy c20  
27          70     usa                    dodge d200  
28          70     usa                      hi 1200d  
     mpg  cylinders  displacement  horsepower  weight  acceleration  \
29  27.0          4          97.0        88.0    2130          14.5   
30  28.0          4         140.0        90.0    2264          15.5   
31  25.0          4         113.0        95.0    2228          14.0   
32  25.0          4          98.0         NaN    2046          19.0   
33  19.0          6         232.0       100.0    2634          13.0   
34  16.0          6         225.0       105.0    3439          15.5   
35  17.0          6         250.0       100.0    3329          15.5   
36  19.0          6         250.0        88.0    3302          15.5   
37  18.0          6         232.0       100.0    3288          15.5   
38  14.0          8         350.0       165.0    4209          12.0   
39  14.0          8         400.0       175.0    4464          11.5   
40  14.0          8         351.0       153.0    4154          13.5   
41  14.0          8         318.0       150.0    4096          13.0   
42  12.0          8         383.0       180.0    4955          11.5   
43  13.0          8         400.0       170.0    4746          12.0   
44  13.0          8         400.0       175.0    5140          12.0   
45  18.0          6         258.0       110.0    2962          13.5   
46  22.0          4         140.0        72.0    2408          19.0   
47  19.0          6         250.0       100.0    3282          15.0   
48  18.0          6         250.0        88.0    3139          14.5   
49  23.0          4         122.0        86.0    2220          14.0   
50  28.0          4         116.0        90.0    2123          14.0   
51  30.0          4          79.0        70.0    2074          19.5   
52  30.0          4          88.0        76.0    2065          14.5   
53  31.0          4          71.0        65.0    1773          19.0   
54  35.0          4          72.0        69.0    1613          18.0   
55  27.0          4          97.0        60.0    1834          19.0   
56  26.0          4          91.0        70.0    1955          20.5   

    model_year  origin                        name  
29          71   japan                datsun pl510  
30          71     usa         chevrolet vega 2300  
31          71   japan               toyota corona  
32          71     usa                  ford pinto  
33          71     usa                 amc gremlin  
34          71     usa   plymouth satellite custom  
35          71     usa   chevrolet chevelle malibu  
36          71     usa             ford torino 500  
37          71     usa                 amc matador  
38          71     usa            chevrolet impala  
39          71     usa   pontiac catalina brougham  
40          71     usa            ford galaxie 500  
41          71     usa           plymouth fury iii  
42          71     usa           dodge monaco (sw)  
43          71     usa    ford country squire (sw)  
44          71     usa         pontiac safari (sw)  
45          71     usa  amc hornet sportabout (sw)  
46          71     usa         chevrolet vega (sw)  
47          71     usa            pontiac firebird  
48          71     usa                ford mustang  
49          71     usa          mercury capri 2000  
50          71  europe                   opel 1900  
51          71  europe                 peugeot 304  
52          71  europe                   fiat 124b  
53          71   japan         toyota corolla 1200  
54          71   japan                 datsun 1200  
55          71  europe        volkswagen model 111  
56          71     usa            plymouth cricket  
     mpg  cylinders  displacement  horsepower  weight  acceleration  \
57  24.0          4         113.0        95.0    2278          15.5   
58  25.0          4          97.5        80.0    2126          17.0   
59  23.0          4          97.0        54.0    2254          23.5   
60  20.0          4         140.0        90.0    2408          19.5   
61  21.0          4         122.0        86.0    2226          16.5   
62  13.0          8         350.0       165.0    4274          12.0   
63  14.0          8         400.0       175.0    4385          12.0   
64  15.0          8         318.0       150.0    4135          13.5   
65  14.0          8         351.0       153.0    4129          13.0   
66  17.0          8         304.0       150.0    3672          11.5   
67  11.0          8         429.0       208.0    4633          11.0   
68  13.0          8         350.0       155.0    4502          13.5   
69  12.0          8         350.0       160.0    4456          13.5   
70  13.0          8         400.0       190.0    4422          12.5   
71  19.0          3          70.0        97.0    2330          13.5   
72  15.0          8         304.0       150.0    3892          12.5   
73  13.0          8         307.0       130.0    4098          14.0   
74  13.0          8         302.0       140.0    4294          16.0   
75  14.0          8         318.0       150.0    4077          14.0   
76  18.0          4         121.0       112.0    2933          14.5   
77  22.0          4         121.0        76.0    2511          18.0   
78  21.0          4         120.0        87.0    2979          19.5   
79  26.0          4          96.0        69.0    2189          18.0   
80  22.0          4         122.0        86.0    2395          16.0   
81  28.0          4          97.0        92.0    2288          17.0   
82  23.0          4         120.0        97.0    2506          14.5   
83  28.0          4          98.0        80.0    2164          15.0   
84  27.0          4          97.0        88.0    2100          16.5   

    model_year  origin                              name  
57          72   japan             toyota corona hardtop  
58          72     usa                dodge colt hardtop  
59          72  europe                 volkswagen type 3  
60          72     usa                    chevrolet vega  
61          72     usa               ford pinto runabout  
62          72     usa                  chevrolet impala  
63          72     usa                  pontiac catalina  
64          72     usa                 plymouth fury iii  
65          72     usa                  ford galaxie 500  
66          72     usa                amc ambassador sst  
67          72     usa                   mercury marquis  
68          72     usa              buick lesabre custom  
69          72     usa        oldsmobile delta 88 royale  
70          72     usa            chrysler newport royal  
71          72   japan                   mazda rx2 coupe  
72          72     usa                  amc matador (sw)  
73          72     usa  chevrolet chevelle concours (sw)  
74          72     usa             ford gran torino (sw)  
75          72     usa    plymouth satellite custom (sw)  
76          72  europe                   volvo 145e (sw)  
77          72  europe               volkswagen 411 (sw)  
78          72  europe                  peugeot 504 (sw)  
79          72  europe                   renault 12 (sw)  
80          72     usa                   ford pinto (sw)  
81          72   japan                   datsun 510 (sw)  
82          72   japan       toyouta corona mark ii (sw)  
83          72     usa                   dodge colt (sw)  
84          72   japan          toyota corolla 1600 (sw)  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
85   13.0          8         350.0       175.0    4100          13.0   
86   14.0          8         304.0       150.0    3672          11.5   
87   13.0          8         350.0       145.0    3988          13.0   
88   14.0          8         302.0       137.0    4042          14.5   
89   15.0          8         318.0       150.0    3777          12.5   
90   12.0          8         429.0       198.0    4952          11.5   
91   13.0          8         400.0       150.0    4464          12.0   
92   13.0          8         351.0       158.0    4363          13.0   
93   14.0          8         318.0       150.0    4237          14.5   
94   13.0          8         440.0       215.0    4735          11.0   
95   12.0          8         455.0       225.0    4951          11.0   
96   13.0          8         360.0       175.0    3821          11.0   
97   18.0          6         225.0       105.0    3121          16.5   
98   16.0          6         250.0       100.0    3278          18.0   
99   18.0          6         232.0       100.0    2945          16.0   
100  18.0          6         250.0        88.0    3021          16.5   
101  23.0          6         198.0        95.0    2904          16.0   
102  26.0          4          97.0        46.0    1950          21.0   
103  11.0          8         400.0       150.0    4997          14.0   
104  12.0          8         400.0       167.0    4906          12.5   
105  13.0          8         360.0       170.0    4654          13.0   
106  12.0          8         350.0       180.0    4499          12.5   
107  18.0          6         232.0       100.0    2789          15.0   
108  20.0          4          97.0        88.0    2279          19.0   
109  21.0          4         140.0        72.0    2401          19.5   
110  22.0          4         108.0        94.0    2379          16.5   
111  18.0          3          70.0        90.0    2124          13.5   
112  19.0          4         122.0        85.0    2310          18.5   
113  21.0          6         155.0       107.0    2472          14.0   
114  26.0          4          98.0        90.0    2265          15.5   
115  15.0          8         350.0       145.0    4082          13.0   
116  16.0          8         400.0       230.0    4278           9.5   
117  29.0          4          68.0        49.0    1867          19.5   
118  24.0          4         116.0        75.0    2158          15.5   
119  20.0          4         114.0        91.0    2582          14.0   
120  19.0          4         121.0       112.0    2868          15.5   
121  15.0          8         318.0       150.0    3399          11.0   
122  24.0          4         121.0       110.0    2660          14.0   
123  20.0          6         156.0       122.0    2807          13.5   
124  11.0          8         350.0       180.0    3664          11.0   

     model_year  origin                          name  
85           73     usa             buick century 350  
86           73     usa                   amc matador  
87           73     usa              chevrolet malibu  
88           73     usa              ford gran torino  
89           73     usa          dodge coronet custom  
90           73     usa      mercury marquis brougham  
91           73     usa     chevrolet caprice classic  
92           73     usa                      ford ltd  
93           73     usa      plymouth fury gran sedan  
94           73     usa  chrysler new yorker brougham  
95           73     usa      buick electra 225 custom  
96           73     usa       amc ambassador brougham  
97           73     usa              plymouth valiant  
98           73     usa         chevrolet nova custom  
99           73     usa                    amc hornet  
100          73     usa                 ford maverick  
101          73     usa               plymouth duster  
102          73  europe       volkswagen super beetle  
103          73     usa              chevrolet impala  
104          73     usa                  ford country  
105          73     usa        plymouth custom suburb  
106          73     usa      oldsmobile vista cruiser  
107          73     usa                   amc gremlin  
108          73   japan                 toyota carina  
109          73     usa                chevrolet vega  
110          73   japan                    datsun 610  
111          73   japan                     maxda rx3  
112          73     usa                    ford pinto  
113          73     usa              mercury capri v6  
114          73  europe          fiat 124 sport coupe  
115          73     usa       chevrolet monte carlo s  
116          73     usa            pontiac grand prix  
117          73  europe                      fiat 128  
118          73  europe                    opel manta  
119          73  europe                    audi 100ls  
120          73  europe                   volvo 144ea  
121          73     usa             dodge dart custom  
122          73  europe                     saab 99le  
123          73   japan                toyota mark ii  
124          73     usa              oldsmobile omega  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
125  20.0          6         198.0        95.0    3102          16.5   
126  21.0          6         200.0         NaN    2875          17.0   
127  19.0          6         232.0       100.0    2901          16.0   
128  15.0          6         250.0       100.0    3336          17.0   
129  31.0          4          79.0        67.0    1950          19.0   
130  26.0          4         122.0        80.0    2451          16.5   
131  32.0          4          71.0        65.0    1836          21.0   
132  25.0          4         140.0        75.0    2542          17.0   
133  16.0          6         250.0       100.0    3781          17.0   
134  16.0          6         258.0       110.0    3632          18.0   
135  18.0          6         225.0       105.0    3613          16.5   
136  16.0          8         302.0       140.0    4141          14.0   
137  13.0          8         350.0       150.0    4699          14.5   
138  14.0          8         318.0       150.0    4457          13.5   
139  14.0          8         302.0       140.0    4638          16.0   
140  14.0          8         304.0       150.0    4257          15.5   
141  29.0          4          98.0        83.0    2219          16.5   
142  26.0          4          79.0        67.0    1963          15.5   
143  26.0          4          97.0        78.0    2300          14.5   
144  31.0          4          76.0        52.0    1649          16.5   
145  32.0          4          83.0        61.0    2003          19.0   
146  28.0          4          90.0        75.0    2125          14.5   
147  24.0          4          90.0        75.0    2108          15.5   
148  26.0          4         116.0        75.0    2246          14.0   
149  24.0          4         120.0        97.0    2489          15.0   
150  26.0          4         108.0        93.0    2391          15.5   
151  31.0          4          79.0        67.0    2000          16.0   

     model_year  origin                               name  
125          74     usa                    plymouth duster  
126          74     usa                      ford maverick  
127          74     usa                         amc hornet  
128          74     usa                     chevrolet nova  
129          74   japan                        datsun b210  
130          74     usa                         ford pinto  
131          74   japan                toyota corolla 1200  
132          74     usa                     chevrolet vega  
133          74     usa  chevrolet chevelle malibu classic  
134          74     usa                        amc matador  
135          74     usa         plymouth satellite sebring  
136          74     usa                   ford gran torino  
137          74     usa           buick century luxus (sw)  
138          74     usa          dodge coronet custom (sw)  
139          74     usa              ford gran torino (sw)  
140          74     usa                   amc matador (sw)  
141          74  europe                           audi fox  
142          74  europe                  volkswagen dasher  
143          74  europe                         opel manta  
144          74   japan                      toyota corona  
145          74   japan                         datsun 710  
146          74     usa                         dodge colt  
147          74  europe                           fiat 128  
148          74  europe                        fiat 124 tc  
149          74   japan                        honda civic  
150          74   japan                             subaru  
151          74  europe                          fiat x1.9  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
152  19.0          6         225.0        95.0    3264          16.0   
153  18.0          6         250.0       105.0    3459          16.0   
154  15.0          6         250.0        72.0    3432          21.0   
155  15.0          6         250.0        72.0    3158          19.5   
156  16.0          8         400.0       170.0    4668          11.5   
157  15.0          8         350.0       145.0    4440          14.0   
158  16.0          8         318.0       150.0    4498          14.5   
159  14.0          8         351.0       148.0    4657          13.5   
160  17.0          6         231.0       110.0    3907          21.0   
161  16.0          6         250.0       105.0    3897          18.5   
162  15.0          6         258.0       110.0    3730          19.0   
163  18.0          6         225.0        95.0    3785          19.0   
164  21.0          6         231.0       110.0    3039          15.0   
165  20.0          8         262.0       110.0    3221          13.5   
166  13.0          8         302.0       129.0    3169          12.0   
167  29.0          4          97.0        75.0    2171          16.0   
168  23.0          4         140.0        83.0    2639          17.0   
169  20.0          6         232.0       100.0    2914          16.0   
170  23.0          4         140.0        78.0    2592          18.5   
171  24.0          4         134.0        96.0    2702          13.5   
172  25.0          4          90.0        71.0    2223          16.5   
173  24.0          4         119.0        97.0    2545          17.0   
174  18.0          6         171.0        97.0    2984          14.5   
175  29.0          4          90.0        70.0    1937          14.0   
176  19.0          6         232.0        90.0    3211          17.0   
177  23.0          4         115.0        95.0    2694          15.0   
178  23.0          4         120.0        88.0    2957          17.0   
179  22.0          4         121.0        98.0    2945          14.5   
180  25.0          4         121.0       115.0    2671          13.5   
181  33.0          4          91.0        53.0    1795          17.5   

     model_year  origin                       name  
152          75     usa    plymouth valiant custom  
153          75     usa             chevrolet nova  
154          75     usa            mercury monarch  
155          75     usa              ford maverick  
156          75     usa           pontiac catalina  
157          75     usa          chevrolet bel air  
158          75     usa        plymouth grand fury  
159          75     usa                   ford ltd  
160          75     usa              buick century  
161          75     usa  chevroelt chevelle malibu  
162          75     usa                amc matador  
163          75     usa              plymouth fury  
164          75     usa              buick skyhawk  
165          75     usa        chevrolet monza 2+2  
166          75     usa            ford mustang ii  
167          75   japan             toyota corolla  
168          75     usa                 ford pinto  
169          75     usa                amc gremlin  
170          75     usa              pontiac astro  
171          75   japan              toyota corona  
172          75  europe          volkswagen dasher  
173          75   japan                 datsun 710  
174          75     usa                 ford pinto  
175          75  europe          volkswagen rabbit  
176          75     usa                  amc pacer  
177          75  europe                 audi 100ls  
178          75  europe                peugeot 504  
179          75  europe                volvo 244dl  
180          75  europe                  saab 99le  
181          75   japan           honda civic cvcc  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
182  28.0          4         107.0        86.0    2464          15.5   
183  25.0          4         116.0        81.0    2220          16.9   
184  25.0          4         140.0        92.0    2572          14.9   
185  26.0          4          98.0        79.0    2255          17.7   
186  27.0          4         101.0        83.0    2202          15.3   
187  17.5          8         305.0       140.0    4215          13.0   
188  16.0          8         318.0       150.0    4190          13.0   
189  15.5          8         304.0       120.0    3962          13.9   
190  14.5          8         351.0       152.0    4215          12.8   
191  22.0          6         225.0       100.0    3233          15.4   
192  22.0          6         250.0       105.0    3353          14.5   
193  24.0          6         200.0        81.0    3012          17.6   
194  22.5          6         232.0        90.0    3085          17.6   
195  29.0          4          85.0        52.0    2035          22.2   
196  24.5          4          98.0        60.0    2164          22.1   
197  29.0          4          90.0        70.0    1937          14.2   
198  33.0          4          91.0        53.0    1795          17.4   
199  20.0          6         225.0       100.0    3651          17.7   
200  18.0          6         250.0        78.0    3574          21.0   
201  18.5          6         250.0       110.0    3645          16.2   
202  17.5          6         258.0        95.0    3193          17.8   
203  29.5          4          97.0        71.0    1825          12.2   
204  32.0          4          85.0        70.0    1990          17.0   
205  28.0          4          97.0        75.0    2155          16.4   
206  26.5          4         140.0        72.0    2565          13.6   
207  20.0          4         130.0       102.0    3150          15.7   
208  13.0          8         318.0       150.0    3940          13.2   
209  19.0          4         120.0        88.0    3270          21.9   
210  19.0          6         156.0       108.0    2930          15.5   
211  16.5          6         168.0       120.0    3820          16.7   
212  16.5          8         350.0       180.0    4380          12.1   
213  13.0          8         350.0       145.0    4055          12.0   
214  13.0          8         302.0       130.0    3870          15.0   
215  13.0          8         318.0       150.0    3755          14.0   

     model_year  origin                               name  
182          76  europe                           fiat 131  
183          76  europe                          opel 1900  
184          76     usa                           capri ii  
185          76     usa                         dodge colt  
186          76  europe                       renault 12tl  
187          76     usa  chevrolet chevelle malibu classic  
188          76     usa             dodge coronet brougham  
189          76     usa                        amc matador  
190          76     usa                   ford gran torino  
191          76     usa                   plymouth valiant  
192          76     usa                     chevrolet nova  
193          76     usa                      ford maverick  
194          76     usa                         amc hornet  
195          76     usa                 chevrolet chevette  
196          76     usa                    chevrolet woody  
197          76  europe                          vw rabbit  
198          76   japan                        honda civic  
199          76     usa                     dodge aspen se  
200          76     usa                  ford granada ghia  
201          76     usa                 pontiac ventura sj  
202          76     usa                      amc pacer d/l  
203          76  europe                  volkswagen rabbit  
204          76   japan                       datsun b-210  
205          76   japan                     toyota corolla  
206          76     usa                         ford pinto  
207          76  europe                          volvo 245  
208          76     usa         plymouth volare premier v8  
209          76  europe                        peugeot 504  
210          76   japan                     toyota mark ii  
211          76  europe                 mercedes-benz 280s  
212          76     usa                   cadillac seville  
213          76     usa                          chevy c10  
214          76     usa                          ford f108  
215          76     usa                         dodge d100  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
216  31.5          4          98.0        68.0    2045          18.5   
217  30.0          4         111.0        80.0    2155          14.8   
218  36.0          4          79.0        58.0    1825          18.6   
219  25.5          4         122.0        96.0    2300          15.5   
220  33.5          4          85.0        70.0    1945          16.8   
221  17.5          8         305.0       145.0    3880          12.5   
222  17.0          8         260.0       110.0    4060          19.0   
223  15.5          8         318.0       145.0    4140          13.7   
224  15.0          8         302.0       130.0    4295          14.9   
225  17.5          6         250.0       110.0    3520          16.4   
226  20.5          6         231.0       105.0    3425          16.9   
227  19.0          6         225.0       100.0    3630          17.7   
228  18.5          6         250.0        98.0    3525          19.0   
229  16.0          8         400.0       180.0    4220          11.1   
230  15.5          8         350.0       170.0    4165          11.4   
231  15.5          8         400.0       190.0    4325          12.2   
232  16.0          8         351.0       149.0    4335          14.5   
233  29.0          4          97.0        78.0    1940          14.5   
234  24.5          4         151.0        88.0    2740          16.0   
235  26.0          4          97.0        75.0    2265          18.2   
236  25.5          4         140.0        89.0    2755          15.8   
237  30.5          4          98.0        63.0    2051          17.0   
238  33.5          4          98.0        83.0    2075          15.9   
239  30.0          4          97.0        67.0    1985          16.4   
240  30.5          4          97.0        78.0    2190          14.1   
241  22.0          6         146.0        97.0    2815          14.5   
242  21.5          4         121.0       110.0    2600          12.8   
243  21.5          3          80.0       110.0    2720          13.5   

     model_year  origin                          name  
216          77   japan             honda accord cvcc  
217          77     usa       buick opel isuzu deluxe  
218          77  europe                 renault 5 gtl  
219          77     usa             plymouth arrow gs  
220          77   japan         datsun f-10 hatchback  
221          77     usa     chevrolet caprice classic  
222          77     usa    oldsmobile cutlass supreme  
223          77     usa         dodge monaco brougham  
224          77     usa       mercury cougar brougham  
225          77     usa            chevrolet concours  
226          77     usa                 buick skylark  
227          77     usa        plymouth volare custom  
228          77     usa                  ford granada  
229          77     usa         pontiac grand prix lj  
230          77     usa  chevrolet monte carlo landau  
231          77     usa              chrysler cordoba  
232          77     usa              ford thunderbird  
233          77  europe      volkswagen rabbit custom  
234          77     usa         pontiac sunbird coupe  
235          77   japan       toyota corolla liftback  
236          77     usa           ford mustang ii 2+2  
237          77     usa            chevrolet chevette  
238          77     usa                dodge colt m/m  
239          77   japan                     subaru dl  
240          77  europe             volkswagen dasher  
241          77   japan                    datsun 810  
242          77  europe                      bmw 320i  
243          77   japan                    mazda rx-4  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
244  43.1          4          90.0        48.0    1985          21.5   
245  36.1          4          98.0        66.0    1800          14.4   
246  32.8          4          78.0        52.0    1985          19.4   
247  39.4          4          85.0        70.0    2070          18.6   
248  36.1          4          91.0        60.0    1800          16.4   
249  19.9          8         260.0       110.0    3365          15.5   
250  19.4          8         318.0       140.0    3735          13.2   
251  20.2          8         302.0       139.0    3570          12.8   
252  19.2          6         231.0       105.0    3535          19.2   
253  20.5          6         200.0        95.0    3155          18.2   
254  20.2          6         200.0        85.0    2965          15.8   
255  25.1          4         140.0        88.0    2720          15.4   
256  20.5          6         225.0       100.0    3430          17.2   
257  19.4          6         232.0        90.0    3210          17.2   
258  20.6          6         231.0       105.0    3380          15.8   
259  20.8          6         200.0        85.0    3070          16.7   
260  18.6          6         225.0       110.0    3620          18.7   
261  18.1          6         258.0       120.0    3410          15.1   
262  19.2          8         305.0       145.0    3425          13.2   
263  17.7          6         231.0       165.0    3445          13.4   
264  18.1          8         302.0       139.0    3205          11.2   
265  17.5          8         318.0       140.0    4080          13.7   
266  30.0          4          98.0        68.0    2155          16.5   
267  27.5          4         134.0        95.0    2560          14.2   
268  27.2          4         119.0        97.0    2300          14.7   
269  30.9          4         105.0        75.0    2230          14.5   
270  21.1          4         134.0        95.0    2515          14.8   
271  23.2          4         156.0       105.0    2745          16.7   
272  23.8          4         151.0        85.0    2855          17.6   
273  23.9          4         119.0        97.0    2405          14.9   
274  20.3          5         131.0       103.0    2830          15.9   
275  17.0          6         163.0       125.0    3140          13.6   
276  21.6          4         121.0       115.0    2795          15.7   
277  16.2          6         163.0       133.0    3410          15.8   
278  31.5          4          89.0        71.0    1990          14.9   
279  29.5          4          98.0        68.0    2135          16.6   

     model_year  origin                               name  
244          78  europe    volkswagen rabbit custom diesel  
245          78     usa                        ford fiesta  
246          78   japan                   mazda glc deluxe  
247          78   japan                     datsun b210 gx  
248          78   japan                   honda civic cvcc  
249          78     usa  oldsmobile cutlass salon brougham  
250          78     usa                     dodge diplomat  
251          78     usa               mercury monarch ghia  
252          78     usa                 pontiac phoenix lj  
253          78     usa                   chevrolet malibu  
254          78     usa               ford fairmont (auto)  
255          78     usa                ford fairmont (man)  
256          78     usa                    plymouth volare  
257          78     usa                        amc concord  
258          78     usa              buick century special  
259          78     usa                     mercury zephyr  
260          78     usa                        dodge aspen  
261          78     usa                    amc concord d/l  
262          78     usa       chevrolet monte carlo landau  
263          78     usa    buick regal sport coupe (turbo)  
264          78     usa                        ford futura  
265          78     usa                    dodge magnum xe  
266          78     usa                 chevrolet chevette  
267          78   japan                      toyota corona  
268          78   japan                         datsun 510  
269          78     usa                         dodge omni  
270          78   japan          toyota celica gt liftback  
271          78     usa                   plymouth sapporo  
272          78     usa             oldsmobile starfire sx  
273          78   japan                      datsun 200-sx  
274          78  europe                          audi 5000  
275          78  europe                        volvo 264gl  
276          78  europe                         saab 99gle  
277          78  europe                      peugeot 604sl  
278          78  europe                volkswagen scirocco  
279          78   japan                    honda accord lx  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
280  21.5          6         231.0       115.0    3245          15.4   
281  19.8          6         200.0        85.0    2990          18.2   
282  22.3          4         140.0        88.0    2890          17.3   
283  20.2          6         232.0        90.0    3265          18.2   
284  20.6          6         225.0       110.0    3360          16.6   
285  17.0          8         305.0       130.0    3840          15.4   
286  17.6          8         302.0       129.0    3725          13.4   
287  16.5          8         351.0       138.0    3955          13.2   
288  18.2          8         318.0       135.0    3830          15.2   
289  16.9          8         350.0       155.0    4360          14.9   
290  15.5          8         351.0       142.0    4054          14.3   
291  19.2          8         267.0       125.0    3605          15.0   
292  18.5          8         360.0       150.0    3940          13.0   
293  31.9          4          89.0        71.0    1925          14.0   
294  34.1          4          86.0        65.0    1975          15.2   
295  35.7          4          98.0        80.0    1915          14.4   
296  27.4          4         121.0        80.0    2670          15.0   
297  25.4          5         183.0        77.0    3530          20.1   
298  23.0          8         350.0       125.0    3900          17.4   
299  27.2          4         141.0        71.0    3190          24.8   
300  23.9          8         260.0        90.0    3420          22.2   
301  34.2          4         105.0        70.0    2200          13.2   
302  34.5          4         105.0        70.0    2150          14.9   
303  31.8          4          85.0        65.0    2020          19.2   
304  37.3          4          91.0        69.0    2130          14.7   
305  28.4          4         151.0        90.0    2670          16.0   
306  28.8          6         173.0       115.0    2595          11.3   
307  26.8          6         173.0       115.0    2700          12.9   
308  33.5          4         151.0        90.0    2556          13.2   

     model_year  origin                                  name  
280          79     usa                     pontiac lemans v6  
281          79     usa                      mercury zephyr 6  
282          79     usa                       ford fairmont 4  
283          79     usa                      amc concord dl 6  
284          79     usa                         dodge aspen 6  
285          79     usa             chevrolet caprice classic  
286          79     usa                       ford ltd landau  
287          79     usa                 mercury grand marquis  
288          79     usa                       dodge st. regis  
289          79     usa               buick estate wagon (sw)  
290          79     usa              ford country squire (sw)  
291          79     usa         chevrolet malibu classic (sw)  
292          79     usa  chrysler lebaron town @ country (sw)  
293          79  europe                      vw rabbit custom  
294          79   japan                      maxda glc deluxe  
295          79     usa           dodge colt hatchback custom  
296          79     usa                         amc spirit dl  
297          79  europe                    mercedes benz 300d  
298          79     usa                     cadillac eldorado  
299          79  europe                           peugeot 504  
300          79     usa     oldsmobile cutlass salon brougham  
301          79     usa                      plymouth horizon  
302          79     usa                  plymouth horizon tc3  
303          79   japan                            datsun 210  
304          79  europe                    fiat strada custom  
305          79     usa                 buick skylark limited  
306          79     usa                    chevrolet citation  
307          79     usa             oldsmobile omega brougham  
308          79     usa                       pontiac phoenix  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
309  41.5          4          98.0        76.0    2144          14.7   
310  38.1          4          89.0        60.0    1968          18.8   
311  32.1          4          98.0        70.0    2120          15.5   
312  37.2          4          86.0        65.0    2019          16.4   
313  28.0          4         151.0        90.0    2678          16.5   
314  26.4          4         140.0        88.0    2870          18.1   
315  24.3          4         151.0        90.0    3003          20.1   
316  19.1          6         225.0        90.0    3381          18.7   
317  34.3          4          97.0        78.0    2188          15.8   
318  29.8          4         134.0        90.0    2711          15.5   
319  31.3          4         120.0        75.0    2542          17.5   
320  37.0          4         119.0        92.0    2434          15.0   
321  32.2          4         108.0        75.0    2265          15.2   
322  46.6          4          86.0        65.0    2110          17.9   
323  27.9          4         156.0       105.0    2800          14.4   
324  40.8          4          85.0        65.0    2110          19.2   
325  44.3          4          90.0        48.0    2085          21.7   
326  43.4          4          90.0        48.0    2335          23.7   
327  36.4          5         121.0        67.0    2950          19.9   
328  30.0          4         146.0        67.0    3250          21.8   
329  44.6          4          91.0        67.0    1850          13.8   
330  40.9          4          85.0         NaN    1835          17.3   
331  33.8          4          97.0        67.0    2145          18.0   
332  29.8          4          89.0        62.0    1845          15.3   
333  32.7          6         168.0       132.0    2910          11.4   
334  23.7          3          70.0       100.0    2420          12.5   
335  35.0          4         122.0        88.0    2500          15.1   
336  23.6          4         140.0         NaN    2905          14.3   
337  32.4          4         107.0        72.0    2290          17.0   

     model_year  origin                    name  
309          80  europe               vw rabbit  
310          80   japan   toyota corolla tercel  
311          80     usa      chevrolet chevette  
312          80   japan              datsun 310  
313          80     usa      chevrolet citation  
314          80     usa           ford fairmont  
315          80     usa             amc concord  
316          80     usa             dodge aspen  
317          80  europe               audi 4000  
318          80   japan  toyota corona liftback  
319          80   japan               mazda 626  
320          80   japan    datsun 510 hatchback  
321          80   japan          toyota corolla  
322          80   japan               mazda glc  
323          80     usa              dodge colt  
324          80   japan              datsun 210  
325          80  europe    vw rabbit c (diesel)  
326          80  europe      vw dasher (diesel)  
327          80  europe     audi 5000s (diesel)  
328          80  europe      mercedes-benz 240d  
329          80   japan     honda civic 1500 gl  
330          80  europe    renault lecar deluxe  
331          80   japan               subaru dl  
332          80  europe        vokswagen rabbit  
333          80   japan           datsun 280-zx  
334          80   japan           mazda rx-7 gs  
335          80  europe       triumph tr7 coupe  
336          80     usa      ford mustang cobra  
337          80   japan            honda accord  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
338  27.2          4         135.0        84.0    2490          15.7   
339  26.6          4         151.0        84.0    2635          16.4   
340  25.8          4         156.0        92.0    2620          14.4   
341  23.5          6         173.0       110.0    2725          12.6   
342  30.0          4         135.0        84.0    2385          12.9   
343  39.1          4          79.0        58.0    1755          16.9   
344  39.0          4          86.0        64.0    1875          16.4   
345  35.1          4          81.0        60.0    1760          16.1   
346  32.3          4          97.0        67.0    2065          17.8   
347  37.0          4          85.0        65.0    1975          19.4   
348  37.7          4          89.0        62.0    2050          17.3   
349  34.1          4          91.0        68.0    1985          16.0   
350  34.7          4         105.0        63.0    2215          14.9   
351  34.4          4          98.0        65.0    2045          16.2   
352  29.9          4          98.0        65.0    2380          20.7   
353  33.0          4         105.0        74.0    2190          14.2   
354  34.5          4         100.0         NaN    2320          15.8   
355  33.7          4         107.0        75.0    2210          14.4   
356  32.4          4         108.0        75.0    2350          16.8   
357  32.9          4         119.0       100.0    2615          14.8   
358  31.6          4         120.0        74.0    2635          18.3   
359  28.1          4         141.0        80.0    3230          20.4   
360  30.7          6         145.0        76.0    3160          19.6   
361  25.4          6         168.0       116.0    2900          12.6   
362  24.2          6         146.0       120.0    2930          13.8   
363  22.4          6         231.0       110.0    3415          15.8   
364  26.6          8         350.0       105.0    3725          19.0   
365  20.2          6         200.0        88.0    3060          17.1   
366  17.6          6         225.0        85.0    3465          16.6   

     model_year  origin                       name  
338          81     usa           plymouth reliant  
339          81     usa              buick skylark  
340          81     usa     dodge aries wagon (sw)  
341          81     usa         chevrolet citation  
342          81     usa           plymouth reliant  
343          81   japan             toyota starlet  
344          81     usa             plymouth champ  
345          81   japan           honda civic 1300  
346          81   japan                     subaru  
347          81   japan             datsun 210 mpg  
348          81   japan              toyota tercel  
349          81   japan                mazda glc 4  
350          81     usa         plymouth horizon 4  
351          81     usa             ford escort 4w  
352          81     usa             ford escort 2h  
353          81  europe           volkswagen jetta  
354          81  europe                renault 18i  
355          81   japan              honda prelude  
356          81   japan             toyota corolla  
357          81   japan               datsun 200sx  
358          81   japan                  mazda 626  
359          81  europe  peugeot 505s turbo diesel  
360          81  europe               volvo diesel  
361          81   japan            toyota cressida  
362          81   japan          datsun 810 maxima  
363          81     usa              buick century  
364          81     usa      oldsmobile cutlass ls  
365          81     usa            ford granada gl  
366          81     usa     chrysler lebaron salon  
      mpg  cylinders  displacement  horsepower  weight  acceleration  \
367  28.0          4         112.0        88.0    2605          19.6   
368  27.0          4         112.0        88.0    2640          18.6   
369  34.0          4         112.0        88.0    2395          18.0   
370  31.0          4         112.0        85.0    2575          16.2   
371  29.0          4         135.0        84.0    2525          16.0   
372  27.0          4         151.0        90.0    2735          18.0   
373  24.0          4         140.0        92.0    2865          16.4   
374  23.0          4         151.0         NaN    3035          20.5   
375  36.0          4         105.0        74.0    1980          15.3   
376  37.0          4          91.0        68.0    2025          18.2   
377  31.0          4          91.0        68.0    1970          17.6   
378  38.0          4         105.0        63.0    2125          14.7   
379  36.0          4          98.0        70.0    2125          17.3   
380  36.0          4         120.0        88.0    2160          14.5   
381  36.0          4         107.0        75.0    2205          14.5   
382  34.0          4         108.0        70.0    2245          16.9   
383  38.0          4          91.0        67.0    1965          15.0   
384  32.0          4          91.0        67.0    1965          15.7   
385  38.0          4          91.0        67.0    1995          16.2   
386  25.0          6         181.0       110.0    2945          16.4   
387  38.0          6         262.0        85.0    3015          17.0   
388  26.0          4         156.0        92.0    2585          14.5   
389  22.0          6         232.0       112.0    2835          14.7   
390  32.0          4         144.0        96.0    2665          13.9   
391  36.0          4         135.0        84.0    2370          13.0   
392  27.0          4         151.0        90.0    2950          17.3   
393  27.0          4         140.0        86.0    2790          15.6   
394  44.0          4          97.0        52.0    2130          24.6   
395  32.0          4         135.0        84.0    2295          11.6   
396  28.0          4         120.0        79.0    2625          18.6   
397  31.0          4         119.0        82.0    2720          19.4   

     model_year  origin                               name  
367          82     usa                 chevrolet cavalier  
368          82     usa           chevrolet cavalier wagon  
369          82     usa          chevrolet cavalier 2-door  
370          82     usa         pontiac j2000 se hatchback  
371          82     usa                     dodge aries se  
372          82     usa                    pontiac phoenix  
373          82     usa               ford fairmont futura  
374          82     usa                     amc concord dl  
375          82  europe                volkswagen rabbit l  
376          82   japan                 mazda glc custom l  
377          82   japan                   mazda glc custom  
378          82     usa             plymouth horizon miser  
379          82     usa                     mercury lynx l  
380          82   japan                   nissan stanza xe  
381          82   japan                       honda accord  
382          82   japan                     toyota corolla  
383          82   japan                        honda civic  
384          82   japan                 honda civic (auto)  
385          82   japan                      datsun 310 gx  
386          82     usa              buick century limited  
387          82     usa  oldsmobile cutlass ciera (diesel)  
388          82     usa         chrysler lebaron medallion  
389          82     usa                     ford granada l  
390          82   japan                   toyota celica gt  
391          82     usa                  dodge charger 2.2  
392          82     usa                   chevrolet camaro  
393          82     usa                    ford mustang gl  
394          82  europe                          vw pickup  
395          82     usa                      dodge rampage  
396          82     usa                        ford ranger  
397          82     usa                         chevy s-10  
for a,b in df.groupby("model_year"):
    print(type(b))
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>

Instead of using print, let’s use display for b. The information is the same, but it will be in a more familiar presentation. Let’s also use the expression break to leave the for loop immediately. (So here we are doing the code inside the for loop exactly one time.)

Notice that a is 70 in this case and b is a pandas DataFrame. In fact, it is the sub-DataFrame corresponding to the model_year of 70. We would normally get this sub-DataFrame b using Boolean indexing, but here pandas groupby is doing that work for us.

for a,b in df.groupby("model_year"):
    print(a)
    display(b)
    break # leave the for loop
70
mpg cylinders displacement horsepower weight acceleration model_year origin name
0 18.0 8 307.0 130.0 3504 12.0 70 usa chevrolet chevelle malibu
1 15.0 8 350.0 165.0 3693 11.5 70 usa buick skylark 320
2 18.0 8 318.0 150.0 3436 11.0 70 usa plymouth satellite
3 16.0 8 304.0 150.0 3433 12.0 70 usa amc rebel sst
4 17.0 8 302.0 140.0 3449 10.5 70 usa ford torino
5 15.0 8 429.0 198.0 4341 10.0 70 usa ford galaxie 500
6 14.0 8 454.0 220.0 4354 9.0 70 usa chevrolet impala
7 14.0 8 440.0 215.0 4312 8.5 70 usa plymouth fury iii
8 14.0 8 455.0 225.0 4425 10.0 70 usa pontiac catalina
9 15.0 8 390.0 190.0 3850 8.5 70 usa amc ambassador dpl
10 15.0 8 383.0 170.0 3563 10.0 70 usa dodge challenger se
11 14.0 8 340.0 160.0 3609 8.0 70 usa plymouth 'cuda 340
12 15.0 8 400.0 150.0 3761 9.5 70 usa chevrolet monte carlo
13 14.0 8 455.0 225.0 3086 10.0 70 usa buick estate wagon (sw)
14 24.0 4 113.0 95.0 2372 15.0 70 japan toyota corona mark ii
15 22.0 6 198.0 95.0 2833 15.5 70 usa plymouth duster
16 18.0 6 199.0 97.0 2774 15.5 70 usa amc hornet
17 21.0 6 200.0 85.0 2587 16.0 70 usa ford maverick
18 27.0 4 97.0 88.0 2130 14.5 70 japan datsun pl510
19 26.0 4 97.0 46.0 1835 20.5 70 europe volkswagen 1131 deluxe sedan
20 25.0 4 110.0 87.0 2672 17.5 70 europe peugeot 504
21 24.0 4 107.0 90.0 2430 14.5 70 europe audi 100 ls
22 25.0 4 104.0 95.0 2375 17.5 70 europe saab 99e
23 26.0 4 121.0 113.0 2234 12.5 70 europe bmw 2002
24 21.0 6 199.0 90.0 2648 15.0 70 usa amc gremlin
25 10.0 8 360.0 215.0 4615 14.0 70 usa ford f250
26 10.0 8 307.0 200.0 4376 15.0 70 usa chevy c20
27 11.0 8 318.0 210.0 4382 13.5 70 usa dodge d200
28 9.0 8 304.0 193.0 4732 18.5 70 usa hi 1200d

The above DataFrame b has length 29. The full DataFrame df has length 398.

len(df)
398

Here is another example. We tell the for loop to stop if we reach a == 75. What I really want to see is, what is the value of b when a == 75. If the break part is confusing, you can remove the break and just print “hi” or something, and you can then work with a later value of b.

for a,b in df.groupby("model_year"):
    if a == 75:
        break

Because we broke out of the for loop when a was equal to 75, the value of a is still 75. It never reached the later values of 76 through 82.

a
75

What I really want to emphasize is, what is the value of b when a is 75? It is exactly the sub-DataFrame of df corresponding to model_year 75.

b
mpg cylinders displacement horsepower weight acceleration model_year origin name
152 19.0 6 225.0 95.0 3264 16.0 75 usa plymouth valiant custom
153 18.0 6 250.0 105.0 3459 16.0 75 usa chevrolet nova
154 15.0 6 250.0 72.0 3432 21.0 75 usa mercury monarch
155 15.0 6 250.0 72.0 3158 19.5 75 usa ford maverick
156 16.0 8 400.0 170.0 4668 11.5 75 usa pontiac catalina
157 15.0 8 350.0 145.0 4440 14.0 75 usa chevrolet bel air
158 16.0 8 318.0 150.0 4498 14.5 75 usa plymouth grand fury
159 14.0 8 351.0 148.0 4657 13.5 75 usa ford ltd
160 17.0 6 231.0 110.0 3907 21.0 75 usa buick century
161 16.0 6 250.0 105.0 3897 18.5 75 usa chevroelt chevelle malibu
162 15.0 6 258.0 110.0 3730 19.0 75 usa amc matador
163 18.0 6 225.0 95.0 3785 19.0 75 usa plymouth fury
164 21.0 6 231.0 110.0 3039 15.0 75 usa buick skyhawk
165 20.0 8 262.0 110.0 3221 13.5 75 usa chevrolet monza 2+2
166 13.0 8 302.0 129.0 3169 12.0 75 usa ford mustang ii
167 29.0 4 97.0 75.0 2171 16.0 75 japan toyota corolla
168 23.0 4 140.0 83.0 2639 17.0 75 usa ford pinto
169 20.0 6 232.0 100.0 2914 16.0 75 usa amc gremlin
170 23.0 4 140.0 78.0 2592 18.5 75 usa pontiac astro
171 24.0 4 134.0 96.0 2702 13.5 75 japan toyota corona
172 25.0 4 90.0 71.0 2223 16.5 75 europe volkswagen dasher
173 24.0 4 119.0 97.0 2545 17.0 75 japan datsun 710
174 18.0 6 171.0 97.0 2984 14.5 75 usa ford pinto
175 29.0 4 90.0 70.0 1937 14.0 75 europe volkswagen rabbit
176 19.0 6 232.0 90.0 3211 17.0 75 usa amc pacer
177 23.0 4 115.0 95.0 2694 15.0 75 europe audi 100ls
178 23.0 4 120.0 88.0 2957 17.0 75 europe peugeot 504
179 22.0 4 121.0 98.0 2945 14.5 75 europe volvo 244dl
180 25.0 4 121.0 115.0 2671 13.5 75 europe saab 99le
181 33.0 4 91.0 53.0 1795 17.5 75 japan honda civic cvcc

Let’s contrast that with how we would normally find this sub-DataFrame, using Boolean indexing.

# Boolean indexing
sub_df = df[df["model_year"] == 75]

I claim that the sub-DataFrame sub_df we just defined is the same as b that was produced above. If we evaluate sub_df == b, we get a whole DataFrame of Trues and Falses.

sub_df == b
mpg cylinders displacement horsepower weight acceleration model_year origin name
152 True True True True True True True True True
153 True True True True True True True True True
154 True True True True True True True True True
155 True True True True True True True True True
156 True True True True True True True True True
157 True True True True True True True True True
158 True True True True True True True True True
159 True True True True True True True True True
160 True True True True True True True True True
161 True True True True True True True True True
162 True True True True True True True True True
163 True True True True True True True True True
164 True True True True True True True True True
165 True True True True True True True True True
166 True True True True True True True True True
167 True True True True True True True True True
168 True True True True True True True True True
169 True True True True True True True True True
170 True True True True True True True True True
171 True True True True True True True True True
172 True True True True True True True True True
173 True True True True True True True True True
174 True True True True True True True True True
175 True True True True True True True True True
176 True True True True True True True True True
177 True True True True True True True True True
178 True True True True True True True True True
179 True True True True True True True True True
180 True True True True True True True True True
181 True True True True True True True True True

Are all of the values True? (In other words, are the sub-DataFrames really equal?) The following will tell us if all the values are equal along each individual row.

(sub_df == b).all(axis=1)
152    True
153    True
154    True
155    True
156    True
157    True
158    True
159    True
160    True
161    True
162    True
163    True
164    True
165    True
166    True
167    True
168    True
169    True
170    True
171    True
172    True
173    True
174    True
175    True
176    True
177    True
178    True
179    True
180    True
181    True
dtype: bool

We can add another all to verify that all the rows really are equal. The following confirms that the two DataFrames are equal.

(sub_df == b).all(axis=1).all()
True

Let’s finally get to a question we can solve by iterating over a GroupBy object with a for loop.

  • For each year, how many cars had mpg at least 25?

Here we change to better variable names than a and b. We use the variable names year and sub_df.

for year, sub_df in df.groupby("model_year"):
    print(year)
70
71
72
73
74
75
76
77
78
79
80
81
82

We can’t realistically print out every value of sub_df, but here is the final value. (Inside of the for loop, sub_df takes a different value for each year from 70 to 82. When we leave the for loop, sub_df is still the final value, the sub-DataFrame corresponding to model_year 82.) Notice how the model_year is 82 throughout this DataFrame.

sub_df
mpg cylinders displacement horsepower weight acceleration model_year origin name
367 28.0 4 112.0 88.0 2605 19.6 82 usa chevrolet cavalier
368 27.0 4 112.0 88.0 2640 18.6 82 usa chevrolet cavalier wagon
369 34.0 4 112.0 88.0 2395 18.0 82 usa chevrolet cavalier 2-door
370 31.0 4 112.0 85.0 2575 16.2 82 usa pontiac j2000 se hatchback
371 29.0 4 135.0 84.0 2525 16.0 82 usa dodge aries se
372 27.0 4 151.0 90.0 2735 18.0 82 usa pontiac phoenix
373 24.0 4 140.0 92.0 2865 16.4 82 usa ford fairmont futura
374 23.0 4 151.0 NaN 3035 20.5 82 usa amc concord dl
375 36.0 4 105.0 74.0 1980 15.3 82 europe volkswagen rabbit l
376 37.0 4 91.0 68.0 2025 18.2 82 japan mazda glc custom l
377 31.0 4 91.0 68.0 1970 17.6 82 japan mazda glc custom
378 38.0 4 105.0 63.0 2125 14.7 82 usa plymouth horizon miser
379 36.0 4 98.0 70.0 2125 17.3 82 usa mercury lynx l
380 36.0 4 120.0 88.0 2160 14.5 82 japan nissan stanza xe
381 36.0 4 107.0 75.0 2205 14.5 82 japan honda accord
382 34.0 4 108.0 70.0 2245 16.9 82 japan toyota corolla
383 38.0 4 91.0 67.0 1965 15.0 82 japan honda civic
384 32.0 4 91.0 67.0 1965 15.7 82 japan honda civic (auto)
385 38.0 4 91.0 67.0 1995 16.2 82 japan datsun 310 gx
386 25.0 6 181.0 110.0 2945 16.4 82 usa buick century limited
387 38.0 6 262.0 85.0 3015 17.0 82 usa oldsmobile cutlass ciera (diesel)
388 26.0 4 156.0 92.0 2585 14.5 82 usa chrysler lebaron medallion
389 22.0 6 232.0 112.0 2835 14.7 82 usa ford granada l
390 32.0 4 144.0 96.0 2665 13.9 82 japan toyota celica gt
391 36.0 4 135.0 84.0 2370 13.0 82 usa dodge charger 2.2
392 27.0 4 151.0 90.0 2950 17.3 82 usa chevrolet camaro
393 27.0 4 140.0 86.0 2790 15.6 82 usa ford mustang gl
394 44.0 4 97.0 52.0 2130 24.6 82 europe vw pickup
395 32.0 4 135.0 84.0 2295 11.6 82 usa dodge rampage
396 28.0 4 120.0 79.0 2625 18.6 82 usa ford ranger
397 31.0 4 119.0 82.0 2720 19.4 82 usa chevy s-10

Focusing just on this single sub-DataFrame, how can we determine how many values of “mpg” are greater than or equal to 25? We won’t be using Boolean indexing, but we will be using a Boolean Series.

sub_df["mpg"] >= 25 
367     True
368     True
369     True
370     True
371     True
372     True
373    False
374    False
375     True
376     True
377     True
378     True
379     True
380     True
381     True
382     True
383     True
384     True
385     True
386     True
387     True
388     True
389    False
390     True
391     True
392     True
393     True
394     True
395     True
396     True
397     True
Name: mpg, dtype: bool

Remember that Python treats True like 1 and treats False like 0. So we can count the number of True values just by adding up the values.

(sub_df["mpg"] >= 25).sum()
28

There was nothing special about this sub_df corresponding to model_year 82. Within the for loop, we can use the exact same code. For example, the results below say that in the sub-DataFrame corresponding to 75, there were only 5 cars with mpg of at least 25.

for year, sub_df in df.groupby("model_year"):
    print(year)
    print((sub_df["mpg"] >= 25).sum())
    print()
70
5

71
11

72
5

73
3

74
13

75
5

76
12

77
12

78
12

79
14

80
25

81
24

82
28

Let’s try to make the output more readable using an f-string. There is a very subtle mistake here. Because my f-string uses double quotation marks " and my column name also uses double quotation marks, Python doesn’t know which quotation marks go together.

for year, sub_df in df.groupby("model_year"):
    print(f"In the year 19{year} there were {(sub_df["mpg"] >= 25).sum()} cars in the dataset with mpg at least 25.")
  Cell In [36], line 2
    print(f"In the year 19{year} there were {(sub_df["mpg"] >= 25).sum()} cars in the dataset with mpg at least 25.")
                                                      ^
SyntaxError: f-string: unmatched '['

Luckily this error is very easy to fix once we identify it. We just switch one of the pairs of quotation marks to single apostrophes.

for year, sub_df in df.groupby("model_year"):
    print(f"In the year 19{year} there were {(sub_df['mpg'] >= 25).sum()} cars in the dataset with mpg at least 25.")
In the year 1970 there were 5 cars in the dataset with mpg at least 25.
In the year 1971 there were 11 cars in the dataset with mpg at least 25.
In the year 1972 there were 5 cars in the dataset with mpg at least 25.
In the year 1973 there were 3 cars in the dataset with mpg at least 25.
In the year 1974 there were 13 cars in the dataset with mpg at least 25.
In the year 1975 there were 5 cars in the dataset with mpg at least 25.
In the year 1976 there were 12 cars in the dataset with mpg at least 25.
In the year 1977 there were 12 cars in the dataset with mpg at least 25.
In the year 1978 there were 12 cars in the dataset with mpg at least 25.
In the year 1979 there were 14 cars in the dataset with mpg at least 25.
In the year 1980 there were 25 cars in the dataset with mpg at least 25.
In the year 1981 there were 24 cars in the dataset with mpg at least 25.
In the year 1982 there were 28 cars in the dataset with mpg at least 25.

I think the code is more readable if we move the (sub_df['mpg'] >= 25).sum() portion onto its own line.

# most important: understand what are year and sub_df as we move through this for loop
for year, sub_df in df.groupby("model_year"):
    num_cars = (sub_df['mpg'] >= 25).sum()
    print(f"In the year 19{year} there were {num_cars} cars in the dataset with mpg at least 25.")
In the year 1970 there were 5 cars in the dataset with mpg at least 25.
In the year 1971 there were 11 cars in the dataset with mpg at least 25.
In the year 1972 there were 5 cars in the dataset with mpg at least 25.
In the year 1973 there were 3 cars in the dataset with mpg at least 25.
In the year 1974 there were 13 cars in the dataset with mpg at least 25.
In the year 1975 there were 5 cars in the dataset with mpg at least 25.
In the year 1976 there were 12 cars in the dataset with mpg at least 25.
In the year 1977 there were 12 cars in the dataset with mpg at least 25.
In the year 1978 there were 12 cars in the dataset with mpg at least 25.
In the year 1979 there were 14 cars in the dataset with mpg at least 25.
In the year 1980 there were 25 cars in the dataset with mpg at least 25.
In the year 1981 there were 24 cars in the dataset with mpg at least 25.
In the year 1982 there were 28 cars in the dataset with mpg at least 25.

Notice how num_cars is 28, the same value we saw above.

num_cars
28

Aside: there was a question about how break works. Here is a simpler example (but it’s very easy to be off slightly in your expectation). Imagine i is 18. Then the condition in the if statement fails, so i increases to 21, then the condition in the if statement is True, os we break out of the for loop. So when we leave the for loop, i is 21.

for i in range(0,100,3):
    if i > 20:
        break
i
21

You’ll be impressed at how easily we can switch from the above question, about absolute numbers, to the following question, about proportions. The key reason behind this is again that True counts as 1 and False counts as 0, so if we compute the mean of a collection of Trues and Falses, then that mean will represent exactly the proprtion of True values.

  • For each year, what proportion of cars had mpg at least 25?

# Because True is like 1 and False is like 0, can use mean to get the proportion
for year, sub_df in df.groupby("model_year"):
    prop_cars = (sub_df['mpg'] >= 25).mean()
    print(f"In the year 19{year}, {prop_cars} proportion of cars in the dataset had mpg at least 25.")
In the year 1970, 0.1724137931034483 proportion of cars in the dataset had mpg at least 25.
In the year 1971, 0.39285714285714285 proportion of cars in the dataset had mpg at least 25.
In the year 1972, 0.17857142857142858 proportion of cars in the dataset had mpg at least 25.
In the year 1973, 0.075 proportion of cars in the dataset had mpg at least 25.
In the year 1974, 0.48148148148148145 proportion of cars in the dataset had mpg at least 25.
In the year 1975, 0.16666666666666666 proportion of cars in the dataset had mpg at least 25.
In the year 1976, 0.35294117647058826 proportion of cars in the dataset had mpg at least 25.
In the year 1977, 0.42857142857142855 proportion of cars in the dataset had mpg at least 25.
In the year 1978, 0.3333333333333333 proportion of cars in the dataset had mpg at least 25.
In the year 1979, 0.4827586206896552 proportion of cars in the dataset had mpg at least 25.
In the year 1980, 0.8620689655172413 proportion of cars in the dataset had mpg at least 25.
In the year 1981, 0.8275862068965517 proportion of cars in the dataset had mpg at least 25.
In the year 1982, 0.9032258064516129 proportion of cars in the dataset had mpg at least 25.

Stare at the prop_cars code above until it makes sense. For example, try it with a smaller Boolean Series, maybe a Boolean Series with three True values and one False value. The mean of such a Series would be 0.75.

The next two cells are just for fun, and are meant to show you some of the possibilities with string formatting. Here the string formatting automatically converts from a proportion (between 0 and 1) to a percentage (between 0 and 100).

# Because True is like 1 and False is like 0, can use mean to get the proportion
for year, sub_df in df.groupby("model_year"):
    prop_cars = (sub_df['mpg'] >= 25).mean()
    print(f"In the year 19{year}, {prop_cars:%} percent of cars in the dataset had mpg at least 25.")
In the year 1970, 17.241379% percent of cars in the dataset had mpg at least 25.
In the year 1971, 39.285714% percent of cars in the dataset had mpg at least 25.
In the year 1972, 17.857143% percent of cars in the dataset had mpg at least 25.
In the year 1973, 7.500000% percent of cars in the dataset had mpg at least 25.
In the year 1974, 48.148148% percent of cars in the dataset had mpg at least 25.
In the year 1975, 16.666667% percent of cars in the dataset had mpg at least 25.
In the year 1976, 35.294118% percent of cars in the dataset had mpg at least 25.
In the year 1977, 42.857143% percent of cars in the dataset had mpg at least 25.
In the year 1978, 33.333333% percent of cars in the dataset had mpg at least 25.
In the year 1979, 48.275862% percent of cars in the dataset had mpg at least 25.
In the year 1980, 86.206897% percent of cars in the dataset had mpg at least 25.
In the year 1981, 82.758621% percent of cars in the dataset had mpg at least 25.
In the year 1982, 90.322581% percent of cars in the dataset had mpg at least 25.

We can also make things look a little nicer by saying we only want two decimal places in the percentage.

# Because True is like 1 and False is like 0, can use mean to get the proportion
for year, sub_df in df.groupby("model_year"):
    prop_cars = (sub_df['mpg'] >= 25).mean()
    print(f"In the year 19{year}, {prop_cars:.2%} percent of cars in the dataset had mpg at least 25.")
In the year 1970, 17.24% percent of cars in the dataset had mpg at least 25.
In the year 1971, 39.29% percent of cars in the dataset had mpg at least 25.
In the year 1972, 17.86% percent of cars in the dataset had mpg at least 25.
In the year 1973, 7.50% percent of cars in the dataset had mpg at least 25.
In the year 1974, 48.15% percent of cars in the dataset had mpg at least 25.
In the year 1975, 16.67% percent of cars in the dataset had mpg at least 25.
In the year 1976, 35.29% percent of cars in the dataset had mpg at least 25.
In the year 1977, 42.86% percent of cars in the dataset had mpg at least 25.
In the year 1978, 33.33% percent of cars in the dataset had mpg at least 25.
In the year 1979, 48.28% percent of cars in the dataset had mpg at least 25.
In the year 1980, 86.21% percent of cars in the dataset had mpg at least 25.
In the year 1981, 82.76% percent of cars in the dataset had mpg at least 25.
In the year 1982, 90.32% percent of cars in the dataset had mpg at least 25.

We didn’t get to any of the later portions of this notebook.

Using the mean method of a pandas GroupBy object#

The above for-loop method is very flexible, but for easier questions, often there are easier approaches.

  • For each year, what was the average mpg?

Visualizing the data#

  • Using Altair, make a scatter plot where the x-axis is the model_year, the y-axis is the mpg, and the color is encoded from the “origin” column.

  • Which of these fields (model_year, mpg, origin) could be most improved by specifying an encoding type?

  • How could we use mean and a bar chart to encode the average mpg data by year? (Remove the color for this one.)

  • Put these last two Altair charts side by side using alt.hconcat.

(There probably won’t be time for this example.)

  • What if we put the data into a facet chart, where we divide by weight?