Intro to Pandas¶

Many of us work with large and complicated datasets. The Pandas library is a useful tool for data analysis in Python. In this lesson,we will introduce some tools that can be used to analyze a population data set from Gapminder http://www.gapminder.org/.

Objectives¶

Read tabular data into an IPython notebook
Access columns of the data
Isolate subsets of the data
Generate plots based on subsetted data

Resources¶

http://pandas.pydata.org/pandas-docs/stable/index.html
http://pandas.pydata.org/pandas-docs/stable/basics.html
http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html
https://github.com/datacarpentry/datacarpentry/blob/master/cheatsheets/R_pandas_compare.md
http://pandas.pydata.org/pandas-docs/stable/visualization.html

Preparing for the lesson¶

From you desktop navigiate to WiSE-swc/pandas

Working with tabular data¶

Question¶

What is tabular data? Who works with tabular data?

Some data is stored in a tabular format, meaning the data is organized into rows and columns.

We will use a libarary called pandas to read this dataset into our notebook.
Pandas should be installed with the Anaconda distribution.
Documentation: http://pandas.pydata.org/pandas-docs/stable/index.html

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

To read the Gapminder text file we will use a pandas funtion called read_table().

gDat = pd.read_table('gapminderDataFiveYear.txt')

Familiarizing yourself with the data¶

It is often useful get familiar with data we are working with. What is recorded in our dataset? How is it organized? We can gain some insight by taking a quick peak at the dataset.

gDat

In each row, there is a recording for a country, year, population, continent, life expectancy, and GDP per capita. Further, each row has an interger label.

Since the data set is large, it would be nice to read the column headings without printing the entire dataset. First, we need to know more about how Python treats this data set. What type of object are we working with?

type(gDat)

pandas.core.frame.DataFrame

Pandas loads this tabular data as a DataFrame object. This is a special type of object that is particularly useful for organizing data. Some more information can be found at the documentation webpage.

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html

DataFrame objects have a number of attributes.

gDat.shape

(1704, 6)

gDat.columns

Index([u'country', u'year', u'pop', u'continent', u'lifeExp', u'gdpPercap'], dtype='object')

DataFrame objects also a number of methods.

Look at the first few lines.

gDat.head()

Pass an integer into head() to specify the number of lines to print

gDat.head(7)

Look at the end

gDat.tail()

Summary statistics with describe() * Count, mean,std, min, quartiles, max * Note: maybe not useful for every column

gDat.describe()

More resources: http://pandas.pydata.org/pandas-docs/stable/basics.html

A closer look at the data¶

It would be useful to look at sections of the data, rather than the entire dataset all at once. We might also want to use simple plots to take a closer look at the data.

Select a row of data with ix. * select row with label 0.

gDat.ix[[0]]

select a set of rows

gDat.ix[[0,5,10]]

select a range of rows like in numpy indexing

gDat[0:5]

extract a column

gDat.lifeExp

0     28.801
1     30.332
2     31.997
3     34.020
4     36.088
5     38.438
6     39.854
7     40.822
8     41.674
9     41.763
10    42.129
11    43.828
12    55.230
13    59.280
14    64.820
...
1689    40.238
1690    39.193
1691    42.384
1692    48.451
1693    50.469
1694    52.358
1695    53.995
1696    55.635
1697    57.674
1698    60.363
1699    62.351
1700    60.377
1701    46.809
1702    39.989
1703    43.487
Name: lifeExp, Length: 1704, dtype: float64

or use an indexing-like notation

gDat['lifeExp']

0     28.801
1     30.332
2     31.997
3     34.020
4     36.088
5     38.438
6     39.854
7     40.822
8     41.674
9     41.763
10    42.129
11    43.828
12    55.230
13    59.280
14    64.820
...
1689    40.238
1690    39.193
1691    42.384
1692    48.451
1693    50.469
1694    52.358
1695    53.995
1696    55.635
1697    57.674
1698    60.363
1699    62.351
1700    60.377
1701    46.809
1702    39.989
1703    43.487
Name: lifeExp, Length: 1704, dtype: float64

Many of operations we saw earlier can apply to single column.

gDat['lifeExp'].head()

0    28.801
1    30.332
2    31.997
3    34.020
4    36.088
Name: lifeExp, dtype: float64

gDat['lifeExp'].describe()

count    1704.000000
mean       59.474439
std        12.917107
min        23.599000
25%        48.198000
50%        60.712500
75%        70.845500
max        82.603000
dtype: float64

In this dataset, the year column functions a bit like categorical data since there are only a few unique values.

gDat['year'].describe()

count    1704.00000
mean     1979.50000
std        17.26533
min      1952.00000
25%      1965.75000
50%      1979.50000
75%      1993.25000
max      2007.00000
dtype: float64

But the continent and country columns are truly categorical

gDat['continent'].describe()

count       1704
unique         5
top       Africa
freq         624
dtype: object

gDat['country'].describe()

count               1704
unique               142
top       Czech Republic
freq                  12
dtype: object

These categorical columns display a different set of summary statistics. Note that country, continent were not included in the gDat.describe() command.

Some visualualization tools¶

Several plots can be easily generated for DataFrame objects using the plot method.

First, display the plots inline with a magic command:

%matplotlib inline

We can specify the type of plot with the kind argument. Also, choose the independent and dependent variables with x and y arguments. * Plot year vs life expectancy in a scatter plot.

gDat.plot(x='year',y='lifeExp',kind='scatter' )

<matplotlib.axes._subplots.AxesSubplot at 0xb94fda0>

Plot gdp per capity vs life expectancy in a scatter plot

gDat.plot(x='gdpPercap',y='lifeExp',kind='scatter')

<matplotlib.axes._subplots.AxesSubplot at 0xbacdbe0>

Question¶

What other kinds of plots can be generate with the plot() command?

We can also generate plots for specific columns.

gDat['lifeExp'].plot(kind='kde')

<matplotlib.axes._subplots.AxesSubplot at 0xbc1f080>

Histograms are also availble.

gDat.hist(column='lifeExp')

array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000000012DF6160>]], dtype=object)

Subsetting and grouping¶

Sometimes we need to look at only parts of the data. For example, we might want to look at the data for a particular country or in a particular year. Or we may want to group the data by continent.

Groupby¶

We can use the groupby method to split up the data according to repeated values in each column. For example, group the data by continent.

continents=gDat.groupby('continent')

list the number of unique continents.

len(continents)

5

Question¶

How many unique countries are there? Years?

list the names of the continents and the number of data points in each.

continents.size()

continent
Africa       624
Americas     300
Asia         396
Europe       360
Oceania       24
dtype: int64

Plot in a barchart.

continents.size().plot(kind='bar')

<matplotlib.axes._subplots.AxesSubplot at 0x12d12b38>

Horizontal bar plot

continents.size().plot(kind='barh')

<matplotlib.axes._subplots.AxesSubplot at 0x132046a0>

Retrieve a particular group with the get_group() command.

continents.get_group('Africa').describe()

Question¶

What is the minimum life expectancy for Asia?

continents.get_group('Asia')['lifeExp'].min()

28.800999999999998

continents.get_group('Asia').describe()

Subset¶

We can also isolate sets of data with the isin() fuction. isin() returns a list of indices where the data matches a condition. These indices can be passed into the DataFrame object as an argument. * Isolate all of the data with year 2007

gDat[gDat['year'].isin([2007])]

You can also use "==" instead of isin().

Isolate a country (Uruguay).

gDat[gDat['country'] =='Uruguay']

We could have also accomplished this by accessing the indices of these rows:

gDat[1620:1632]

But this isn't very helpful for a few reasons 1. We need to know the rows that are associated with Urugray ahead of time. How are we supposed to find that? 2. If the DataFrame changes then we need to change the indices.

Exercise¶

Write function that will take two countries as an argument and plot the life expectancy vs year for each country on the same axis.

def compare_lifeExp(country1, country2):
    """Plot life expectancy vs year for country1 and country2"""

    for c in [country1,country2]:
        gDat[gDat['country']==c].plot(x='year',y='lifeExp')
    plt.legend((country1,country2))

compare_lifeExp('Canada','Mexico')

Pivot Table¶

Another option for subsetting is to use a pivot table to group the data. * http://pandas.pydata.org/pandas-docs/dev/generated/pandas.tools.pivot.pivot_table.html

The pivot_table allows us to organize the information in a different way. For instance, we might prefer to think of the year column as a single set of observations and index the data in that way. For simplicity, we will view the population of each country indexed by year. We will send these arguments in the pivot_table function: * index = 'year' * columns = 'country' * values = 'pop'

pd.pivot_table(gDat,index=['year'],columns='country', values='pop')

We could may also be interested in more than one value for each country * index by year and print each county's pop and gdp.

pd.pivot_table(gDat,index=['year'],columns=['country'], values=['pop','gdpPercap'])

We could also index by year but apply an aggregation function to each column. The aggregation function defines how the information in the column should be summarized. Here are some examples: * len * mean * min * max

pd.pivot_table(gDat,index=['year'],aggfunc=[len])

Index by country and only look at gdpPerCap, lifeExp, and pop. Calculate the mean and minimum.

pd.pivot_table(gDat,index=['country'],values=['gdpPercap','lifeExp','pop'],aggfunc=[np.mean,np.min])

Fancier subsetting¶

We can also subset a subset.

Isolate Mexico's year and life expectancy.

spec=['year','lifeExp']
gDat[gDat['country']=='Mexico'][spec]

Or pass multiple conditions for subsetting. * Display Mexico's life expectancy in 2002.

gDat[(gDat['country']=='Mexico') & (gDat['year']==2002)]['lifeExp']

994    74.902
Name: lifeExp, dtype: float64

Exercise¶

Write a function that will print a given country's life expectancy, population and gdp per capita in a given year. (note data is available only for every 5 years between 1952 and 2007).

def print_stats(gDat,country,year):
    """ Prints the life expectancy, gdp per capita and population
    of country in year. """
    
    spec = ['lifeExp','pop','gdpPercap']
    
    print 'Statisitcs for', country, 'in', year
    print gDat[(gDat['country']==country) & (gDat['year']==year)][spec]

print_stats(gDat,'Canada',2007)

Statisitcs for Canada in 2007
     lifeExp       pop    gdpPercap
251   80.653  33390141  36319.23501

Calculations on subsets¶

Somtimes we want to calculate useful pieces of information that apply to groups of the data. We can accomplish this using groupby and the agg() or apply() functions.

agg() allows use to aggregate the data in a certain way.

For example, we may want to look up the minimum life expectancy for each continent.

gDat.groupby('continent')['lifeExp'].agg(np.min)

continent
Africa       23.599
Americas     37.579
Asia         28.801
Europe       43.585
Oceania      69.120
Name: lifeExp, dtype: float64

But it would also be helpful to know where and when this occured. For this we can use apply() and indexing.

We will break this down by looking up when the minimum life expectancy occured for Africa.

Isolate the Africa group and call it subset.

subset= gDat.groupby('continent').get_group('Africa')

Find the index of the minimum life expectancy. Use the idxmin() method.

ind=subset['lifeExp'].idxmin()

Now we pass that index into the year column of subset

subset.year[ind]

1992

or even faster

subset.year[subset['lifeExp'].idxmin()]

1992

Great! So Africa had a minimum life expecency in 1992. But we don't want to write this again for each contintent. We can apply this aggregation to each continent in a single line using the apply() method.

We will pass our own aggregation function called lambda as an argument in the apply() method.

Note: A lambda function is a function that is not bound to a name at run time.

gDat.groupby('continent').apply(lambda subset: subset.year[subset['lifeExp'].idxmin()])

continent
Africa       1992
Americas     1952
Asia         1952
Europe       1952
Oceania      1952
dtype: int64

Repeat again for country.

gDat.groupby('continent').apply(lambda subset: subset.country[subset['lifeExp'].idxmin()])

continent
Africa            Rwanda
Americas           Haiti
Asia         Afghanistan
Europe            Turkey
Oceania        Australia
dtype: object

Or apply to both country and year at once wih a slightly different lambda function

spec=['year','country','lifeExp'] #columns we want to return
gDat.groupby('continent').apply(lambda subset: subset.ix[subset['lifeExp'].idxmin()])[spec]

Exercise¶

Suzy wrote some code to determine which country had the lowest life expectancy in 1982.

What is wrong with her solution?

spec=['country','lifeExp']
gDat[gDat['year']==1982][spec].min()

country    Afghanistan
lifeExp         38.445
dtype: object

We can do a quick check to look up Afghanistan's life expectancy in 1982.

gDat[(gDat['year']==1982) & (gDat['country']=='Afghanistan')]

This doesnt match with the answer above because the min() function was applied to each column (country and lifeExp).

She should have tried this.

gDat.ix[gDat[gDat['year']==1982]['lifeExp'].idxmin()]['country']

'Sierra Leone'

Putting it together - Plots¶

We can use all of these ideas to generate a plot that looks at a subset of the data. * Plot GDP per capita vs life expectancy in 2007 for each continent.

c=gDat.groupby(['continent'])
for cont in c.groups:
    group=c.get_group(cont)
    group[group['year']==2007].plot(kind='scatter',x='gdpPercap',y='lifeExp',title=cont)
    #plt.axis([-10000,60000,30,90]) #add this in

Or plot each on a different axis.

fig,ax=plt.subplots(1,1)
cols=['m','b','r','g','y']
for cont, col in zip(c.groups,cols):
    group=c.get_group(cont)
    group[group['year']==2007].plot(kind='scatter',x='gdpPercap',y='lifeExp',label=cont,ax=ax,color=col,alpha=0.5)
ax.set_title(2007)

<matplotlib.text.Text at 0x13633518>

Exercise¶

Write a function the takes a country as an argument and plots the life expectancy against GDP per capita for all years in a scatter plot. Also print the year of the minimum/maximum lifeExp and the year of the miniimim/maximum GDP per capita.

def compare_gdp_lifeExp(gDat,country):
    """ plot GDP per capita against life expectancy for a given country.
    print year of min/max gdp per capita and life expectancy
    """
    
    sub=gDat[gDat['country']==country]
    sub.plot(x='gdpPercap',y='lifeExp',kind='scatter',title=country)

    print 'Year of Min/Max GDP per capita'
    print gDat.ix[[sub['gdpPercap'].idxmin(),sub['gdpPercap'].idxmax()]]['year']
    print 'Year of Min/Max life expectancy'
    print gDat.ix[[sub['lifeExp'].idxmin(),sub['lifeExp'].idxmax()]]['year']

compare_gdp_lifeExp(gDat,'Afghanistan')

Year of Min/Max GDP per capita
9    1997
6    1982
Name: year, dtype: int64
Year of Min/Max life expectancy
0     1952
11    2007
Name: year, dtype: int64

compare_gdp_lifeExp(gDat,'Canada')

Year of Min/Max GDP per capita
240    1952
251    2007
Name: year, dtype: int64
Year of Min/Max life expectancy
240    1952
251    2007
Name: year, dtype: int64

Other resources for plotting: * http://pandas.pydata.org/pandas-docs/stable/visualization.html#trellis-plotting-interface

Summary¶

pandas is a powerful tool for reading and manipulating tabular data
subsets of data can be created using == and isin()
data can be grouped by similar values using groupby
pivot tables can be used to calculate aggregate statistics over certain variables
pandas supports some simple plotting features

	country	year	pop	continent	lifeExp	gdpPercap
0	Afghanistan	1952	8425333	Asia	28.801	779.445314
1	Afghanistan	1957	9240934	Asia	30.332	820.853030
2	Afghanistan	1962	10267083	Asia	31.997	853.100710
3	Afghanistan	1967	11537966	Asia	34.020	836.197138
4	Afghanistan	1972	13079460	Asia	36.088	739.981106
5	Afghanistan	1977	14880372	Asia	38.438	786.113360
6	Afghanistan	1982	12881816	Asia	39.854	978.011439
7	Afghanistan	1987	13867957	Asia	40.822	852.395945
8	Afghanistan	1992	16317921	Asia	41.674	649.341395
9	Afghanistan	1997	22227415	Asia	41.763	635.341351
10	Afghanistan	2002	25268405	Asia	42.129	726.734055
11	Afghanistan	2007	31889923	Asia	43.828	974.580338
12	Albania	1952	1282697	Europe	55.230	1601.056136
13	Albania	1957	1476505	Europe	59.280	1942.284244
14	Albania	1962	1728137	Europe	64.820	2312.888958
15	Albania	1967	1984060	Europe	66.220	2760.196931
16	Albania	1972	2263554	Europe	67.690	3313.422188
17	Albania	1977	2509048	Europe	68.930	3533.003910
18	Albania	1982	2780097	Europe	70.420	3630.880722
19	Albania	1987	3075321	Europe	72.000	3738.932735
20	Albania	1992	3326498	Europe	71.581	2497.437901
21	Albania	1997	3428038	Europe	72.950	3193.054604
22	Albania	2002	3508512	Europe	75.651	4604.211737
23	Albania	2007	3600523	Europe	76.423	5937.029526
24	Algeria	1952	9279525	Africa	43.077	2449.008185
25	Algeria	1957	10270856	Africa	45.685	3013.976023
26	Algeria	1962	11000948	Africa	48.303	2550.816880
27	Algeria	1967	12760499	Africa	51.407	3246.991771
28	Algeria	1972	14760787	Africa	54.518	4182.663766
29	Algeria	1977	17152804	Africa	58.014	4910.416756
...	...	...	...	...	...	...
1674	Yemen, Rep.	1982	9657618	Asia	49.113	1977.557010
1675	Yemen, Rep.	1987	11219340	Asia	52.922	1971.741538
1676	Yemen, Rep.	1992	13367997	Asia	55.599	1879.496673
1677	Yemen, Rep.	1997	15826497	Asia	58.020	2117.484526
1678	Yemen, Rep.	2002	18701257	Asia	60.308	2234.820827
1679	Yemen, Rep.	2007	22211743	Asia	62.698	2280.769906
1680	Zambia	1952	2672000	Africa	42.038	1147.388831
1681	Zambia	1957	3016000	Africa	44.077	1311.956766
1682	Zambia	1962	3421000	Africa	46.023	1452.725766
1683	Zambia	1967	3900000	Africa	47.768	1777.077318
1684	Zambia	1972	4506497	Africa	50.107	1773.498265
1685	Zambia	1977	5216550	Africa	51.386	1588.688299
1686	Zambia	1982	6100407	Africa	51.821	1408.678565
1687	Zambia	1987	7272406	Africa	50.821	1213.315116
1688	Zambia	1992	8381163	Africa	46.100	1210.884633
1689	Zambia	1997	9417789	Africa	40.238	1071.353818
1690	Zambia	2002	10595811	Africa	39.193	1071.613938
1691	Zambia	2007	11746035	Africa	42.384	1271.211593
1692	Zimbabwe	1952	3080907	Africa	48.451	406.884115
1693	Zimbabwe	1957	3646340	Africa	50.469	518.764268
1694	Zimbabwe	1962	4277736	Africa	52.358	527.272182
1695	Zimbabwe	1967	4995432	Africa	53.995	569.795071
1696	Zimbabwe	1972	5861135	Africa	55.635	799.362176
1697	Zimbabwe	1977	6642107	Africa	57.674	685.587682
1698	Zimbabwe	1982	7636524	Africa	60.363	788.855041
1699	Zimbabwe	1987	9216418	Africa	62.351	706.157306
1700	Zimbabwe	1992	10704340	Africa	60.377	693.420786
1701	Zimbabwe	1997	11404948	Africa	46.809	792.449960
1702	Zimbabwe	2002	11926563	Africa	39.989	672.038623
1703	Zimbabwe	2007	12311143	Africa	43.487	469.709298

	year	pop	lifeExp	gdpPercap
count	624.00000	6.240000e+02	624.00000	624.000000
mean	1979.50000	9.916003e+06	48.86533	2193.754578
std	17.27411	1.549092e+07	9.15021	2827.929863
min	1952.00000	6.001100e+04	23.59900	241.165877
25%	1965.75000	1.342075e+06	42.37250	761.247010
50%	1979.50000	4.579311e+06	47.79200	1192.138217
75%	1993.25000	1.080149e+07	54.41150	2377.417422
max	2007.00000	1.350312e+08	76.44200	21951.211760

	year	pop	lifeExp	gdpPercap
count	396.000000	3.960000e+02	396.000000	396.000000
mean	1979.500000	7.703872e+07	60.064903	7902.150428
std	17.282097	2.068852e+08	11.864532	14045.373112
min	1952.000000	1.204470e+05	28.801000	331.000000
25%	1965.750000	3.844393e+06	51.426250	1056.993223
50%	1979.500000	1.453083e+07	61.791500	2646.786844
75%	1993.250000	4.630035e+07	69.505250	8549.255654
max	2007.000000	1.318683e+09	82.603000	113523.132900

	country	year	pop	continent	lifeExp	gdpPercap
11	Afghanistan	2007	31889923	Asia	43.828	974.580338
23	Albania	2007	3600523	Europe	76.423	5937.029526
35	Algeria	2007	33333216	Africa	72.301	6223.367465
47	Angola	2007	12420476	Africa	42.731	4797.231267
59	Argentina	2007	40301927	Americas	75.320	12779.379640
71	Australia	2007	20434176	Oceania	81.235	34435.367440
83	Austria	2007	8199783	Europe	79.829	36126.492700
95	Bahrain	2007	708573	Asia	75.635	29796.048340
107	Bangladesh	2007	150448339	Asia	64.062	1391.253792
119	Belgium	2007	10392226	Europe	79.441	33692.605080
131	Benin	2007	8078314	Africa	56.728	1441.284873
143	Bolivia	2007	9119152	Americas	65.554	3822.137084
155	Bosnia and Herzegovina	2007	4552198	Europe	74.852	7446.298803
167	Botswana	2007	1639131	Africa	50.728	12569.851770
179	Brazil	2007	190010647	Americas	72.390	9065.800825
191	Bulgaria	2007	7322858	Europe	73.005	10680.792820
203	Burkina Faso	2007	14326203	Africa	52.295	1217.032994
215	Burundi	2007	8390505	Africa	49.580	430.070692
227	Cambodia	2007	14131858	Asia	59.723	1713.778686
239	Cameroon	2007	17696293	Africa	50.430	2042.095240
251	Canada	2007	33390141	Americas	80.653	36319.235010
263	Central African Republic	2007	4369038	Africa	44.741	706.016537
275	Chad	2007	10238807	Africa	50.651	1704.063724
287	Chile	2007	16284741	Americas	78.553	13171.638850
299	China	2007	1318683096	Asia	72.961	4959.114854
311	Colombia	2007	44227550	Americas	72.889	7006.580419
323	Comoros	2007	710960	Africa	65.152	986.147879
335	Congo, Dem. Rep.	2007	64606759	Africa	46.462	277.551859
347	Congo, Rep.	2007	3800610	Africa	55.322	3632.557798
359	Costa Rica	2007	4133884	Americas	78.782	9645.061420
...	...	...	...	...	...	...
1355	Sierra Leone	2007	6144562	Africa	42.568	862.540756
1367	Singapore	2007	4553009	Asia	79.972	47143.179640
1379	Slovak Republic	2007	5447502	Europe	74.663	18678.314350
1391	Slovenia	2007	2009245	Europe	77.926	25768.257590
1403	Somalia	2007	9118773	Africa	48.159	926.141068
1415	South Africa	2007	43997828	Africa	49.339	9269.657808
1427	Spain	2007	40448191	Europe	80.941	28821.063700
1439	Sri Lanka	2007	20378239	Asia	72.396	3970.095407
1451	Sudan	2007	42292929	Africa	58.556	2602.394995
1463	Swaziland	2007	1133066	Africa	39.613	4513.480643
1475	Sweden	2007	9031088	Europe	80.884	33859.748350
1487	Switzerland	2007	7554661	Europe	81.701	37506.419070
1499	Syria	2007	19314747	Asia	74.143	4184.548089
1511	Taiwan	2007	23174294	Asia	78.400	28718.276840
1523	Tanzania	2007	38139640	Africa	52.517	1107.482182
1535	Thailand	2007	65068149	Asia	70.616	7458.396327
1547	Togo	2007	5701579	Africa	58.420	882.969944
1559	Trinidad and Tobago	2007	1056608	Americas	69.819	18008.509240
1571	Tunisia	2007	10276158	Africa	73.923	7092.923025
1583	Turkey	2007	71158647	Europe	71.777	8458.276384
1595	Uganda	2007	29170398	Africa	51.542	1056.380121
1607	United Kingdom	2007	60776238	Europe	79.425	33203.261280
1619	United States	2007	301139947	Americas	78.242	42951.653090
1631	Uruguay	2007	3447496	Americas	76.384	10611.462990
1643	Venezuela	2007	26084662	Americas	73.747	11415.805690
1655	Vietnam	2007	85262356	Asia	74.249	2441.576404
1667	West Bank and Gaza	2007	4018332	Asia	73.422	3025.349798
1679	Yemen, Rep.	2007	22211743	Asia	62.698	2280.769906
1691	Zambia	2007	11746035	Africa	42.384	1271.211593
1703	Zimbabwe	2007	12311143	Africa	43.487	469.709298

	country	year	pop	continent	lifeExp	gdpPercap
1620	Uruguay	1952	2252965	Americas	66.071	5716.766744
1621	Uruguay	1957	2424959	Americas	67.044	6150.772969
1622	Uruguay	1962	2598466	Americas	68.253	5603.357717
1623	Uruguay	1967	2748579	Americas	68.468	5444.619620
1624	Uruguay	1972	2829526	Americas	68.673	5703.408898
1625	Uruguay	1977	2873520	Americas	69.481	6504.339663
1626	Uruguay	1982	2953997	Americas	70.805	6920.223051
1627	Uruguay	1987	3045153	Americas	71.918	7452.398969
1628	Uruguay	1992	3149262	Americas	72.752	8137.004775
1629	Uruguay	1997	3262838	Americas	74.223	9230.240708
1630	Uruguay	2002	3363085	Americas	75.307	7727.002004
1631	Uruguay	2007	3447496	Americas	76.384	10611.462990

	year	pop	lifeExp	gdpPercap
count	1704.00000	1.704000e+03	1704.000000	1704.000000
mean	1979.50000	2.960121e+07	59.474439	7215.327081
std	17.26533	1.061579e+08	12.917107	9857.454543
min	1952.00000	6.001100e+04	23.599000	241.165877
25%	1965.75000	2.793664e+06	48.198000	1202.060309
50%	1979.50000	7.023596e+06	60.712500	3531.846989
75%	1993.25000	1.958522e+07	70.845500	9325.462346
max	2007.00000	1.318683e+09	82.603000	113523.132900

	year	lifeExp
984	1952	50.789
985	1957	55.190
986	1962	58.299
987	1967	60.110
988	1972	62.361
989	1977	65.032
990	1982	67.405
991	1987	69.498
992	1992	71.455
993	1997	73.670
994	2002	74.902
995	2007	76.195

country	Afghanistan	Albania	Algeria	Angola	Argentina	Australia	Austria	Bahrain	Bangladesh	Belgium	...	Uganda	United Kingdom	United States	Uruguay	Venezuela	Vietnam	West Bank and Gaza	Yemen, Rep.	Zambia	Zimbabwe
year
1952	8425333	1282697	9279525	4232095	17876956	8691212	6927772	120447	46886859	8730405	...	5824797	50430000	157553000	2252965	5439568	26246839	1030585	4963829	2672000	3080907
1957	9240934	1476505	10270856	4561361	19610538	9712569	6965860	138655	51365468	8989111	...	6675501	51430000	171984000	2424959	6702668	28998543	1070439	5498090	3016000	3646340
1962	10267083	1728137	11000948	4826015	21283783	10794968	7129864	171863	56839289	9218400	...	7688797	53292000	186538000	2598466	8143375	33796140	1133134	6120081	3421000	4277736
1967	11537966	1984060	12760499	5247469	22934225	11872264	7376998	202182	62821884	9556500	...	8900294	54959000	198712000	2748579	9709552	39463910	1142636	6740785	3900000	4995432
1972	13079460	2263554	14760787	5894858	24779799	13177000	7544201	230800	70759295	9709100	...	10190285	56079000	209896000	2829526	11515649	44655014	1089572	7407075	4506497	5861135
1977	14880372	2509048	17152804	6162675	26983828	14074100	7568430	297410	80428306	9821800	...	11457758	56179000	220239000	2873520	13503563	50533506	1261091	8403990	5216550	6642107
1982	12881816	2780097	20033753	7016384	29341374	15184200	7574613	377967	93074406	9856303	...	12939400	56339704	232187835	2953997	15620766	56142181	1425876	9657618	6100407	7636524
1987	13867957	3075321	23254956	7874230	31620918	16257249	7578903	454612	103764241	9870200	...	15283050	56981620	242803533	3045153	17910182	62826491	1691210	11219340	7272406	9216418
1992	16317921	3326498	26298373	8735988	33958947	17481977	7914969	529491	113704579	10045622	...	18252190	57866349	256894189	3149262	20265563	69940728	2104779	13367997	8381163	10704340
1997	22227415	3428038	29072015	9875024	36203463	18565243	8069876	598561	123315288	10199787	...	21210254	58808266	272911760	3262838	22374398	76048996	2826046	15826497	9417789	11404948
2002	25268405	3508512	31287142	10866106	38331121	19546792	8148312	656397	135656790	10311970	...	24739869	59912431	287675526	3363085	24287670	80908147	3389578	18701257	10595811	11926563
2007	31889923	3600523	33333216	12420476	40301927	20434176	8199783	708573	150448339	10392226	...	29170398	60776238	301139947	3447496	26084662	85262356	4018332	22211743	11746035	12311143

	pop										...	gdpPercap
country	Afghanistan	Albania	Algeria	Angola	Argentina	Australia	Austria	Bahrain	Bangladesh	Belgium	...	Uganda	United Kingdom	United States	Uruguay	Venezuela	Vietnam	West Bank and Gaza	Yemen, Rep.	Zambia	Zimbabwe
year
1952	8425333	1282697	9279525	4232095	17876956	8691212	6927772	120447	46886859	8730405	...	734.753484	9979.508487	13990.48208	5716.766744	7689.799761	605.066492	1515.592329	781.717576	1147.388831	406.884115
1957	9240934	1476505	10270856	4561361	19610538	9712569	6965860	138655	51365468	8989111	...	774.371069	11283.177950	14847.12712	6150.772969	9802.466526	676.285448	1827.067742	804.830455	1311.956766	518.764268
1962	10267083	1728137	11000948	4826015	21283783	10794968	7129864	171863	56839289	9218400	...	767.271740	12477.177070	16173.14586	5603.357717	8422.974165	772.049160	2198.956312	825.623201	1452.725766	527.272182
1967	11537966	1984060	12760499	5247469	22934225	11872264	7376998	202182	62821884	9556500	...	908.918522	14142.850890	19530.36557	5444.619620	9541.474188	637.123289	2649.715007	862.442146	1777.077318	569.795071
1972	13079460	2263554	14760787	5894858	24779799	13177000	7544201	230800	70759295	9709100	...	950.735869	15895.116410	21806.03594	5703.408898	10505.259660	699.501644	3133.409277	1265.047031	1773.498265	799.362176
1977	14880372	2509048	17152804	6162675	26983828	14074100	7568430	297410	80428306	9821800	...	843.733137	17428.748460	24072.63213	6504.339663	13143.950950	713.537120	3682.831494	1829.765177	1588.688299	685.587682
1982	12881816	2780097	20033753	7016384	29341374	15184200	7574613	377967	93074406	9856303	...	682.266227	18232.424520	25009.55914	6920.223051	11152.410110	707.235786	4336.032082	1977.557010	1408.678565	788.855041
1987	13867957	3075321	23254956	7874230	31620918	16257249	7578903	454612	103764241	9870200	...	617.724406	21664.787670	29884.35041	7452.398969	9883.584648	820.799445	5107.197384	1971.741538	1213.315116	706.157306
1992	16317921	3326498	26298373	8735988	33958947	17481977	7914969	529491	113704579	10045622	...	644.170797	22705.092540	32003.93224	8137.004775	10733.926310	989.023149	6017.654756	1879.496673	1210.884633	693.420786
1997	22227415	3428038	29072015	9875024	36203463	18565243	8069876	598561	123315288	10199787	...	816.559081	26074.531360	35767.43303	9230.240708	10165.495180	1385.896769	7110.667619	2117.484526	1071.353818	792.449960
2002	25268405	3508512	31287142	10866106	38331121	19546792	8148312	656397	135656790	10311970	...	927.721002	29478.999190	39097.09955	7727.002004	8605.047831	1764.456677	4515.487575	2234.820827	1071.613938	672.038623
2007	31889923	3600523	33333216	12420476	40301927	20434176	8199783	708573	150448339	10392226	...	1056.380121	33203.261280	42951.65309	10611.462990	11415.805690	2441.576404	3025.349798	2280.769906	1271.211593	469.709298

	mean			amin
	gdpPercap	lifeExp	pop	gdpPercap	lifeExp	pop
country
Afghanistan	802.674598	37.478833	1.582372e+07	635.341351	28.801	8.425333e+06
Albania	3255.366633	68.432917	2.580249e+06	1601.056136	55.230	1.282697e+06
Algeria	4426.025973	59.030167	1.987541e+07	2449.008185	43.077	9.279525e+06
Angola	3607.100529	37.883500	7.309390e+06	2277.140884	30.015	4.232095e+06
Argentina	8955.553783	69.060417	2.860224e+07	5911.315053	62.485	1.787696e+07
Australia	19980.595634	74.662917	1.464931e+07	10039.595640	69.120	8.691212e+06
Austria	20411.916279	73.103250	7.583298e+06	6137.076492	66.800	6.927772e+06
Bahrain	18077.663945	65.605667	3.739132e+05	9867.084765	50.939	1.204470e+05
Bangladesh	817.558818	49.834083	9.075540e+07	630.233627	37.484	4.688686e+07
Belgium	19900.758072	73.641750	9.725119e+06	8343.105127	68.000	8.730405e+06
Benin	1155.395107	48.779917	4.017497e+06	949.499064	38.223	1.738315e+06
Bolivia	2961.228754	52.504583	5.610395e+06	2127.686326	40.414	2.883315e+06
Bosnia and Herzegovina	3484.779069	67.707833	3.816525e+06	973.533195	53.820	2.791000e+06
Botswana	5031.503557	54.597500	9.711862e+05	851.241141	46.634	4.423080e+05
Brazil	5829.316653	62.239500	1.223121e+08	2108.944355	50.917	5.660256e+07
Bulgaria	6384.055172	69.743750	8.182985e+06	2444.286648	59.600	7.274900e+06
Burkina Faso	843.990665	44.694000	7.548677e+06	543.255241	31.975	4.469979e+06
Burundi	471.662990	44.817333	4.651608e+06	339.296459	39.031	2.445618e+06
Cambodia	675.367824	47.902750	8.510431e+06	368.469286	31.220	4.693836e+06
Cameroon	1774.634222	48.128500	9.816648e+06	1172.667655	38.523	5.009067e+06
Canada	22410.746340	74.902750	2.446297e+07	11367.161120	68.750	1.478558e+07
Central African Republic	958.784697	43.866917	2.560963e+06	706.016537	35.463	1.291695e+06
Chad	1165.453674	46.773583	5.329256e+06	797.908101	38.092	2.682462e+06
Chile	6703.289147	67.430917	1.120573e+07	3939.978789	54.745	6.377619e+06
China	1488.307694	61.785140	9.581601e+08	400.448611	44.000	5.562635e+08
Colombia	4195.342920	63.897750	2.725610e+07	2144.115096	50.643	1.235077e+07
Comoros	1314.380339	52.381750	3.616839e+05	986.147879	40.715	1.539360e+05
Congo, Dem. Rep.	648.342646	44.543750	3.268166e+07	241.165877	39.143	1.410000e+07
Congo, Rep.	3312.788215	52.501917	1.923209e+06	2125.621418	42.111	8.548850e+05
Costa Rica	5448.610779	70.181417	2.400008e+06	2627.009471	57.206	9.263170e+05
...	...	...	...	...	...	...
Sierra Leone	1072.819493	36.769167	3.605425e+06	574.648158	30.331	2.143249e+06
Singapore	17425.382267	71.220250	2.667817e+06	2315.138227	60.396	1.127000e+06
Slovak Republic	10415.530689	70.696083	4.774507e+06	5074.659104	64.360	3.558137e+06
Slovenia	14074.582109	71.600750	1.794381e+06	4215.041741	65.570	1.489518e+06
Somalia	1140.793252	40.988667	5.197198e+06	882.081822	32.978	2.526994e+06
South Africa	7247.431074	53.993167	2.992835e+07	4725.295531	45.009	1.426494e+07
Spain	14029.826479	74.203417	3.585180e+07	3834.034742	64.940	2.854987e+07