Programming Exercises for Data Analysis, Level A2

These exercises are for people who are learning a new programming language and would like to apply it to data analysis. They work with Python, R, but also when you getting familiar with libraries like pandas.

During the exercises, you will analyze from the Gapminder Foundation www.gapminder.org on global demography.

In this challenge, you will learn:

  • integrating data from multiple sources
  • aggregation
  • plotting, plotting, plotting

Preparations

Download the Gapminder datasets from www.gapminder.org on:

  • fertility
  • life expectancy
  • population

Exercise 1

Load all three tables and determine their dimensions.

Exercise 2

Pick fertility and life expectancy for the year 2010. Integrate both in a single table.

Exercise 3

Remove all rows with missing values.

Exercise 4

Draw a scatterplot of fertility over life expectancy in 2010.

Exercise 5

Draw a histogram of life expectancy in 2010. Try different bin sizes.

Exercise 6

Make the histogram a publication-quality figure.

Exercise 7

Draw a bar plot displaying the fertility of a few selected countries in 2010.

Exercise 8

Repeat exercises 2-7 for the year 1960. Observe differences.

Exercise 9

Make it convenient to repeat the process for any given year.

Exercise 10

Calculate a correlation coefficient between fertility and life expectancy (for the year 2010).

Exercise 11

Fit a linear model allowing to model fertility by life expectancy.

Exercise 12

Read a list of country-continent pairs. Associate the continent with each country (e.g. as an extra column).

Exercise 13

Summarize the world population by continent over time as a scatterplot.

Exercise 14

Identify a few countries that are redundant in the dataset. Remove the respective entries.

Exercise 15

Integrate all three tables into a single data structure.

Exercise 16

Plot population, life expetancy and fertility from 1960 to 2010 in a single diagram.

Exercise 17

Create a series of scatterplots like in Exercise 4 (one for each year) where the color indicates the continent.

Exercise 18

Connect the scatterplots to an animation.

results matching ""

    No results matching ""