Showing the Right Numbers and Visuals Library Paper
Description
According to the author, ‘ggplot is an implementation of the grammar of graphics’ which is a set of rules for producing visualizations of data. In this first plot, we will track the trajectory of life expectancy over time for each country in the data.
- map year to x and lifeExp to y.
- use geom_line to show how lifeExp changes over time. (did you notice a mistaken assignment to the y parameter in the book?)
- use grouping to make each line refer to a specific country in the dataset
- facet the data on continent
- try moving the facets around on the page – 5 across and 5 down
- add a smoother, change the y scale to a log scale and add the dollar sign
- add the labels as described in the book
- try using facet_grid
- be sure that you know what categorical variables, ordered and unordered, are. compare continuous variables.
- use the glimpse function on the gss_sm data set. Try this gss_sm %>% glimpse(). What do you think the pipe operator (%>%) does?
- make a smoothed scatterplot of the relationship between age of the respondent and the number of children. What did you learn?
- facet the result with the sex and race of the respondent (person who respond to the survey)
- experiment with the alpha attribute
- experiment with combining attributes in the facet wrapper
- geom_bar uses count by default
- describe how the geom_bar used the count function to determine how much water had been used.
- Use the prop function and group by ‘1’ to show by region
- show just the religion column in a table
- create a bar chart showing the frequency of religious distribution in the data
- use fill to highlight the different religions
- create a stacked bar chart to show the frequency of religions by region
- use the position = dodge attribute to create individual religious frequency bars
- facet the religious frequency chart by region
- create a histogram showing midwest regions by size
- experiment with the number of bins in the previous histogram
- Where does the count variable come from?
- use subset to show the data in a histogram just from Ohio and Wisconsin
- create a kernel density plot of the area of the midwest states
Submit a Word document by Sunday at midnight with screen shots of your work and text. Explain what each image is.
Submit screenshots of your successful installation.