library(tidyverse)
ggplot
activity 1
Load packages and data
In this exercise you will be using one of the datasets that is built in to R. The most commonly used of these are:
- airquality
- AirPassengers
- mtcars
- iris
You can get a preliminary view of each of these by using any one of the following commands in the Console window. Try them all out to see how they differ. I’ve shown them below for the mtcars dataset, you should try them for all four datasets.
- print(mtcars)
- glimpse(mtcars)
- View(mtcars)
- ?mtcars
Because you will be using a built-in data set, you don’t need to worry about how to read in the data and set up a data frame (phew!).
Plot relationships
Create a scatter plot
Next, pick three columns of interest from your chosen data set. You will basically follow the example just presented in class to create a scatterplot (see the handout for the sequence of code used in the example).
- First, set up a simple ggplot command for the data set.
- Second, set up the mapping, specifying the column for the x axis.
- Third, add to the mapping, specifying the column for the y axis.
- Next, add
geom_point
so that you get a scatterplot. - If it makes sense, given your data set, use a third column as a way of adding color for various graph elements.
- Alternatively, specify a color for all the dots.
- Finally, add a title that explains the graph.
No need to turn this in, but I’d like to see your final rendered document before the end of class! Add all the R code in this document.
Plot distributions
Histogram
Use geom_histogram
to create a histogram plot for a continuous variable in your chosen dataset. Interpret the result.
Density plot
Use geom_density
to create a smooth density plot for another continuous variable in your chosen dataset. Interpret the result.
Box plot
Use geom_boxplot
to create a box plot plot for another continuous variable in your chosen dataset. Interpret the result.
Plot categorical variables
Use geom_bar
to create a bar plot of a categorical variable in your chosen dataset.