install.packages("ecotourism")Introduction to R
Working with data in R
In this lab you will be introduced to core R functions we will learn about throughout the course. You will review and practice base R functions (pre-loaded functions in R), and functions in the tidy data framework (from the tidyverse packages).
We’ll use data from the ecotourism package to get a feel for where we are heading in the course.
First, install and load the required packages
Copy the following code into your R Console (bottom left window in RStudio)
Note: you will need to have already installed the
tidyverseandsfpackages
Load libraries
library(tidyverse)
library(sf)
library(ecotourism)Run some code to make charts and make a map!
Download one of the following Quarto documents
Eco Tourism Examples
Run
??ecotourismin the RStudio Console to learn more about these datasets
Directions to render one of the example Quarto files above.
Place the .qmd file in your RStudio project
activitiesfolderOpen in RStudio by selecting the file in the Files window (bottom right corner of RStudio)
Select Render (blue arrow) on the menu bar above your Quarto doc in RStudio.
Explore a dataset
Learn more about manta_rays (note: you can try this with your chosen dataset)
Practice:
Inspecting data: str(), head(), summary() vs glimpse()
Filtering / sorting
Creating variables
Grouping / summarizing
Basic plotting (base + ggplot)
What are rows + columns?
In other words, what are the observations and variables?
Run the code below in a code chunk or the RStudio Console, run one function at a time, e.g.,
str().
str(manta_rays)
head(manta_rays)
summary(manta_rays)Make note, what do you learn about the data with each of these functions?
Base R filtering + sorting
recent <- manta_rays[manta_rays$year >= 2018, ] # Try running this code with a different year.
recent <- recent[order(recent$year, recent$month), ]
head(recent)Simple summaries
Sightings of manta rays by year
table(manta_rays$year)Plot (charts) in base R
plot(table(manta_rays$year), xlab = "Year", ylab = "Number of records")Tidyverse
Learn about the variables in your dataset
manta_rays |> glimpse()Filter and arrange rows
manta_rays |>
filter(year >= 2018) |> # Try out different years
arrange(year) |> # Arrange the table by year
select(year, obs_lat, obs_lon, ws_id) # Choose to show selected variables (columns)Create a new variable with data in the table
manta2 <- manta_rays |>
mutate(
season = case_when(
month %in% c(12, 1, 2) ~ "summer",
month %in% c(3, 4, 5) ~ "autumn",
month %in% c(6, 7, 8) ~ "winter",
month %in% c(9,10,11) ~ "spring",
TRUE ~ NA_character_ # If any observations do not fit in the functions above, make the entry NA
)
)
manta2 # display the resulting table
# Show the new variable
manta2 |>
select(year, obs_lat, obs_lon, ws_id, season)Group and summarize
Summarize by year
by_year <- manta2 %>%
group_by(year) %>%
summarize(n_records = n(), .groups = "drop")
by_year # Display the resultSummarize by season
by_season <- manta2 %>%
filter(!is.na(season)) %>%
group_by(season) %>%
summarize(n_records = n(), .groups = "drop") %>% #
arrange(desc(n_records))
by_seasonLearn more about any of the functions we used about by running ? followed by the function name. For example: ?summarize()
Chart the result using ggplot package
- Note: the
ggplotpackage comes with thetidyverse. When you load the tidyverse library, several functions that are compatible with the tidy data approach are loaded (e.g.,readr,dplyr,ggplot).
ggplot(by_year, aes(x = year, y = n_records)) +
geom_line() +
geom_point() +
labs(x = NULL, y = "Number of records", title = "Manta ray occurrence records by year")ggplot(by_season, aes(x = season, y = n_records)) +
geom_col() +
labs(x = NULL, y = "Number of records", title = "Manta ray records by season")What patterns do your observe in these charts?