Thinking with Data
An Introduction to Data Analysis and R Programming
Welcome to DATA 121 Thinking with Data: An Introduction to Data Analysis and R Programming at Bard College.
Data analytics, the process of analyzing, revealing, interpreting, and visualizing information concealed inside data, is revolutionizing daily life. Data literacy is increasingly vital across disciplines, from the natural and social sciences to business and the arts & humanities. Data isused by all types of organizations, for example Amazon, or Facebook; in the planning and management of cities; in the diagnosis of medical conditions and decisions about health insurance claims; in shaping financial decisions; and in academic research, from the analysis of historical texts to the study of Supreme Court deliberations and genomics data. In each of these contexts, data actively shapes decisions, priorities, and outcomes, raising questions about power, accountability, and the ethics of working with data. This course introduces techniques, implemented through programming in R, for transforming data into useable forms, performing descriptive and exploratory analysis, and learning the basics of visualizing the results. These skills are developed through applied, project-based assignments using real-world datasets, with an emphasis on interpretation, critique, and storytelling. Prerequisites: passing score on Part I of the Mathematics Placement.
Learning Objectives
Thanks to analysis, coding, reading, writing, and presentation, by the end of the course you should be able to:
understand and engage with a full data science workflow, from inputting data to generating meaningful inferences from those data.
write your own code, and employ open-source tools (primarily the software environment RStudio) to generate insights from diverse data sources.
synthesize the results of data analyses for public consumption through visual representation, narrative/written description, and verbal presentation.
apply critical thinking skills to evaluate existing data analyses and visualization
consider various ethical, political, and social issues in the world of data science practice.
References
The following websites were used in the construction of this workbook. My gratitude to the people who invested their time and effort in developing and offering these valuable resources to the public.
Smith College’s SDS 100
Rafael A. Irizarry’s Introduction to Data Science: Data Wrangling and Visualization with R
Mine Çetinkaya-Rundel’s Data Science in a Box
Ben Baumer’s Introduction to Data Science Course
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International CC BY-NC-SA 4.0