Exam 1 study guide
Additional study materials
- Practice exams and solutions
- 👷♀️ Fixed! Provided reference sheets (you will have a copy of this on the exam)
The first exam will test the following learning objectives, divided into the following topic areas. For each topic area, you should be able to do the list that follows. You can think of this as a studying checklist!
- R Basics: general
- Assign an object to a valid variable name, list all variables in the environment and remove them
- Use packages and differentiate between installing and loading
- Get help with a function or package from R
- Read an error message or warning message and interpret
- Return information about an object, including its structure, data type, and length.
- R Basics: vectors, operations, and subsetting
- Distinguish between an atomic vector and a list
- Create atomic vectors and determine their data types
- Differentiate between implicit and explicit coercion and coerce an object to another type
- Use arithmetic, comparison, and logical operators on vectors
- Explain how more complex data structures are built from atomic vectors and create them (dataframe and matrix)
- Distinguish between
NAandNULL
- Data visualization: basics
- Describe how to create a plot with
ggplot2including the 3 basic requirements - Distinguish between mapping and setting aesthetics
- Describe how
ggplot2maps categorical variables to aesthetics and interpret the 3 common warnings people encounter in this process - Interpret
ggplot()calls with explicit or implicit arguments for data and mapping - Recognize the geoms we discussed in class and select which to use for a given situation
- Differentiate between globally and locally defined mappings and recognize them in given plot (or code)
- Describe how to create a plot with
- Data visualization: layers
- Use the
positionargument to modify the position of the geoms ingeom_bar()orgeom_point() - Describe
stat="identity"and describe the default transformations forgeom_bar(),geom_histogram(), andgeom_smooth() - Set the smoothing method for
geom_smooth()and the bins or bindwidth forgeom_histogram() - Facet a plot with
facet_wrap()andfacet_grid() - Modify axis, legend, and plot labels with
labs() - Apply a given theme to a plot and adjust the base font size or family.
- Describe scales and recognize the outcome of adding a scale layer
- Use the
- Data importing
- Load the
tidyverse, recognize the included packages, and critique code for redundant loading - Construct a tidy dataset and critique whether a given dataset is tidy
- Use the map function from the
purrpackage - Create a tibble and distinguish between a tibble and a data frame
- Use
readrto read delimited files and determine whetherreadrcan read files of a given type - Use
col_typesto add a column specifications and explain how readr guesses without it - Solve the 3 most common importing problems we discussed in class
- Load the
- Data wrangling
- Describe the common structure of
dplyrfunctions (aka verbs) - Combine
dplyrfunctions with the pipe operator to solve complex problems - Manipulate rows with
filter(),arrange(), anddistinct() - Maniuplate columns with
mutate(),select(), andrename() - Group and summarise data with
group_by()andsummarise() - Evaulate
dplyrfunctions that include the common arguments we covered in class
- Describe the common structure of
- Sampling distribution
- Explore a dataset with an appropriate figure (histogram, boxplot, scatterplot) and summary statistics appropriate for the distribution.
- Recognize uniform and Gaussian probability distributions in a plot or equation and use R’s functions
d*(),p*(), andr*()to work with these distributions - Explain the difference between the parameter and the paramter estimate
- Construct the sampling distribution of a paramater estimate with
inferand quantify the spread of the distribution with a confidence interval. - Understand the difference between constructing a confidence interval the standard error method vs. the percentile method.
- Hypothesis testing
- Given a set of data, implement the 3-step hypothesis testing framework nonparametrically: (1) Pose a null hypothesis, (2) quantify how likely a given pattern of results is under the null, and (3) determine whether to reject the null (conceptually and with the
inferframework). - Given a theoretical distriubiton (e.g. t), implement the 3-step hypothesis testing framework parametrically.
- Given an observed correlation, determine whether a correlation is positive, negative, or no correlation.
- Given a set of data, implement the 3-step hypothesis testing framework nonparametrically: (1) Pose a null hypothesis, (2) quantify how likely a given pattern of results is under the null, and (3) determine whether to reject the null (conceptually and with the