ggplot(data, mapping=aes()) +
geom_object()
Lesson 4: Introduction to Plots
1 Introduction
Welcome to lesson four, where you will learn the basics of graphing using a package called ggplot2
. GGPlot is an excellent resource for creating many types of graphs, including histograms, box plots, scatterplots and more. We’ll start by introducing how ggplot works and then try creating a histogram.
See the home page for more details on how to use this tutorial and for troubleshooting tips!
1.1 Dataset
We will work with the physiology dataset, called data
, that you have already seen.
Type the word data to recall what it contains.
When you examine the data table, you can see that there are some rows that are missing data for heart rate. In R, missing data are indicated by the letters ‘NA’. R will ignore these missing values when making a graph. Once you collect your own data, this is the way to enter any missing data into your data table.
2 GGPlot graphing
There are 3 main components of ggplot:
Data (
data
): This is your datasetAesthetics (
mapping=aes()
): This allows you to define your variables and define how you want your graph to look (color, size, shape, etc)Geometric Objects (
geom_ ______
): Define the type of plot (bar plot, scatter plot, histogram, etc)
The basic structure of ggplot()
is as follows. You’ll notice that each of the three components above are included in this structure. Also notice that different lines (or properties) are connected using a +
.
2.1 Example: Histogram
Let’s look at an example to see how this works. We’ll start by creating a histogram. A histogram shows the distribution of data values.
Try running the code below.
This will create a histogram of the heart rate values. On the x-axis are intervals (bins) representing the different possible data values. On the y-axis are the frequencies with which different data values are present in the data set (once, twice, etc.).
Notice how the code is set up:
ggplot(data, mapping=aes(x=heart_rate)) +
geom_histogram(bins=5)
ggplot(data ...)
: Specify thedata
right after the first parenthesisaes(x = heart_rate)
: Specify the x variable (independent variable) and y variable (dependent variable) in theaes()
argument. A histogram only requires an independent variable, so we did not specify the y variable.geom_histogram(bins=5)
: Specify histogram as your chosengeom
. You can also decide how many bins you’d like with thebins=5
argument. Try changing the 5 to a different number and running the code again, see what happens!Chain together the function
ggplot()
with the geomgeom_histogram()
using a+
.
2.1.1 Try it out
Your turn to give it a try! Make a histogram of the RQ
values for our data
.
Here’s the basic structure. Click the next hint for more detailed structure.
ggplot(data, mapping=aes(___________))+
geom_histogram(______________)
ggplot(data, mapping=aes(x=______))+
geom_histogram(bins=_____)
Make sure you use the proper capitalization and spelling for the x variable, exactly as it is shown in our data
.
ggplot(data, mapping=aes(x=RQ))+
geom_histogram(bins=5)
2.2 Customizing a plot
You can change how a plot looks by adding the “color” and “fill” attributes to your
geom_object()
.color
indicates the outline color andfill
specifies the background color. Notice that we put the color name inside quotation marks. This is because the colors are character strings (i.e. lists of letters).You can adjust the labels of the plot by chaining on the
labs()
function:labs(x="x label", y="y label", title="title")
Again, we enclose the labels in quotation marks.
2.3 Try it out
Your turn to give it a try! Make a histogram of the RQ
values for our data
. As an extra challenge, add color and labels to your graph.
Here’s the basic structure. Click the next hint for more detailed structure.
ggplot(data, mapping=aes(___________))+
geom_histogram(______________)+
labs(_____________)
ggplot(data, mapping=aes(x=______))+
geom_histogram(bins=_____, color="_____", fill="_______")+
labs(x="______", y="_______", title="_______")
ggplot(data, mapping=aes(x=RQ))+
geom_histogram(bins=5, color="pink", fill="darkorange")+
labs(x="Respiratory Quotient", y="Count", title="Histogram of Respiratory Quotient for Physiology Data")
Great work!
2.4 Other ggplot resources
There are LOTS of resources online where you can learn about making plots with ggplot. Here are a few to start with:
3 Congratulations!
Hooray! You have completed lesson four. You have learned the basics of creating a plot using GGPlot. Lesson five will teach you how to make a scatterplot and boxplot.