Lesson 4: Introduction to Plots

1 Introduction

Welcome to lesson four, where you will learn the basics of graphing using a package called ggplot2. GGPlot is an excellent resource for creating many types of graphs, including histograms, box plots, scatterplots and more. We’ll start by introducing how ggplot works and then try creating a histogram.

See the home page for more details on how to use this tutorial and for troubleshooting tips!

1.1 Dataset

We will work with the physiology dataset, called data, that you have already seen.

Type the word data to recall what it contains.

When you examine the data table, you can see that there are some rows that are missing data for heart rate. In R, missing data are indicated by the letters ‘NA’. R will ignore these missing values when making a graph. Once you collect your own data, this is the way to enter any missing data into your data table.

2 GGPlot graphing

There are 3 main components of ggplot:

  • Data (data): This is your dataset

  • Aesthetics (mapping=aes()): This allows you to define your variables and define how you want your graph to look (color, size, shape, etc)

  • Geometric Objects (geom_ ______): Define the type of plot (bar plot, scatter plot, histogram, etc)

The basic structure of ggplot() is as follows. You’ll notice that each of the three components above are included in this structure. Also notice that different lines (or properties) are connected using a +.

ggplot(data, mapping=aes()) +
  geom_object()

2.1 Example: Histogram

Let’s look at an example to see how this works. We’ll start by creating a histogram. A histogram shows the distribution of data values.

Try running the code below.

This will create a histogram of the heart rate values. On the x-axis are intervals (bins) representing the different possible data values. On the y-axis are the frequencies with which different data values are present in the data set (once, twice, etc.).

Notice how the code is set up:

ggplot(data, mapping=aes(x=heart_rate)) +
  geom_histogram(bins=5)
  • ggplot(data ...): Specify the data right after the first parenthesis

  • aes(x = heart_rate): Specify the x variable (independent variable) and y variable (dependent variable) in the aes() argument. A histogram only requires an independent variable, so we did not specify the y variable.

  • geom_histogram(bins=5): Specify histogram as your chosen geom. You can also decide how many bins you’d like with the bins=5 argument. Try changing the 5 to a different number and running the code again, see what happens!

  • Chain together the function ggplot() with the geom geom_histogram() using a +.

2.1.1 Try it out

Your turn to give it a try! Make a histogram of the RQ values for our data.

Here’s the basic structure. Click the next hint for more detailed structure.

ggplot(data, mapping=aes(___________))+
      geom_histogram(______________)
ggplot(data, mapping=aes(x=______))+
      geom_histogram(bins=_____)

Make sure you use the proper capitalization and spelling for the x variable, exactly as it is shown in our data.

ggplot(data, mapping=aes(x=RQ))+
      geom_histogram(bins=5)

2.2 Customizing a plot

  • You can change how a plot looks by adding the “color” and “fill” attributes to your geom_object(). color indicates the outline color and fill specifies the background color. Notice that we put the color name inside quotation marks. This is because the colors are character strings (i.e. lists of letters).

  • You can adjust the labels of the plot by chaining on the labs() function: labs(x="x label", y="y label", title="title") Again, we enclose the labels in quotation marks.

2.3 Try it out

Your turn to give it a try! Make a histogram of the RQ values for our data. As an extra challenge, add color and labels to your graph.

Here’s the basic structure. Click the next hint for more detailed structure.

ggplot(data, mapping=aes(___________))+
      geom_histogram(______________)+
      labs(_____________)
ggplot(data, mapping=aes(x=______))+
      geom_histogram(bins=_____, color="_____", fill="_______")+
      labs(x="______", y="_______", title="_______")
ggplot(data, mapping=aes(x=RQ))+
      geom_histogram(bins=5, color="pink", fill="darkorange")+
      labs(x="Respiratory Quotient", y="Count", title="Histogram of Respiratory Quotient for Physiology Data")

Great work!

2.4 Other ggplot resources

There are LOTS of resources online where you can learn about making plots with ggplot. Here are a few to start with:

3 Congratulations!

Hooray! You have completed lesson four. You have learned the basics of creating a plot using GGPlot. Lesson five will teach you how to make a scatterplot and boxplot.