= "Multnomah"
myCounty = "Los Angeles" myCounty2
AQI Part 2 - Custom Graphs
Introduction
Welcome to Part 2 of the Air Quality Data Science Tutorials! If you have not done Part 1 yet, click here to complete that tutorial first - it will provide helpful background info about air quality and the basics of data science.
Part 2 will build off of the graphs created in Part 1 and allow you to customize them with your own locations as well as create different types of graphs.
Our Data Set
This tutorial uses air quality data from the EPA ranging from 2000-2023, and has data on all counties in the United States. The Air Quality Index (AQI) is calculated by looking at various pollutants including Particle Pollution (PM2.5 and PM10), Carbon Monoxide, Lead, Nitrogen Dioxide, Ozone, and Sulfur Dioxide. Here is a link to the data and here is a link to more information about the data.
Our data tells gives the highest recorded AQI value on a given day in a given location.
The data set is very large, so we will not view it now, but there will be an option to view individual counties later in the tutorial.
Customizing your graphs!
In this tutorial, you can create graphs with various customizations. Here are your options:
Choose one of the options above, or click “continue” to do all of them!
Create a scatter plot of AQI over time
First, let’s create a scatterplot! We’re going to graph the number of days per year when the AQI was above a certain threshold.
In other words, we’ll graph the number of days when the AQI was red (or orange or green or whatever level you choose). Then, we’ll see how this data changes over time, from 2020 to 2023.
To do this, we’ll use a scatterplot. A scatterplot is a collection of points, where each point represents a data value.
Stop and Think
Before we start, what do you think the graph might look like? Discuss these questions with your neighbor:
In Portland, do you think the air quality is getting better or worse since 2000?
If we only look at Unhealthy AQI levels (red) in Portland, do you think there will be more or less total days of unhealthy air quality since 2000?
Step 1: Choose your county
First, you’ll need to choose a county that you want to graph.
Here are your options, organized by state. Click through them and find the county you’d like to graph. You’ll need to remember the exact spelling AND capitalization.
Step 2: Add your county name to the code
Now, you’ll tell the computer which county you chose.
Important make sure you spell the County name exactly as you found it in the table above, and be sure to use proper capitalization as well.
Type your county name in the code chunk below. Replace the red line with the county but DON’T delete the quotes ""
.
Press Run Code
Step 3: Choose which AQI level you want to graph
Now, you can choose which air quality level you’d like to graph. You can focus on any of the following:
- “Good”
- “Moderate”
- “Unhealthy for Sensitive Groups”
- “Unhealthy”
- “Very Unhealthy”
- “Hazardous”
Type the level you choose in the code chunk. Replace the red line and be sure to keep the quotes.
Press Run code
Step 4: Choose a color
Now you can choose a color for your points. Here are a few options. Otherwise go to this link for more options.
- “red”
- “green”
- “blue”
- “orange”
- “pink”
- “purple”
- “yellow”
Type your color name in the code chunk. Replace the red line. Use all lowercase and be sure to keep the quotes ""
.
Press run code
Step 5: Create the graph!
Now you can create your graph! Click through the tabs at the top of the code editor to see more options for customizing your graph.
Press run code
Want to add a line of best fit?
Press run code.
Want to overlay 2 scatterplots and compare 2 different counties?
Replace the red lines with your county names! Make sure you enclose them with quotes ""
and spell their names properly (refer back to this table).
Press run code
Example:
Now, let’s make the graph again:
Press Run code
Run Code button not working: Make sure you wait until the green circle at the very top of this page says “Ready!” We have some code running in the background and the green circle indicates it has finished running.
Error message:
- Did you accidentally delete or change some code? Click the refresh button at the top of the code cell and try running it again.
- Did you run all the previous code chunks in this section? Return to Step 1 and make sure you pressed “Run code” in every step.
- Did you spell and capitalize all the variables correctly, as specified in each step? If not, correct it and press “run code” in that step.
Make a tile plot of any county!
What if you wanted to look at the air quality data for another place (other than Portland and LA which were analyzed in Part 1)? You’ll be able to do that below.
If you’d like a challenge, go to the final section to create a tile plot with multiple different variables you can adjust!
Step 1: Choose a county
First, you’ll need to choose a county that you want to graph.
Here are your options, organized by state. Click through them and find the county you’d like to graph. You’ll need to remember the exact spelling AND capitalization.
Step 2: Add your county name to the code
Now, you’ll tell the computer which county you chose.
Important make sure you spell the County name exactly as you found it in the table above, and be sure to use proper capitalization as well.
Type your county name in the code chunk below. Replace the red line with the county but DON’T delete the quotes ""
.
Press Run Code
Step 3: Make a tile plot
Now we can graph! Click through the tabs at the top of the code editor to see more options for customizing your graph.
Press run code
Want to combine county data?
Replace the red lines with your county names! Make sure you enclose them with quotes ""
and spell their names properly (refer back to this table).
Press run code
Example:
Now, let’s make the graph again:
Run Code button not working: Make sure you wait until the green circle at the very top of this page says “Ready!” We have some code running in the background and the green circle indicates it has finished running.
Error message:
- Did you accidentally delete or change some code? Click the refresh button at the top of the code cell and try running it again.
- Did you run all the previous code chunks in this section? Return to the code editor where you chose your county name. Make sure you clicked the “Run code” button.
- Did you spell and capitalize the County name correctly, exactly as it was found in the table of “States and County Names”?
Great work! Want to create a more complicated tile plot? Go to the Challenge section at the very end of this tutorial to learn how to adjust more variables!
Otherwise, click “continue” to view your raw data.
View raw data
After graphing, you may wonder why the graph looks the way it does. Here, you can look at your raw data.
Choose a county
First, you’ll need to choose a county that you want to graph. See section Choose a county to see a list of county options.
After choosing a county, tell the computer which county you chose.
Important make sure you spell the County name exactly as you found it in the table above, and be sure to use proper capitalization as well.
Type your county name in the code chunk below. Replace the red line with the county but DON’T delete the quotes ""
. Keep all the code below the red line as well.
Press Run Code
Press run code
Raw data
Click the blue box to see raw data for the county you chose.
Press run code to see your raw data!
It might be very large, so you can click the blue rectangle again to close the data.
Note: If this code doesn’t work, it’s because you haven’t chosen a county in a previous step. Return to Choose a county and run the code to choose your county.
Click “continue” for a coding challenge.
Challenge: Tile plot with more variable options
Want to create a more complex tile plot? In this section, you’ll have the opportunity to customize the x axis, legend, and plot labels.
Step 1: Choose your location
First, you’ll tell the computer which county you’d like to graph.
Important make sure you spell the County name exactly as you found it in the table above, and be sure to use proper capitalization as well.
Type the county name in the code chunk below. Replace the red line with the county but DON’T delete the quotes ""
.
Press run code
Step 2: Choose your variables
Next, you can decide whether you want to view the data by month or by day of the year. In the graph we made previously, we looked at the AQI index for each day of year. But, if you’d like, you can combine the data and look at it by month.
Choose which one you prefer and press run code below.
Step 3: Choose your category levels
The EPA has defined category levels for air quality (good, moderate, unhealthy for sensitive groups, unhealthy, very unhealthy, hazardous). But maybe, you want to adjust those categories. For example, you might be extra sensitive to air pollution, so you want the “good” category to be smaller. Here, you have the option to define those categories.
If you want to keep the categories the same, skip this step!
Below, you can change the numbers for each category. Each category includes numbers equal to or greater than the one you input. For example, right now hazardous is anything equal to or greater than 301, and very unhealthy is between 201-300.
Replace the current numbers with your chosen AQI values.
Press run code
Stop and think
Why did you choose the AQI values you did for each category?
How do you think the EPA makes those decisions? (Check out table 6 in this document if you want more information on this!)
Do you think one AQI level affects everyone the same? Or does it affect some people more than others?
Step 4: Add labels
Now let’s add labels to your tile plot.
Type in names for your labels below. Replace the red lines and keep the quotes ""
.
The
xlabel
is the label for the bottom axis.The
ylabel
is the label for the left axis. This will be year for all plots.The
title
tells the reader what the graph is about and will be on top of the graph.
Type your labels below.
Press run code.
Step 5: Graph
Now let’s make your graph!
Press run code below to see your graph. If you want to know more about how the code works, click “Learn more”!
Press run code
I’ll get to this :)
Run Code button not working: Make sure you wait until the green circle at the very top of this page says “Ready!” We have some code running in the background and the green circle indicates it has finished running.
Error message:
- Did you accidentally delete or change some code? Click the refresh button at the top of the code cell and try running it again.
- Did you run all the previous code chunks in this section? Return to Step 1 and make sure you pressed “Run code” in every step.
- Did you spell and capitalize all the variables correctly, as specified in each step? If not, correct it and press “run code” in that step.
Congratulations!
You have finished this tutorial and made lots of different types of air quality graphs! Excellent work!