#Tell R what data to use and what the x and y values are
ggplot(data=co2DataFiltered, mapping=aes(x=year_chr, y=value, fill=name))+
#Select what kind of graph we're making
geom_col(position="dodge") +
#Label the graph to make it easy to read
labs(title="CO2 Production from Energy Vs. Other Sources", x="Year", y="CO2 In Tons") +
#add a legend to show what each color means
scale_fill_discrete(name="CO2 Source", labels=c("CO2 from Energy", "All Other Sources of CO2")) +
#make it pretty :)
theme_light()
Climate Change Data Tutorial
Climate Change Data Visualization
Welcome to this data science tutorial! We’ll be using R to look at climate change data.
R is a coding language that is often used to look at LARGE amounts of data. It can do all sorts of things including basic math!
Want to try using R?
In the code box right below this, type in 6+3
and press the green play button:
Do you see the answer 9
that popped out right under your code? Pretty cool!
The gray and white box with the green arrow and “run code” button is called a code chunk. You can type in “R commands” into that box and it will output the answer underneath the code chunk.
Our Data
Today we’ll be using R to look at climate data. We will be able to create graphs of different variables in our data, like carbon dioxide emissions and temperature.
We’ve preloaded a dataset into this tutorial. This was found
Where??
Let’s start by taking a look at the data set. Press the green play button below and see what happens!
You should see 10 rows of data and a lot of random words. We won’t go into what all of it means, but here’s a quick overview:
The words on the top (country, year, iso_code, population, etc) are the different variables in our data set
Underneath those words are the type of variable, which we’ll ignore right now.
Below that, you’ll see some numbers and NA values. Those are different values for each variable.
Our data is called co2Data
. If you type co2Data
into any code chunk, you’ll be able to see our data.
Our First Graph
We can use R to do all sorts of things, including make graphs easily and quickly! In this tutorial we won’t be able to cover all the details of how to make a graph, but there’s lots of resources online you can find to learn more. We use something called ggplot
to make graphs in this tutorial.
First, lets make a graph that just looks at the worlds CO2 Emissions over time, this data goes all the way back to 1750!
Press Run Code on the code chunk below!
Temperature Data
What if we want to compare this to how the worlds temperature has changed over this same time? We have another dataset that contains temperature data. It tells us how different the temperature each year is from the average in 1900.
In the code chunk below type tempData
and click “run code” to see the new data set.
You should see 2 columns: one called Year
and one called Anomaly
. The Anomaly column tells us how different the temperature that year was from the average in 1900. This only shows us 10 years, but the data goes all the way to 2023! We can graph the data to visualize it better.
Let’s graph this! Press “run code” and see what happens!
Now, what if we combine our co2 data and our temperature data?
Lets put these two data sets together matching up the years. Then, we can make a graph of temperature anomaly and co2 levels. Click “run code”.
CO2 Sources
Now what if we ask where most of that CO2 we’re emitting comes from? We can do this with a bar graph that shows the data from the past few years. A bar graph is a little more complicated to make, so we’ve edited some of our data in the background to make it easier to graph with. Our new edited data is called co2DataFiltered
.
Let’s see what a graph looks like! Click “run code”.
Scatter Plot Generator!
Now you get to try creating your own plot! You will be able to choose any two variables and see how they relate to each other. If your questions haven’t been answered by the infographics and graphs you’ve already seen, take a look at the data and see what you can ask to try and learn more.
First, decide which variable you want to learn about. You can choose any variable from co2Data
. Here are some options:
- population
- year
- cement_co2
- co2
- consumption_co2
- cumulative_co2
- energy_per_gdp
- land_use_change_co2
- gas_co2
- coal_co2
- cement_co2
- land_use_change_co2
- methane
- temperature_change_from_co2
Choose two of these variables that you want to plot against each other. Then, in the code chunk below, type them inside the quotes. Make sure you leave the quotes and just replace the red line. Also make sure you type the variable name exactly - in fact, you can just copy and paste from the list.
Then click “run code”!
Now, let’s add titles and labels to your graph! Again, type your title between the quotation marks. Then click “run code”!
Finally, run this to see your scatter plot! Click “run code” and if something doesn’t look right, click through the “troubleshooting” tabs to see how to fix it.
Did you get an error message in red?
If the graph also displayed and looks right, don’t worry about the error message!
If the graph is not displaying, go back and make sure you ran all the code chunks previously where you defined the variables and labels.
- Did you put everything in quotes when you defined the variables and labels?
You don’t need to edit this code at all, just the code where you define variables and labels
If it still isn’t working or if you accidentally deleted something, you can refresh the page to restart.
If your x variable is year
and your points are clustered at the bottom of your graph, this may be because it goes all the way back to 1750 and there wasn’t much change until much more recently.
So try running this code instead. It will display your graph from 2000 to 2020. Note: This code will only work if your x variable is year
.
Nice work! Hopefully you enjoyed making your own graph with this data!