A Simple Guide to Scatterplots in R
Scatterplots can visualize the relationship between two variables.
For example, the relationship between hours studied and exam scores.
Create a Table in R
library(data.table)
data <- data.table(
hours_studied = c(2, 5, 3, 6, 4, 7, 5, 8),
exam_scores = c(65, 75, 70, 85, 80, 90, 75, 95) )
print(data)
In R, the output should look something like this.
hours_studied exam_scores
1: 2 65
2: 5 75
3: 3 70
4: 6 85
5: 4 80
6: 7 90
7: 5 75
8: 8 95
Basic Plot
Use the plot() function in R to visualize a basic relationship between hours studied and the corresponding exam scores.
plot(x = data$hours_studied, y = data$exam_scores, main = Basic Plot)
Tweaks
Adding some elements all good graphs should have.
- A Title
- X and Y labels
plot(data$hours_studied, data$exam_scores,
main = "Hours Studied and Exam Scores", # Title of the plot
xlab = "Hours Studied", # Label for the x-axis
ylab = "Exam Scores", # Label for the y-axis
col = "blue", # Color of the points
pch = 16, # Shape of the points
)
That’s It
Easy right?
With four lines of code, we see that more hours spent studying contribute to higher scores on exams.
This guide is just the beginning, you’ll find a ton more guides on how to complicate things further both on Medium and the broader internet.
Refer back to this guide if you ever need a refresher on the basic fundamentals.