data.frame
object.site_id
column be numeric or
Boolean?subset()
with the select
argument to create a data.frame
that contains only the
seeds_present
and site_id
columns. Next check
out the unique()
function.seeds_present
column in your report.So far, we’ve only used plotting functions from base R. There are several other plotting paradigms developed for R including grid, lattice, and ggplot.
“ggplot” stands for the grammar of
graphics. The grammar is a systematic way of thinking about
visualizing data. The grammar is implemented in R in the
ggplot2
package.
We won’t have time to go into art and science of graphing with
ggplot2
. Instead, I’ll provide a couple of examples that
you can tinker with.
The syntax for graphics with ggplot2
is fundamentally
different than for base R plots.
One thing you’ll notice in the following examples is the
idiosyncratic use of the plus symbol. In ggplot2
,
the addition operator has been overloaded so that it can be
used to combine graphical layers together.
You’ll probably need to install package ggplot2
.
Remember that you can use the install.packages()
function
to install a new package.
require(ggplot2)
dat = read.csv(here("data", "ginkgo_data_2022.csv"))
names(dat)
## [1] "site_id" "seeds_present" "max_width"
## [4] "max_depth" "notch_depth" "petiole_length"
ggplot(dat, aes(x = max_width, y = notch_depth)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Notch Depth (mm)")
Notice the call to aes()
. This function sets the
aesthetics of the plot. It’s is the part of the code that
maps variables in the data.frame
to the x- and y-
axes (for scatterplots).
Try editing the code above to make a scatterplot of
max_width
on the x-axis and max_depth
on the
y-axis.
ggplot(dat, aes(x = max_width, y = max_depth)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Max Leaf Depth (mm)")
One of the greatest strengths of plotting with ggplot is that it is easy to group observations by a factor variable in order to display them using different colors or symbols.
To add a variable by which to color the points, we can use the
colour
argument to aes()
. Let’s try the
seeds_present
column:
ggplot(dat, aes(x = max_width, y = notch_depth, colour = seeds_present)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Notch Depth (mm)")
Try to recreate the following plot:
I can also make a scatterplot that color-codes for individual tree ID.
colour = factor(site_id)
so that
site_id
was treated as a factor rather than a numeric
column.These examples just scratch the surface of the power of the
ggplot2
package. As you continue to build your R skills, I
encourage you to check out some of the abundant resources for learning
ggplot2
!
For example, you might check out the Introduction to Data Visualization with ggplot2 course on Datacamp.