data.frame object.site_id column be numeric or
Boolean?subset() with the select
argument to create a data.frame that contains only the
seeds_present and site_id columns. Next check
out the unique() function.seeds_present column in your report.So far, we’ve only used plotting functions from base R. There are several other plotting paradigms developed for R including grid, lattice, and ggplot.
“ggplot” stands for the grammar of
graphics. The grammar is a systematic way of thinking about
visualizing data. The grammar is implemented in R in the
ggplot2 package.
We won’t have time to go into art and science of graphing with
ggplot2. Instead, I’ll provide a couple of examples that
you can tinker with.
The syntax for graphics with ggplot2 is fundamentally
different than for base R plots.
One thing you’ll notice in the following examples is the
idiosyncratic use of the plus symbol. In ggplot2,
the addition operator has been overloaded so that it can be
used to combine graphical layers together.
You’ll probably need to install package ggplot2.
Remember that you can use the install.packages() function
to install a new package.
require(ggplot2)
dat = read.csv(here("data", "ginkgo_data_2022.csv"))
names(dat)
## [1] "site_id" "seeds_present" "max_width"
## [4] "max_depth" "notch_depth" "petiole_length"
ggplot(dat, aes(x = max_width, y = notch_depth)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Notch Depth (mm)")
Notice the call to aes(). This function sets the
aesthetics of the plot. It’s is the part of the code that
maps variables in the data.frame to the x- and y-
axes (for scatterplots).
Try editing the code above to make a scatterplot of
max_width on the x-axis and max_depth on the
y-axis.
ggplot(dat, aes(x = max_width, y = max_depth)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Max Leaf Depth (mm)")
One of the greatest strengths of plotting with ggplot is that it is easy to group observations by a factor variable in order to display them using different colors or symbols.
To add a variable by which to color the points, we can use the
colour argument to aes(). Let’s try the
seeds_present column:
ggplot(dat, aes(x = max_width, y = notch_depth, colour = seeds_present)) +
geom_point() +
xlab("Max Leaf Width (mm)") +
ylab("Notch Depth (mm)")
Try to recreate the following plot:
I can also make a scatterplot that color-codes for individual tree ID.
colour = factor(site_id) so that
site_id was treated as a factor rather than a numeric
column.These examples just scratch the surface of the power of the
ggplot2 package. As you continue to build your R skills, I
encourage you to check out some of the abundant resources for learning
ggplot2!
For example, you might check out the Introduction to Data Visualization with ggplot2 course on Datacamp.