In-class File Import And Logical Subset
Exercise Introduction to Quantitative Ecology:
read.csv() and
here()You’ve already installed them, but just in case here’s a reminder:
You’ll need install the here and
palmerpenguins packages before you can complete this
assignment.
Remember how we installed the penguins package?
install.packages("palmerpenguins")
You can use similar syntax to install the here
package.
You’ll need to download the grazing_data.csv data file
and save it in your data subdirectory.
here packageDo you remember how you used the require() function to
load the palmerpenguins package?
This is the syntax you used to load the penguins package:
require(palmerpenguins)
Use similar syntax to load the here package.
To read data in a csv file, you’ll use the here() and
read.csv() functions.
Review the instructions for the week 3 pre-class assignment if you need a refresher.
Read the data file into a data.frame object called
grazing_dat
data.frameTo test that you’ve read the file correctly, run the following code to preview the first six lines:
head(grazing_dat)
## X abundance replicate grass pasture
## 1 1 9 1 short upper
## 2 2 11 2 short upper
## 3 3 6 3 short upper
## 4 4 14 1 med upper
## 5 5 17 2 med upper
## 6 6 19 3 med upper
The head() function will print out the first six rows of
a data.frame object.
data.frameYou may recall from the DataCamp assignment that there are two primary ways to subset columns from a data frame:
You can retrieve a named column using the dollar sign. This method searches for a column in the data frame with a matching name. If found, it will print out the contents of the column. For example:
grazing_dat$abundance
## [1] 9 11 6 14 17 19 28 31 32 7 6 5 14 17 15 44 38 37
returns the contents of the abundance column.
How would you retrieve the pasture column?
You can use the square brackets to retrieve one or more columns by their position:
the following retrieves the first column of
grazing_dat
grazing_dat[, 1]
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
while this syntax retrieves the second and fourth columns:
grazing_dat[, c(2, 4)]
## abundance grass
## 1 9 short
## 2 11 short
## 3 6 short
## 4 14 med
## 5 17 med
## 6 19 med
## 7 28 tall
## 8 31 tall
## 9 32 tall
## 10 7 short
## 11 6 short
## 12 5 short
## 13 14 med
## 14 17 med
## 15 15 med
## 16 44 tall
## 17 38 tall
## 18 37 tall
data.frame rows with logical testsWe’ll use the penguins data for an example. Run the following code to
turn the data into a data.frame before you start:
require(palmerpenguins)
penguins = data.frame(penguins)
Recall that we can use the $ or [] to extract entire columns from a data frame. For example I can pull out the flipper length column.
penguins$flipper_length_mm
I can also use the subset() function along with a
logical test to pull out rows that meet criteria that
we specify.
For example, I can pull out all the penguins that were measured on Torgersen island:
subset(penguins, island == "Torgersen")
Then I could plot a histogram of their body masses:
torger_penguins = subset(penguins, island == "Torgersen")
hist(
x = torger_penguins$body_mass_g,
main = "Body mass of penguins on Torgersen Island",
xlab = "body mass (g)"
)
Things to note from the example:
subset() is the data frame I want
to subset==Your group will submit code that accomplishes the following tasks:
data.frame object called grazing_dat using
only the functions here() and read.csv().
read_csv (with an underscore) or
file.choose()grazing_dat.grass column
from grazing_dat.Submit your answers as a knitted html file on Moodle.
Challenge: Using two successive calls to subset() can
you create a histogram of the body masses of only the male penguins on
Dream island?