here
?The here package works in conjunction with RStudio’s RProjects to make reading data into R much much easier.
It avoids all of the problems with working directories.
The working directory is just the place that R looks when you tell it you want to open a file.
This sounds simple, but using working directories is one of the biggest headaches with R, especially for new users.
For example, suppose I had a file called whale_data.csv
in a subfolder of my main project called data
.
I could use read.csv()
to read the data into a data.frame
object called whales
.
whales = read.csv("whale_data.csv")
## Warning in file(file, "rt"): cannot open file 'whale_data.csv': No such file or
## directory
## Error in file(file, "rt"): cannot open the connection
Mmmmm, that’s not what I wanted. I know the file is there, but R didn’t find it.
One option is to set my working directory to the data
subdirectory of my project using the setwd()
:
setwd("data")
and try again:
whales = read.csv("whale_data.csv")
I didn’t get an error that time, let’s take a look at the first six rows of my data frame to make sure everything is ok:
head(whales)
## Group Individual Count Round Date Recorder Time.Observed Location
## 1 1 Emily Begonis 80 Pre 9/17/2018 John 10 secs MPA
## 2 1 Bri 70 Pre 9/17/2018 John 10 secs MPA
## 3 1 Anthony N. 95 Pre 9/17/2018 John 10 secs MPA
## 4 2 John 100 Pre 9/17/2018 John 10 secs MPA
## 5 2 Jon 70 Pre 9/17/2018 John 10 secs MPA
## 6 2 Ruth 80 Pre 9/17/2018 John 10 secs MPA
## Section
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
Now, suppose I want to read another data file called week_1_dat.csv
that is in the week_01
subfolder of my main project directory:
week_1 = read.csv("week_1_dat.csv")
## Warning in file(file, "rt"): cannot open file 'week_1_dat.csv': No such file or
## directory
## Error in file(file, "rt"): cannot open the connection
head(week_1)
## Spp X1996 X1997 X1998 X1999 X2000
## 1 M.bro 88 47 13 33 86
## 2 Or.tip 90 14 36 24 47
## 3 Paint.l 50 0 0 0 4
## 4 Pea 48 110 85 54 65
## 5 Red.ad 6 3 8 10 15
## 6 Ring 190 80 96 179 145
Not ideal.
Once again, R did not know where to look for my file, so I must tell it explicitly to set my working directory to “week_01”:
setwd("week_01")
That did not work. Why?
I could type in the absolute path to my week_01
directory:
setwd("C:/Users/michaelnelso/git/intro_quant_ecol/week_01")
and try again:
week_1 = read.csv("week_1_dat.csv")
That works, but it’s not very reproducible. What if I decided I wanted to move my intro_quant_ecol
into a different location, or if I got a new computer that didn’t have the same directory structure?
Using working directories is one of the biggest sources of confusion and frustration when you’re working with files in R.
Now, I have another file I want to read that is in my week_02
subdirectory of my main folder…
here
insteadA much better solution is to use the here()
function from package here
You must first install the package:
install.packages("here")
You only need to do this one time. I like to put a comment character in front of lines I have used to install packages to remind myself that I already installed them:
# install.packages("here")
Remember that you have to load a package before you can use any of the functions in it:
require("here")
The function here()
always points to the root directory of the RProject that you have open in RStudio. It doesn’t matter where R’s current working directory is located.
Do you remember where I last set my working directory to?
Neither do I, but it doesn’t matter where my current working directory is if I use the here()
function:
whales = read.csv(here("data", "whale_data.csv"))
head(whales)
## Group Individual Count Round Date Recorder Time.Observed Location
## 1 1 Aldo 120 pre 9/17/2018 Mathew 20 secs MPA
## 2 1 Aldo 115 post 9/17/2018 Mathew 20 secs MPA
## 3 1 Anthony 95 Pre 9/17/2018 John 10 secs MPA
## 4 1 Anthony 85 Post 9/17/2018 John 10 secs MPA
## 5 1 Bri 70 Pre 9/17/2018 John 10 secs MPA
## 6 1 Bri 75 Post 9/17/2018 John 10 secs MPA
## Section
## 1 2
## 2 2
## 3 1
## 4 1
## 5 1
## 6 1
week_1 = read.csv(here("data", "week_01", "week_1_dat.csv"))
head(week_1)
## Spp X1996 X1997 X1998 X1999 X2000
## 1 M.bro 88 47 13 33 86
## 2 Or.tip 90 14 36 24 47
## 3 Paint.l 50 0 0 0 4
## 4 Pea 48 110 85 54 65
## 5 Red.ad 6 3 8 10 15
## 6 Ring 190 80 96 179 145
I didn’t have to worry about my working directory at all!
Everything just worked.
Some things to notice:
here()
is the name of subdirectory of my main directory.here()
is the filename.here
with nested directoriesLet’s say I have another directory inside my week_01
directory that is called hello
, and another directory within that called world
. Inside of that directory is a data file I want to read data from:
read.csv(here("data", "week_01", "hello", "world", "hello_world.csv"))
## a b c
## 1 one two three