What is here?

The here package works in conjunction with RStudio’s RProjects to make reading data into R much much easier.

It avoids all of the problems with working directories.

What is wrong with working directories?

The working directory is just the place that R looks when you tell it you want to open a file.

This sounds simple, but using working directories is one of the biggest headaches with R, especially for new users.

For example, suppose I had a file called whale_data.csv in a subfolder of my main project called data.

I could use read.csv() to read the data into a data.frame object called whales.

whales = read.csv("whale_data.csv")
## Warning in file(file, "rt"): cannot open file 'whale_data.csv': No such file or
## directory
## Error in file(file, "rt"): cannot open the connection

Mmmmm, that’s not what I wanted. I know the file is there, but R didn’t find it.

One option is to set my working directory to the data subdirectory of my project using the setwd():

setwd("data")

and try again:

whales = read.csv("whale_data.csv")

I didn’t get an error that time, let’s take a look at the first six rows of my data frame to make sure everything is ok:

head(whales)
##   Group    Individual Count Round      Date Recorder Time.Observed Location
## 1     1 Emily Begonis    80   Pre 9/17/2018     John       10 secs      MPA
## 2     1           Bri    70   Pre 9/17/2018     John       10 secs      MPA
## 3     1    Anthony N.    95   Pre 9/17/2018     John       10 secs      MPA
## 4     2          John   100   Pre 9/17/2018     John       10 secs      MPA
## 5     2           Jon    70   Pre 9/17/2018     John       10 secs      MPA
## 6     2          Ruth    80   Pre 9/17/2018     John       10 secs      MPA
##   Section
## 1       1
## 2       1
## 3       1
## 4       1
## 5       1
## 6       1

Now, suppose I want to read another data file called week_1_dat.csv that is in the week_01 subfolder of my main project directory:

week_1 = read.csv("week_1_dat.csv")
## Warning in file(file, "rt"): cannot open file 'week_1_dat.csv': No such file or
## directory
## Error in file(file, "rt"): cannot open the connection
head(week_1)
##       Spp X1996 X1997 X1998 X1999 X2000
## 1   M.bro    88    47    13    33    86
## 2  Or.tip    90    14    36    24    47
## 3 Paint.l    50     0     0     0     4
## 4     Pea    48   110    85    54    65
## 5  Red.ad     6     3     8    10    15
## 6    Ring   190    80    96   179   145

Not ideal.

Once again, R did not know where to look for my file, so I must tell it explicitly to set my working directory to “week_01”:

setwd("week_01")

That did not work. Why?

I could type in the absolute path to my week_01 directory:

setwd("C:/Users/michaelnelso/git/intro_quant_ecol/week_01")

and try again:

week_1 = read.csv("week_1_dat.csv")

That works, but it’s not very reproducible. What if I decided I wanted to move my intro_quant_ecol into a different location, or if I got a new computer that didn’t have the same directory structure?

Using working directories is one of the biggest sources of confusion and frustration when you’re working with files in R.

Now, I have another file I want to read that is in my week_02 subdirectory of my main folder…

Use here instead

A much better solution is to use the here() function from package here

You must first install the package:

install.packages("here")

You only need to do this one time. I like to put a comment character in front of lines I have used to install packages to remind myself that I already installed them:

# install.packages("here")

Remember that you have to load a package before you can use any of the functions in it:

require("here")

How does it work?

The function here() always points to the root directory of the RProject that you have open in RStudio. It doesn’t matter where R’s current working directory is located.

Do you remember where I last set my working directory to?

Neither do I, but it doesn’t matter where my current working directory is if I use the here() function:

whales = read.csv(here("data", "whale_data.csv"))
head(whales)
##   Group Individual Count Round      Date Recorder Time.Observed Location
## 1     1      Aldo    120   pre 9/17/2018  Mathew        20 secs      MPA
## 2     1      Aldo    115  post 9/17/2018  Mathew        20 secs      MPA
## 3     1   Anthony     95   Pre 9/17/2018    John        10 secs      MPA
## 4     1   Anthony     85  Post 9/17/2018    John        10 secs      MPA
## 5     1        Bri    70   Pre 9/17/2018    John        10 secs      MPA
## 6     1        Bri    75  Post 9/17/2018    John        10 secs      MPA
##   Section
## 1       2
## 2       2
## 3       1
## 4       1
## 5       1
## 6       1
week_1 = read.csv(here("data", "week_01", "week_1_dat.csv"))
head(week_1)
##       Spp X1996 X1997 X1998 X1999 X2000
## 1   M.bro    88    47    13    33    86
## 2  Or.tip    90    14    36    24    47
## 3 Paint.l    50     0     0     0     4
## 4     Pea    48   110    85    54    65
## 5  Red.ad     6     3     8    10    15
## 6    Ring   190    80    96   179   145

I didn’t have to worry about my working directory at all!

Everything just worked.

Some things to notice:

  • The first argument to here() is the name of subdirectory of my main directory.
  • The last argument to here() is the filename.
  • If I have nested directories, I just include them as additional arguments.

Using here with nested directories

Let’s say I have another directory inside my week_01 directory that is called hello, and another directory within that called world. Inside of that directory is a data file I want to read data from:

read.csv(here("data", "week_01", "hello", "world", "hello_world.csv"))
##     a   b     c
## 1 one two three