In addition to my original materials, this lab contains material adapted from Kevin McGarigal’s ECO 634 course.
Kevin taught this course for many years and his lab and lecture materials are an invaluable resource!
Recall the salamander catastrope data we worked with in the Using Models 1 lecture assignment.
catrate = read.csv(here("data", "catrate.csv"))
head(catrate)
## pond success years cat.rate
## 1 2 5 7 0.2857143
## 2 3 5 7 0.2857143
## 3 4 6 7 0.1428571
## 4 5 4 7 0.4285714
## 5 6 0 7 1.0000000
## 6 7 1 4 0.7500000
To review, the data columns represent
pond
: the ID of the pondsuccess
: The number of years in which successful
reproduction occurred at the pondyears
: The total number of years that the pond was
observed.cat.rate
: the ratio of successes to total observation
years.In the Using Models 1 lecture assignment, we used the t and Wilcox tests to test whether the catastrophic rate was equal to the late fill rate.
In those test examples, we treated the data as continuous, but in fact the data are discrete, taking on integer values ranging from 0 to 7.
An alternative test for ratios is the binomial test, which is suitable for data consisting of a set of single trials in which each observation can take on one of two values; e.g., success or fail, present or absent, live or die.
We might ask the question:
We can investigate the answer with a two-sided binomial test for proportions.
Pooling all of the data in the 14 ponds, we observed a total of 33 reproductive successes out of 61 total trials.
How likely is a response of 33/61 if the reproductive success and failure are equally likely, i.e., \(Pr(success)=0.5\)?
n_success = sum(catrate$success)
n_years = sum(catrate$years)
binom.test(
x = n_success,
n = n_years,
p = 0.5)
##
## Exact binomial test
##
## data: n_success and n_years
## number of successes = 33, number of trials = 61, p-value =
## 0.6089
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.4084989 0.6693559
## sample estimates:
## probability of success
## 0.5409836
Note, we had to first compute the number of successes and the total number of years pooled across all 14 ponds.
Also note the similarity in the syntax to dbinom()
, but
with slightly different argument names.
Recall that we know that ponds experience late filling in approximately 2 out of every 7 years:
Marbled salamanders lay eggs in dry vernal pool locations in the fall.
Late pond-filling can cause the desiccation and/or freezing of the eggs prior to inundation.
Let’s define variables to hold the late- and normal-filling rates:
late_fill_rate = 2/7
normal_fill_rate = 1 - late_fill_rate
Instead of testing the null hypothesis of equal probability of success and failure (i.e., p=0.5), we might instead ask the question:
In this scenario, we expect successful reproduction in approximately 5 of every 7 years.
We can modify the code we used above to test the observed reproduction success rate against the normal-filling rate:
binom.test(
x = n_success,
n = n_years,
p = normal_fill_rate)
##
## Exact binomial test
##
## data: n_success and n_years
## number of successes = 33, number of trials = 61, p-value =
## 0.004227
## alternative hypothesis: true probability of success is not equal to 0.7142857
## 95 percent confidence interval:
## 0.4084989 0.6693559
## sample estimates:
## probability of success
## 0.5409836
In addition, note again that the default test is a two-sided alternative
We might instead prefer the one-sided alternative hypothesis that the observed success rate is less than the pond normal-filling rate. We can perform the test as follows:
binom.test(
x = n_success,
n = n_years,
p = normal_fill_rate,
alternative ='less')
##
## Exact binomial test
##
## data: n_success and n_years
## number of successes = 33, number of trials = 61, p-value =
## 0.002984
## alternative hypothesis: true probability of success is less than 0.7142857
## 95 percent confidence interval:
## 0.0000000 0.6507238
## sample estimates:
## probability of success
## 0.5409836
What do these binomial tests say about the success (or catastrophe) rate?
Here’s the t-test syntax. The Wilcoxon test syntax is similar.
t.test(catrate$cat.rate, mu = 2/7)
##
## One Sample t-test
##
## data: catrate$cat.rate
## t = 2.9595, df = 12, p-value = 0.01193
## alternative hypothesis: true mean is not equal to 0.2857143
## 95 percent confidence interval:
## 0.3526250 0.7261295
## sample estimates:
## mean of x
## 0.5393773
Now we’ve see a few one- and two-sample tests in practice. To summarize, we can use classical two-sample tests to:
The following sections present examples of each of these tests.
Before we can carry out a test to compare two means using classical methods (see below), we need to test whether the sample variances are significantly different
The simplest test is called Fisher’s F test, based on the F-statistic.
The F-statistic represents the ratio between two variances.
In order to be significantly different, the ratio will need to be significantly smaller or larger than 1, depending on whether the smaller variance is in the numerator or denominator.
These data come from a field experiment designed to assess tree seedling response to understory vegetation treatments.
Specifically, four treatments (including a control) were randomly assigned to 32 plots in a randomized block design
veg = read.csv(here("data", "vegdata.csv"))
head(veg)
## block plot date treatment birch pine fern
## 1 A A3 1995 control 0 4 260
## 2 A A7 1995 control 0 0 186
## 3 A A4 1995 removed 8 8 46
## 4 A A6 1995 removed 6 28 1
## 5 A A5 1995 mixed 0 1 309
## 6 A A8 1995 mixed 0 0 258
For the moment, we’ll only consider the pine seedling count (column
pine
) and we’ll ignore the block experimental design.
Let’s visualize the data using box plots, as follows:
boxplot(pine ~ treatment, data = veg)
It is apparent from the adjacent box plots that the means (and medians) differ among the treatments, but also that the variances differ as well.
Let’s test whether the variance in pine seedling count differs between two treatments:
First, let’s subset the data:
veg2 = droplevels(
subset(
veg,
treatment %in% c('control','clipped')
))
# verify that treatment is factorized
veg2$treatment = factor(veg2$treatment)
Now, perform the test:
var.test(
pine ~ treatment,
data = veg2)
##
## F test to compare two variances
##
## data: pine by treatment
## F = 119.3, num df = 7, denom df = 7, p-value = 1.92e-06
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 23.88333 595.86810
## sample estimates:
## ratio of variances
## 119.2951
What does Fisher’s F test say about equal variances for these two samples?
Visual inspection of the box plots indicates that this may be the case, but we might want to test for normality using one of the tests described previously, for example:
shapiro.test(veg2$pine[veg2$treatment=="control"])
##
## Shapiro-Wilk normality test
##
## data: veg2$pine[veg2$treatment == "control"]
## W = 0.89397, p-value = 0.2547
shapiro.test(veg2$pine[veg2$treatment=="clipped"])
##
## Shapiro-Wilk normality test
##
## data: veg2$pine[veg2$treatment == "clipped"]
## W = 0.81806, p-value = 0.04452
Note, because the Shapiro-Wilk test is a one-sample test, we had to select the records for each treatment and conduct separate tests.
Note also the use of logical subsetting!
If the results indicate that the data are non-normal, then we should use a non-parametric test of homogeneity of variances, such as the Fligner-Killeen test, as follows:
fligner.test(
pine ~ treatment,
data = veg2)
##
## Fligner-Killeen test of homogeneity of variances
##
## data: pine by treatment
## Fligner-Killeen:med chi-squared = 8.3686, df = 1, p-value =
## 0.003818
Thus far, we have been concerned with comparing the variances of two samples.
However, there are roughly equivalent tests for n-sample problems; i.e., when there are more than two groups.
The n-sample parametric test is called Bartlett’s test, which we can use to test for homogeneity of variances among all four treatment levels as follows:
bartlett.test(pine ~ treatment, data = veg)
##
## Bartlett test of homogeneity of variances
##
## data: pine by treatment
## Bartlett's K-squared = 63.741, df = 3, p-value = 9.323e-14
What does Bartlett’s test say about homogeneity of variances among all 4 treatments?
The non-parametric alternative test which is largely preferred by many statisticians is called the Fligner-Killeen test. We used it to test two variances above, but it can test n variances as well:
fligner.test(pine ~ treatment, data = veg)
##
## Fligner-Killeen test of homogeneity of variances
##
## data: pine by treatment
## Fligner-Killeen:med chi-squared = 19.707, df = 3, p-value =
## 0.0001952
Given what we know about the variation from replicate to replicate within each sample (the within-sample variance), how likely is it that our sample means were drawn from populations with the same mean?
If it is highly likely, then we shall say that our two sample means are not significantly different
If it is rather unlikely, then we shall say that our sample means are significantly different.
But perhaps a better way to proceed is to work out the probability that the two samples were indeed drawn from populations with the same mean.
We’ve already met two simple tests for comparing two sample means (or medians): the Student’s t test and Wilcoxon’s rank-sum test.
The Student’s t test is appropriate when the samples are independent, the variances constant, and the errors normally distributed. We implemented it as follows for the pine seedling data in the control and clipped treatment plots:
t.test(
pine ~ treatment,
data = veg2)
##
## Welch Two Sample t-test
##
## data: pine by treatment
## t = 2.2825, df = 7.1173, p-value = 0.05582
## alternative hypothesis: true difference in means between group clipped and group control is not equal to 0
## 95 percent confidence interval:
## -0.5204871 32.5204871
## sample estimates:
## mean in group clipped mean in group control
## 17.875 1.875
Because we asked for a confidence interval
(conf.int=TRUE
), the output includes a 95% (by default)
confidence interval on the difference between sample means
A confidence interval that includes 0 indicates that the sample means are not significantly different
What does the two-sample t test say about the differences in pine seedling counts between the control and clipped treatments?
Despite the rather large differences in the sample means (17.9 vs 1.9), can we say with confidence that the sample means are different?
Of course, the validity of Student’s t test depends on whether the assumptions have been met.
In this case, we know that variances are significantly different between treatments (see above).
In addition, we know that the “clipped” sample is somewhat non-normally distributed.
Hence, there is good reason to be suspect of the test result.
We’ve used the Wilcox test in lecture and lab before.
The Wilcoxon’s rank-sum test is appropriate when the samples are independent but the errors are not normally distributed, and is implemented as follows:
wilcox.test(
pine ~ treatment,
data = veg2)
## Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...):
## cannot compute exact p-value with ties
##
## Wilcoxon rank sum test with continuity correction
##
## data: pine by treatment
## W = 48, p-value = 0.1005
## alternative hypothesis: true location shift is not equal to 0
Do the results agree with the Student’s t test? Which test result to you have more confidence in, and why?
Sometimes, two-sample data come from paired observations
In this case, we might expect a correlation between the two measurements, either because they were made on the same individual, or were taken from the same location.
A positive covariance between the two samples reduces the variance of the difference between means, which makes it easier to detect significant differences between the means Pairing is not always effective, because the correlation between samples may be weak. In general, however, if you can do a paired t test, then you should always do the paired test.
A good example comes from a data set of mice whose body mass was measured before and after a treatment.
You can obtain the dataset in the datarium
package
(you’ll need to install it).
require(datarium)
data("mice2")
head(mice2)
## id before after
## 1 1 187.2 429.5
## 2 2 194.2 404.4
## 3 3 231.7 405.6
## 4 4 200.5 397.2
## 5 5 201.7 377.9
## 6 6 235.0 445.8
The syntax is slightly different because the data are in “wide” format (not exactly the row-data paradigm that we’re used to).
t.test(mice2$before, mice2$after, paired = TRUE)
##
## Paired t-test
##
## data: mice2$before and mice2$after
## t = -25.546, df = 9, p-value = 1.039e-09
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -217.1442 -181.8158
## sample estimates:
## mean difference
## -199.48
You should check the normality assumption to verify that a t-test is appropriate:
##
## Shapiro-Wilk normality test
##
## data: mice2$before
## W = 0.92399, p-value = 0.3914
##
## Shapiro-Wilk normality test
##
## data: mice2$after
## W = 0.95283, p-value = 0.7021
According to the Shapiro test, things are good, but the fact that there are only 10 observations in each group might make us a little uncomfortable about assuming normality.
Let’s do a paired Wilcoxon test and compare results. You can figure out the syntax.
##
## Wilcoxon signed rank exact test
##
## data: mice2$before and mice2$after
## V = 0, p-value = 0.001953
## alternative hypothesis: true location shift is not equal to 0
Compare the p-values of the parametric and nonparametric tests.
Now, let’s compare the results of the paired tests to the results of an unapaired test.
t.test(mice2$before, mice2$after, paired = FALSE)
##
## Welch Two Sample t-test
##
## data: mice2$before and mice2$after
## t = -17.453, df = 15.667, p-value = 1.099e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -223.7515 -175.2085
## sample estimates:
## mean of x mean of y
## 200.56 400.04
When in doubt, choose the non-paired option.
We like to use paired tests when they are appropriate, but it’s not always clear when this is so.
Because a paired test makes additional assumptions about the data (i.e. that pairs of observations are meaningfully related), we can run into problems when there is no true paired structure in the data.
If you’re not sure, you should use the unpaired option; It is always a valid choice. Using a paired test on unpaired data doesn’t make sense, and can make your analysis invalid.
In the paired two-sample problem above, we suggested that a strong correlation between the two measurement variables will affect the test for significant differences between means or medians
More generally, with any two continuous variables, x and y, the question naturally arises as to whether their values are correlated with each other
Let’s take the marbled salamander dispersal data introduced in a previous lab
Recall that the data represent the dispersal of first-time breeders (ftb) from their natal ponds, representing juvenile dispersal, and the dispersal of experienced breeders (eb) from their established breeding ponds, representing adult dispersal.
The data set includes three variables:
dist.class
= distance class, based on 100 m
intervals;disp.rate.ftb
= standardized dispersal rate for
first-time breeders, which can be interpreted as a relative dispersal
probability.disp.rate.eb
= standardized dispersal rate for
experienced breeders, which can be interpreted as a relative dispersal
probabilityThe question arises as to whether the dispersal rates for first-time breeders and experienced breeders are correlated
First, we need to read in the data and check it, as follows:
disp = read.csv(here("data", "dispersal.csv"))
disp
## dist.class disp.rate.ftb disp.rate.eb
## 1 100 0.00 0.0
## 2 200 0.25 0.8
## 3 300 0.65 0.2
## 4 400 0.92 0.1
## 5 500 0.40 0.3
## 6 600 0.00 0.0
## 7 700 NA NA
## 8 800 0.20 0.0
## 9 900 0.05 0.0
## 10 1000 0.00 0.0
## 11 1100 0.15 0.0
## 12 1200 0.00 0.0
## 13 1300 0.12 0.0
## 14 1400 0.00 0.0
## 15 1500 0.00 0.0
plot(
disp.rate.ftb ~ disp.rate.eb,
data = disp,
main = "Marbled Salamander Dispersal Rates",
xlab = "Dispersal Rate\nFirst Time Breeders",
ylab = "Dispersal Rate\nExperienced Breeders",
pch = 21, col = 1, bg = "steelblue"
)
Next, we can test the significance of the correlation using the cor.test() function, as follows:
cor.test(
disp$disp.rate.ftb,
disp$disp.rate.eb,
use='complete.obs')
##
## Pearson's product-moment correlation
##
## data: disp$disp.rate.ftb and disp$disp.rate.eb
## t = 1.2084, df = 12, p-value = 0.2502
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2438293 0.7320180
## sample estimates:
## cor
## 0.3293598
Note, we needed to specify the use=’complete.obs’ argument to address the missing values for the 700 m distance class (for which there are no ponds in this particular distance interval).
What does the correlation test say about the significance of the correlation between juvenile and adult dispersal-distance functions?
If the data are non-normal, then a non-parametric rank-based measure of association is more appropriate.
If method is “kendall” or “spearman”, Kendall’s tau or Spearman’s rho statistic is used to estimate a rank-based measure of association
These tests may be used if the data do not necessarily come from a bivariate normal distribution.
Let’s try a test of the Spearman’s rank correlation:
cor.test(
disp$disp.rate.ftb,
disp$disp.rate.eb,
use='complete.obs',
method='spearman')
## Warning in cor.test.default(disp$disp.rate.ftb, disp$disp.rate.eb,
## use = "complete.obs", : Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: disp$disp.rate.ftb and disp$disp.rate.eb
## S = 102.99, p-value = 0.001169
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.7736421
What does this test say about the correlation between dispersal-distance functions? Does it agree with the parametric test? Which do you trust more with this data set?
Another way to compare two samples, whether they be paired or not, is to compare the empirical cumulative distributions of the samples
This test is known for its wonderful name, the Kolmogorov-Smirnov test, which is an extremely simple test for asking one of two different questions:
The two-sample problem is the one most often used and the one we will concern ourselves with here
The apparently simple question is actually very broad
It is obvious that two distributions could be different because their means were different.
But two distributions with exactly the same mean could be significantly different if they differed in variance, or in skew or kurtosis, or both.
The Kolmogorov-Smirnov test works on empirical cumulative distribution functions (ecdf).
Let’s see what the ecdf for the sample of juvenile dispersal rate looks like:
plot(
ecdf(disp$disp.rate.ftb),
verticals=TRUE,
main = "Mike's Plot of Marbled Salamanders\nFirst-Time Breeders: ECDF")
Now let’s add the ecdf for the adult dispersal rate, but change the line type (lty) so that we can distinguish it from the ecdf for the juvenile dispersal rate
plot(
ecdf(disp$disp.rate.ftb),
verticals=TRUE,
main = "Mike's Plot of Marbled Salamanders\nFirst-Time and Experienced Breeders: ECDF")
plot(
ecdf(disp$disp.rate.eb),
verticals=TRUE,
lty=3,
add=TRUE)
legend(
x = 0.4, y = 0.4,
lty = c(1, 3),
legend = c("first-time", "experienced"),
title = "Breeder Class")
Are these two distributions different? We can use the Kolmogorov-Smirnov test (ks.test) to determine if they differ significantly in any aspect.
The test statistic is the maximum difference in value of the cumulative distribution functions; i.e., maximum vertical difference in the curves for a given value of X
ks.test(disp$disp.rate.ftb,disp$disp.rate.eb)
##
## Exact two-sample Kolmogorov-Smirnov test
##
## data: disp$disp.rate.ftb and disp$disp.rate.eb
## D = 0.28571, p-value = 0.4265
## alternative hypothesis: two-sided
Is there enough evidence to suggest that the dispersal-distance relationship differs between first-time breeders and experienced breeders?
Suppose that only 4 female salamanders were killed crossing the road, compared to 16 males.
Before we can judge, of course, we need to know about the number of male and female candidates.
It turns out that 16 males were killed out of 250 male individuals crossing the road, compared to 4 deaths out of only 40 crossings for females.
Now, if anything, it looks like the females did worse than males in successfully crossing the road (10% mortality for females versus 6% for males).
The question arises as to whether the apparent positively biased mortality rate for females is statistically significant, or whether this sort of difference could arise through chance alone.
prop.test(
x = c(4,16),
n = c(40,250))
## Warning in prop.test(x = c(4, 16), n = c(40, 250)): Chi-squared
## approximation may be incorrect
##
## 2-sample test for equality of proportions with continuity
## correction
##
## data: c(4, 16) out of c(40, 250)
## X-squared = 0.24825, df = 1, p-value = 0.6183
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.07629446 0.14829446
## sample estimates:
## prop 1 prop 2
## 0.100 0.064
A significant p-value indicates that the proportions are different between samples; i.e., that the proportions observed were unlikely to have been drawn from the same underlying population
Conversely, an insignificant p-value means that there is insufficient evidence to reject the null hypothesis and we would conclude that the proportions are not statistically different.
What does the binomial test indicate?
A great deal of data in ecology and conservation comes in the form of counts (whole numbers or integers), for example:
With count data, the number 0 is often the value of the response variable; in other words, there are often observations that receive a count of 0
With contingency tables, counts are cross-classified according to one or more categorical contingent variables, where the contingencies are all the events of interest that could possibly happen
A contingency table shows the counts of how many times each of the contingencies actually happened in a particular sample
In a contingency table, each observation is cross-classified according to each of the categorical contingent variables; i.e., each observation is placed into one bin representing a unique categorical level of each contingent variables.
Consider a sample of 40 forest stands where a survey determined that barred owls were either ‘present’ or ‘absent’ and where each stand was classified as either ‘young’ (<80 years) or ‘old’ (>80 years).
Thus, each of the 40 observations could be cross-classified into one of the four contingencies: present-old, present-young, absent-old, and absent-young
We have the following 2x2 contingency table (with expected values shown in parentheses):
old | young | |
---|---|---|
present | 16 (12.5) | 4 (7.5) |
absent | 9 (12.5) | 11 (7.5) |
We would like to know whether the observed counts differ from what we would expect if presence/absence was independent of stand age
Do you remember how to calculate the expected values in a contingency table? If not, look this up. This site has a good explanation.
It is clear that the observed frequencies and the expected frequencies are different. But in sampling, everything always varies, so this is no surprise The important question is whether the expected frequencies are significantly different from the observed frequencies.
We can assess the significance of the differences between observed and expected frequencies in a variety of ways. The usual way is with Pearson’s chi-squared test (generalized linear models are an alternative). ) :::{.info} Do you remember how to calculate the chi-square test statistic? If not, look it up.
In R we can compute the chi-squared test as follows:
owls = matrix(c(16, 9, 4, 11), nrow=2)
rownames(owls) = c("present", "absent")
colnames(owls) = c("old", "young")
chisq_owls = chisq.test(owls)
chisq_owls
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: owls
## X-squared = 3.84, df = 1, p-value = 0.05004
The output of the chisq.test()
function prints the value
of the test statistic (X-squared: 3.84) as well as a p-value (0.05).
What does the result indicate?
The significant p-value doesn’t tell us whether barred owls are present more or less frequently than expected.
We can compare the observed and expected values of the contingency table for more insight:
The expected values
round(chisq_owls$expected, 1)
## old young
## present 12.5 7.5
## absent 12.5 7.5
The observed values
chisq_owls$observed
## old young
## present 16 4
## absent 9 11
Were there more, or fewer, than expected owls observed in the young forests?
It can be a pain to visually compare the two contingency tables. We can instead calculate the residuals, in other words the difference between the observed and expected values:
round(
chisq_owls$observed - chisq_owls$expected,
digits = 1)
## old young
## present 3.5 -3.5
## absent -3.5 3.5
Now it’s easy to see that there are fewer owls present in the young forests. We can tell because theresidual is negative.
You might recall that Pearson’s chi-squared test expects the expected cell values to be large, generally greater than 4 or 5. As you can seen in the table above, we have met this rule of thumb with the barred owl data, so the Pearson’s chi-squared test is appropriate.
However, when one or more of the expected frequencies is less than 4 (or 5), then it is wrong to use Pearson’s chi-squared test for your contingency table. This is because small expected values inflate the value of the test statistic, and it can no longer be assumed to follow the chi-square distribution
In this case, an alternative test called Fisher’s exact test is more appropriate
We use the function in the same way as before:
fisher.test(owls)
##
## Fisher's Exact Test for Count Data
##
## data: owls
## p-value = 0.04837
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 1.00906 26.52895
## sample estimates:
## odds ratio
## 4.683421
Does the conclusion change? Note, the procedures described in this section can all be used with contingency tables much bigger than 2x2
The final data set represents standardized breeding bird counts (bird) and a variety of habitat variables (hab) across 1046 sample plots in the Oregon Coast Range
See birdhab.meta.pdf
for a complete description of the
data sets.
Here, we are interested in knowing whether the presence/absence of brown creepers varies between the interior and edge of forest stands.
Read in the raw data and create a 2x2 contingency table consisting of counts of brown creeper presence (1) versus (0) absence and forest edge (E) versus interior (I):
birds = read.csv(here("data", "bird.sta.csv"))
hab = read.csv(here("data", "hab.sta.csv"))
birdhab = merge(
birds,
hab, by=c("basin", "sub", "sta"))
# Create a contingency table for edge/interior and brown creeper presence/absence
table(
birdhab$s.edge,
birdhab$BRCR > 0)
##
## FALSE TRUE
## E 144 29
## I 559 314
# set the presence to be in the first column
br_creeper_table = table(
birdhab$s.edge,
birdhab$BRCR > 0)[, 2:1]
br_creeper_table
##
## TRUE FALSE
## E 29 144
## I 314 559
Make sure you understand the script above:
First, we read in the bird and habitat data and merged them into a single file based on the common fields.
Then we used the table() function to compute the cross-classified counts.
What code converted the Brown Creeper counts to presence/absence?
The next step simply switched the order of the columns so that the present counts were in the first column and the absent counts were in the second column, as this is the expected order in some functions (e.g, prop.test())
Are brown creepers present more or less frequently than expected in forest interiors versus forest edges and is the difference significant?
Before you start the questions, I recommend you work your way through this DataCamp tutorial on R Formulas:
Don’t worry if it doesn’t all make sense yet. You can stop at the “How To Inspect Formulas” section.
You’ve already used formulas with the boxplot()
and
aggregate()
functions. You’ll be using them a lot to build
models as well, so it’s worth the effort to get familiar with them!
A quick pre/re-view of build models for ANOVA
In the lecture assignments, we’ll be conducting Analyses of Variances on the penguin data. In these lab questions we’ll practice checking some of the ANOVA assumptions.
We’ll use the lm()
function, which stands for “linear
model”, to build models that we can use for ANOVA. To build a simple
model of flipper length as a function of species, I would use the
following syntax:
require(palmerpenguins)
fit_fl_sp =
lm(
formula = flipper_length_mm ~ species,
data = penguins)
fit_species
) of penguin body
mass as predicted by penguin species.fit_sex
) of penguin body
mass as predicted by sex.fit_both
) of penguin body
mass as predicted by species and
sex. This should be an interactive model,
i.e. it should include a sex and species interaction.We know that the Group 1 methods require homogeneity (constant variance). Conditional boxplots can be a great way to visually check for homogeneity!
If the homogeneity assumption holds, we would expect that all of the groups would have similar variability - the boxes should all be about the same width.
As you know, the boxplot()
function will accept a
formula
and data
arguments, just like you used
to build you fits.
Create conditional boxplots corresponding to each of your model fits.
Hint: the first draft of my doubly conditional boxplot of species and sex looked like this:
Note the missing labels for some of the factor combinations on the x-axis.
names
argument for a way to
create custom names to be plotted under each box.las
argument.rnorm(100) ~ rbinom(100, 1, 0.5)
part, focus on the
main
and names
argument values:boxplot(
rnorm(100) ~ rbinom(100, 1, 0.5),
main = "Plot of \n 100 Random Numbers",
names = c("group 1", "group 2"))
I was able to make some adjustments for the second draft:
fit_species
model.fit_sex
model.fit_both
model.
Review the walkthrough to review how to test for homogeneity of variances.
Use bartlett.test()
to check for homogeneity of the
variances in each of your model fits.
Unfortunately, bartlett.test()
doesn’t work with the
formula syntax you used to make your two-way model or your
doubly-conditional boxplot.
You’ll need to use your friend aggregate()
to separate
the observations by the two factors sex and species.
You need to figure out how to extract separate vectors of body masses for each of the sex/species groups.
Hint: Here’s some code that I used to extract the body mass observations, grouped by island (you’ll use sex and species):
dat_groups = aggregate(
body_mass_g ~ island,
data = penguins,
FUN = c)
str(dat_groups)
## 'data.frame': 3 obs. of 2 variables:
## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 1 2 3
## $ body_mass_g:List of 3
## ..$ : int 3400 3600 3800 3950 3800 3800 3550 3200 3150 3950 ...
## ..$ : int 3250 3900 3300 3900 3325 4150 3950 3550 3300 4650 ...
## ..$ : int 3750 3800 3250 3450 3650 3625 4675 3475 4250 3300 ...
Hint 2:
I can use the $ operator extract a list containing the body mass
observations in each group. The name of the list is
body_mass_g
:
dat_groups$body_mass_g
bartlett.test()
. Can you use a
list as the value for the x argument?In 2015, street trees in three neighborhoods in Tampa, FL were surveyed for defects and assessed for probability of failure (Nelson et al., 2022).
The failure probability assessment considers individual tree characteristics such as DBH, crown width, and defects (such as decay, root problems, or leaning trunks). The tree assessment method classifies trees into four ascending classes of failure probability with 1 indicating low probability and 4 extreme probability of failure.
In 2017, Hurricane Irma made landfall in Florida causing extensive destruction. Following the hurricane, the trees were revisited to assess storm damage.
You’ll use a modified version of this survey dataset to practice some of the techniques form the lab walkthrough.
Retrieve the trees_FL.csv
file and read it into a
data.frame
object called dat_fl
.
ProbabilityofFailure
.Failure_Standardized
)The table()
and barplot()
functions will be
your friends.
Three types of failure were recorded in the survey. A question of interest is whether there are differences in the characteristics of trees with each type of failure.
For example, we can compare the distributions of DBH within each failure category.
The distribution of DBH for the trees with branch failure is clearly different than the distributions for whole-tree failures and intact trees. The ‘none’ and ‘whole tree’ classes may be more similar.
Conduct a Kolmogorov-Smirnov test on the distributions of DBH for intact trees and the trees with whole-tree failures.
HINT: It may be convenient to create subsets of the FL tree data, one each for the none and whole failure types.
Challenge Try to re-create my density plot.
Take a look at your DBH/tree height scatterplot. Do you see a relationship between the two variables?
Let’s do a formal test to test to support our graphical intuition.
One of the goals of a study on the trees was to determine whether the probability of failure ratings were effective. In other words, were trees with a higher probability of failure rating more likely to experience a failure?
We can use a Chi-square test on a contingency table of failure types and failure probability ratings:
1 | 2 | 3 | 4 | ||
---|---|---|---|---|---|
Failure Type | branch | 97 | 154 | 156 | 31 |
none | 1239 | 670 | 421 | 53 | |
whole | 71 | 65 | 50 | 32 |
Let’s simplify our table (and the interpretation of our results) by combining the branch and whole tree failures into a single category.
We can use some slightly ugly R code to accomplish this:
dat_fl$fail = factor(dat_fl$Failure_Standardized != "none")
levels(dat_fl$fail) = c("No Fail", "Fail")
Now our contingency table looks like:
fl_table_2 = table(
dat_fl$ProbabilityofFailure,
dat_fl$fail)
fl_table_2
##
## No Fail Fail
## 1 1239 168
## 2 670 219
## 3 421 206
## 4 53 63
Conduct a chi-square test on this contingency table and calculate the chi-square residuals.
Hint: Review the chi-square test code in the walkthrough.