Introduction to Quantitative
Ecology
In-class Random Numbers and Plotting Practice:
One of the best ways to improve your R skills and comfort level is to experiment with lots of variations on code templates.
One of the best ways to learn about probability distributions and their relationship to real data is to experiment with random number generation. Today, we’ll practice generating random numbers using both the normal and uniform distributions today.
Upload a document with answers to questions 1 - 4 (2 points each)
Only one member of your group needs to submit the report, but everybody in your group should keep a copy of the code you used to do the exercises.
You can do your work in an RMarkdown document and knit it to html.
Alternatively, you may do your work in a word doc (or Google doc), pasting your figures into the document and saving as a pdf.
Because R is a programming language specialized for statistical analysis, it has some sophisticated random number generators built in.
Note: The proper term for random numbers generated via computer is pseudorandom.
We hope our CPU always produces the same results when we give them the same instructions. Current computers cannot produce truly random numbers (but quantum computers may be able to one day).
We do have very good algorithms for producing sequences of numbers that have the statistical properties of sequences of truly random numbers.
One desirable property of pseudorandom numbers is that we can choose what number we want R to use as a starting key, called the random seed to the generator.
When we specify a seed, R will always create the same sequence. This is useful when we want to test different code on the same data.
set.seed(12345)
rnorm(n = 4)
## [1] 0.5855288 0.7094660 -0.1093033 -0.4534972
set.seed(12345)
rnorm(n = 4)
## [1] 0.5855288 0.7094660 -0.1093033 -0.4534972
Notice what happens if I don’t set the random seed:
rnorm(n = 4)
## [1] 0.6058875 -1.8179560 0.6300986 -0.2761841
rnorm(n = 4)
## [1] -0.2841597 -0.9193220 -0.1162478 1.8173120
R’s random number functions are all based on probability distributions.
Some of the most famous distributions are the Normal, Uniform, Binomial, and Poisson distributions. The corresponding R random number generating functions are:
rnorm()
runif()
rbinom()
rpois()
Here is a demo using the uniform distribution for x-values and the Normal distribution for y-values:
# generate a sequence of 20 normally distributed numbers:
rnorm(n = 20, mean = 10, sd = 1.5)
## [1] 10.555942 10.780325 8.874202 11.225350 8.670464 9.502634 11.681069
## [8] 10.448086 11.169433 12.183678 9.033507 7.670294 7.603436 12.707646
## [15] 9.277529 10.930570 10.918185 9.756534 11.217810 13.295250
# generate two sequences that you can use as coordinates to make a plot
n_pts = 3000
x = runif(n = n_pts, min = 2, max = 20)
y = rnorm(n_pts, mean = 4, sd = 0.75)
plot(
x, y,
main = "Scatterplot of random numbers",
col = adjustcolor("steelblue", 0.3))
We can also plot histograms of values of the x- and y-coordinates:
hist(x, main = "Histogram of 3000 uniform distributed random numbers")
hist(y, main = "Histogram of 3000 normally distributed random numbers")
Histograms of uniformly distributed numbers
Experiment with the runif
function. Check out the help
entry to see how to use the arguments:
n
min
max
Try to create sequences of different lengths and print them to your console: 5, 50, 500.
What are the default upper and lower bounds of the random numbers? How can you change these?
Plot histograms with the following uniform random number sequences. You can use this code as a template:
hist(x = runif(n = , min =, max =))
NOTE: you’ll have to fill in numbers for n, min, and max for the code to work.
Your answer needs to contain your histograms and your predictions.
Describe the differences in appearance in the histograms as you increased the number of randomly-generated numbers. Did they meet your predictions?
Experiment with the rnorm
function.
For now, we’ll use the default values for the mean
and
sd
arguments.
Histograms of normally distributed numbers
Before you create any plots, discuss your predictions about the histograms might appear different using with small numbers of points vs. the histograms from large numbers of points. Write down your predictions before you make any plots. Make sure you describe your predictions of how the following will change as you increase the number of points:
Plot histograms with the following normally-distributed random number sequences. You can use this code as a template:
hist(x = rnorm(n = ))
NOTE: you’ll have to fill in the appropriate number for n.
Your answer needs to contain your histograms and your predictions.
Describe the differences in appearance in the histograms as you increased the number of randomly-generated numbers. Did they meet your predictions?