Overview

Concepts and objectives

Spatial concepts:

Complete Spatial Randomness
Poisson Process
Quantifying aggregation via K, pcf, and G functions
Simulating point patterns

Programming and statistical concepts:

Calculating confidence regions via simulation.
Creative use of plot characters and colors.

Point Patterns

Work through sections 4.3.1 through 4.3.4, then do section 4.3.8.

There are a few inconsistencies between the book code and the supplemental code.

You should compare the book code to the supplemental code provided by the authors (in Moodle).

One pitfall:

The text code specifies a file called cactus_boundary.csv (singular), however the file is actually cactus_boundaries.csv (plural).
You’ll need to adjust your code accordingly.

I’ll leave it up to you to find the others. Feel free to add any errors and fixes you find to the course wiki.

Note: Before you work through section 4.3.8, but after you have done 4.3.1 to 4.3.4, run the following code:

pp.xy <- ppm(ppp.cactus, ~ x + y)
pp.xy2 <- ppm(ppp.cactus, ~ polynom( x, y, 2))
pp.xy.pred <- predict.ppm(pp.xy2, type = "trend")

After you feel comfortable with the example code from these sections, work on the exercises in the next section.

Note: Recall that I created a Wiki for us to document errata and discrepancies in the F & F text and code examples.

As you work through the book examples, you may want to double check that the F & F supplemental data files are consistent with the examples in the book.

If you find any errors, please document them in the Wiki. A good entry should include the location in the text (section number, page number), a description of the error, and an example of how to correct it.

Simulation Exercises

Poisson Process

Plot CSR patterns

You simulated Poisson processes in the book examples above using envelope() and rpoispp. Recall that you used envelope() to estimate 95% confidence regions for the Opuntia humifusa data.

Create a new owin object with x and y ranges of 0 to 10.

Now simulate several Poisson processes using rpoispp(). Plot several replicated simulated patterns created with the same value of lambda to get an intuitive feel for how variable CSR can look.

Try some extreme values of lambda.

Hint: to plot point patterns with very dense points, you can use translucent colors so that points on top of each other are still visible.

You can make a translucent gray using rgb(0, 0, 0, 0.2).

A dense CSR figure

Choose a value of lambda that produces a point pattern with at least 20000 points. Experiment with different translucent colors and plotting characters (I like pch = 16). Try to create a figure that resembles this:

Dense CSR pattern

Confidence Envelope

You used envelope() to calculate a confidence region for the cactus L-function values above.

The plot of the envelope results shows upper and lower bounds for the 95% region, but it doesn’t show the L-values of the individual simulations it carried out.

To gain insight on how the confidence regions are built from the simulations, let’s dissect the output from envelope() to find the L-values for each simulation.

The first thing to do is to re-run envelope() on the cactus L-values with the argument savefuns = TRUE. This tells the function to save all of the simulated L-values.

Let’s assume you save your output of envelope() to a variable called csr_env_cactus.

Envelope object dissection

The following code digs into the structure of so-called S3 objects in R. I don’t expect you to learn all about S3 and S4 objects or how to access their components so I’ll provide the needed code to get the pieces we need:

# simulation envelopes
csr_env_cactus = envelope(
  ppp.cactus, 
  Lest,
  nsim = 99,
  rank = 1,
  corection = "translate",
  global = FALSE,
  savefuns = TRUE)

## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78,
## 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.

You must use the argument savefuns = TRUE, otherwise R will not save the intermediate simulation data.

# The attr() function can retrieve components of S3 objects that are saved as 'atributes'
# The simulated L-function results are stored in an attribute called 'simfuns'

# Retrieve the simulation results as a data frame:
envelope_sims = data.frame(attr(csr_env_cactus, "simfuns"))
head(envelope_sims)[, 1:5]

##            r sim1 sim2 sim3 sim4
## 1 0.00000000    0    0    0    0
## 2 0.02694434    0    0    0    0
## 3 0.05388867    0    0    0    0
## 4 0.08083301    0    0    0    0
## 5 0.10777734    0    0    0    0
## 6 0.13472168    0    0    0    0

Now we have a dataframe with the radius in the first column, and the corresponding L-values for each of the simulations in the remaining columns. The L-function plots are easier to read if we subtract the value of the radius from the L-values. (This centers the L-values around 0).

We can subtract the radius from all of the remaining columns using some data.frame index magic:

# Make a copy of the data to work on:
envelope_sims2 = envelope_sims
envelope_sims2[, -1] = envelope_sims[, -1] - envelope_sims2$r

Plot L with matplot

R has a handy function matplot() that we can use to plot all of the columns in a dataframe.

The syntax for matplot can be a little confusing so I’ll provide a short demo:

matplot(x = envelope_sims2$r, y = envelope_sims2[, -1][, 1:5], 
        type = "l", ylab = "L(r)", xlab = "r")

A few things to note: I used the square brackets twice. The first set removed the first column (since it contains the radius, we don’t want to plot it). The second set pulled out the first 5 columns of the result.

Envelope figure

Try running the code, but plotting more than the first 5 columns. The plots can get messy because of the many overlapping lines.

Next, try to use the col argument to plot the lines in a translucent color so that we can see lines stacked on top of each other.

Hint: check out the gray() function, specifically the alpha = argument.

Try to reproduce a figure that looks like this:

L Envelope Simulations 1

Envelope figure 2

Now, overplot the calculated L-values from the Opuntia data using a red line. L Envelope Simulations 1

Marked Point Patterns

Work through the R examples in F&F section 4.3.5.

Select one of the following datasets provided in spatstat:

waka
longleaf
anemones
spruces

You can read more about them here:

https://cran.r-project.org/web/packages/spatstat/vignettes/datasets.pdf

Repeat the mark correlation function analysis from section 4.3.5 on your chosen dataset.

Aggregated Point Patterns

Matern plots

Here is an adaptation of part of the code I used to make the Matern cluster figure in the slides:

# kappa is the intensity (lambda) of the parent points
kappa_1 = 1

# scale is the radius of the offspring clusters
scale_1 = 0.2

# mean number of points per cluster.
# (I think the final count is Poisson-distributed -- have to dig into the code to see.)
mu_1 = 4

# Different sized windows for scale comparison
win_small = owin(xrange = c(0, 10), yrange = c(0, 10))
win_large = owin(xrange = c(0, 30), yrange = c(0, 30))

# plotting parameters
pch_mat = 16
cex_mat = 0.8

mat_pp_small = rMatClust(kappa_1, scale_1, mu_1, win = win_small)
mat_pp_large = rMatClust(kappa_1, scale_1, mu_1, win = win_large)

e_lest_mat_small =  envelope(
  mat_pp_small, 
  fun = Lest,
  nsim = 99,
  rank = 1,
  correction = "isotropic",
  global = FALSE)


e_lest_mat_large =  envelope(
  mat_pp_large,
  fun = Lest,
  nsim = 99, 
  rank = 1, 
  correction = "isotropic", 
  global = FALSE, funargs = list("rmax" = 20))

e_lest_mat_small =  envelope(
  mat_pp_small,
  fun = Lest,
  nsim = 99,
  rank = 1,
  correction = "isotropic",
  global = FALSE,
  funargs = list("rmax" = 5))

plot(e_lest_mat_small, . - r ~ r)
plot(e_lest_mat_large, . - r ~ r)

par(mfrow = c(2, 1))
plot(
  mat_pp_small,
  pch = pch_mat, cex = cex_mat,
  main = "Matern Process - 10 by 10 window ")
plot(
  mat_pp_large,
  pch = pch_mat, cex = 0.3 * cex_mat,
  "Matern Process - 30 by 30 window ")

Tinker with my code to get a feel for how the lambda, kappa, and mu parameters affect the pattern you see.

If there are differences in the large- and small-window plots with different parameter values, try to qualitatively describe what you see.

How do your qualitative descriptions change with different aggregation parameters?

Be sure to try out many different values for each of these parameters so that you understand how changing them qualitatively affects the pattern you see.

Matern L-function

When you have an intuitive sense of the Matern process, perform an L-function analysis using Lest.

Plot your results for the pattern at both scales. See if your qualitative descriptions line up with what the L-function analysis plots show.

Note: you may need to tell Lest() to extend the radius beyond its default value to see the full pattern. You can use the rmax argument to do this.

You can try to use envelope() to simulate confidence regions, but note that envelope() can be slow for point pattern objects that contain many points. (Why?)

Lab Report

Compile your answers the lab questions below into a report and submit via Moodle.

Q1 (2 pts.): Paste your recreation of the dense CSR figure into your document.

Q2 (2 pts.): Include your recreation of the envelope figures 1 and 2 into the document.

Q3 (5 pts.): Using non-technical language (imagine that you are explaining the figure to a group of kids) describe what the envelope figure 2 is showing. Be sure to explain the red line, the confidence region, and a take-home conclusion from the figure.

Q4 (3 pts.): Include a plot of your mark-correlation analysis of your chosen marked-point dataset. The figure should be similar to F&F figure 4.9b.

Q5 (5 pts.): Briefly summarize your conclusions from the mark-correlation analysis of your chosen dataset.

Q6 (2 pts.): Paste a representative plot of your Matern point patterns at each of the two different scales (see my examples using win_small and win_large above). Include appropriate titles and axis labels. Make sure you use different values for kappa, scale, and mu than those in my example. There should be one plot at the large scale (win_large) and one at small scale (win_small).

Q7 (2 pts.): Paste plots of your L-function analysis for 2 scales of the Matern process. Include appropriate titles and axis labels.

Q8 (5 pts.): Qualitatively describe the patterns you see in the two Matern patterns. Focus on the arrangement of individual points and clusters of points.

Some possible considerations:
Are the clusters well-separated? How clumpy are points within the clusters?
What are some differences you see in the 10 by 10 and 100 by 100 patterns?

Q9 (5 pts.): Describe the shapes of the L-function curves of your Matern point patterns. Be sure to relate the configurations you see in the point pattern plots with the plots of L(r).

Things to consider:
Were the shapes of the L-function plots what you expected after seeing the point patterns?
Can you explain an intuitive relationship between the point- and L-lots?
Do the L-function plots have peaks?
Are the L-functions less than or greater than the expectation for CSR for any of the radii?

Spatial Data in R

Lab: Point Pattern Analysis In R

Michael France Nelson

Spring 2022