Comma-Separated Values (CSV) File Format

CSV (Comma Separated Values) is a commonly used standardized format for storing data that fit into a table consisting of rows and columns. A CSV file is just a text file that uses the following conventions:

Each row in the CSV file corresponds to a row in a table.

Within each row, the elements in each column are separated by a comma.

Typically, in these tables the columns represent variables or attributes and the rows represent individual observations (technically rows are sampling units - a term we will encounter later).

The first row of a CSV contains the names of the columns.

Additional rows contain data entries for each individual observation.

Row-Data Paradigm

The CSV format exemplifies the row-data paradigm in which:

  • Each row is an observation.
  • Columns represent attributes, or variables, of the observation.

Gardener refers to this as the data-recording format in Chapter 2.

CSV Example

Suppose you had collected the following species accumulation data in order to perform a rarefaction analysis:

Site Number Sample Number Distinct Species Count Total Individuals
1 4 44 123
2 2 44 100
3 2 50 142
4 13 20 77

The data columns are

  1. Which site the sample was taken
  2. The number of the sample (multiple samples were taken at each site)
  3. The total number of distinct species observed in the sample
  4. The total number of individuals (regardless of species) in the sample

The text stored in a CSV-formatted text file would look like this:

Site Number, Sample Number, Distinct Species Count, Total Individuals
1, 4, 44, 123
2, 2, 44, 100
3, 2, 50, 142
4, 13, 20, 77