Comma-Separated Values (CSV) File Format
CSV (Comma Separated Values) is a commonly used standardized format for storing data that fit into a table consisting of rows and columns. A CSV file is just a text file that uses the following conventions:
Each row in the CSV file corresponds to a row in a table.
Within each row, the elements in each column are separated by a comma.
Typically, in these tables the columns represent variables or attributes and the rows represent individual observations (technically rows are sampling units - a term we will encounter later).
The first row of a CSV contains the names of the columns.
Additional rows contain data entries for each individual observation.
Row-Data Paradigm
The CSV format exemplifies the row-data paradigm in which:
- Each row is an observation.
- Columns represent attributes, or variables, of the observation.
Gardener refers to this as the data-recording format in Chapter 2.
CSV Example
Suppose you had collected the following species accumulation data in order to perform a rarefaction analysis:
Site Number | Sample Number | Distinct Species Count | Total Individuals |
---|---|---|---|
1 | 4 | 44 | 123 |
2 | 2 | 44 | 100 |
3 | 2 | 50 | 142 |
4 | 13 | 20 | 77 |
The data columns are
- Which site the sample was taken
- The number of the sample (multiple samples were taken at each site)
- The total number of distinct species observed in the sample
- The total number of individuals (regardless of species) in the sample
The text stored in a CSV-formatted text file would look like this:
Site Number, Sample Number, Distinct Species Count, Total Individuals
1, 4, 44, 123
2, 2, 44, 100
3, 2, 50, 142
4, 13, 20, 77