index_heterogeneity returns an heterogeneity or dominance index.

index_evenness returns an evenness measure.

bootstrap_* and jackknife_* perform bootstrap/jackknife resampling.

index_heterogeneity(object, ...)

simulate_heterogeneity(object, ...)

bootstrap_heterogeneity(object, ...)

jackknife_heterogeneity(object, ...)

index_evenness(object, ...)

simulate_evenness(object, ...)

bootstrap_evenness(object, ...)

jackknife_evenness(object, ...)

# S4 method for CountMatrix
index_heterogeneity(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
...
)

# S4 method for CountMatrix
simulate_heterogeneity(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
quantiles = TRUE,
level = 0.8,
step = 1,
n = 1000,
progress = getOption("tabula.progress"),
...
)

# S4 method for CountMatrix
bootstrap_heterogeneity(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
probs = c(0.05, 0.95),
n = 1000,
...
)

# S4 method for CountMatrix
jackknife_heterogeneity(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
...
)

# S4 method for CountMatrix
index_evenness(
object,
method = c("shannon", "brillouin", "mcintosh", "simpson"),
...
)

# S4 method for CountMatrix
simulate_evenness(
object,
method = c("shannon", "brillouin", "mcintosh", "simpson"),
quantiles = TRUE,
level = 0.8,
step = 1,
n = 1000,
progress = getOption("tabula.progress"),
...
)

# S4 method for CountMatrix
bootstrap_evenness(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
probs = c(0.05, 0.95),
n = 1000,
...
)

# S4 method for CountMatrix
jackknife_evenness(
object,
method = c("berger", "brillouin", "mcintosh", "shannon", "simpson"),
...
)

## Arguments

object A $$m \times p$$ matrix of count data (typically a CountMatrix object). Further arguments to be passed to internal methods. A character string specifying the index to be computed (see details). Any unambiguous substring can be given. A logical scalar: should sample quantiles be used as confidence interval? If TRUE (the default), sample quantiles are used as described in Kintigh (1989), else quantiles of the normal distribution are used. A length-one numeric vector giving the confidence level. A non-negative integer giving the increment of the sample size. Only used if simulate is TRUE. A non-negative integer giving the number of bootstrap replications. A logical scalar: should a progress bar be displayed? A numeric vector of probabilities with values in $$[0,1]$$ (see quantile).

## Value

index_heterogeneity, index_evenness and simulate_evenness return a DiversityIndex object.

bootstrap_* and jackknife_* return a data.frame.

## Details

Diversity measurement assumes that all individuals in a specific taxa are equivalent and that all types are equally different from each other (Peet 1974). A measure of diversity can be achieved by using indices built on the relative abundance of taxa. These indices (sometimes referred to as non-parametric indices) benefit from not making assumptions about the underlying distribution of taxa abundance: they only take relative abundances of the species that are present and species richness into account. Peet (1974) refers to them as indices of heterogeneity.

Diversity indices focus on one aspect of the taxa abundance and emphasize either richness (weighting towards uncommon taxa) or dominance (weighting towards abundant taxa; Magurran 1988).

Evenness is a measure of how evenly individuals are distributed across the sample.

The following heterogeneity index and corresponding evenness measures are available (see Magurran 1988 for details):

berger

Berger-Parker dominance index. The Berger-Parker index expresses the proportional importance of the most abundant type. This metric is highly biased by sample size and richness, moreover it does not make use of all the information available from sample.

brillouin

Brillouin diversity index. The Brillouin index describes a known collection: it does not assume random sampling in an infinite population. Pielou (1975) and Laxton (1978) argues for the use of the Brillouin index in all circumstances, especially in preference to the Shannon index.

mcintosh

McIntosh dominance index. The McIntosh index expresses the heterogeneity of a sample in geometric terms. It describes the sample as a point of a S-dimensional hypervolume and uses the Euclidean distance of this point from the origin.

shannon

Shannon-Wiener diversity index. The Shannon index assumes that individuals are randomly sampled from an infinite population and that all taxa are represented in the sample (it does not reflect the sample size). The main source of error arises from the failure to include all taxa in the sample: this error increases as the proportion of species discovered in the sample declines (Peet 1974, Magurran 1988). The maximum likelihood estimator (MLE) is used for the relative abundance, this is known to be negatively biased by sample size.

simpson

Simpson dominance index for finite sample. The Simpson index expresses the probability that two individuals randomly picked from a finite sample belong to two different types. It can be interpreted as the weighted mean of the proportional abundances. This metric is a true probability value, it ranges from 0 (perfectly uneven) to 1 (perfectly even).

The berger, mcintosh and simpson methods return a dominance index, not the reciprocal or inverse form usually adopted, so that an increase in the value of the index accompanies a decrease in diversity.

## Note

Ramanujan approximation is used for $$x!$$ computation if $$x > 170$$.

## References

Berger, W. H. & Parker, F. L. (1970). Diversity of Planktonic Foraminifera in Deep-Sea Sediments. Science, 168(3937), 1345-1347. doi: 10.1126/science.168.3937.1345 .

Brillouin, L. (1956). Science and information theory. New York: Academic Press.

Kintigh, K. W. (1984). Measuring Archaeological Diversity by Comparison with Simulated Assemblages. American Antiquity, 49(1), 44-54. doi: 10.2307/280511 .

Kintigh, K. W. (1989). Sample Size, Significance, and Measures of Diversity. In Leonard, R. D. and Jones, G. T., Quantifying Diversity in Archaeology. New Directions in Archaeology. Cambridge: Cambridge University Press, p. 25-36.

Laxton, R. R. (1978). The measure of diversity. Journal of Theoretical Biology, 70(1), 51-67. doi: 10.1016/0022-5193(78)90302-8 .

Magurran, A. E. (1988). Ecological Diversity and its Measurement. Princeton, NJ: Princeton University Press. doi: 10.1007/978-94-015-7358-0 .

McIntosh, R. P. (1967). An Index of Diversity and the Relation of Certain Concepts to Diversity. Ecology, 48(3), 392-404. doi: 10.2307/1932674 .

Peet, R. K. (1974). The Measurement of Species Diversity. Annual Review of Ecology and Systematics, 5(1), 285-307. doi: 10.1146/annurev.es.05.110174.001441 .

Pielou, E. C. (1975). Ecological Diversity. New York: Wiley. doi: 10.4319/lo.1977.22.1.0174b

Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal, 27, 379-423. doi: 10.1002/j.1538-7305.1948.tb01338.x .

Simpson, E. H. (1949). Measurement of Diversity. Nature, 163(4148), 688-688. doi: 10.1038/163688a0 .

Other diversity: richness-index, similarity(), turnover-index

N. Frerebeau

## Examples

## Coerce dataset to a count matrix
data("chevelon", package = "folio")
chevelon <- as_count(chevelon)

## Shannon diversity index
(index_h <- index_heterogeneity(chevelon, method = "shannon"))
#> <HeterogeneityIndex: shannon>
#>       size     index
#> P610s   14 1.6681740
#> P610e    8 1.2130076
#> P625     4 0.6931472
#> P630     1 0.0000000
#> P307     1 0.0000000
#> P631    12 1.7481555
#> P623     3 1.0986123
#> P624    14 1.8711604
#> P626s   26 1.7207095
#> P626e  106 1.9115521
#> P627     4 1.0397208
#> P628    12 1.8200760(index_e <- index_evenness(chevelon, method = "shannon"))
#> <EvennessIndex: shannon>
#>       size     index
#> P610s   14 0.8572718
#> P610e    8 0.8750000
#> P625     4 1.0000000
#> P630     1       NaN
#> P307     1       NaN
#> P631    12 0.8983742
#> P623     3 1.0000000
#> P624    14 0.9615862
#> P626s   26 0.8842698
#> P626e  106 0.8699849
#> P627     4 0.9463946
#> P628    12 0.9353340
## Bootstrap resampling
(boot_h <- bootstrap_heterogeneity(chevelon, method = "shannon"))
#>             min      mean       max        Q5       Q95
#> P610s 0.6559757 1.4217738 1.9085353 1.0547209 1.7671950
#> P610e 0.0000000 0.9829383 1.3862944 0.5623351 1.3208883
#> P625  0.0000000 0.5266198 0.6931472 0.0000000 0.6931472
#> P630  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> P307  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> P631  0.4505612 1.4751503 1.9072840 1.0975595 1.8200760
#> P623  0.0000000 0.6552635 1.0986123 0.0000000 1.0986123
#> P624  1.0789922 1.6280739 1.9085353 1.3337360 1.8711604
#> P626s 0.9746702 1.5997919 1.9056849 1.3698324 1.8005937
#> P626e 1.6536757 1.8740800 2.0336751 1.7634515 1.9755950
#> P627  0.0000000 0.7066917 1.0397208 0.0000000 1.0397208
#> P628  0.8239592 1.5349087 1.9072840 1.1988493 1.8200760
## Jackknife resampling
(jack_h <- jackknife_heterogeneity(chevelon, method = "shannon"))
#>            mean       bias     error
#> P610s 1.5680395 -0.9012101 0.2056978
#> P610e 1.1096729 -0.9300118 0.3968560
#> P625  0.5545177 -1.2476649 0.8317766
#> P630  0.0000000  0.0000000 0.0000000
#> P307  0.0000000  0.0000000 0.0000000
#> P631  1.6463856 -0.9159291 0.2108823
#> P623  0.9769728 -1.0947558 0.5574224
#> P624  1.7643489 -0.9613033 0.2126469
#> P626s 1.6152716 -0.9489415 0.2242804
#> P626e 1.8080769 -0.9312771 0.1429090
#> P627  0.9244221 -1.0376881 0.5301829
#> P628  1.7148123 -0.9473731 0.2085352