ENS rarefaction and related approaches

Our approach relies on a family of diversity measures that was first introduced as “Hurlbert ENS” by Dauby and Hardy (2012). Here, we use the term “ENS rarefaction” to emphasize that these measures are simply an effective number of species (ENS) transformation of the individual-based rarefaction curve (Hurlbert, 1971). Since ENS rarefaction is one of the lesser-known, but quite powerful, families of diversity measures, we briefly explain it below and compare it with the related Hill number framework, and the individual-based rarefaction framework that it is based on (see Table 1).
Relating the complementary information given by a set of diversity measures to the diversity components discussed above is challenging because many metrics are sensitive to more than one component (Chase and Knight, 2013). Furthermore, diversity metrics often differ in their numerical ranges and units (i.e. their numerical constraints), and in the degree to which they are affected by passive sampling effects, which in statistics is called estimation bias (Gotelli & Chao, 2013). For example, species richness, which counts all species independent of their abundance, can attain any integer number, and is strongly affected by the number of individuals in the sample. In contrast, Simpson’s index, which gives disproportionately high weight to the dominant species of the SAD, ranges between 0 and 1 and is almost unaffected by sample size (i.e., the number of individuals). Although the two metrics hold complementary information on the SAD and passive sampling effects, their different numerical constraints and estimation biases make it difficult to disentangle the two components and compare their effect sizes (Jost, 2006).
The Hill number framework solves the problem of incompatible numerical constraints by converting diversity index values to effective numbers of species (Eqn 1). This encompasses all diversity indices that are a function of the term \(\sum_{i=1}^{S}{p_{i}}^{q}\)(e.g., species richness for q=0, Shannon index for q=1 and Simpson’s index for q=2), where the diversity order, q, tunes the weight of species abundances \(p_{i}\) (Rényi, 1961; Hill, 1973; Jost, 2006). The term ENS refers to the hypothetical number of species that a perfectly even sample would have if it produced the same index value as the real sample. Hence, Hill numbers relieve diversity indices of their numerical constraints by re-expressing them in units equivalent to that of species richness (Jost, 2006). However, like most diversity metrics, Hill numbers retain a downward estimation bias, whose strength diminishes with increasing values of the diversity order q (Chao et al., 2014). Therefore, differences in Hill number profiles cannot unambiguously be attributed to changes in the regional SAD or changes in total abundance. For example, if 2D (corresponding to Simpson’s index) is constant along a gradient of interest while 0D (i.e. species richness) increases, this pattern can be underlain by a change in the regional SAD (i.e. an increase in the number of rare species), a passive sampling effect (i.e. an increase in total abundance) or both.
Individual-based rarefaction (IBR) is a framework that explicitly addresses passive sampling effects by expressing diversity in terms of the expected number of species for a standardized number of individuals (Eqn 2) (Hurlbert, 1971; Gotelli & Colwell, 2001). The resulting non-linear scaling relationship between the number of individuals (n ) and expected species richness (i.e. rarefied richness, Sn) is the IBR curve (Fig 1). Rarefied richness estimates are unbiased for random samples, which means that they only respond to changes in the SAD but not to the original number of individuals present in the sample N. By varying the reference sample size n, IBR can give more or less influence to species abundances (Gotelli & Colwell, 2001). However, the value of n also constrains the numerical range of rarefied richness values. Thus, effect sizes at the base of the IBR curve (representing mostly common species) are not directly comparable to those at higher values of n (representing both common and rare species; Dauby and Hardy, 2012). In other words, if we find a species richness gradient to be steeper than a corresponding gradient in rarefied richness, part of the numerical difference has nothing to do with more individual effects, but is merely the null expectation from the different numerical constraints of the two metrics.
ENS rarefaction is method that converts the IBR curve into effective numbers of species with consistent numerical constraints along the curve (Fig 1). There is no simple closed-form equation for ENS rarefaction but Dauby and Hardy (2012) showed that numerical approximation of Eqn 3 can be used to convert any Sn value to its corresponding effective number (En). Again, ENS refers to the number of species in a hypothetical, perfectly even community that has the same rarefied richness as the real community (Dauby and Hardy, 2012). The base of the resulting “ENS curve” (i.e. E2) is also the ENS transformation of Hurlbert’s (1971) unbiased probability of interspecific encounter (SPIE, Olszewski 2004), and is equal to an asymptotic estimate of the Hill number 2D (Chao et al., 2014, Dauby and Hardy, 2012). It can be interpreted as the number of dominant species in the species pool because being at the base of the curve it gives disproportionately high weight to species with high relative abundances. As n increases along the ENS curve, rarer and rarer species influence the diversity estimate until it practically converges onto the observed total species richness, where all species are counted regardless of their abundance (i.e. EN). Increases along the ENS curve are entirely due to the incremental influence of rare species and do not result from variable numerical constraints along the curve. Therefore, the ENS transformation makes it easy to assess relative evenness; random samples from perfectly even communities (i.e. communities without rare species) produce ENS curves that are flat horizontal lines (Dauby & Hardy, 2012). In some sense, ENS rarefaction combines the advantages of Hill numbers and individual-based rarefaction in a single family of diversity measures. It has unconstrained values for all values of n and, being a simple transformation of rarefied richness, its values for a reference sample size n are only affected by the SAD and not by the actual number of individuals captured in the sample. Therefore, differences in En values for a constant n can be unambiguously attributed to changes in the SAD, while comparisons between different levels of n reflect a quantification of the more-individual effect. These properties make ENS rarefaction a useful tool for the decomposition approach we present here.