Main Content
Ecology and Evolution of Intraspecific Plant Chemodiversity
Project: Partitioning of chemical diversity within different hierarchical levels
Secondary metabolite profiles are one of the most diverse phenotypes of organisms and can consist of a large number of
compounds originating from a limited number of biosynthetic pathways. The statistical treatment of such profiles often
is complicated due to their diversity as well as the intra- and interspecific variability in the quantitative and qualitative
composition of secondary metabolites. Most importantly, the assumption of independence of the presence/absence and the
quantity of compounds is violated due to the shared biosynthetic origin of many compounds. Therefore, I propose a biosynthetically
informed pairwise distance measure that fully considers the biosynthesis of the compounds and thus quantifies
the similarity in the enzymatic equipment of two samples. The biosynthetic similarity of compounds is calculated based on
the proportion of shared enzymes that are required for their biosynthesis. Using this information (provided as dendrogram
structure) and the quantitative composition of the samples, generalized UniFrac distances are calculated measuring that
fraction of the dendrogram (i.e., the branch lengths) that is unique to either of the samples but not shared by both samples.
To allow a straightforward cross-platform application of the approach, I provide functions for the statistical software R and
sample data sets. A hypothetical and a real world example show the feasibility of the biosynthetically informed distances
dA,B and highlight the differences to conventional distance measures. The advantages of this approach and potential fields
of application are discussed.