Uneven missing data skews phylogenomic relationships within the lories and lorikeets

Resolution of the Tree of Life has accelerated with massively parallel sequencing of genomic loci. To achieve dense taxon sampling within clades, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. A particular challenge that arises with this type of sampling scheme is an expected systematic bias in DNA sequences, where older material has more missing data. In this study, we evaluated how missing data influenced phylogenomic relationships in the brush-tongued parrots, or the lories and lorikeets (Tribe: Loriini), which are distributed across the Australasian region. We collected ultraconserved elements from modern and historical material representing the majority of described taxa in the clade. Preliminary phylogenomic analyses recovered clustering of samples within genera, where strongly supported groups formed based on sample type. To assess if the aberrant relationships were being driven by missing data, we performed an outlier loci analysis and calculated gene-likelihoods for trees built with and without missing data. We produced a series of alignments where loci were excluded based on ? gene-wise log-likelihood scores and inferred topologies with the different datasets to assess whether sample-type clustering could be altered by excluding particular loci. We found that the majority of questionable relationships were driven by particular subsets of loci. Unexpectedly, the biased loci did not have higher missing data, but rather more parsimony informative sites. This counterintuitive result suggests that the most informative loci may be subject to the highest bias as the most variable loci can have the greatest disparity in phylogenetic signal among sample types. After accounting for biased loci, we inferred a more robust phylogenomic hypothesis for the Loriini. Taxonomic relationships within the clade can now be revised to reflect natural groupings, but for some groups additional work is still necessary.

Identifier
Source https://data.blue-cloud.org/search-details?step=~0123828CB597256FB04C9164884214BB180578F403C
Metadata Access https://data.blue-cloud.org/api/collections/3828CB597256FB04C9164884214BB180578F403C
Provenance
Publisher Blue-Cloud Data Discovery & Access service; ELIXIR-ENA
Publication Year 2024
OpenAccess true
Contact blue-cloud-support(at)maris.nl
Representation
Discipline Marine Science
Spatial Coverage (-140.060W, -41.670S, 179.760E, 8.730N)
Temporal Coverage Begin 2019-02-20T00:00:00Z
Temporal Coverage End 2019-03-02T00:00:00Z