Artikel,

The accuracy of absolute differential abundance analysis from relative count data

K. Roche, und S. Mukherjee.
PLoS Comput. Biol., 18 (7): e1010284 (Juli 2022)

Zusammenfassung

Concerns have been raised about the use of relative abundance data derived from next generation sequencing as a proxy for absolute abundances. For example, in the differential abundance setting, compositional effects in relative abundance data may give rise to spurious differences (false positives) when considered from the absolute perspective. In practice however, relative abundances are often transformed by renormalization strategies intended to compensate for these effects and the scope of the practical problem remains unclear. We used simulated data to explore the consistency of differential abundance calling on renormalized relative abundances versus absolute abundances and find that, while overall consistency is high, with a median sensitivity (true positive rates) of 0.91 and specificity (1-false positive rates) of 0.89, consistency can be much lower where there is widespread change in the abundance of features across conditions. We confirm these findings on a large number of real data sets drawn from 16S metabarcoding, expression array, bulk RNA-seq, and single-cell RNA-seq experiments, where data sets with the greatest change between experimental conditions are also those with the highest false positive rates. Finally, we evaluate the predictive utility of summary features of relative abundance data themselves. Estimates of sparsity and the prevalence of feature-level change in relative abundance data give reasonable predictions of discrepancy in differential abundance calling in simulated data and can provide useful bounds for worst-case outcomes in real data.

BibTeX-Schlüssel: Roche2022-vv
Eintragstyp: article
Jahr: 2022
Monat: jul
Zeitschrift: PLoS Comput. Biol.
Nummer: 7
Seiten: e1010284
Verlag: Public Library of Science (PLoS)
Band: 18
copyright: http://creativecommons.org/licenses/by/4.0/
language: en

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

@article{Roche2022-vv, abstract = {Concerns have been raised about the use of relative abundance data derived from next generation sequencing as a proxy for absolute abundances. For example, in the differential abundance setting, compositional effects in relative abundance data may give rise to spurious differences (false positives) when considered from the absolute perspective. In practice however, relative abundances are often transformed by renormalization strategies intended to compensate for these effects and the scope of the practical problem remains unclear. We used simulated data to explore the consistency of differential abundance calling on renormalized relative abundances versus absolute abundances and find that, while overall consistency is high, with a median sensitivity (true positive rates) of 0.91 and specificity (1-false positive rates) of 0.89, consistency can be much lower where there is widespread change in the abundance of features across conditions. We confirm these findings on a large number of real data sets drawn from 16S metabarcoding, expression array, bulk RNA-seq, and single-cell RNA-seq experiments, where data sets with the greatest change between experimental conditions are also those with the highest false positive rates. Finally, we evaluate the predictive utility of summary features of relative abundance data themselves. Estimates of sparsity and the prevalence of feature-level change in relative abundance data give reasonable predictions of discrepancy in differential abundance calling in simulated data and can provide useful bounds for worst-case outcomes in real data.}, added-at = {2024-09-10T11:56:37.000+0200}, author = {Roche, Kimberly E and Mukherjee, Sayan}, biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/247d5ad1ef9cb9067d86b5bfd3f277acc/scadsfct}, copyright = {http://creativecommons.org/licenses/by/4.0/}, interhash = {eb7f5550e69cc15b69446fc2b3ada7b0}, intrahash = {47d5ad1ef9cb9067d86b5bfd3f277acc}, journal = {PLoS Comput. Biol.}, keywords = {topic_mathfoundation yaff}, language = {en}, month = jul, number = 7, pages = {e1010284}, publisher = {Public Library of Science (PLoS)}, timestamp = {2025-07-29T11:28:34.000+0200}, title = {The accuracy of absolute differential abundance analysis from relative count data}, volume = 18, year = 2022 }

PUMA

The accuracy of absolute differential abundance analysis from relative count data

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf