It's kind of a "negative" result, but I'll add this one to the mix, as I think it highlights the importance of being careful when doing large-scale computational analyses with complex genomics data:
which looks broadly across WGS data in the TCGA. From their abstract, they find:
Our recent re-analysis of data from three cancer types revealed that technical errors have caused erroneous reports of numerous microbial species reportedly found in sequencing data from The Cancer Genome Atlas (TCGA) project. Here we have expanded our analysis to cover all 5,734 whole-genome sequencing (WGS) data sets currently available from The Cancer Genome Atlas (TCGA) project, covering 25 distinct types of cancer. We analyzed the microbial content using updated computational methods and databases, and compared our results to those from two major recent studies that focused on bacteria, viruses, and fungi in cancer. Our results expand upon and reinforce our recent findings, which showed that the presence of microbes is far smaller than had been previously reported, and that most species identified in TCGA data are either not present at all, or are known contaminants rather than microbes residing within tumors.
Indeed! It looks like the retraction statement is still not live yet, so it’s unclear if the authors agree with the retraction or if this is Nature’s unilateral decision. Regardless, one wonders of the implications for the follow-up study.
The long-read assembly papers have been pretty influential (or rather, will be very influential in the years to come):
"The complete sequence of a human Y chromosome"
"Telomere-to-telomere assembly of diploid chromosomes with Verkko"
"The complete sequence of a human genome"
"Resolution of structural variation in diverse mouse genomes reveals chromatin remodeling due to
transposable elements"
Many others...
For non-consortium papers that I'd say are influential (in part, basing them on social media response):
There's "The specious art of single-cell genomics" (Plos comp bio) by my colleague, which has ignited some discussion+debates+considerations about t-SNE/UMAPs.
There's "Major data analysis errors invalidate cancer microbiome findings" (Mbio), which has performed some important re-analysis of a major finding and revealed how important it is to normalize correctly and to make absolutely sure your reads are aligning to what you think they are aligning to. Processing genomics data (at both the read-level and quantification-level) is very difficult to get "right" so be extra wary of your own papers and of papers by others when drawing large biological conclusions from one genomics data analysis.
Edit: Someone else beat me to this as I was typing :)
Following up on the above; many of the authors of this paper have now just published a new pre-print
Comprehensive analysis of microbial content in whole-genome sequencing samples from The Cancer Genome Atlas project
which looks broadly across WGS data in the TCGA. From their abstract, they find:
Just saw on X, original paper that that paper replied to was retracted.
Indeed! It looks like the retraction statement is still not live yet, so it’s unclear if the authors agree with the retraction or if this is Nature’s unilateral decision. Regardless, one wonders of the implications for the follow-up study.