when I look at RNA-seq quanitification data, I see miRNA genes:
protein_coding 19962 lncRNA 16901 processed_pseudogene 10167 unprocessed_pseudogene 2614 misc_RNA 2212 snRNA 1901 miRNA 1881 <--------- TEC 1057
So why would a TCGA cohort have a separate pipeline for miRNA quantification?
This video says small RNA seq (mi/si/piRNA) is too small to be captured by regular RNA-seq kits
True. That is why miRNAs are a small fraction of detected genes in the above example while in small RNA-seq it is the majority. No assay is bias-free, therefore you always see some spurious miRNA hits. It's simply not black and white.
that's the nice thing about controls though. if you are simply looking for variance in cases vs controls, then how the data was obtained doesn't matter so much. especially if you scale the data
I could not disagree more. If you want to make statements from data then the experiment must be performed accordingly. Quantifying noise and then pretending it was signal while transforming data to hide that is naive at best and fraud at worst.
i see. so it's noise, not comparatively small levels of detection
would you make the same case for lncRNA detected by a generic RNA-seq protocol?