when I look at RNA-seq quanitification data, I see miRNA genes:
protein_coding                        19962
lncRNA                                16901
processed_pseudogene                  10167
unprocessed_pseudogene                2614 
misc_RNA                              2212 
snRNA                                 1901 
miRNA                                 1881  <---------
TEC                                   1057
So why would a TCGA cohort have a separate pipeline for miRNA quantification?
https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/miRNA_Pipeline/
This video says small RNA seq (mi/si/piRNA) is too small to be captured by regular RNA-seq kits
True. That is why miRNAs are a small fraction of detected genes in the above example while in small RNA-seq it is the majority. No assay is bias-free, therefore you always see some spurious miRNA hits. It's simply not black and white.
that's the nice thing about controls though. if you are simply looking for variance in cases vs controls, then how the data was obtained doesn't matter so much. especially if you scale the data
I could not disagree more. If you want to make statements from data then the experiment must be performed accordingly. Quantifying noise and then pretending it was signal while transforming data to hide that is naive at best and fraud at worst.
i see. so it's noise, not comparatively small levels of detection
would you make the same case for lncRNA detected by a generic RNA-seq protocol?