Why Does Cufflinks With Mask (-M) Option Have Lower Fpkm For Mrna Genes?
2
4
Entering edit mode
10.4 years ago
Xianjun ▴ 310

Dear all,

I was curious how much the -M (mask file) option can improve the FPKM from Cufflinks. From the mannual, it says

-M/--mask-file <mask.(gtf gff)&gt;="" <br=""/> Tells Cufflinks to ignore all reads that could have come from transcripts in this GTF file. We recommend including any annotated rRNA, mitochondrial transcripts other abundant transcripts you wish to ignore in your analysis in this file. Due to variable efficiency of mRNA enrichment methods and rRNA depletion kits, masking these transcripts often improves the overall robustness of transcript abundance estimates.

So, I would expect by providing the mask file containing rRNA, tRNA, mt genes etc. will decrease the "total mapped reads" (e.g. denominator), which will lead a increased FPKM. But actually what I see is, for most mRNA genes, the FPKM values with -M option are smaller than that without -M. See attached figures (e.g. I expect most of the dots are under the red dotted line, which is x=y).

I have to admit that -M indeed can reduce a lot of the FPKM for rRNA genes. But still, it's mysterious why most mRNA genes have lower FPKM after applying -M option. Does anyone have similar observation?

btw, here is my cufflinks arguments with -M:

cufflinks --library-type fr-unstranded -o cufflink_w_M -p 8 -G /data/iGenome/Homo_sapiens/UCSC/hg19/Annotation/Genes/gencode.v13.annotation.karotyped.gtf -M /data/iGenome/Homo_sapiens/UCSC/hg19/Annotation/Genes/chrM.rRNA.tRNA.gtf --multi-read-correct accepted_hits.bam

and without -M:

cufflinks --library-type fr-unstranded -o cufflink_wo_M -p 8 -G /data/iGenome/Homo_sapiens/UCSC/hg19/Annotation/Genes/gencode.v13.annotation.karotyped.gtf --multi-read-correct accepted_hits.bam

Thanks

-Xianjun

Cufflinks output with -M vs. without -M

cufflinks • 5.8k views
ADD COMMENT
2
Entering edit mode

It looks like this might be related to the options --compatible-hits-norm and --total-hits-norm. The documentation isn't super clear, but I'd suggest playing around with these options and seeing what happens.

ADD REPLY
0
Entering edit mode

Thanks for the clue. I am re-running cufflinks with --compatible-hits-norm option (by default it uses --total-hits-norm). I will update with you for the result. Thanks.

ADD REPLY
0
Entering edit mode

Does anyone have clue to the mysterious?

ADD REPLY
0
Entering edit mode
10.4 years ago
Xianjun ▴ 310

Here is the plot after applying --compatible-hits-norm option. Looks more much what I expected. Thanks for the tip from Chris.

ADD COMMENT
0
Entering edit mode
8.3 years ago

Hey Xianjun,

I was trying to make my mask gtf (hg 19) file but I failed, is there's a chance that you would share the gtf file?

Also could you write the complete cufflinks command that managed to get you the right results?

Thank you so much!

ADD COMMENT

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6