different FeatureCounts output for the same data
0
0
Entering edit mode
8 weeks ago

Hi Everyone,

I analyzed 200 datasets from NCBI using the same code. However, when I applied the analysis to my original dataset, I noticed that the outputs were incorrect. After discussing with my supervisor, he conducted the same analysis using his own code, and it produced the correct results. I thought maybe my code was wrong, so I tried using his code, but the results were still incorrect. I'm unsure of the differences between the two sets of code. The problem seems to be that my FPKM values are too low. While my HISAT2 alignment rate is approximately 90%, the featureCounts alignment rate is only around 15-20%. This discrepancy is causing issues with the accuracy of my results. Any ideas?

This is my supervisor's code:

counts <- featureCounts(files = bams,
                        annot.ext = "~/Desktop/mouse_meta/gencode.vM32.primary_assembly.annotation.gtf.gz", 
                        isGTFAnnotationFile=TRUE,
                        GTF.featureType= c("exon"),
                        GTF.attrType="gene_id",
                        useMetaFeatures=TRUE,
                        allowMultiOverlap=F,
                        largestOverlap=F,
                        countMultiMappingReads=TRUE,
                        isPairedEnd=FALSE,
                        nthreads=10
)

This is my code:

counts <- featureCounts(files = bams, annot.ext = "/archive/alotaibih/sehribanb/new_mouse/gencode.vM32.primary_assembly.annotation.gtf.gz", 
                          isGTFAnnotationFile= TRUE,
                          GTF.featureType= c("exon"),
                          GTF.attrType= "gene_id",
                          useMetaFeatures= T,
                          countMultiMappingReads= T,
                          isPairedEnd= TRUE,
                          nthreads= 24
)
fpkm Counts Rsubread rna-seq • 530 views
ADD COMMENT
0
Entering edit mode

featureCounts alignment rate is only around 15-20%.

You mean assignment rate? It is that low even after allowing for counting of multi-mapping reads?

ADD REPLY
0
Entering edit mode

Yeah, let me show you my featureCounts result, it looks like this. It's too low.

ADD REPLY
0
Entering edit mode

Make sure you are using the same version of featureCounts, it has changed how it deals with paired end reads. The exact same command will produce half as many counts when moving from one version of FeatureCounts to another.

the two codes you have have also different flags for paired end.

ADD REPLY
0
Entering edit mode

Sure, I'll inquire about the version of Rsubread. I'll keep you posted if it works :) Thanks

ADD REPLY
0
Entering edit mode

just to clarify, it is not that the program algorithm works differently but the meaning of the flags changed;

before -p was sufficient to have it work in paired mode, but in later versions, an additional parameter is needed to count in paired-end mode.

so the symptom is that the same command may count exactly half as many reads between two versions, and the latter needs the countPairs flag to recover the same behavior

ADD REPLY

Login before adding your answer.

Traffic: 2085 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6