Question: featureCounts: extremely low rate of 'Successfully assigned alignments'
gravatar for Yi Chu
3 months ago by
Yi Chu0
Yi Chu0 wrote:

When I was using featureCounts for counting RNA-seq reads, I found extremely low rate of Successfully assigned alignments : 134418 (0.4%), it's weired, because the hisat2 mapping rate is quite high(94.8%), even the uniquely mapping rate is 45.0%. I looked up the summary file, and obviously there are a large percent of unmapped reads are from multimapping and no features as shown in the figure below:

enter image description here

I checked out other five samples from the same species, and the results were extremely similar.

enter image description here

My code used during mapping and counting is attached:

nohup hisat2 --new-summary -p 3 -x ~/Fman/index/index -1 1.clean_data/31-L-2-A_1.fq.gz -2 1.clean_data/31-L-2-A_2.fq.gz -S 31-L-rep2.sam --rna-strandness RF --dta & samtools sort -o 31-L-rep2.bam 31-L-rep2.sam featureCounts -T 10 -p -t exon -g gene_id -s 2 -a ~/Fman/ -o 31-L-rep2_featureCounts.txt 31-L-rep2.bam

I'm quite sure my library is strand-specific with d-UTP method. And my sample is tetraploid. My question is : 1. why the mapping result between hisat2 and featureCounts are so different? 2. Did I do something wrong with the parameters I set? 3. Or it's just the normal circumstance for polyploidy species?

rna-seq featurecounts • 219 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Yi Chu0

Have you visually inspected the alignments? Are they properly nested under exons or are scattered all over? DNA contamination can be a rare but possible problem. It would lead to good alignments but poor assignments/counts.

ADD REPLYlink modified 3 months ago • written 3 months ago by genomax91k
gravatar for h.mon
3 months ago by
h.mon31k wrote:

You are mixing up two concepts: hisat2 is mapping the reads to a reference genome, featurecounts is assigning mapped reads to genomic features - typically, genes. So, even when one finds a good mapping rate, it doesn't necessarily mean one will get high high counts for the annotated features.

There are multiple factors that can cause this, and you will have to investigate further to discover the cause. You said the species you are interested is a tetraploid, do you know if the reference genome is an haploid, diploid, or tetraploid representation? Did you check if the genome annotation has duplicated feature ID in different chromosomes, or many overlapping features? Did you check if featurecounts -s 0 or -s 1 improve assignment rate?

ADD COMMENTlink written 3 months ago by h.mon31k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1192 users visited in the last hour