Question: htseq read count zero ?
0
gravatar for cvu
2.4 years ago by
cvu140
India
cvu140 wrote:

Hi all,

I'm analysing RNA-Seq data to finally calculate RPKM value. First I did denovo assembly of reads using Trinity assembler. Then using Trinity output as reference for bowtie2, against same reads (used for denovo assembly) for mapping. But when I ran htseq, it gives zero read counts for some trinity contigs, since I used same reads for assembly and mapping, I shouldn't get zero read count.

here is a workflow:

Trinity -> bowtie2 -> htseq -> RPKM

Ideally I should not get zero read count for any of the trinity contigs, since all contigs were created from same reads. I don't understand where it went wrong.

Any help would be appreciated Thanks

rna-seq trinity htseq • 1.3k views
ADD COMMENTlink modified 2.4 years ago by Carlo Yague4.7k • written 2.4 years ago by cvu140
2

Try running transrate on your data first. My guess is that many of the affected contigs are low quality and should probably be filtered.

ADD REPLYlink written 2.4 years ago by Devon Ryan92k

thanks Devon. I have not filtered my Trinity output !! May be that's one of the reasons.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by cvu140

It's always informative to specify exactly which commands you used.

ADD REPLYlink written 2.4 years ago by WouterDeCoster41k
2
gravatar for Carlo Yague
2.4 years ago by
Carlo Yague4.7k
Belgium
Carlo Yague4.7k wrote:

Have you checked the bottom of the output table ? The number of reads counted as *ambiguous/not_aligned/...

Htseq is possibly more stringent than trinity and could not count mapped reads for several reasons listed here. This could explain why some contigs have zero counts.

__no_feature: reads (or read pairs) which could not be assigned to any feature;
__ambiguous: reads (or read pairs) which could have been assigned to more than one feature and hence were not counted for any of these.
__too_low_aQual: reads (or read pairs) which were skipped due to the -a option, see below
__not_aligned: reads (or read pairs) in the SAM file without alignment
__alignment_not_unique: reads (or read pairs) with more than one reported alignment.

Also, you could check your bam files with samtools idxstats if you could map reads on all contigs with bowtie.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Carlo Yague4.7k

Thanks a lot Carlo Yague. I changed -a 5 and number of read count increased.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by cvu140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour