1
0
Entering edit mode
4.2 years ago
cvu ▴ 180

Hi all,

I'm analysing RNA-Seq data to finally calculate RPKM value. First I did denovo assembly of reads using Trinity assembler. Then using Trinity output as reference for bowtie2, against same reads (used for denovo assembly) for mapping. But when I ran htseq, it gives zero read counts for some trinity contigs, since I used same reads for assembly and mapping, I shouldn't get zero read count.

here is a workflow:

Trinity -> bowtie2 -> htseq -> RPKM

Ideally I should not get zero read count for any of the trinity contigs, since all contigs were created from same reads. I don't understand where it went wrong.

Any help would be appreciated Thanks

RNA-Seq htseq trinity • 1.9k views
2
Entering edit mode

Try running transrate on your data first. My guess is that many of the affected contigs are low quality and should probably be filtered.

0
Entering edit mode

thanks Devon. I have not filtered my Trinity output !! May be that's one of the reasons.

0
Entering edit mode

It's always informative to specify exactly which commands you used.

2
Entering edit mode
4.2 years ago

Have you checked the bottom of the output table ? The number of reads counted as *ambiguous/not_aligned/...

Htseq is possibly more stringent than trinity and could not count mapped reads for several reasons listed here. This could explain why some contigs have zero counts.

__no_feature: reads (or read pairs) which could not be assigned to any feature;
__ambiguous: reads (or read pairs) which could have been assigned to more than one feature and hence were not counted for any of these.
__too_low_aQual: reads (or read pairs) which were skipped due to the -a option, see below


Also, you could check your bam files with samtools idxstats if you could map reads on all contigs with bowtie.

0
Entering edit mode

Thanks a lot Carlo Yague. I changed -a 5 and number of read count increased.