Can anyone tell if this HTSeq calculation come out correct
Entering edit mode
7.3 years ago

I am a new people in this field. After Bowtie and HTSeq, here is a output.sam file result (just list some here), want to ask your advice:

BD94_4129    0
BD94_4130    0
BD94_4131    0
BD94_4132    0
BD94_4133    0
BD94_4134    0
BD94_4135    0
BD94_4136    0
BD94_4137    0
BD94_4138    0
BD94_4139    0
BD94_4140    0
BD94_4141    0
BD94_4142    0
BD94_4143    0
__no_feature    6117090
__ambiguous    0
__too_low_aQual    0
__not_aligned    1813286
__alignment_not_unique    0

There are some may 0 above the _no_feature line, is it normal, or not suppose to be? Thank you for your advice again!

rna-seq • 1.1k views
Entering edit mode

Anyone wants to comment on this and help me out. Thank you!

Entering edit mode
7.3 years ago

Without more information about the experiment it's a bit difficult to tell... A quick check to see if at least HTSeq has actually counted reads in genes is

awk 'substr($1, 1, 2) != "__" {if($2 == 0){nzero+=1}; nassigned+=$2}END{print NR, nzero, nassigned}' data.htseq

Where data.htseq is your output from htseq. This command will print three numbers: The number of genes, the number of genes with count zero, the total number of reads assigned to genes. It's probably ok to have a few genes without any read. The total number of reads assigned should be quite a bit larger than the number of reads in "__no_feature", unless your reference transcriptome is very incomplete. (Assuming what you have here is some sort of RNA-Seq experiment)


Login before adding your answer.

Traffic: 828 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6