how htseq-count counts unstranded RNA-seq data
1
0
Entering edit mode
2.7 years ago
wiscoyogi ▴ 40

preliminary

Say I have some unstranded RNA-seq data and im mapping to the reference human genome using htseq-count (--stranded=no)

My understanding (biologically) was that for a given protein_coding gene, reading DNA in the sense strand gives the protein_coding transcript, reading the gene in the opposite direction gives the non-coding version of the gene.

questions

  1. for reads mapping to a gene whose biological function is protein_coding (irrespective of the strand of the genome the read aligns to), is a given read counted towards the protein_coding gene (irrespective of the genome strand) or considered noncoding? In other words, how does htseq-count consider read alignment directionality for unstranded RNA-seq data in assigning counts to a given gene?
  2. Say I am counting unstranded RNA-seq data aligned to the exon human genome only. Do reads only mapping in the sense direction of the genome count? Does the exon human genome fasta preserve directionality or does it just have genomic coordinates? So reads that align in the non-protein coding direction for an exonic portion of the genome would not be counted as protein_coding?
htseq-count RNA-seq stranded • 1.3k views
ADD COMMENT
1
Entering edit mode
2.7 years ago

reading DNA in the sense strand gives the protein_coding transcript, reading the gene in the opposite direction gives the non-coding version of the gene.

Depends on the library prep. In some RNAseq preps, your reads will run towards the beginning of the transcript, in some preps, the reads might run towards the end. You have to find out what prep was used to analyze your data with the proper context.

I don't think HTSeq-count gives a fig whether or not a feature is designated protein coding.

When run unstranded, reads will count no matter what direction they are. That's the point of telling the software your prep is unstranded.

ADD COMMENT
0
Entering edit mode

this doesn't answer my question -- im wondering what will a given read count towards? the protein_coding or noncoding annotation of the gene?

ADD REPLY
1
Entering edit mode

Reads that align to two or more features will be thrown out. But if the prep is stranded, and your features run in opposite directions, the read will count for the feature in the right direction.

ADD REPLY

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6