Say I have some unstranded RNA-seq data and im mapping to the reference human genome using htseq-count (--stranded=no)
My understanding (biologically) was that for a given protein_coding gene, reading DNA in the sense strand gives the protein_coding transcript, reading the gene in the opposite direction gives the non-coding version of the gene.
- for reads mapping to a gene whose biological function is protein_coding (irrespective of the strand of the genome the read aligns to), is a given read counted towards the protein_coding gene (irrespective of the genome strand) or considered noncoding? In other words, how does htseq-count consider read alignment directionality for unstranded RNA-seq data in assigning counts to a given gene?
- Say I am counting unstranded RNA-seq data aligned to the exon human genome only. Do reads only mapping in the sense direction of the genome count? Does the exon human genome fasta preserve directionality or does it just have genomic coordinates? So reads that align in the non-protein coding direction for an exonic portion of the genome would not be counted as protein_coding?