HTSeq - Extracting exon level read coverage of a specific gene
Entering edit mode
8 months ago
chet • 0

Dear all, I am trying to quantify RNASeq reads at the "exon level" using HTSeq. To achieve a quantitative exon comparison. I am using ENCODE mouse data which is Illumina reads alligned to GENCODE M27 (GRCm39) using STAR (bam files) and a relevant index (bai) file generated by Samtools.

My HTSeq command is as follows:

$htseq-count -f bam -r pos -s no -t exon -i gene_id --additional-attr=gene_name--additional_attr=exon_number Aligned.sortedByCoord.out.bam ~/Gencode_Mouse_39/gencode.vM27.primary_assembly.annotation.gff3 > exon_counts.txt

Apparently this approach provides exon coverage for all "gene_id" attributed genes. Is there any method to limiting this approach to a "specific" ensemble annotated gene only? (present in the GFF file?) I have gone through the HTSeq documentation as well as several topics but couldn't find any example or implementation. Any comments or suggestions are welcome

alignment exon RNASeq HTSeq • 499 views
Entering edit mode
8 months ago
yhoogstrate ▴ 130

I would recommend to count all genes and to filter/select those you want afterwards. It probably feels like a waste of resources, but when you use a complete gene annotation as input and HTSeq uses it entirely, it will cope more naturally with reads that have ambiguous maps. Consider the examples used to explain the difference between union, intersection_strict, and intersection_nonempty.


Login before adding your answer.

Traffic: 1932 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6