How can I get # of reads mapping to CDS/rRNA features or intergenic? I did my alignment with tophat and I have .gtf .gff and .bed annotations of my features.
You may try https://www.broadinstitute.org/cancer/cga/rna-seqc or http://rseqc.sourceforge.net/
Use HTSeq tool.
python -m HTSeq.scripts.count [options] <alignment_file> <gff_file>
Example command line to count reads mapped to the CDS:
python -m HTSeq.scripts.count --type=CDS --idattr=gene_id --mode=union --stranded=yes <sam_file> <gff_file>
Thanks for the useful tool, but it doesn't seem to work with tophat alignemnts:
7584 GFF lines processed.
Warning: No features of type 'exon' found.
Error occured when reading beginning of SAM/BAM file.
('SAM line does not contain at least 11 tab-delimited fields.', 'line 1 of file con1/accepted_hits.bam')
[Exception type: ValueError, raised in _HTSeq.pyx:1276]
Hmm, looks like I need to specify -f bam and install a bam reader.
I can confirm that yes, all you need to do is specify "-f bam" and this error should go away. The error is misleading because it says "SAM/BAM". Many other tools perform this detection behind-the-scenes with no problems, so it surprising that HTSeq does not.
I am not sure (this is just a guess) but the problem may be because you are providing bam file to HTseq as an input. It requires a SAM file. You can make HTseq read from the standard input ("-") using the following command.
samtools view accepted_hits.bam | python -m HTSeq.scripts.count --type=CDS --idattr=gene_id --mode=union --stranded=yes - <gff_file>
Or you can convert your bam file to sam file
samtools view -h input.bam > output.sam and give it to HTSeq.
If this is not the case, would you mind pasting the first few lines from your bam/sam.
Though you can see this in the example command that is provided here, I'd like to point out that to get the full distribution of CDS vs UTR vs Introns vs Intergenic, you will have to run at least that many commands with htseq-count, changing the --type= argument for each feature type.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy