I am working on RNA seq data. I have some questions about htseq-count tool for counting the reads.
QUESTION1. The result of htseq-count produces gene name and its count. Now htseqcount says that
In the case of RNA-Seq, the features are typically genes, where each gene is considered here as the union of all its exons.
Now i already had bam alignment file and used htseq count on it(ofcourse converting to sam first and providing hg19 gtf file). It gave me ensembl name and read count.
My question is suppose for a particular ensembl name read count is 13450. If i want to see these 13450 reads in sam/bam file, how can i do that???
Question2. i created my own cdna genome taking couple of genes and mapped using gsnap aligner(using a fastq file as input). The bam file which was already present for the whole genome for a particular fastq file (same as above fastq file ofcourse) gives very low count as compared to when i use my own cdna genome.
What could be the reason for this.
Hope to hear from you guys soon