Entering edit mode
6.1 years ago
aboyd003
•
0
I am attempting to count the number of reads in my bam at a list of specific coordinates. I am using samtools. Queryid specifies chromosome and mylookup specifies the coordinate. The coordinate is a single position and not a feature. A GFF file is not available to run standard programs like featureCount. The resulting count is 663504 which is more than 661089. Does anyone have ideas about this?
samtools view -c $FILE "$queryid":"$mylookup"-"$mylookup"
featureCountsaccepts custom position lists in its SAF format, check its manual. From this you can quantify any given intervals in the genome.I'm unsure of precisely what you want, but chr:pos-pos returns all alignments that overlap pos and not all alignments starting at pos.
Your title implies you wanted the latter. If this is the case, I don't think such a tool exists but you could use mpileup to get a single coordinate and then count the '^' symbols. Eg:
The regexp replaces anything not "^" with "" ([^x] is not x, so [^^] is not ^) and then counts the number of remaining symbols. "-Q0" is needed to avoid removal of low quality bases.
That may still remove duplicates unless other mpileup options are used? I forget now how it works.
Alternatively just use normal samtools view and count the number of lines exactly starting at the desired pos.
5