Hi,
I'm trying to extract reads based on their start coordinate in a bam file. I've tried using samtools view but that seems to give all reads covering that region, not originating there.
Apologies if this has been asked elsewhere but I couldn't find an answer via Google.
Many thanks.
For me the question is not clear , I will assume that you have list of coordinate that you want to extract in such case it will be like this
samtools view -b -L ROI.bed file.bam > ROI_file.bam
where region of interest ROI.bed will contains the start position you are interested in
otherwise give example to what you want to extract
Sorry I'll try to explain myself better. From my understanding if I run the command:
output.bam will contain all reads that fall between 1000-1500. I want to extract reads that start at any coordinate within this region. So for example a read that started at 1500 would also be extracted, even if most of the read does not lie within this region.
Apologies if I'm not correctly understanding how samtools view works!
clear now thanks :)
If your ultimate goal is to get the number of 5' position of reads at each genomic position, then you can try bedtools genomecov -5.
EDIT : it is not a very clear question indeed.