extract aligned reads to a sequence part and reset indices
Entering edit mode
2.1 years ago


I have NGS reads mapped to genome. Now I need to extract only a portion of the reads mapping to certain region, which I can do easily with:

samtools view my_bam.bam GENOME:1000-2000 > my_region.sam

However, the aligned reads are still recorded with their original indices on the genome (from 1000 to 2000 in this example). However, I need to have them indexed from 1, as if the requested region is new genome sequence.

1) is there any tool (or samtools/sambamba setting) that can do this?

2) Sure, I can process the file manually and subtract the offset from the index. Is this the way to go, are there any gotchas regarding the sam format e.g. offsets for reads mapping to reverse strand? (I know I will also need to replace the sequence id in the file.)

Ps. I wouldn't do this, but the tool I want to use require such input :(.

samtools sam bam • 423 views
Entering edit mode

use awk to substract the POS and the mate-POS ?


Login before adding your answer.

Traffic: 1363 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6