substring BAM alignments from read direction
0
0
Entering edit mode
3 months ago
June ▴ 10

Hello, Is there a known utility or sub routine to substring a BAM by sequence offset and length? I think it will be easy for SEQ and QUAL fields, but subset CIGAR seems a pain on the table. Thanks in advance!

CIGAR BAM • 428 views
ADD COMMENT
0
Entering edit mode

to substring a BAM by sequence offset and length?

Can you provide some additional clarification of what you are looking to do?

ADD REPLY
0
Entering edit mode

Thanks, GenoMax. I am looking for a way to substring my BAM file alignment to the read beginning information, which is needed to identify ATAC-seq Tn5 insertion sites. For example, I have a paired alignment like below:

lh00134:214:22FC2KLT3:8:1111:36385:6464 147 chr1 9998 0 57S73M4I3M1D20M = 10010 -81 TGACACGAATACTAGGACGAATCGTAGCCGTAGTCGTAAGCGTAAGACTGGTATTAGCGATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC 555555555555-5-5555555--5----555555---555-5-5-5----FFF-F---55F5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

I want to extract the first 29 base pair in the read direction with matching position, SEQ, CIGAR and Phred Qual fields. Seems for this alignment it will be 1M4I3M1D20M. Is there a way to do this in a large scale? I also find another post similar to my needs: Extracting CIGAR string from a specific region of a read in a bam file using pysam?. But seems it is not easy.

ADD REPLY
0
Entering edit mode

Think I figured out a routine if I could expand the CIGAR string first: the $string is an expanded CIGAR which could be generated by offset and length. To expand the CIGAR string, I copied code from Ben Langmead (BtlBio::Alignment:Util.pm).

sub getCIGAR{
my $string = shift;
my $i = 0;
my $nCIGAR;
my $count = 0;
while($i < length($string)){
my $exp;
my $current = substr($string, $i, 1);
my $next = substr($string, $i+1, 1);
if($current eq $next){
$count ++;      
}
else{
$exp = $count + 1 . $current;
$count = 0; 
}
$nCIGAR = $nCIGAR . $exp; 
$i++;
}
return $nCIGAR;
    }
ADD REPLY

Login before adding your answer.

Traffic: 3540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6