Fetching All Alignments From A Sam/Bam By Read Header In Perl
1
1
Entering edit mode
12.2 years ago
Abhi ★ 1.6k

Hey Guys

I am wondering if there is a slick way access all the possible alignments for a read present in sam or bam file given the read header. Since the existing codebase is in perl I would prefer something which can be done in/via perl.

By default BAM's are indexed by location so the inbuilt samtools indexing wont work I guess.

I should also say the input bam file will have in the order of 500 million total alignments and many reads are expected to be aligned to more than one place in the genome. Given the size of the data loading it all in one big hash is not turning out to be memory friendly.

Thanks!
-Abhi

bam perl samtools sam • 3.1k views
ADD COMMENT
2
Entering edit mode
12.2 years ago
SES 8.6k

Have a look at Bio::DB::Sam. This does not ship with the core BioPerl, and to compile this you will need to get the Kent source tree a recent version of SAMtools. Building this can be tricky depending on your OS, but I'm sure there are plenty of people that can offer advice.

ADD COMMENT
1
Entering edit mode

Actually, you don't need the Kent source tree (that would be needed for Bio::DB::BigFile), but you do need a recent version of samtools

ADD REPLY
0
Entering edit mode

Thanks, Chris. That was my error in memory. It looks like Bio::DB::BigFile is for BigWig and BigBed anyway, so at least Bio::DB::Sam is appropriate for the original question.

ADD REPLY
0
Entering edit mode

@SES @Chris : Thanks guys .. on my weekend test list

ADD REPLY
0
Entering edit mode

Actually, you don't need the Kent source tree (that would be Bio::DB::BigFile), but you do need a recent version of samtools.

ADD REPLY

Login before adding your answer.

Traffic: 1400 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6