Extracting specific regions from a bam file and exporting them to do a phylogenetic analysis
0
0
Entering edit mode
5.5 years ago
Badh2 • 0

Hi all, I’m new to bioinformatics and coding. Therefore, any help to do the following tasks is appreciated. I have a bam file I got from running samtools for 8 samples. I viewed them in IGV and Geneious. I want to use these alignments for a phylogenetic analysis but for that I need to extract only the scaffolds (in IGV) or contigs (in Geneious) that contain all 8 samples that aligned to a specific locus. I identified few of them using IGV and Geneious. But there’s over 2 million scaffolds that I have to go through. Please tell me how I can scan through all of them to extract only the scaffolds that have data for all 8 samples or at least >4 samples. Then I want to know how to export that data in fasta format so that I can use them to build my tree.

Thanks!!!

bam samtools Geneious Phylogenetic fasta • 1.5k views
ADD COMMENT
2
Entering edit mode

Hello,

you could first convert the read positions within the bam files to bed using bamToBed. Atferwards do a multiIntersect with these bed files and extract the lines with the number of overlaps you like.

fin swimmer

ADD REPLY
0
Entering edit mode

Hi finswimmer, Thanks for your suggestion. I just read about BEDtool ans seems like that has an option to do my task. I'll try and see. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6