Identify deletion in multiple bam files compared to the same bacterial reference
1
1
Entering edit mode
5.2 years ago
David ▴ 230

Hi, I have done an illumina WGS experiment in many gut samples and mapped those samples back to a reference bacterial genome.

I have identified a small deletion (2bp) in one bam file compared to the reference bacterial genome. I have the coordinates of the region. Is there a quick way to chech if the same deletion is present in all other samples (bam files) compared to the reference??

Thanks

bedtools bam • 853 views
ADD COMMENT
0
Entering edit mode

You can have the hits on this region :

samtools view -b your_file.bam "chr_name:start-end" > hits_file.bam

Then check out by eyes or process the CIGAR

ADD REPLY
1
Entering edit mode
5.2 years ago
David ▴ 230

Thanks for the answer, i have tried as follows

samtools view -b your_file.bam "sequence1:345000-345510" > hits_file.bam

Then to count how many reads have at least the deletion i have done:

samtools view  17.mapped_sorted.bam  "sequence1:345000-345510" | grep  -cE '[0-9]{1,2}D' 
202

To count all i did:

samtools view  17.mapped_sorted.bam  "sequence1:345000-345510" | wc -l 
410

The idea is to know how many deletions( i got in this region , so here is 202/410*100= 51%

Would that make sense or is there any easy way to do that ?

ADD COMMENT
0
Entering edit mode

You posted this as an answer. Did you mean to do that or did you mean to add it as a comment to Bastien's comment? Your're pulling information from the CIGAR string with the grep statement, right?

ADD REPLY

Login before adding your answer.

Traffic: 2527 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6