Re-merge split BAM files
1
0
Entering edit mode
7.6 years ago
KT • 0

Hi all, Let's say I have SAM/BAM files containing single-end reads. My reads look like: Intron 1 - Exon 2 - Intron 2 - Exon 3 - Intron 3

My aim is to get rid of all intron sequences so that my reads will look like: Exon 2 - Exon 3

I intend to split BAM files into two BAM files based on the region. One BAM file contains exon 2 sequences and the other contains exon 3. Then, re-merge split BAM files. However, I don't know how to tell samtools (or whatever tools) to join reads that have the same QNAME to create one single read.

Anyone can give me an advise? Thank you very much,

next-gen alignment sequence • 1.6k views
ADD COMMENT
1
Entering edit mode
7.6 years ago

There's no way to get samtools to do that, you'll either have to use something else or, much more likely, write your own program.

As an aside, you might want to mention what you're actually trying to accomplish. It's likely that you're making life more complicated than needed.

ADD COMMENT
0
Entering edit mode

Thanks for your advise. The reason why I want to do that is because my reference file only have exon sequences. I am working on HLA-B and want to phase the alleles. I have two different reference files. One contain full HLA-B sequences from 384 HLA-B alleles while the other have all HLA-B alleles sequences (around 4000) but it only contains exon sequences.

When I used BLAST to align my reads with the first reference file, I could call the correct alleles. But when I used the second file, I could not get the same answer. I guess because my reads have intron sequences in the middle, BLAST did not behave the same. That's why I am thinking of removing all intron sequences and re-align my reads to the second reference file.

ADD REPLY

Login before adding your answer.

Traffic: 3832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6