aligned or unaligned bam file
1
0
Entering edit mode
8.6 years ago
bioguy24 ▴ 230
Is it possible to unalign an aligned TMAP bam file? I am only sent the aligned bam but would like to use other aligners such as bowtie2 or bwa-mem to do additional QC. Can I use a tool on the aligned bam to create fastq or am I better off getting the unaligned bam as well? Thank you :).
ngs • 6.8k views
ADD COMMENT
0
Entering edit mode
8.6 years ago

You can use samtools bam2fq to produce fastq out of a bam file.

ADD COMMENT
0
Entering edit mode
Using that tool will produce the same results as an unaligned bam file? Thank you :).
ADD REPLY
0
Entering edit mode

there is no such thing as an "unaligned" bam file - a bam file is an alignment file. The command above creates Fastq files that you can align with another tool and create a different alignment file.

ADD REPLY
0
Entering edit mode
In the ion torrent an unaligned and aligned bam file are created. I assume that the unaligned bam contains just the raw reads and the aligned bam contains the tmap hg19 aligned reads. By converting the aligned bam to a fastq is the tmap hg19 index removed? Sorry for all the questions, just trying to understand better. Thank you :).
ADD REPLY
0
Entering edit mode

Ok, I see what you mean. The confusion arises from the having reads with no alignments in an alignment file and once we all put these all into a different file one can call an alignment file as being unaligned ...

The file is still an alignment file, it just contains reads that did not align. Use the same command as above to extract the read sequences. You now have a new dataset that you can align with a different tool. Treat it as new data.

ADD REPLY
0
Entering edit mode

I think I understand, but I want to make sure.

Running the command above on the aligned bam file will give me both aligned and unaligned reads in a new file? Or should I request the unaligned bam as well to use with other tools? Thank you :).

ADD REPLY
0
Entering edit mode

I believe that you need to run the command on each file, one will give you part of the original data that was aligned successfully, the other will give you part of the original data that did not align with that method. You could combine the two or keep them separate.

ADD REPLY
0
Entering edit mode

@Istvan Albert, Ion machine provides data in different formats. One such format is Unaligned BAM. I was working with the Ion data, I've been given the unaligned bam and I converted these BAM files back to fastq with different tools samtools, bedtools, picard ..etc. But the problem is, these fastq files don't match with the fastq produced by the machine. Still I wonder how to convert these unaligned BAM to fastq.

ADD REPLY
0
Entering edit mode

Yes, in the meantime I also understood the purpose of an unaligned bam file - with them it is possible to attach read group hence sample related information to the data.

FWIW that only demonstrates just how ill defined our file formats are - we are using an alignment format to represent unaligned data because the regular format does not allow us to enter sample related information.

ADD REPLY
0
Entering edit mode

But what if the aligner doesn't accept this kind of format as input? How to convert it to standard fastq format? Asking the sequencing service provider for fastq files is the only way?

ADD REPLY
0
Entering edit mode

of course, most aligners won't accept this format. This is a somewhat unexpected (even absurd) storage format where we store raw data as "alignment" because the alignment format allows attaching sample information to the data

as mentioned in the answer above you would need to convert to fastq before aligning with samtools bam2fq

ADD REPLY
0
Entering edit mode

As mentioned previously, fastq produced with these tools (samtools, bedtools..etc) are not matching with the fastq produced from the ion machine. This is the major problem. Initially I've received unaligned BAM for a sample and I converted these to fastq, for the same sample I've received fastq from the machine afterwards, the fastq I've generated from unaligned BAM is 720M and fastq machine has given me is 23GB.

As for as I know (and according to what you have mentioned above) they have only given me unaligned reads in BAM format after the alignment with the TMAP, supporting to this, read id's from unaligned BAM are not found in the TMAP alignment file. This would be a major problem for entire experiment if one proceeds that unaligned BAM is the whole raw data.

ADD REPLY

Login before adding your answer.

Traffic: 2699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6