Is it okay to realign and subsequently process fastq files that were converted from processed BAM files?
1
0
Entering edit mode
6.8 years ago

I have been given BAM files from a collaborator that have already gone through processing, but I want to incorporate this data set into a larger analysis for which I have a standard protocol for taking files from fastq through somatic variant calling (roughly similar to GATK's best practices). I am hoping to convert the BAM files back to fastq using Picard's SamToFastq program, and then take the files from there through the typical protocol, but I was curious about what potential issues this may raise, given that the quality scores may be different now than they were for the original fastqs.

FYI, the specifics for the BAM files were that the original fastqs were “aligned to the hg19 human genome build using BWA (v0.7.5) [and then] subjected to mark duplication, realignment, and recalibration using the Picard tool and GATK software tools”

Unfortunately, I don't know more about the origin of these files than that, but any general insights as to how the previous processing of the BAMs might affect how the fastqs I will generate are treated would be appreciated!

alignment fastq bam bwa recalibration • 2.0k views
ADD COMMENT
0
Entering edit mode

If you have been given the files, you could ask the person who gave you the bam files for more information, or even ask about the fastq files. Or is that out of question?

ADD REPLY
0
Entering edit mode

It's not necessarily out of the question, but difficult. I wasn't the one directly given the data, and my supervisor who received the files is having difficulty reaching those who generated the data. I was hoping to be able to move forward with analysis, but I wanted to try get a general feel for how feasible it is to work with fastqs generated from processed bams.

ADD REPLY
3
Entering edit mode
6.8 years ago

Unless the reads were hard clipped the sequence information is unaltered and can be recovered into its original format.

The samtools fastq command can also perform the back conversion.

ADD COMMENT
0
Entering edit mode

To add more to this - the content of the BAM file may only be a subset of the original data though.

ADD REPLY
0
Entering edit mode

Thanks Istvan! You don't think there will be any issues with the quality scores, for example when it comes time to call variants?

ADD REPLY
0
Entering edit mode

It will reconstitute the quality scores as well.

That FASTQ file will be just as fresh, fluffy and untouched as if it just rolled off of an instrument.

ADD REPLY
0
Entering edit mode

Okay, thanks so much for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6