Realignment stage from SNP calling for whole genome sequencing data
1
0
Entering edit mode
6.2 years ago

Hi all, I am trying to run Realignment stage from SNP calling for whole genome sequencing and getting the following error massage. any idea what this might be? thanks! Mostafa

Code i run: java -jar /home/m.rafiepour222/GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar –R /home/m.rafiepour222/GCF_000298355.1_BosGru_v2.0_genomic.fa -T RealignerTargetCreator –o /home/m.rafiepour222/SRR3112430/SRR3112430indels_Realigner.intervals -I /home/m.rafiepour222/SRR3112430/SRR3112430.sort.rmdup.bam

My error massage: enter image description here

snp • 1.5k views
ADD COMMENT
0
Entering edit mode

Thanks a lot for your reply, But, Which BAM file is your purpose?

I have three BAM files: A) SRR3112430.bam B) SRR3112430-sort.bam C) SRR3112430.sort.rmdup.bam

ADD REPLY
0
Entering edit mode

The last one, since otherwise you'd need to recreate the others after redoing the header.

ADD REPLY
0
Entering edit mode

Ok, I'm using this script for fix the SRR3112430.sort.rmdup.bam file:

git clone https://github.com/dpryan79/Answers.git
cd Answers
git submodule init
git submodule update
cd biostars_133825
make
mv ConvertPhredQuals /somewhere/in/your/path

But, a fundamental change was created and that's it, Before the change SRR3112430.sort.rmdup.bam (24,276,244 kb) and After the change SRR3112430.sort.rmdup.bam (1,349 kb)? Is this change correct?

ADD REPLY
0
Entering edit mode

Something went wrong there, no clue what.

ADD REPLY
0
Entering edit mode
6.2 years ago

When the people who did this experiment uploaded the data, they made a mistake and added 33 to all of the base quality scores. Since you rightly assumed that SRA data should be phred+33, you didn't tell the aligner to subtract something from the base qualities before producing BAM files. GATK is noticing this and throwing an error. There are a few programs out there that will take a BAM file and change all of the quality scores (such as this one that I wrote), so just choose one of those and fix the BAM file.

You might also alert SRA and the people who uploaded the file.

As an aside, you be using the haplotype caller instead, then you can avoid the unneeded indel realignment step.

ADD COMMENT
0
Entering edit mode

If i want to ignore the indel realignment step and using the haplotype caller, Which BAM file should i use as input?

I have three BAM files: A) SRR3112430.bam B) SRR3112430-sort.bam C) SRR3112430.sort.rmdup.bam

ADD REPLY
0
Entering edit mode

C, since you want to have duplicates marked or removed.

ADD REPLY
0
Entering edit mode

Ok, thanks a lot for your help.

Please make sure the indel realignment step is not needed if using the haplotype caller.

ADD REPLY
0
Entering edit mode

I can ensure you that indel realignment is completely pointless if you use the haplotype caller, that's why the realignment isn't included in the GATK best practices when the haplotype caller is used.

ADD REPLY
0
Entering edit mode

Ok Good, Thanks for everything. i can now start the haplotype caller step with more confidence.

ADD REPLY

Login before adding your answer.

Traffic: 2131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6