Question: Problem With Picard Dedup In Illumina Alignment Pipeline
6
gravatar for tommivat
5.7 years ago by
tommivat240
Finland
tommivat240 wrote:

I'm trying to complement my alignment pipeline with picard dedup as recommended by Broad (see their best practices). I have paired end Illumina reads and I use only reads from chr17 (mapped previsouly) as test data. After, bwa I run picard MarkDuplicates and get the following error

Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 630636, Read name HWI-H212:69:C0NR3ACXX:1:1212:10100:13262, bin field of BAM record does not equal value computed based on alignment start and end, and length of sequence to which read is aligned
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:541)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:522)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:481)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:672)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:650)
    at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:397)
    at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:161)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:145)

I have tried running FixMateInformation, as suggested in SeqAnswers but then the error is

Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 630636, Read name HWI-H212:69:C0NR3ACXX:1:1212:10100:13262, bin field of BAM record does not equal value computed based on alignment start and end, and length of sequence to which read is aligned
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:541)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:522)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:481)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:672)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:650)
    at net.sf.picard.sam.FixMateInformation.doWork(FixMateInformation.java:148)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119)
    at net.sf.picard.sam.FixMateInformation.main(FixMateInformation.java:76)

Finally, ValidateSamFile return the following error for number of reads

ERROR: Record 43432, Read name HWI-H212:69:C0NR3ACXX:1:1310:18404:48164, Mate negative strand flag does not match read negative strand flag of mate

Any suggestions?

picard duplicates alignment • 6.9k views
ADD COMMENTlink modified 3.4 years ago by yuanyuan.zhang.cau0 • written 5.7 years ago by tommivat240
7
gravatar for Mitch Bekritsky
5.7 years ago by
Mitch Bekritsky1.1k
London, England
Mitch Bekritsky1.1k wrote:

I ran into a bin field problem when I used Picard's MarkDuplicates before, and tried to find documentation for the error either in Picard's source code or online. Turns out there's not much on it, and I think it has to do with the reads not being in the correct bin in a BAM file for random access. Unfortunately, I couldn't find a way to fix it either.

My best solution was to set VALIDATION_STRINGENCY=LENIENT on all my Picard jobs, which, while it didn't eliminate my problem, did prevent Picard from dying on this particular error. Since it only affects one BAM record and I don't do much random access of BAM files anyway, I'm hoping that it won't affect my pipeline too much.

If anyone has a better solution, there's at least two of us who would love to hear it!

ADD COMMENTlink written 5.7 years ago by Mitch Bekritsky1.1k
2

I just noticed in the SAM format specification v1.4-r985, there are two pieces of code described that calculate a read's bin index number based on its position in the alignment. If you were really interested in fixing the bug instead of using VALIDATION_STRINGENCY to compare it, maybe you could compare the bin field in your record to what it should be according to the SAM specification?

ADD REPLYlink modified 5.6 years ago • written 5.6 years ago by Mitch Bekritsky1.1k
3
gravatar for henryvuong
5.6 years ago by
henryvuong750
USA
henryvuong750 wrote:

Hi, I encountered the same error with picard tool version 1.96 while running CollectTargetedPcrMetrics. I tried the same command line with picard version 1.79 then it worked fine.

ADD COMMENTlink written 5.6 years ago by henryvuong750

Same problem here with ValidateSamFile in 1.96. Picard 1.87 works, but 1.90 fails with a NullPointerException.

ADD REPLYlink modified 5.5 years ago • written 5.5 years ago by Morgan20

Same problem here with ReorderSam in 1.97. Picard 1.88 seems to work fine. Strange.

ADD REPLYlink written 5.4 years ago by steve40
2
gravatar for benjamin.schuster-boeckler
4.5 years ago by
United Kingdom

There is now a post on the GATK user forum about this issue: http://gatkforums.broadinstitute.org/discussion/4290/sam-bin-field-error-for-the-gatk-run

The bottom line seems to be: ignoring it is ok, otherwise you could re-create the input bam file and/or the .bai index which should fix the issue.

ADD COMMENTlink written 4.5 years ago by benjamin.schuster-boeckler20
1
gravatar for xrao
4.7 years ago by
xrao10
United States
xrao10 wrote:

Hello, thank you for your posts. I encountered the same issue and I avoid the errors using picard 1.88 as suggested. But then the GATK picks up the errors again. I still have to use --validation_stringency=LENIENT to get GATK to run normally. Do you have any updates about solving this problem?

BTW, I am thinking to use the option IGNORE=INVALID_INDEXING_BIN for the picard ValidateSamFile step, but not sure if it is the right way.

Thank you in advance for any suggestions!

 

ADD COMMENTlink written 4.7 years ago by xrao10
0
gravatar for Rad
4.5 years ago by
Rad800
Canada
Rad800 wrote:

Did you try to align with bowtie2 ? When using piccard tools (some of them, especially the metrics tools, I sometimes encounter problems with bwa alignment, where I have to use the ignore warnings, but when I use bowtie2 I have no problems, may be it is worth trying 

ADD COMMENTlink written 4.5 years ago by Rad800
0
gravatar for yuanyuan.zhang.cau
3.4 years ago by
China
yuanyuan.zhang.cau0 wrote:

I encountered the same errors. I thought mistakes happened in the combination of forward and reverse sam files(generated by bwa aln). Then I used

bwa mem ref.fa my_F.fq my_R.fq > my.sam 

to get a complete sam file.

samtools view -bS my.sam -o my.bam

to convert sam file to bam format. Picard worked successfully with my.bam file.

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by yuanyuan.zhang.cau0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 695 users visited in the last hour