Picard: Value was put into PairInfoMap more than once. Even bwa enabled -M
0
1
Entering edit mode
7.1 years ago
pentiumy ▴ 10

I try to follow this link(GDC protocol of processing TCGA data) https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/#dna-seq-alignment-command-line-parameters to process some cancer exome sequencing file. I used TCGA bam file as a test. I first convert bam to fq, and run following command:

bwa mem -t 8 -T 0 -R '@RG\tID:'${sampleID}'\tSM:'${sampleID}  -M $reference $readsFiles_split |samtools view -Shb -o "$sampleID"/step1.output.bam -

java -Xmx4g -jar ${binDir%/}/picard.jar SortSam CREATE_INDEX=true INPUT="$sampleID"/step1.output.bam OUTPUT="$sampleID"/step2.output.bam SORT_ORDER=coordinate VALIDATION_STRINGENCY=STRICT

java -Xmx4g -jar ${binDir%/}/picard.jar MergeSamFiles ASSUME_SORTED=false CREATE_INDEX=true INPUT="$sampleID"/step2.output.bam MERGE_SEQUENCE_DICTIONARIES=false OUTPUT="$sampleID"/step3.output.bam SORT_ORDER=coordinate USE_THREADING=true VALIDATION_STRINGENCY=STRICT

java -Xmx4g -jar ${binDir%/}/picard.jar MarkDuplicates CREATE_INDEX=true INPUT="$sampleID"/step3.output.bam VALIDATION_STRINGENCY=STRICT O="$sampleID"/step4.output.bam M="$sampleID"/step4.output.txt

However, I recieved error message in MarkDuplicates step saying :

Exception in thread "main" htsjdk.samtools.SAMException: Value was put into PairInfoMap more than once. 6: test_tcga:HWI-EAS289_106503224:7:3:14221:11655

I used -M already. And I used TCGA bam file to convert to fq. Anyone has any idea what's going on? I am not quite familar with those steps. Thanks!

next-gen bwa picard MarkDuplicates • 4.6k views
ADD COMMENT
0
Entering edit mode

what is the version of picard ? try with VALIDATION_STRINGENCY=LENIENT

ADD REPLY
0
Entering edit mode

Picard 2.6.0 -SNAPSHOT I will try that Any other possibilities? Thanks!

ADD REPLY
0
Entering edit mode

Sorry I just found that read actually shows twice in fq file....

so 1.fq:

@HWI-EAS289_106503224:7:3:14221:11655/1

@HWI-EAS289_106503224:7:3:14221:11655/1

in 2.fq

@HWI-EAS289_106503224:7:3:14221:11655/2

@HWI-EAS289_106503224:7:3:14221:11655/2

here are my bam2fastq step commands...anything wrong :

samtools sort -n $bamfile ${bamfile}.qsort
bedtools bamtofastq -i ${bamfile}.qsort.bam -fq ${bamfile}.1.fq -fq2 ${bamfile}.2.fq
ADD REPLY
0
Entering edit mode

Sorry I just found that read actually shows twice in fq file....

so the fastq is wrong... there's nothing to do but removing those duplicates.

ADD REPLY

Login before adding your answer.

Traffic: 2218 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6