Question: Picard: Value was put into PairInfoMap more than once. Even bwa enabled -M
1
gravatar for pentiumy
2.0 years ago by
pentiumy10
pentiumy10 wrote:

I try to follow this link(GDC protocol of processing TCGA data) https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/#dna-seq-alignment-command-line-parameters to process some cancer exome sequencing file. I used TCGA bam file as a test. I first convert bam to fq, and run following command:

bwa mem -t 8 -T 0 -R '@RG\tID:'${sampleID}'\tSM:'${sampleID}  -M $reference $readsFiles_split |samtools view -Shb -o "$sampleID"/step1.output.bam -

java -Xmx4g -jar ${binDir%/}/picard.jar SortSam CREATE_INDEX=true INPUT="$sampleID"/step1.output.bam OUTPUT="$sampleID"/step2.output.bam SORT_ORDER=coordinate VALIDATION_STRINGENCY=STRICT

java -Xmx4g -jar ${binDir%/}/picard.jar MergeSamFiles ASSUME_SORTED=false CREATE_INDEX=true INPUT="$sampleID"/step2.output.bam MERGE_SEQUENCE_DICTIONARIES=false OUTPUT="$sampleID"/step3.output.bam SORT_ORDER=coordinate USE_THREADING=true VALIDATION_STRINGENCY=STRICT

java -Xmx4g -jar ${binDir%/}/picard.jar MarkDuplicates CREATE_INDEX=true INPUT="$sampleID"/step3.output.bam VALIDATION_STRINGENCY=STRICT O="$sampleID"/step4.output.bam M="$sampleID"/step4.output.txt

However, I recieved error message in MarkDuplicates step saying :

Exception in thread "main" htsjdk.samtools.SAMException: Value was put into PairInfoMap more than once. 6: test_tcga:HWI-EAS289_106503224:7:3:14221:11655

I used -M already. And I used TCGA bam file to convert to fq. Anyone has any idea what's going on? I am not quite familar with those steps. Thanks!

ADD COMMENTlink modified 2.0 years ago by Pierre Lindenbaum118k • written 2.0 years ago by pentiumy10

what is the version of picard ? try with VALIDATION_STRINGENCY=LENIENT

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum118k

Picard 2.6.0 -SNAPSHOT I will try that Any other possibilities? Thanks!

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by pentiumy10

Sorry I just found that read actually shows twice in fq file....

so 1.fq:

@HWI-EAS289_106503224:7:3:14221:11655/1

@HWI-EAS289_106503224:7:3:14221:11655/1

in 2.fq

@HWI-EAS289_106503224:7:3:14221:11655/2

@HWI-EAS289_106503224:7:3:14221:11655/2

here are my bam2fastq step commands...anything wrong :

samtools sort -n $bamfile ${bamfile}.qsort
bedtools bamtofastq -i ${bamfile}.qsort.bam -fq ${bamfile}.1.fq -fq2 ${bamfile}.2.fq
ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by pentiumy10

Sorry I just found that read actually shows twice in fq file....

so the fastq is wrong... there's nothing to do but removing those duplicates.

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum118k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 880 users visited in the last hour