Question: Converting TCGA Bam files to fastq: Picard does not work!
1
gravatar for jonessara770
2.2 years ago by
jonessara770170
jonessara770170 wrote:

Hello,

I am trying to convert bam files from TCGA to fastq. Picard gives the following error:

picard.sam.SamToFastq done. Elapsed time: 0.78 minutes.
Runtime.totalMemory()=2058354688
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" picard.PicardException: Illegal mate state: H090WADXX130325:1:1106:10520:95300
    at picard.sam.SamToFastq.assertPairedMates(SamToFastq.java:342)
    at picard.sam.SamToFastq.doWork(SamToFastq.java:164)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:185)
    at picard.sam.SamToFastq.main(SamToFastq.java:137)

The error is due to more than one pair of reads having the same query name.

it has been suggested to use bedtools bamtofastq. This produce the fastq files, however there are duplicated read names that makes my pipeline to crash in downstream steps…

I also tested “resolvepair” script but it does not produce anything…

I would like to either remove these duplicate read names or rename them. Do you have a solution to solve this issue?

Thanks

sequencing wes • 1.0k views
ADD COMMENTlink modified 11 months ago by rmh19950 • written 2.2 years ago by jonessara770170

did you use the latest version of picard ? did you use VALIDATION_STRINGENCY=LENIENT ?

ADD REPLYlink written 2.2 years ago by Pierre Lindenbaum123k

Thanks for your reply! I ran it again with this version (picard-2.9.0/picard.jar SamToFastq VALIDATION_STRINGENCY=LENIENT) but get the same error.

ADD REPLYlink written 2.2 years ago by jonessara770170

Do you know which aligner created the BAM? I have found BAM to FASTQ conversion almost impossible in some cases of BAMs originating from RNA-SEQ. I believe its related to conflicting interpretations of mates and pairs.

ADD REPLYlink written 2.2 years ago by jomo018490

yes, these are aligned by BWA meme

ADD REPLYlink written 2.2 years ago by jonessara770170

In BWA site I see this Q/A:

With BWA-MEM/BWA-SW, my tools are complaining about multiple primary alignments. Is it a bug? It is not. Multi-part alignments are possible in the presence of structural variations, gene fusion or reference misassembly. However, representing multi-part alignments in SAM has not been finalized. To make BWA work with your tools, please use option `-M' to flag extra hits as secondary.

I believe "SAM has not been finalized" for multi-part alignments is basically the "conflicting interpretations of mates and pairs".

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by jomo018490
0
gravatar for rmh1995
11 months ago by
rmh19950
rmh19950 wrote:

I understand this is very late, but I believe UNC has provided some code to solve this issue. UBU. GenoMax disucsses the problem in a little detail here I hope this helps anyone currently looking for this solution!

ADD COMMENTlink written 11 months ago by rmh19950
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2862 users visited in the last hour