Question

Picard tools MergeBamAlignment error

0

Entering edit mode

6.6 years ago

ea11g10 • 0

Hi all,

I am attempting to go through the Dropseq pipeline, but I have changed a few things to the default. I have aligned my fastq files to the hg38 genome rather than hg19 and I've also used TopHat to align rather than STAR. However, when I get the the MergeBamAlignment step, used to merge an unaligned bam and the aligned bam to re-introduce the tags into the aligned bam files, I keep getting an error but unsure how to resolve it.

Both the bam files are sorted by queryname as the pipeline says to do, but I keep getting the following error (I've removed some of the chromosome names otherwise it would have been too long, as it contains all the contigs):

    Exception in thread "main" java.lang.IllegalArgumentException: Do not use this function to merge dictionaries with different sequences in them. Sequences must be in the same order as well. Found [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 3, 4, 5, 6, 7, 8, 9, ...].
 at htsjdk.samtools.SAMSequenceDictionary.mergeDictionaries(SAMSequenceDictionary.java:305)
 at picard.sam.SamAlignmentMerger.getDictionaryForMergedBam(SamAlignmentMerger.java:197)
 at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:346)
 at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:181)
 at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:282)
 at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
 at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)

The picard code I'm using is:

picard MergeBamAlignment UNMAPPED_BAM=4571121.blue101/temp/unaligned_mc_tagged_polyA_filtered.bam ALIGNED_BAM=4571121.blue101/temp/aligned.sorted.bam OUTPUT=4565921.blue101/temp/merged.bam REFERENCE_SEQUENCE=/scratch/ea11g10/Dropseq/hg38.fasta PAIRED_RUN=false INCLUDE_SECONDARY_ALIGNMENTS=false    CLIP_ADAPTERS=true IS_BISULFITE_SEQUENCE=false ALIGNED_READS_ONLY=false MAX_INSERTIONS_OR_DELETIONS=1 READ1_TRIM=0 READ2_TRIM=0 ALIGNER_PROPER_PAIR_FLAGS=false SORT_ORDER=coordinate PRIMARY_ALIGNMENT_STRATEGY=BestMapq CLIP_OVERLAPPING_READS=true ADD_MATE_CIGAR=true VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LE
VEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json

I created the dict file for the hg38.fasta file using picard CreateSequenceDictionary

Any help would be appreciated. Thanks

picard mergebam dropseq • 3.4k views

ADD COMMENT • link 6.6 years ago by ea11g10 • 0

0

Entering edit mode

Both the bam files are sorted by queryname

I think, only the UNMAPPED_BAM should be sorted on queryname

ADD REPLY • link 6.6 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

In the Dropseq pipeline, after alignment of the fastq file, it says to sort the outputted alinged BAM by queryname before it moves onto the merging of the 2 BAM files

ADD REPLY • link 6.5 years ago by ea11g10 • 0