Hello,
I've been running a drop-seq experiment, and the 2nd to last command fails:
java -Xmx4000m -jar 3rdParty/picard/picard.jar MergeBamAlignment REFERENCE_SEQUENCE=mm10/mm10.fasta UNMAPPED_BAM=unaligned_mc_tagged_polyA_filtered.bam ALIGNED_BAM=aligned.sorted.bam INCLUDE_SECONDARY_ALIGNMENTS=false PAIRED_RUN=false OUTPUT=merged.bam
This command requires two files, which I've examined with Picard's validateSamFile
for unaligned_mc_tagged_polyA_filtered.bam the error is
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_PLATFORM_VALUE 1
and for aligned.sorted.bam, the error message is:
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_READ_GROUP 1
WARNING:MISSING_TAG_NM 11720805
WARNING:RECORD_MISSING_READ_GROUP 11720805
which I think can be fixed by Picard's AddOrReplaceReadGroups (http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups), but this command requires several options, which I don't know (the default options didn't work), i.e.
RGID=4 \
RGLB=lib1 \
RGPL=illumina \
RGPU=unit1 \
RGSM=20
I don't know what to put for these options for a DropSeq experiment: How I can find these values for Picard's AddOrReplaceReadGroups?
I think it would be helpful if you explained what kind of data you have, and what you are trying to accomplish. I looked up drop-seq, but it's not clear from the description, how it relates to what you are doing. Also, the exact command lines you used at every step of the way are absolutely essential to solving the problem. I only see one command, but clearly you used more.
Hi Brian,
Drop-Seq is a single-cell pipeline.
This is run in a series of commands, the second to last is where it fails: