Merging alignment graphs vs merging BAMs

0

Entering edit mode

20 months ago

Daniel • 0

Hi, I'm aligning my reads to the reference graph using the same process as found here:

time ${DATA_DIR}/vg giraffe --progress \
--read-group "ID:1 LB:lib1 SM:HG003 PL:illumina PU:unit1" \
--sample "HG003" \
-o BAM --ref-paths ${DATA_DIR}/GRCh38.path_list.txt \
-P -L 3000 \
-f ${DATA_DIR}/HG003.novaseq.pcr-free.35x.R1.fastq.gz \
-f ${DATA_DIR}/HG003.novaseq.pcr-free.35x.R2.fastq.gz \
-Z ${DATA_DIR}/hprc-v1.1-mc-grch38.gbz \
--kff-name ${DATA_DIR}/HG003.fq.kff \
--haplotype-name ${DATA_DIR}/hprc-v1.1-mc-grch38.hapl \
-t $(nproc) > reads.unsorted.bam

Here are my questions:

Is there any difference / benefit in merging multiple graphs from multiple read pairs vs doing a "samtools merge bam" to multiple result BAMs (after ordering alignments)? If so, what is command for merging graphs?
Can you clarify whether we should be removing duplicates? At which point should this occur? Which tool is recommended - Picard MarkDuplicates?

Thanks you.

bam vg • 436 views

ADD COMMENT • link 20 months ago by Daniel • 0

Login before adding your answer.