Merging alignment graphs vs merging BAMs
0
0
Entering edit mode
20 months ago
Daniel • 0

Hi, I'm aligning my reads to the reference graph using the same process as found here:

time ${DATA_DIR}/vg giraffe --progress \
--read-group "ID:1 LB:lib1 SM:HG003 PL:illumina PU:unit1" \
--sample "HG003" \
-o BAM --ref-paths ${DATA_DIR}/GRCh38.path_list.txt \
-P -L 3000 \
-f ${DATA_DIR}/HG003.novaseq.pcr-free.35x.R1.fastq.gz \
-f ${DATA_DIR}/HG003.novaseq.pcr-free.35x.R2.fastq.gz \
-Z ${DATA_DIR}/hprc-v1.1-mc-grch38.gbz \
--kff-name ${DATA_DIR}/HG003.fq.kff \
--haplotype-name ${DATA_DIR}/hprc-v1.1-mc-grch38.hapl \
-t $(nproc) > reads.unsorted.bam

Here are my questions:

  1. Is there any difference / benefit in merging multiple graphs from multiple read pairs vs doing a "samtools merge bam" to multiple result BAMs (after ordering alignments)? If so, what is command for merging graphs?
  2. Can you clarify whether we should be removing duplicates? At which point should this occur? Which tool is recommended - Picard MarkDuplicates?

Thanks you.

bam vg • 436 views
ADD COMMENT

Login before adding your answer.

Traffic: 4255 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6