Hello VG team,
I want to compare alignment performances between Bowtie2, BWA-MEM2, and VG Giraffe. To do so, I have generated simulated reads (50 paired-end FASTQ files) based on chromosome 1 (HPRC v1.1) using vg sim
and vg surject
as follows:
vg sim \
-x chr1.xg \
-g chr1.gbwt \
-n 20000000 \
-p 500 -v 50 \
-e 0.0024 -i 0.00029 \
--threads 40 \
--progress \
-m HG02055 \
-F REAL_SAMPLE_1.fastq.gz \
-F REAL_SAMPLE_2.fastq.gz \
--random-seed 42
-a > HG02055.gam
vg surject \
-x chr1.xg \
--bam-output \
--into-path CHM13#0#chr1 \
--threads 40 \
--progress \
--sample HG02055 \
HG02055.gam \
| samtools reheader -c "sed s/CHM13#0#//g" - \
| samtools sort -n -@ 40 - \
| samtools fastq -@ 40 - > HG02055.interleaved.fastq
Testing the Bowtie2 and BWA-MEM2 is straightforward. However, when it comes to VG Giraffe, things get a bit complex. For instance, if I want to test sample HG02055, I should first generate a custom chr1 pangenome without paths associated with HG02055, as it would be similar to a personalized pangenome.
What is the best way to proceed from a computational perspective? I am aware I can remove a sample from a GWBT using the following command:
vg gbwt -o chr1.HG02055.gbwt --remove-sample HG02055 chr1.gbwt
However, how can I remove it from a GFA or GBZ graph?
Thank you in advance for any help you can provide.
Thank you so much. I did not fully grasp the
vg gbwt
command.And thank you for the great work!
saruman : Please accept this answer (green check mark) to provide closure for this thread.
Done. Thank you.