HiFi read alignment with vg giraffe
Entering edit mode
8 months ago
shehzad_99 • 0

Hi, I used vg giraffe to align long read hifi sequences to a pangenome using vg giraffe and these are the numbers i am getting by running vg stats.

Total alignments: 4144653
Total primary: 4144653
Total secondary: 0
Total aligned: 4144644
Total perfect: 420948
Total gapless (softclips allowed): 3318916
Total paired: 0
Total properly paired: 0
Alignment score: mean 4237.06, median 3312, stdev 3333.28, max 25623 (1 reads)
Mapping quality: mean 58.526, median 60, stdev 8.54529, max 60 (4010647 reads)
Insertions: 1151521 bp in 768106 read events
Deletions: 1222803 bp in 772070 read events
Substitutions: 21231696 bp in 21231696 read events
Softclips: 43549949152 bp in 5046954 read events
Total time: 498646 seconds
Speed: 8.31181 reads/second

My team believes that these numbers are too good to be true and that there is something wrong. Could someone let me know if this is fine.

For reference the command i used is

vg giraffe -p -t 80 -Z /path/to/file.gbz -d /path/to/file.dist -m /path/to/file.min -x /path/to/file.xg -f sample.fq > vg-sample.gam 2> vg-sample.out 
vg • 599 views
Entering edit mode

What numbers do you find "too good to be true"? What are you mapping and against what? What were you expecting?

Entering edit mode

I would also be quite surprised if you got especially good performance on HiFi from a mapping tool that is designed for short read data. One thing to note is that you got an average 10508 bp of soft clips per read, which is roughly half the length of a HiFi read.

There are experimental features in vg giraffe to support HiFi alignment, but they're not fully baked yet.


Login before adding your answer.

Traffic: 2991 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6