vg giraffe successfully mapped?
2
0
Entering edit mode
3 days ago
kingcohn ▴ 30

Hello again.

So, after taking your suggestions I was able to generate a mapping file with my paired-end reads-yay! However, there were some warnings that are giving me pause.

warning[vg::giraffe]: Encountered 100000 ambiguously-paired reads before finding enough
                      unambiguously-paired reads to learn fragment length distribution. Are you sure
                      your reads are paired and your graph is not a hairball?
warning[vg::giraffe]: Finalizing fragment length distribution before reaching maximum sample size
                      mapped 15 reads single ended with 100000 pairs of reads left unmapped
                      mean: 0, stdev: 1
warning[vg::giraffe]: Cannot cluster reads with a fragment distance smaller than read distance
                      Fragment length distribution: mean=0, stdev=1
                      Fragment distance limit: 2, read distance limit: 200
warning[vg::giraffe]: Falling back on single-end mapping
Using fragment length estimate: 0 +/- 1
Mapped 23194886 reads across 48 threads in 2715.04 seconds with 43.8742 additional single-threaded seconds.
Mapping speed: 177.921 reads per second per thread
Used 4940.71 CPU-seconds (including output).
Achieved 4694.64 reads per CPU-second (including output)
Used 7535877895759 CPU instructions (not including output).
Mapping slowness: 0.324894 M instructions per read at 1525.26 M mapping instructions per inclusive CPU-second
Memory footprint: 2.69394 GB

I perused this issue that described a similar warning. Here are the stats on the gam, should I be concerned about the low "total properly paired"? Also, this is a fairly hairy reference-graph generated from PGGB with Jacquard scores in the 0.14-0.16 range, but for my testing I really want to keep the outgroup in the graph. Thanks!

vg stats:

Total alignments: 23194886
Total primary: 23194886
Total secondary: 0
Total aligned: 3587032
Total perfect: 841
Total gapless (softclips allowed): 2473575
Total paired: 23194886
Total properly paired: 30
Alignment score: mean 68.5133, median 62, stdev 28.6525, max 160 (637 reads)
Mapping quality: mean 23.6366, median 21, stdev 17.3144, max 60 (148111 reads)
Insertions: 2109185 bp in 797954 read events
Deletions: 2798036 bp in 961606 read events
Substitutions: 21341967 bp in 21341967 read events
Matches: 328426000 bp (14.1594 bp/alignment)
Softclips: 182091099 bp in 3953117 read events
Total time: 2670.53 seconds
Speed: 8685.49 reads/second
vg • 472 views
ADD COMMENT
0
Entering edit mode

...are you referring to this previous thread? - vg dist on (large) graph runtime?

I can see that you also posted about vg ~5 years ago. You are the true expert on this program, no :)

ADD REPLY
0
Entering edit mode

ha! true expert? hardly...but back in my day the warnings were easier to ignore

ADD REPLY
2
Entering edit mode
3 days ago
Jouni Sirén ▴ 800

There are three common reasons for that warning:

  1. The reads are not paired.
  2. The reads are sorted, with previously unmapped reads first, and vg cannot determine fragment length distribution on its own. If you want to do paired-end mapping, you either have to shuffle the reads or provide the fragment length parameters manually.
  3. The graph is not a good reference, either because it does not resemble your sample or because its structure is too far from what Giraffe expects.

In your case, the third reason is probably the right one. Graphs built with PGGB are less useful for read mapping than those built with Minigraph–Cactus.

ADD COMMENT
0
Entering edit mode
1 day ago

Just FYI, I see these kind of stats typically for properly paired (so much higher than yours). Check you have submitted the reads properly to vg.

Total paired: 649308246
Total properly paired: 639578400
ADD COMMENT

Login before adding your answer.

Traffic: 5706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6