User: Jouni Sirén
Jouni Sirén • 130
- Reputation:
- 130
- Status:
- Trusted
- Location:
- UCSC Genomics Institute
- Website:
- https://iki.fi/jouni.s...
- Twitter:
- jltsiren
- Last seen:
- 21 hours ago
- Joined:
- 1 year, 3 months ago
- Email:
- j*******@gmail.com
Posts by Jouni Sirén
<prev
• 11 results •
page 1 of 2 •
next >
1
vote
1
answer
144
views
1
answers
Answer:
A: vg index memory allocation
... Assuming that the graph is not too complex locally (in a 256 bp window), ~2 billion initial kmers in a single graph file should require 100-200 GB memory and 200-300 GB disk space in `$TMPDIR`.
GCSA construction uses a semi-external algorithm that works best when the graph is partitioned (e.g. by c ...
written 18 days ago by
Jouni Sirén • 130
0
votes
1
answer
141
views
1
answers
... The original sequences can be stored as threads (lightweight paths) in a GBWT index. See the [index construction wiki page][1] for details on building the GBWT. Storing a large number of paths in the graph itself is usually not a good idea, because the paths are not very space-efficient.
If you wan ...
written 5 weeks ago by
Jouni Sirén • 130
1
vote
1
answer
177
views
1
answers
... It looks like `vg convert` outputs the graph in uncompressed form, while `vg view` compresses it with gzip. In any case, both formats are now obsolete. The default graph format is now HashGraph, which is faster and requires less memory than the old vg format. You can convert the GFA into the HashGra ...
written 5 weeks ago by
Jouni Sirén • 130
1
vote
1
answer
150
views
1
answers
... You can get some basic statistics about the GAM file with `vg stats -a alignments.gam [graph-name]`. The graph file is optional, but if you include it, `vg stats` will compute some extra statistics.
...
written 9 weeks ago by
Jouni Sirén • 130
1
vote
2
answers
166
views
2
answers
... VG assumes that path names are opaque strings. While some path names starting with `_` (e.g. `_alt_*` and `_thread_*`) are used for technical purposes, VG generally does not understand the information encoded in path names.
In VG terminology, there is a conceptual difference between paths and threa ...
written 9 weeks ago by
Jouni Sirén • 130
3
votes
1
answer
248
views
1
answers
... The easiest way is probably doing it outside vg:
1. Extract all kmers (kmer occurrences) with `vg kmers`.
2. Extract the selected paths in FASTA format with `vg paths -v graph.vg -F -p path-names.txt > output.fa`.
3. Determine the kmers in the paths using an external tool.
4. Compare the kmer se ...
written 6 months ago by
Jouni Sirén • 130
1
vote
1
answer
213
views
1
answers
Answer:
A: Vg gamsort temporary folder
... `vg gamsort`, as well as most other vg commands, takes the temporary directory from environment variables. The primary variable is `TMPDIR`, but vg also checks a number of other variables (`TMP`, `TEMP`, `TEMPDIR`, `USERPROFILE`) if that has not been set. If everything else fails, vg falls back to ` ...
written 6 months ago by
Jouni Sirén • 130
2
votes
1
answer
351
views
1
answers
... You can use `vg paths -F` to extract entire paths in FASTA format. By default, this extracts all paths in the graph. You can use option `-p FILE` to specify a list of path names in a file and option `-Q PREFIX` to extract all paths with the given name prefix.
In order to extract subpaths, you can c ...
written 10 months ago by
Jouni Sirén • 130
0
votes
1
answer
290
views
1
answers
... Sometimes haplotypes contain alternate alleles of overlapping variants that make no sense together (under the vg interpretation of the VCF). By default, this causes a phase break in GBWT construction. With option `-o`, the construction will use the reference allele for the variant that occurs later ...
written 12 months ago by
Jouni Sirén • 130
1
vote
1
answer
331
views
1
answers
... Your VCF and reference files are probably using different contig names. If so, you can match them by adding options `-n vcf_contig=fasta_contig` (e.g. `-n chr1=1 -n chr2=2`) to the `vg construct` command. ...
written 13 months ago by
Jouni Sirén • 130
Latest awards to Jouni Sirén
Scholar
6 months ago,
created an answer that has been accepted.
For A: Extract linear representation of paths in VG graph
Teacher
6 months ago,
created an answer with at least 3 up-votes.
For A: Best way to obtain reference and non-reference k-mer
Scholar
10 months ago,
created an answer that has been accepted.
For A: Extract linear representation of paths in VG graph
Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.
Powered by Biostar
version 2.3.0
Traffic: 969 users visited in the last hour