User: Jouni Sirén

gravatar for Jouni Sirén
Jouni Sirén130
Reputation:
130
Status:
Trusted
Location:
UCSC Genomics Institute
Website:
https://iki.fi/jouni.s...
Twitter:
jltsiren
Last seen:
21 hours ago
Joined:
1 year, 3 months ago
Email:
j*******@gmail.com

Posts by Jouni Sirén

<prev • 11 results • page 1 of 2 • next >
1
vote
1
answer
144
views
1
answers
Answer: A: vg index memory allocation
... Assuming that the graph is not too complex locally (in a 256 bp window), ~2 billion initial kmers in a single graph file should require 100-200 GB memory and 200-300 GB disk space in `$TMPDIR`. GCSA construction uses a semi-external algorithm that works best when the graph is partitioned (e.g. by c ...
written 18 days ago by Jouni Sirén130
0
votes
1
answer
141
views
1
answers
Answer: A: In 'vg construct', is it possible to have the variant paths be labeled with sam
... The original sequences can be stored as threads (lightweight paths) in a GBWT index. See the [index construction wiki page][1] for details on building the GBWT. Storing a large number of paths in the graph itself is usually not a good idea, because the paths are not very space-efficient. If you wan ...
written 5 weeks ago by Jouni Sirén130
1
vote
1
answer
177
views
1
answers
Answer: A: Issues calling variants with vg when using other tools upstream
... It looks like `vg convert` outputs the graph in uncompressed form, while `vg view` compresses it with gzip. In any case, both formats are now obsolete. The default graph format is now HashGraph, which is faster and requires less memory than the old vg format. You can convert the GFA into the HashGra ...
written 5 weeks ago by Jouni Sirén130
1
vote
1
answer
150
views
1
answers
Answer: A: vg gam file statistics: how many reads are mapped
... You can get some basic statistics about the GAM file with `vg stats -a alignments.gam [graph-name]`. The graph file is optional, but if you include it, `vg stats` will compute some extra statistics. ...
written 9 weeks ago by Jouni Sirén130
1
vote
2
answers
166
views
2
answers
Answer: A: vg path associated with sample name
... VG assumes that path names are opaque strings. While some path names starting with `_` (e.g. `_alt_*` and `_thread_*`) are used for technical purposes, VG generally does not understand the information encoded in path names. In VG terminology, there is a conceptual difference between paths and threa ...
written 9 weeks ago by Jouni Sirén130
3
votes
1
answer
248
views
1
answers
Answer: A: Best way to obtain reference and non-reference k-mer
... The easiest way is probably doing it outside vg: 1. Extract all kmers (kmer occurrences) with `vg kmers`. 2. Extract the selected paths in FASTA format with `vg paths -v graph.vg -F -p path-names.txt > output.fa`. 3. Determine the kmers in the paths using an external tool. 4. Compare the kmer se ...
written 6 months ago by Jouni Sirén130
1
vote
1
answer
213
views
1
answers
Answer: A: Vg gamsort temporary folder
... `vg gamsort`, as well as most other vg commands, takes the temporary directory from environment variables. The primary variable is `TMPDIR`, but vg also checks a number of other variables (`TMP`, `TEMP`, `TEMPDIR`, `USERPROFILE`) if that has not been set. If everything else fails, vg falls back to ` ...
written 6 months ago by Jouni Sirén130
2
votes
1
answer
351
views
1
answers
Answer: A: Extract linear representation of paths in VG graph
... You can use `vg paths -F` to extract entire paths in FASTA format. By default, this extracts all paths in the graph. You can use option `-p FILE` to specify a list of path names in a file and option `-Q PREFIX` to extract all paths with the given name prefix. In order to extract subpaths, you can c ...
written 10 months ago by Jouni Sirén130
0
votes
1
answer
290
views
1
answers
Comment: C: More haplotype threads than expected
... Sometimes haplotypes contain alternate alleles of overlapping variants that make no sense together (under the vg interpretation of the VCF). By default, this causes a phase break in GBWT construction. With option `-o`, the construction will use the reference allele for the variant that occurs later ...
written 12 months ago by Jouni Sirén130
1
vote
1
answer
331
views
1
answers
Answer: A: Error in vg : vg Assertion `reference_for.count(fasta_contig)' failed.
... Your VCF and reference files are probably using different contig names. If so, you can match them by adding options `-n vcf_contig=fasta_contig` (e.g. `-n chr1=1 -n chr2=2`) to the `vg construct` command. ...
written 13 months ago by Jouni Sirén130

Latest awards to Jouni Sirén

Scholar 6 months ago, created an answer that has been accepted. For A: Extract linear representation of paths in VG graph
Scholar 6 months ago, created an answer that has been accepted. For A: Vg gamsort temporary folder
Teacher 6 months ago, created an answer with at least 3 up-votes. For A: Best way to obtain reference and non-reference k-mer
Scholar 10 months ago, created an answer that has been accepted. For A: Extract linear representation of paths in VG graph

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 969 users visited in the last hour