vg de novo calling strategy
0
0
Entering edit mode
3.2 years ago

Hi all,

I am not sure I understand correctly how we are supposed to use the vg pipeline when we don't have a variant set yet (in vcf format).

My objective is to use vg to investigate SVs and SNPs of mitochondrial sequences in a species complex. In the end I am interested in the coverage of each variant (ossibility of multiple mitochondria in those species). Maybe using graph genomes for mitochondria is not the best option either, but I wanted to avoid assembly issues (for example chimeras or misassembly of repetitive regions) when several mitochondria are found in one individual.

What I did so far from what I could grasp:

  • Get mitochondrial sequences from genbank and produce a starting graph using pggb (this will capture some existing variants, but not all) + index this graph
  • map PE reads with vg giraffe for hundreds of individuals
  • augment the graph for each individual
  • vg call for each individual producing an individual vcf

From there, I thought I could merge the vcfs, build a new graph containing all variation, map a second time and finally call a second time with all variation (then merge again all vcfs). However there does not seem to be a way to augment the reference graph (.vg format) using the merged vcf (vg construct only takes fasta sequences).

Am I misled in thinking I need to remap everything a second time? Is the first call enough to capture all variation in all individuals?

For example the answer to this issue suggests this is the way to proceed. Will I be missing variants by stopping at merging the first called vcfs. How do I get the same set of variants from all individuals from that point? I also need coverage of de novo variants identified in some individuals in the other individuals, which I don't get by stopping at the first call.

I will gladly take any advice or reference on this, thanks !

vg • 615 views
ADD COMMENT

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6