What are "must have" data types one should extract from WGS .fq data?
0
0
Entering edit mode
11 weeks ago

I have multiple .fq whole genomes prepared for variant calling. However, it is quite expensive to repeat the whole pipeline so i wonder what data types are must have to extract?

Right now i am planning to extract the following:

  1. Variants (indels, SNPs), with HaplotypeCaller
  2. Structural variants (<1000 bp long), with Manta

What are some other "must have" i could extract?

wgs • 432 views
ADD COMMENT
0
Entering edit mode

why just <1000 bp long ?

ADD REPLY
0
Entering edit mode

It's the maximum length generated by Manta for my dataset (~100 bp reads).

ADD REPLY
0
Entering edit mode

Do you have a reference for that?

ADD REPLY
0
Entering edit mode

Nope, i manually measured the length of structural variants after conversion to PLINK format. Now when i think about it i see there might be a crack in my logic due to PLINK limits on allele naming length.

ADD REPLY
0
Entering edit mode
bcftools query -f '%INFO/SVLEN\n'  in.vcf | tr -d '-' | sort -n | tail
ADD REPLY

Login before adding your answer.

Traffic: 2444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6