Question: Getting matrix of QDs from VCF file
gravatar for Jautis
3.3 years ago by
United States
Jautis290 wrote:

Hi, I have a vcf file and I would like to get a site-by-individual matrix of read depths (the DP label) and a second matrix of just the GQ scores.

What is the easiest way to do this? Thanks in advance!

Ex input: 

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  samp1 samp2

chr1   100  .       C       T       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL  0/0:1,0:1:3:0,3,42      0/0:3,0:3:9:0,9,132

chr1   120  .       C       G       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL 0/1:3,1:4:30:30,0,123   1/1:0,1:1:3:45,3,0


Ex output for DP:

1    3
4    3
vcf-processing snp • 1.0k views
ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 3.3 years ago by Jautis290

If you need the stats for just one sample (column), grep -v '#' test.vcf | cut -f10 | awk -F ':' '{print $3"\t"$4}' should do. For statistics over multiple samples, I would write a script to parse out the details, which should be pretty straightforward.

ADD REPLYlink written 3.3 years ago by Eric Lim1.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2314 users visited in the last hour