Question: Getting matrix of QDs from VCF file
0
gravatar for Jautis
2.5 years ago by
Jautis270
United States
Jautis270 wrote:

Hi, I have a vcf file and I would like to get a site-by-individual matrix of read depths (the DP label) and a second matrix of just the GQ scores.

What is the easiest way to do this? Thanks in advance!

Ex input: 

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  samp1 samp2

chr1   100  .       C       T       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL  0/0:1,0:1:3:0,3,42      0/0:3,0:3:9:0,9,132

chr1   120  .       C       G       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL 0/1:3,1:4:30:30,0,123   1/1:0,1:1:3:45,3,0

 

Ex output for DP:

1    3
4    3
vcf-processing snp • 796 views
ADD COMMENTlink modified 11 months ago by Biostar ♦♦ 20 • written 2.5 years ago by Jautis270

If you need the stats for just one sample (column), grep -v '#' test.vcf | cut -f10 | awk -F ':' '{print $3"\t"$4}' should do. For statistics over multiple samples, I would write a script to parse out the details, which should be pretty straightforward.

ADD REPLYlink written 2.5 years ago by Eric Lim1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1068 users visited in the last hour