Question: Extract allele read counts for heterozygous positions in VCF
0
gravatar for German.M.Demidov
10 months ago by
Tübingen
German.M.Demidov1.8k wrote:

Dear all,

the very basic question: I have genotyped SNVs and for each 0/1 variant I want to extract the depth of coverage and the read counts of both alleles. How to do it "in a universal manner", in other words - for any genotyped vcf, no mater how it is formatted? In our VCF files there is AO field in FORMAT column with contains minor allele count - that's fine, but I can not find this field in the official specifications of vcf 4.2...DP field contains information about the overall read depth, but I'd like to know the number of read supporting the variant.

I know that there are a lot of parsers =) but I need it in a "raw" format for a basic script

vcf • 288 views
ADD COMMENTlink modified 10 months ago by Macspider3.1k • written 10 months ago by German.M.Demidov1.8k

what is the "alternative allele depth" , what is that field AO ? what is a "universal manner" ?

ADD REPLYlink written 10 months ago by Pierre Lindenbaum129k

changed the question. I also have no idea what AO field is - our "format" column has it, it is not described in vcf specification so I guess it is not good to use it. In a universal manner - I want my script for extraction of allele read counts to work on any VCF, no matter how it is formatted (but I assume that genotyping step was performed).

ADD REPLYlink modified 10 months ago • written 10 months ago by German.M.Demidov1.8k

AO apparently is the alternative allele count provided by FreeBayes =)

ADD REPLYlink written 10 months ago by German.M.Demidov1.8k
2
gravatar for Macspider
10 months ago by
Macspider3.1k
Vienna - BOKU
Macspider3.1k wrote:

When you do the pileup, for example with samtools, you can require additional tags to be present in your INFO and GENOTYPE. These are specified via the -t option. I usually ask for:

−t DP,AD,ADF,ADR,DP4

Note that DP4 is deprecated but still present in the output if you require it. It contains the info you want, as well as the ADF and ADR do.

ADD COMMENTlink written 10 months ago by Macspider3.1k

Thanks a lot! This is the answer I was looking for. So I kinda have to ask people to do this pileup with these options before using my script.

ADD REPLYlink written 10 months ago by German.M.Demidov1.8k

Yes, I am afraid so. If this information is not stored during the pileup, it won't be possible to extract it from a VCF file.

ADD REPLYlink written 10 months ago by Macspider3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1407 users visited in the last hour