Question: Extract Dp From The Vcf File
1
gravatar for mad.cichlids
5.2 years ago by
mad.cichlids100
Texas
mad.cichlids100 wrote:

I want to extract some information from the info column of my VCF file, when I load my VCF file, I do not have access to my depth (DP) and number of alleles (AN), however, the NS(number of samples for this allele works fine), DP and AN shown as "NA". My vcf file clearly show that neither DP nor AN is empty, here is just one line example of my vcf file, thanks!

gi|393925858|gb|AGTA02071966.1|    0000000739    .    G    A    121.20    PASS    NS=74:AN=2:DP=8448    GT:DP:GQ:EC:SG    0/1:262:144:116:R

> vcf <- readVcf("z.vcf", "Genome")
> hdr <- exptData(vcf)[["header"]]
>  info(hdr)

DataFrame with 3 rows and 3 columns
        Number        Type                     Description
   <character> <character>                     <character>
NS           1     Integer     Number of Samples With Data
DP           1     Integer                     Total Depth
AN           1     Integer Number of Alleles in Population

>   info(vcf)
DataFrame with 2648 rows and 3 columns
                                                  NS        DP        AN
                                           <integer> <integer> <integer>
gi|393925858|gb|AGTA02071966.1|:0000000739        74        NA        NA
gi|393925858|gb|AGTA02071966.1|:0000000781        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000000957        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000000960        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000001007        73        NA        NA
...                                              ...       ...       ...
vcf R • 2.2k views
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by mad.cichlids100

Can you be more specific? How are you reading the files? Bioconductor? Package name? Version?

ADD REPLYlink written 5.2 years ago by JC7.8k

Thanks. I used vcf <- readVcf("z.vcf", "Genome"), z.vcf is my vcf file, "Genome" is the folder of my indexed ref genome. The package is variantannotation in Bioconductor. Here is how i installed it:

source("http://bioconductor.org/biocLite.R")
    biocLite("VariantAnnotation")

According to the archive in the Bioconductor, this should be Version1.8.13. Please let me know if you need additional information that I can provide, I really appreciate your comment. And I am running R in ubuntu

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by mad.cichlids100
0
gravatar for mad.cichlids
5.2 years ago by
mad.cichlids100
Texas
mad.cichlids100 wrote:

Thanks to Valerie Obenchain's help, it turned out that it is the colon sign in the INFO column caused this, after replacing the colon with semicolon, everything works fine.

ADD COMMENTlink written 5.2 years ago by mad.cichlids100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 868 users visited in the last hour