Extract Dp From The Vcf File
1
1
Entering edit mode
10.1 years ago
mad.cichlids ▴ 140

I want to extract some information from the info column of my VCF file, when I load my VCF file, I do not have access to my depth (DP) and number of alleles (AN), however, the NS(number of samples for this allele works fine), DP and AN shown as "NA". My vcf file clearly show that neither DP nor AN is empty, here is just one line example of my vcf file, thanks!

gi|393925858|gb|AGTA02071966.1|    0000000739    .    G    A    121.20    PASS    NS=74:AN=2:DP=8448    GT:DP:GQ:EC:SG    0/1:262:144:116:R

> vcf <- readVcf("z.vcf", "Genome")
> hdr <- exptData(vcf)[["header"]]
>  info(hdr)

DataFrame with 3 rows and 3 columns
        Number        Type                     Description
   <character> <character>                     <character>
NS           1     Integer     Number of Samples With Data
DP           1     Integer                     Total Depth
AN           1     Integer Number of Alleles in Population

>   info(vcf)
DataFrame with 2648 rows and 3 columns
                                                  NS        DP        AN
                                           <integer> <integer> <integer>
gi|393925858|gb|AGTA02071966.1|:0000000739        74        NA        NA
gi|393925858|gb|AGTA02071966.1|:0000000781        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000000957        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000000960        74        NA        NA
gi|393925983|gb|AGTA02071903.1|:0000001007        73        NA        NA
...                                              ...       ...       ...
r vcf • 4.0k views
ADD COMMENT
0
Entering edit mode

Can you be more specific? How are you reading the files? Bioconductor? Package name? Version?

ADD REPLY
0
Entering edit mode

Thanks. I used vcf <- readVcf("z.vcf", "Genome"), z.vcf is my vcf file, "Genome" is the folder of my indexed ref genome. The package is variantannotation in Bioconductor. Here is how i installed it:

source("http://bioconductor.org/biocLite.R")
    biocLite("VariantAnnotation")

According to the archive in the Bioconductor, this should be Version1.8.13. Please let me know if you need additional information that I can provide, I really appreciate your comment. And I am running R in ubuntu

ADD REPLY
0
Entering edit mode
10.1 years ago
mad.cichlids ▴ 140

Thanks to Valerie Obenchain's help, it turned out that it is the colon sign in the INFO column caused this, after replacing the colon with semicolon, everything works fine.

ADD COMMENT

Login before adding your answer.

Traffic: 2838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6