Question: Tabix Perl Api Examples
0
gravatar for Vivek
5.8 years ago by
Vivek2.3k
Denmark
Vivek2.3k wrote:

I'm trying to develop a script to annotate a VCF file with functional predictions and scores from dbNSFP. dbNSFP is distributed as large tab de-limited files with information for each non synonymous variant locus. https://sites.google.com/site/jpopgen/dbNSFP

I thought it might be most efficient to index these files and use the Tabix perl API to query these files for each variant in my VCF file. I searched for any working examples or documentation for the tabix perl API without any luck and I'd like to know if there is something available that might be easier than looking into the source code.

perl vcf tabix api annotation • 3.9k views
ADD COMMENTlink modified 5.8 years ago by Dan Gaston7.1k • written 5.8 years ago by Vivek2.3k
2
gravatar for hardingnj
5.8 years ago by
hardingnj210
hardingnj210 wrote:

Recently had to go through the source myself, I found the tests in the t/ folder instructive.

Brief synopsis:

use Tabix;

my $tabix = Tabix->new('-data' => $file); 

$var = $tabix->read(

  $tabix->query( $chrom, $start, $end) 

);

$var is a scalar string that will need to be parsed.

ADD COMMENTlink written 5.8 years ago by hardingnj210

I'll accept this as the answer since the test scripts appear to be the best available resource. Thanks!

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by Vivek2.3k
1
gravatar for Dan Gaston
5.8 years ago by
Dan Gaston7.1k
Canada
Dan Gaston7.1k wrote:

So, while I can see the desire in wanting to do this yourself I should point out that there already exists a widely used tool in the genomics community that will annotate a VCF file with dbNSFP annotations, and has recently been updated so it works using TABIX, making it much quicker than it was before.

snpSIFT which is part of the snpEff package, can do this. Since snpSIFT is published, snpEff is pretty widely used for annotating VCF files, etc, you might want to just use this program rather than trying to reinvent the wheel.

ADD COMMENTlink written 5.8 years ago by Dan Gaston7.1k

Thanks for the link but the idea here is not to re-invent the wheel but to not mess with an existing workflow of an analysis pipeline too much. Using another tool, I'd have to eventually write another script to reformat the output and make changes to suit the collaborators' preferences which they are usually very particular about so introducing a simple annotation script is probably easier for me.

Also the idea of using tabix is not just limited to this scenario with dbNSFP, I should eventually be able to write a stock module that could handle any tab-delimited annotation feature to tag a VCF as required.

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by Vivek2.3k

Hi,

I'm trying to use the snpSift option to annotate with dbNSFP and I'm getting

Fatal error: Tabix index not found for database '~/snpEff_latest_core/snpEff/0B7Ms5xMSFMYlSTY5dDJjcHVRZ3M.gz'

I have downloaded the database and the index from snpSift manual.

Do you know how to solve this problem?

ADD REPLYlink modified 18 days ago by RamRS25k • written 4.1 years ago by jan120

I haven't run into this particular problem, but I haven't been annotating my VCF files with dnbSFP much recently.

ADD REPLYlink modified 17 days ago by RamRS25k • written 4.1 years ago by Dan Gaston7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1059 users visited in the last hour