Question: The content of variation (VEP) file from Ensembl
0
gravatar for seta
6 months ago by
seta1.2k
Sweden
seta1.2k wrote:

Hi everybody,

I'm talking about the variation (VEP) file (homo_sapiens_vep_96_GRCh37.tar.gz) available to download from ftp Ensembl. However, I would like to see the short sample of the file before downloading, but I couldn't find such a sample file to view its content. Could you please share me if you have any example or short sample file?

Thanks

human vep ensembl • 333 views
ADD COMMENTlink modified 6 months ago by Emily_Ensembl19k • written 6 months ago by seta1.2k

I would like to see the short sample of the file before downloading

these are binary PERL files.

$ gunzip -c ensembl/vep/cache/homo_sapiens/75/Y/51000001-52000000_reg.gz | file -
/dev/stdin: perl Storable (v0.7) data (network-ordered) (major 2) (minor 8)
ADD REPLYlink written 6 months ago by Pierre Lindenbaum124k
1
gravatar for Emily_Ensembl
6 months ago by
Emily_Ensembl19k
EMBL-EBI
Emily_Ensembl19k wrote:

That's the cache file that the VEP uses. It contains all the genes, regulatory features and variants on GRCh37, sorted into folders of chromosomes, which are then made up of zipped files that represent 1Mb of either genes or regulatory features, and zipped indexed files of all the variants on that chromosome. If you need it, then we recommend installing it with your VEP installation rather than downloading it from the FTP site.

ADD COMMENTlink written 6 months ago by Emily_Ensembl19k

Thank you for the response. So, it isn't a simple text file and should be used along with VEP tool. Sorry, is it possible to annotate about 40 millions variants with web-based VEP or we should do it locally?

ADD REPLYlink modified 6 months ago • written 6 months ago by seta1.2k
1

I would recommend doing that locally.

ADD REPLYlink written 6 months ago by Emily_Ensembl19k

Thanks Emily. As I found, there is another cache file called homo_sapiens_merged_vep_96_GRCh37.tar.gz, which contain Ensembl and RefSeq cache; but, I didn't found about "homo_sapiens_vep_96_GRCh37.tar.gz". Could you please let me know what is the difference of this file with the merged file?

ADD REPLYlink written 6 months ago by seta1.2k
1

The Ensembl file only contains the Ensembl/GENCODE genes.

ADD REPLYlink written 6 months ago by Emily_Ensembl19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1820 users visited in the last hour