Question: How can I simply view the .vcf.gz file to get some basic information
gravatar for nreid
5 weeks ago by
nreid0 wrote:

I'm trying to extract snps (list of RSID's and positions from gnomAD) from a series of .vcf.gz files for analysis, but im not entirely sure where to begin. The readme for the files state that the .vcf.gz files do not contain rsid's which makes this the first step I suspect I need to complete. Included with all said .vcf.gz files are.snpinfo files which I am only 50% certain contain relevant information. I am aware of vcftools annotation feature, but I need to first explore the datasets a bit. Is hail good for this? I am quite new to this space, so pardon the simplicity of my questions. Also: if this has been explained elsewhere please point me to the right spot or proper search terms, ive done a fair bit already but couldnt find much at this level.

Thank you.

snp genome • 207 views
ADD COMMENTlink modified 5 weeks ago by Smandape60 • written 5 weeks ago by nreid0
gravatar for Istvan Albert
5 weeks ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

Not sure what you mean by "simply view" the vcf file

The VCF file is in text format (once you unzip it) and it may be read by eye - though this exercise needs a little training as the format is quite complicated.

If you would like to use a graphical interface you may use IGV (and use the same genome that the VCF file was created against) to graphically visualize the file.

If you wish to transform the VCF file into simpler data, perhaps tab delimited columns that only contain the information that interests you then bcftools are the way to go:

ADD COMMENTlink written 5 weeks ago by Istvan Albert ♦♦ 84k
gravatar for JC
5 weeks ago by
JC11k wrote:

To view the file you can use simply zmore command, VCF is just a text table and compressed with Gzip/Bgzip, also tabix can help you to extract some positions.

To annotate your VCFs check the Variant Effector Predictor

ADD COMMENTlink written 5 weeks ago by JC11k
gravatar for Smandape
5 weeks ago by
United States
Smandape60 wrote:

Adding to above answers, another way is to use unix basic commands such as zcat, cut, grep to browse through few lines of the file.

zcat samplefile.vcf.gz | more

You can also pipe it to cut if you want to look at few columns (for example, just looking at first 8 columns)

zcat samplefile.vcf.gz | cut -f-8
ADD COMMENTlink written 5 weeks ago by Smandape60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1620 users visited in the last hour