Question: Help With Retrieving Snps And Indels From Fvb Mouse (Vcf File)
0
gravatar for Johnathan
6.5 years ago by
Johnathan0
Johnathan0 wrote:

Hi everyone!

I am looking at co-occurring events in transgenic FVB mice tumours. First, I need to filter out snps and indels. I was reading the following paper: Sequencing and characterization of the FVB/NJ mouse genome by Wong et al. 2012. The research group has identified snps and indels in the FVB with next-gen sequencing. I am trying to retrieve this data that is stored in a huge vcf file at the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/resources/mouse/genomes/)

The file is huge and contains the genetic information of 18 different mouse strains (including FVB). I downloaded tabix/cvftools and installed their binaries to my PATH. My problem is that I don't know how to extract the SNP/indel coordinates (and what they are) from this file. I have already spent quite some time reading the documentation and googling, but I am stuck.

Has anyone done something similar? I would greatly appreciate if someone could please give me some hints.

Thank you!

vcf • 2.6k views
ADD COMMENTlink modified 5.5 years ago by Biostar ♦♦ 20 • written 6.5 years ago by Johnathan0
1
gravatar for pd3
6.5 years ago by
pd3340
pd3340 wrote:

The VCF file is a plain text file compressed with bgzip and can be uncompressed with standard unix gzip. The coordinates are in respect to the reference genome as described in the README ftp://ftp-mouse.sanger.ac.uk/REL-1303-SNPs_Indels-GRCm38/README

There are many ways how to extract the information you need, including generic tools like awk or specialized tools listed by Ashutosh.

Yet another tool is bcftools ( http://samtools.github.io/bcftools/)

bcftools view -s FVBJN file.vcf.gz
ADD COMMENTlink written 6.5 years ago by pd3340
0
gravatar for Ashutosh Pandey
6.5 years ago by
Philadelphia
Ashutosh Pandey12k wrote:

You can use the following tools :

http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_SelectVariants.html

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -V [ multiple_vcf_file] --sample_name [Sample_name (FVBJN in your case) ]

Or you can try vcftools (http://vcftools.sourceforge.net/index.html)

vcf-subset -c Sample_name (FVBNJ) | bgzip -c > Sample_name.vcf.gz

ADD COMMENTlink written 6.5 years ago by Ashutosh Pandey12k
0
gravatar for Alex Reynolds
6.5 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

BEDOPS includes vcf2bed, with options to filter on SNVs, insertions and deletions.

ADD COMMENTlink written 6.5 years ago by Alex Reynolds30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 707 users visited in the last hour