Help With Retrieving Snps And Indels From Fvb Mouse (Vcf File)
3
0
Entering edit mode
10.2 years ago
Johnathan • 0

Hi everyone!

I am looking at co-occurring events in transgenic FVB mice tumours. First, I need to filter out snps and indels. I was reading the following paper: Sequencing and characterization of the FVB/NJ mouse genome by Wong et al. 2012. The research group has identified snps and indels in the FVB with next-gen sequencing. I am trying to retrieve this data that is stored in a huge vcf file at the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/resources/mouse/genomes/)

The file is huge and contains the genetic information of 18 different mouse strains (including FVB). I downloaded tabix/cvftools and installed their binaries to my PATH. My problem is that I don't know how to extract the SNP/indel coordinates (and what they are) from this file. I have already spent quite some time reading the documentation and googling, but I am stuck.

Has anyone done something similar? I would greatly appreciate if someone could please give me some hints.

Thank you!

vcf • 3.6k views
ADD COMMENT
1
Entering edit mode
10.2 years ago
pd3 ▴ 350

The VCF file is a plain text file compressed with bgzip and can be uncompressed with standard unix gzip. The coordinates are in respect to the reference genome as described in the README ftp://ftp-mouse.sanger.ac.uk/REL-1303-SNPs_Indels-GRCm38/README

There are many ways how to extract the information you need, including generic tools like awk or specialized tools listed by Ashutosh.

Yet another tool is bcftools ( http://samtools.github.io/bcftools/)

bcftools view -s FVBJN file.vcf.gz
ADD COMMENT
0
Entering edit mode
10.2 years ago

You can use the following tools :

http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_SelectVariants.html

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -V [ multiple_vcf_file] --sample_name [Sample_name (FVBJN in your case) ]

Or you can try vcftools (http://vcftools.sourceforge.net/index.html)

vcf-subset -c Sample_name (FVBNJ) | bgzip -c > Sample_name.vcf.gz

ADD COMMENT
0
Entering edit mode
10.2 years ago

BEDOPS includes vcf2bed, with options to filter on SNVs, insertions and deletions.

ADD COMMENT

Login before adding your answer.

Traffic: 1326 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6