Question: snp analysis using a software tool
0
gravatar for anusha.sunkum
6.6 years ago by
United States
anusha.sunkum0 wrote:

i have varient data obtained from a tool that give varients in with refernce to a genome. i had tried to analyse them , like their position effect on aminoacid,...etc using snpeff , but this tool needs it in a varient call format  which is obtained from sequencing experiments only.can u suggest me any tool to analyse the snp data...or how caan snpeff can be used in this case

thank you

snp • 3.0k views
ADD COMMENTlink modified 6.6 years ago by Bert Overduin3.7k • written 6.6 years ago by anusha.sunkum0

What's the format that you have now? It's difficult to determine what you actually have given what you've written.

ADD REPLYlink written 6.6 years ago by Devon Ryan98k

i have gff3 format with me.

thanks for your reply

ADD REPLYlink modified 6.6 years ago • written 6.6 years ago by anusha.sunkum0

Okay let me be clear.i already have snp data with insertions, deletions, indels in gff3 format.To feed it into snpeff  and analyse it i need it in varient call format which needs quality ,filter,information in input.so in gff3 format  iam not able to make out what is quality and filter.

ADD REPLYlink written 6.6 years ago by anusha.sunkum0

There's no standard way to represent variants in gff3, since that's not it's purpose. You'll just have to write a short script to convert it to VCF.

ADD REPLYlink written 6.6 years ago by Devon Ryan98k
1
gravatar for Bert Overduin
6.6 years ago by
Bert Overduin3.7k
Edinburgh Genomics, The University of Edinburgh
Bert Overduin3.7k wrote:

The Ensembl Variant Effect Predictor (VEP) takes many different formats as input and is available through a web interface, as a downloadable script and also through the Ensembl REST API.

ADD COMMENTlink written 6.6 years ago by Bert Overduin3.7k

The default input format for the VEP is a simple is whitespace-separated format:

  • chromosome - just the name or number, with no 'chr' prefix
  • start
  • end
  • allele - pair of alleles separated by a '/', with the reference allele first
  • strand - defined as + (forward) or - (reverse).
  • identifier (optional) - this identifier will be used in the VEP's output. If not provided, the VEP will construct an identifier from the given coordinates and alleles.
ADD REPLYlink written 6.6 years ago by Bert Overduin3.7k

Does it work for plants?

ADD REPLYlink written 6.6 years ago by anusha.sunkum0

For the ones that are in Ensembl Plants, yes. For the VEP, just have a look at the Ensembl Plants Tools page.

ADD REPLYlink written 6.6 years ago by Bert Overduin3.7k
0
gravatar for Pablo
6.6 years ago by
Pablo1.9k
Canada
Pablo1.9k wrote:

Just transform your format to VCF and feed it into SnpEff.

 

ADD COMMENTlink written 6.6 years ago by Pablo1.9k

Is their any tool to transform or should i do it manually

thank you

ADD REPLYlink modified 6.6 years ago • written 6.6 years ago by anusha.sunkum0
1

Honestly, if I were you, I would just re-run the variant calling phase using a caller that outputs VCF format (which is almost every variant caller I know, since that is the standard).

Keep in mind that GFF3 is certainly not the right format for varaints, so there must be something very "special" (possibly wrong?) in your pipeline. If you transform GFF3 to VCF, you can leave some fields empty (such as FILTER, QUALITY and even INFO) just use '.' to represent 'no value'.

So the fields in VCF are tab separated: 

CHROM
POS
ID       (can be empty, '.')
REF
ALT
QUAL (can be empty, '.')
FILTER (can be empty, '.')
INFO (can be empty, '.')

As you can see, it should be quite easy to get your data into VCF format.

 

 

 

ADD REPLYlink modified 6.6 years ago • written 6.6 years ago by Pablo1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1649 users visited in the last hour
_