Question: How to programatically run a blast to ensembl and obtain GFF from own FASTA
1
gravatar for juanma_lace
4.9 years ago by
juanma_lace20
Argentina
juanma_lace20 wrote:

Hi,

I'm using python and I have a wheat genome portion in  .FASTA format (nucleotides).

I want to run BLAST against ENSEMBL and obtain genes annotations in .GFF3 programatically in order to use those items in a custom pipeline.

 

Any ideas?

thank you in advance

 

EDITED:

My vision is:

1- Run BLAST with own sequences against ENSEMBL

2- Download .GFF files from ENSEMBL of the genome you're blasting

3- use blast results and translate GFF coordinates to your own .FASTA data 

python blast ensembl gff3 • 2.8k views
ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by juanma_lace20

¡Bienvenidos a BioStar! We'd be happy to help, but it looks like you need to take a bit more time to clarify what you're doing. GFF3 is a format that can encode a wide variety of genomic features. From your question it is not clear what features you are looking for or how BLAST/ENSEMBL will help you identify those features.

ADD REPLYlink written 4.9 years ago by Daniel Standage3.9k

thanks, I've edited the question

ADD REPLYlink written 4.9 years ago by juanma_lace20
1
gravatar for Daniel Standage
4.9 years ago by
Daniel Standage3.9k
Davis, California, USA
Daniel Standage3.9k wrote:

Identifying genes in eukaryotes is a bit more complicated than simply aligning proteins with BLAST. If you have a particular set of reference proteins or ESTs, you can splice align these (with programs like GeneSeqer and GenomeThreader) to get a good first approximation of gene structure.

You can also use gene predictors like SNAP or AUGUSTUS to predict genes. These do not require and protein or transcript sequences, but they typically have a substantial number of false positives.

Tools like Maker and EVM combine these two approaches (spliced alignment and ab initio gene prediction), and produce much more reliable annotations. However, they are a pain to set up and more complicated to run.

Sorry there is no easy answer to this question!

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Daniel Standage3.9k

I see, thank you for your answer. Upvoted

ADD REPLYlink written 4.9 years ago by juanma_lace20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1209 users visited in the last hour