Question

Annotating the transcripts of a de novo plant transcriptome

0

Entering edit mode

8.2 years ago

Biogeek ▴ 470

Hey,

Looking around for methodology on how to go about annotating a plant transcriptome. Whilst the information in papers tends to be simplified, and the internet does not really give a good simplied guide on how to go about this process, I've come here for a few simple answers.

What are the best databases available to annotate your Plant transcriptome with (de novo) assuming you are doing blastx on a stand a lone server and customized blastdb.
In regards to setting up your blastdb for blast x, what steps are involved. How do I for say, merge or parse several databases into 1 large custom database, so that the blastx can be performed once.
More broad question here; Once you get your annotation for the transcripts, how can you go about doing Pfam, Interpro?

Thanks

plant-transcriptome blastx standalone annotation • 2.0k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.2 years ago by Biogeek ▴ 470

Ram · Answer 1 · 2016-02-25

Arabidopsis or Rice are the best studied plant models, you can start from there or using NCBI NR or RefSeq.
Get all your databases in fasta format, combine them into one large file (don't mix nucleotides/peptides) and format for Blast+. for example:
```
$ cat genes_1.fasta genes_2.fasta genes_3.fasta > all_genes.fasta
$ makeblastdb -in all_genes.fasta -out all_genes -dbtype nucl
```
Check HMMER+PFAM http://hmmer.janelia.org/download.html.

Also check https://trinotate.github.io/