ORF number count/visualisation
2
1
Entering edit mode
18 months ago
m.halucha • 0

Hey I already did automatic genome annotation using RAST software. I am looking for simple visualisation tool or anything which can provide me information about ORF total number of given dataset or would let me extract decription of all ORFs together.

It's pretty simple to do, but programs I am using are not counting orfs and they won't let me copy all ORFs description where I could just number it on word or any other tool. I am using genebank format.

I am not looking for tools like ORF FINDER, orfs are already predicted and translated in my dataset.

orf visualisation numbering • 443 views
ADD COMMENT
0
Entering edit mode

What format is your result data in?

ADD REPLY
0
Entering edit mode

My format is gene bank. Sorry for not posting it on original post.

ADD REPLY
1
Entering edit mode

Circelator looks like it could be of interest for visualization.

DNAPlotter looks like another good option.

ADD REPLY
3
Entering edit mode
18 months ago
Mensur Dlakic ★ 14k

If you have a FASTA file for your proteins, this is a simple way to count proteins:

grep ">" your_file_name.fas | wc -l

To extract annotations and save them in annotations.txt:

grep ">" your_file_name.fas > annotations.txt

If you want to annotate your genome by metabolic or onthology criteria, KEGG will do the trick.It can give you an output that looks like this:

enter image description here

ADD COMMENT
1
Entering edit mode
18 months ago
m.halucha • 0

I solved this problem using program SnapGene. I put genebank sequences into the program. choosen chromosome -> features -> switch off "full descriptions" -> ctrl + a

You will have info about number of selected features. "(X - 1 - Y)/2" is number of your annotated ORFs. X - number of features Y - number of features that are not genes or CDS ( i. e regulatory regions) (to check if you have that type of features press "sort by gene" button

For smaller datasets you can just manually select all genes and program will count it for you, but it would be problematic with huge datasets.

I choose this pathway because I had my files in genebank format. For fasta files I would reccomend solution of Mr Dlakic, much quicker. Thank everyone for help

ADD COMMENT

Login before adding your answer.

Traffic: 2279 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6