Question

ORF number count/visualisation

1

Entering edit mode

4.0 years ago

m.halucha • 0

Hey I already did automatic genome annotation using RAST software. I am looking for simple visualisation tool or anything which can provide me information about ORF total number of given dataset or would let me extract decription of all ORFs together.

It's pretty simple to do, but programs I am using are not counting orfs and they won't let me copy all ORFs description where I could just number it on word or any other tool. I am using genebank format.

I am not looking for tools like ORF FINDER, orfs are already predicted and translated in my dataset.

orf visualisation numbering • 1.0k views

ADD COMMENT • link 4.0 years ago by m.halucha • 0

0

Entering edit mode

What format is your result data in?

ADD REPLY • link 4.0 years ago by GenoMax 141k

0

Entering edit mode

My format is gene bank. Sorry for not posting it on original post.

ADD REPLY • link 4.0 years ago by m.halucha • 0

1

Entering edit mode

Circelator looks like it could be of interest for visualization.

DNAPlotter looks like another good option.

ADD REPLY • link 4.0 years ago by GenoMax 141k

score 3 · Accepted Answer · 2020-04-16

If you have a FASTA file for your proteins, this is a simple way to count proteins:

grep ">" your_file_name.fas | wc -l

To extract annotations and save them in annotations.txt:

grep ">" your_file_name.fas > annotations.txt

If you want to annotate your genome by metabolic or onthology criteria, KEGG will do the trick.It can give you an output that looks like this:

enter image description here

score 1 · Accepted Answer · 2020-04-17

I solved this problem using program SnapGene. I put genebank sequences into the program. choosen chromosome -> features -> switch off "full descriptions" -> ctrl + a

You will have info about number of selected features. "(X - 1 - Y)/2" is number of your annotated ORFs. X - number of features Y - number of features that are not genes or CDS ( i. e regulatory regions) (to check if you have that type of features press "sort by gene" button

For smaller datasets you can just manually select all genes and program will count it for you, but it would be problematic with huge datasets.

I choose this pathway because I had my files in genebank format. For fasta files I would reccomend solution of Mr Dlakic, much quicker. Thank everyone for help