1
0
Entering edit mode
22 months ago

Hello,

I used Mikado pipeline to annotate the genome of a nematoid species, using assembled transcripts. However, most of the ORFs generated by Mikado are incomplete, either lacking start or stop codons. I would like to know if there is something I could do to improve the number of complete ORFs. I used the worm.yaml configuration file.

Thanks!

rna-seq genome assembly • 394 views
0
Entering edit mode

How do you check if they have a start or a stop codon? When I obtain the protein sequences of the ORFs, using gffread, I see they don't start with the amino acid methionine and don't have a "dot" on the end of the sequence, which represents the stop codon on the gffread fasta output. I am sure they don't have a start and/or stop codon, I visualized them using the software Artemis.

And are you taking into account if the coordinates are 0-based or 1-based? It's 1-based.

1
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment needs to go under @kristoffer's answer.

SUBMIT ANSWER is for new answers to original question.

0
Entering edit mode
22 months ago

How do you check if they have a start or a stop codon? And are you taking into account if the coordinates are 0-based or 1-based?

0
Entering edit mode

How do you check if they have a start or a stop codon? When I obtain the protein sequences of the ORFs, using gffread, I see they don't start with the amino acid methionine and don't have a "dot" on the end of the sequence, which represents the stop codon on the gffread fasta output. I am sure they don't have a start and/or stop codon, I visualized them using the software Artemis.

And are you taking into account if the coordinates are 0-based or 1-based? It's 1-based.

0
Entering edit mode

Strange - how frequent are stop codons in the middle of the sequences?