Question: Repeat regions were contained in cds file (extracted from evm.out.gff3 by using Perl script)
0
gravatar for Ginsea Chen
3.5 years ago by
Ginsea Chen120
Chinese Academy of Tropical Agricultural Sciences, Danzhou, China
Ginsea Chen120 wrote:

Dear all

I predicted genes from a genome fragment by using EVM based on results of ab initio prediction, homologous sequences alignments and RNA-seq (trinity with or without genome-reference) database. Then I used an in-house Perl script of EVM to extract cds sequences from this fragment based on evm.out.gff3, while I found some repeat regions (which have been masked as NNNN) in some cds sequences. My question is how to treat these sequences ? delete whole sequence or this region ?

It is my first time in gene prediction, so I asked for help here. If anyone can give some suggestions, please help me.

Thanks all !

evm repeats cds genome • 1.2k views
ADD COMMENTlink modified 3.3 years ago by abascalfederico1.1k • written 3.5 years ago by Ginsea Chen120
0
gravatar for abascalfederico
3.3 years ago by
abascalfederico1.1k
Spain
abascalfederico1.1k wrote:

Some CDS overlap repeats (e.g. Alu sequences). Do not delete anything. I guess you are getting NNNN because you are retrieving the sequence from a masked genome file. I would suggest to use an unmasked version of the genome.

 

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by abascalfederico1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 671 users visited in the last hour