Question: start OMA run - file.log
0
gravatar for dtejadamartinez
8 months ago by
dtejadamartinez10 wrote:

Hi,

I have a question about the file.log when OMA start to convert the files.

I download 30 coding genomes of different species from Ensembl or NCBI. In order to eliminate the transcripts and isoforms I used cd-hitest first and then I passed the files through TRANSDECODER.

When I start running OMA in the .log file shows me many of these errors:

WARNING: IUPAC ambiguity characters for DNA/RNA not supported. Will replace them with 'X'

Pre-processing input (DNA)
19099 sequences within 19099 entries considered
Creating file Cache/DB/Balaena_mysticetus.db.map for mapping
Building new Pat index in file Cache/DB/Balaena_mysticetus.db.tree with 27254391 entries
Pat index with 27254391 entries
 sorted, from "A</SEQ></E>\n" to "XXXXXXXXXXXXXXXXXXX"
Reading 44567976 characters from file Cache/DB/Balaenoptera_acutorostrata.db
Pre-processing input (DNA)
20993 sequences within 20993 entries considered
Creating file Cache/DB/Balaenoptera_acutorostrata.db.map for mapping
Building new Pat index in file Cache/DB/Balaenoptera_acutorostrata.db.tree with 37893972 entries
Pat index with 37893972 entries
 sorted, from "A</SEQ></E>\n" to "XXXXXXXXXXXXXXXXXXX"

I want to know if that errors can generate some problems with the normal run of OMA?

Thanks,

omabrowser oma • 289 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by dtejadamartinez10

Tagging: adrian.altenhoff

ADD REPLYlink written 8 months ago by genomax59k
0
gravatar for adrian.altenhoff
8 months ago by
Switzerland
adrian.altenhoff440 wrote:

Hi,

it seems to me that you have an inconsistency between your input data and the parameters: From this output it looks to me that you specified in the parameters.drw file the InputDataType := 'DNA'; but you provide protein sequences (which would make sense to use). In that case OMA would convert all amino acids that are not ATCG to unknown nucleotides and threat the remaining amino acids as nucleotides. The proper setting should be InputDataType := 'AA'; as far as I can understand.

Cheers Adrian

ADD COMMENTlink written 8 months ago by adrian.altenhoff440
0
gravatar for dtejadamartinez
8 months ago by
dtejadamartinez10 wrote:

Hi, thanks for the answer.

I have all the sequences in nucleotides, and in the input I have InputDataType := 'DNA'; That's why I find it strange.

It only happens with the final files thrown by TRANSDECODER.

Cheers, Daniela

ADD COMMENTlink written 8 months ago by dtejadamartinez10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1677 users visited in the last hour