Question: MaSuRCA, Flye assembly error
0
gravatar for milady81
26 days ago by
milady8160
milady8160 wrote:

Dear Biostars,

I need your help. I am running MaSuRCA for the Paired end reads with nanopore plus pacbio in one file for bacteria organism.

DATA

PE= il 75 11 /../../R1.fastq /../../R2.fastq (of course I am using a full path)

NANOPORE= /../../../both_longreads.fastq (of course I am using a full path)

END

Parameters in the config file:

PARAMETERS

EXTEND_JUMP_READS=0

GRAPH_KMER_SIZE = auto

USE_LINKING_MATES = 0

USE_GRID=0

GRID_ENGINE=SGE

GRID_QUEUE=all.q

GRID_BATCH_SIZE=500000000

LHE_COVERAGE=25

MEGA_READS_ONE_PASS=0

LIMIT_JUMP_COVERAGE = 60

CA_PARAMETERS = ovlMerSize=30 cgwErrorRate=0.25 ovlMemory=4GB

CLOSE_GAPS=1

NUM_THREADS = 10

JF_SIZE = 160000000

SOAP_ASSEMBLY=0

FLYE_ASSEMBLY=1

END

I am getting an error on the "Assembly with flye failed" step:

[2019-06-25 20:42:03] root: INFO: Starting Flye 2.4.1-release

[2019-06-25 20:42:03] root: DEBUG: Cmd: /bioappl/src/MaSuRCA/MaSuRCA-3.3.3/bin/../Flye/bin/flye -t 6 --nano-corr mr.41.15.15.0.02.1.fa -g 7566250 --kmer-size 21 -m 2500 -o flye -i 0

[2019-06-25 20:42:03] root: INFO: >>>STAGE: configure

[2019-06-25 20:42:03] root: INFO: Configuring run

[2019-06-25 20:42:04] root: ERROR: Invalid char while reading mr.41.15.15.0.02.1.fa

I have no idea what to do now? I would be very glad for any help, Dorota

assembly genome • 160 views
ADD COMMENTlink modified 26 days ago • written 26 days ago by milady8160
1

ERROR: Invalid char while reading mr.41.15.15.0.02.1.fa

Looks like you need to check that file. Does it have anything other than ACTG in sequence?

ADD REPLYlink written 26 days ago by genomax70k

Thank you Genomax:) You are totally right, however, I have no idea how it appears:

m54293_190222_151630/43319410/0_37246.33848_2905 ACGGAAGGCGGCCCAGCATCTCGCGGCTTTGCAGCAGTTCCAGCACGGTCTCGCGCCAGTGGTCGGCTCAGTTTGTCGATTCCGTTGAGCGTCATTCCGTCCAGGTTGGCGCGGATCTCGAACCGCATGCCGTCGCCGACCGGATAGGACTTCGGGAAGATGTAGCGGATGATGAATTCCCCGTCGTATTperl: warning: Setting locale failed.perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system.perl: warning: Falling back to a fallback locale ("en_US.UTF-8").

Each run at the server is showing me that

[Tue Jun 25 20:36:28 CEST 2019] Running locally in 1 batch

perl: warning: Setting locale failed.

perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system.

perl: warning: Falling back to a fallback locale ("en_US.UTF-8").

it should be removed by the server administrator? Or? It is unbelievable that those Warnings appeared at the end of almost all sequences in the file mentioned above.

ADD REPLYlink modified 26 days ago • written 26 days ago by milady8160

Have you opened/edited any of these files on Windows and then moved them to linux? Perhaps it may just be a matter of doing dos2unix your_file.fa to fix the line endings.

ADD REPLYlink written 26 days ago by genomax70k

No, I am using the only Linux. Thx, I will do a line fixing. However, I do not know if after editing the file MaSuRCA will run from the moment it stopped? Now I know I really need to "fix" the warnings, that are influencing on my assembly:). Thank you Genomax again, a lot:) I would never think those warnings are inside the file and disrupt my data and analysis.

ADD REPLYlink written 26 days ago by milady8160

The problem is not fixed. The file:

mr.41.15.15.0.02.1.fa

does not contain any invalid char anymore, and still I am getting the same error:

ERROR: Invalid char while reading mr.41.15.15.0.02.1.fa

Maybe someone have idea what to do?

ADD REPLYlink written 26 days ago by milady8160
Please check that your locale settings: LANG = "en_US.UTF-8" are supported and installed on your system.

It seems that your OS does not support "en_US.UTF-8". Try to set it up with:

LANG=C perl -e exit

You can also try to reconfigure your locales with "dpkg-reconfigure locales"

(source: https://stackoverflow.com/questions/2499794/how-to-fix-a-locale-setting-warning-from-perl?page=1&tab=votes#tab-top)

ADD REPLYlink written 26 days ago by Corentin320

Hi Corentin, actually the admin of the server already fixed the:

"Please check that your locale settings: LANG = "en_US.UTF-8" are supported and installed on your system"

After fixing the Perl issue, still nothing changed with the error from the Flye assembly. However, it changes with the mr.41.15.15.0.02.1.fa file where I had the Perl: warning before. Now the file contains only sequences.

I do not know what is wrong that I am still getting the

ERROR: Invalid char while reading mr.41.15.15.0.02.1.fa

ADD REPLYlink modified 26 days ago • written 26 days ago by milady8160

Just to get this out of the way: are you using "~" instead of "/home/username/" in your full path ? Sometimes "~" is not correctly interpreted.

Also, try to read the file with "cat -v mr.41.15.15.0.02.1.fa", and check if any "^M" characters appear (these are windows new line).

ADD REPLYlink written 26 days ago by Corentin320

As others has mentioned, you should check for anything other than ATCG in the sequence. For example, I noticed fasta output from canu-correct software has a "$" at the end of some sequences that will generate the error you mentioned.

ADD REPLYlink written 5 days ago by chushin.koh0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1725 users visited in the last hour