Question: Error during running the first step of Trinity?
0
gravatar for seta
4.4 years ago by
seta1.2k
Sweden
seta1.2k wrote:

Hi friends,

I used CLC genomic workbench software for trimming, then the output of this software, which is the trimmed fasta file was exposed to Trinity for de novo assembly. But, I encountered with the following error?. Could you please let me know what's wrong and how to solve it, the compatibility is the matter? You can see part of input fasta file and output of trinity here:

>D69F08P1:337:C4GGBACXX:1:1101:1193:2153_1:N:0:
TTTAAGTTCTTTACAGTAAGAAACAACATTGCATTTTTCACATCCTCAAGGTCATGTGAG
TGGCTGAATCATTCGTGGCTACTT
>D69F08P1:337:C4GGBACXX:1:1101:1193:2153_2:N:0:
CCACGAATGATTCAGCCACTCACATGACCTTGAGGATGTGAAAAATGCAATGTTGTTTCT
TACTGTAAAGAACTTAAAAGCCGT
>D69F08P1:337:C4GGBACXX:1:1101:1123:2158_1:N:0:
TACCCTGGAAACCGCTGTCATCATGCCAAAACGAGTTAGCGTCCAACTCAGGCAGCACAA
GTGTAGCATTCATGATTCGTGCAG
>D69F08P1:337:C4GGBACXX:1:1101:1123:2158_2:N:0:

I run Trinity with the code: ./Trinity --seqType fa --JM 180G --single 8_Trimmed.fa --run_as_paired --normalize_reads --SS_lib_type FR --min_contig_length 400 --CPU 6 --full_cleanup

output of trinity with error:

Paired mode requires bowtie. Found bowtie at: /usr/bin/bowtie

 and bowtie-build at /usr/bin/bowtie-build

-since butterfly will eventually be run, lets test for proper execution of java

Found samtools at: /usr/bin/samtools

 

#######################################

Running Java Tests

Wednesday, June 24, 2015: 13:28:35                CMD: java -Xmx64m -jar /home/jafarinezhad/software/trinityrnaseq_r20140717/util/support_scripts/ExitTester.jar 0

CMD finished (1 seconds)

Wednesday, June 24, 2015: 13:28:36                CMD: java -Xmx64m -jar /home/jafarinezhad/software/trinityrnaseq_r20140717/util/support_scripts/ExitTester.jar 1

-we properly captured the java failure status, as needed.  Looking good.

Java tests succeeded.

###################################

---------------------------------------------------------------

------------ In silico Read Normalization ---------------------

-- (Removing Excess Reads Beyond 50 Coverage --

-- /home/jafarinezhad/software/trinityrnaseq_r20140717/insilico_read_normalization --

---------------------------------------------------------------

 

Wednesday, June 24, 2015: 13:28:36                CMD: /home/jafarinezhad/software/trinityrnaseq_r20140717/util/insilico_read_normalization.pl --seqType fa --JM 180G  --max_cov 50 --CPU 6 --output /home/jafarinezhad/software/trinityrnaseq_r20140717/insilico_read_normalization --SS_lib_type FR  --single /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa

CMD: ln -s /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa single.fa

CMD finished (0 seconds)

CMD: touch single.fa.ok

-------------------------------------------

----------- Jellyfish  --------------------

-- (building a k-mer catalog from reads) --

-------------------------------------------

 

CMD finished (0 seconds)

CMD: /home/jafarinezhad/software/trinityrnaseq_r20140717/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads single.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25  --num_threads 6  > single.fa.K25.stats

-reading Kmer occurences...

 

 done parsing 0 Kmers, 0 added, taking 0 seconds.

STATS_GENERATION_TIME: 3461 seconds.

CMD finished (3461 seconds)

CMD: touch single.fa.K25.stats.ok

-sorting each stats file by read name.

CMD finished (0 seconds)

CMD: sort -k5,5 -T . -S 180G single.fa.K25.stats > single.fa.K25.stats.sort

CMD finished (610 seconds)

CMD: touch single.fa.K25.stats.sort.ok

CMD finished (2 seconds)

CMD: /home/jafarinezhad/software/trinityrnaseq_r20140717/util/..//util/support_scripts//nbkc_normalize.pl single.fa.K25.stats.sort 50 200 > single.fa.K25.stats.sort.C50.pctSD200.accs

326553678 / 326553678 = 100.00% reads selected during normalization.

0 / 326553678 = 0.00% reads discarded as likely aberrant based on coverage profiles.

0 / 326553678 = 0.00% reads missing kmer coverage (N chars included?).

CMD finished (1064 seconds)

CMD: touch single.fa.K25.stats.sort.C50.pctSD200.accs.ok

CMD finished (0 seconds)

Thread 2 terminated abnormally: Error, not all specified records have been retrieved (missing 326553678) from /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa at /home/jafarinezhad/software/trinityrnaseq_r20140717/util/insilico_read_normalization.pl line 521.

Error encountered with thread.

Error, at least one thread died at /home/jafarinezhad/software/trinityrnaseq_r20140717/util/insilico_read_normalization.pl line 419.

Error, cmd: /home/jafarinezhad/software/trinityrnaseq_r20140717/util/insilico_read_normalization.pl --seqType fa --JM 180G  --max_cov 50 --CPU 6 --output /home/jafarinezhad/software/trinityrnaseq_r20140717/insilico_read_normalization --SS_lib_type FR  --single /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa died with ret 6400 at ./Trinity line 1990.

 

Thanks for taking look at my problem and help me to resolve it.

 

 

ADD COMMENTlink modified 4.4 years ago by h.mon28k • written 4.4 years ago by seta1.2k

I am guessing CLC trimmed whole reads and left only the header.

ADD REPLYlink written 4.4 years ago by h.mon28k

I don't think so, I did successfully de novo assembly on the same reads using CLC, and now I plan to compare two assemblies and may be combine their results. Any idea?

ADD REPLYlink written 4.4 years ago by seta1.2k

CLC may be immune to its own tricks, and Trinity may be more finicky about its input files.

ADD REPLYlink written 4.4 years ago by h.mon28k

Your mean is the input format is the issue. So, how I can compare two assemblies, is there any way to change change the format and provide acceptable format for trinity?

ADD REPLYlink written 4.4 years ago by seta1.2k

I was not sure input format is the problem, it was just a guess. I just added an answer.

ADD REPLYlink written 4.4 years ago by h.mon28k
0
gravatar for h.mon
4.4 years ago by
h.mon28k
Brazil
h.mon28k wrote:

How can you have paired reads orientation if you are inputing single end reads?

--SS_lib_type FR  --single /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa
ADD COMMENTlink written 4.4 years ago by h.mon28k

Actually, my raw fastq file were PE reads (FR), which CLC software read them as paired and combined as single file (output  file). so I typed --single --read_as_paired --SS_lib_type FR

ADD REPLYlink written 4.4 years ago by seta1.2k

Is your data strand-specific?

ADD REPLYlink written 4.4 years ago by h.mon28k

Yes, it's strand-specific

ADD REPLYlink written 4.4 years ago by seta1.2k

I did not see the "--run_as_paired", as I was focusing on the error message, which do not include it - probably normalization does not considers pairing.

Error, cmd: /home/jafarinezhad/software/trinityrnaseq_r20140717/util/insilico_read_normalization.pl --seqType fa --JM 180G  --max_cov 50 --CPU 6 --output /home/jafarinezhad/software/trinityrnaseq_r20140717/insilico_read_normalization --SS_lib_type FR  --single /home/jafarinezhad/software/trinityrnaseq_r20140717/8_Trimmed.fa died with ret 6400 at ./Trinity line 1990.
ADD REPLYlink written 4.4 years ago by h.mon28k

thanks for your attempt to solve the problem. I just run it without flag of normalization to see what happen, hope it solved.

ADD REPLYlink written 4.4 years ago by seta1.2k

Did it work without read normalization? I guess I am back to my first suggestion, maybe there is something fishy with your input data. Sorry for the false leads.

ADD REPLYlink written 4.4 years ago by h.mon28k

You're welcome. Yeah, I run it without read normalization and it's in the Chrysalis stage, now. It sounds that your guess about normalization was right. However, I would like to do normalization due to save time, memory. Could you please let me know if you have work experience "with" and "without" read normalization on the same reads?, regardless of time and memory,  if it can affect on final results?

ADD REPLYlink written 4.4 years ago by seta1.2k

You can try normalization outside Trinity, with khmer, for example. I have few experience with normalization, it does indeed saved time and memory usage, at the cost of slightly more fragmented assemblies. As in general I can use lots of memory, I tend to only pre-process with BBDuk for quality and adapter filtering, and use the remaining reads for assembly, without normalization.

ADD REPLYlink written 4.4 years ago by h.mon28k

Thanks friend, I may go ahead without normalization.

ADD REPLYlink written 4.4 years ago by seta1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1256 users visited in the last hour