1
0
Entering edit mode
4 weeks ago
willbrown • 0

Hello,

I am a university student currently trying to use Trimmomatic for the first time and have attempted to use a command line I have formed from using the official Trimmomatic manual and some other online resources. I am very new to Bioinformatics and especially new to programming, so I would appreciate if you can try to defer from using any jargon so I can understand your points easier. Thank you.

I am using Trimmomatic-0.36 and also undertaking this command on a remote server, rather than my local drive. The command line I have most recently used is the following:

java -jar $trimmomatic PE -phred33 /home/will/240_CTTGTA_L004_R1_001.fastq.gz /home/will/C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:TruSeq-PE.fa:2:30:10  I have also tried to use the command line without including a path to my original read files, seen below:  java -jar$trimmomatic PE -phred33 240_CTTGTA_L004_R1_001.fastq.gz C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:TruSeq-PE.fa:2:30:10


Lastly, I have included the path to the adapters file within the Trimmomatic-0.36 file:

 java -jar $trimmomatic PE -phred33 240_CTTGTA_L004_R1_001.fastq.gz C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:/usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa:2:30:10  For all 3 of the inputs I have used, the terminal has returned this error message: java.io.FileNotFoundException: /usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at org.usadellab.trimmomatic.fasta.FastaParser.parse(FastaParser.java:54) at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:110) at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:71) at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:32) at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59) at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:536) at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80) Exception in thread "main" java.io.FileNotFoundException: 240_CTTGTA_L004_R1_001.fastq.gz (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135) at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264) at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539  I have looked online for solutions and attempted to fix my command line, but so far nothing has worked. Any help would be greatly appreciated and would help me massively. If anymore information is needed, please do not hesitate to ask. Many thanks, Will. Directory Trimmomatic ILLUMINACLIP • 652 views ADD COMMENT 0 Entering edit mode Order of fastq file names is important for trimmomatic. Make sure that is correct. Does ls -lh /usr/local/bin/Trimmomatic-0.36/ return a result? If the software is there then perhaps the adapters directory is missing or may not be readable by all accounts, if an admin installed the software. If that is the case then you should ask your admins to change the read permissions. In meantime you can also re-download trimmomatic elsewhere and use the path to the adapters file in your command. ADD REPLY 0 Entering edit mode Hi, I inputed your command and it did return a result, with the adapter folder present. Thank you for the advice. If I download it myself, will it be possible to run Trimmomatic on the command line from my local drive, while accessing the fastq files on the remote server? I always thought once I was connected to the server I wouldn't be able to access my local drive contents. ADD REPLY 0 Entering edit mode If adapter folder is present then is this file /usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa not there? Based on the error above either file is not there or is not readable. You can download the adpater file to your $HOME (or any other directory) that you have access to on the server. At run time you will replace /usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa with /path_to/adapter.fa (location on the server) in your command. You can't use data that is present on your local computer from the server.

0
Entering edit mode

Hi, I think your initial comment was correct.

I had spotted a few minor typos in my original command line, including a missing letter (C) before stating the first input read 1 file: I typed '240_CTTGTA_L004_R1_001.fastq.gz', instead of 'C240_CTTGTA_L004_R1_001.fastq.gz'.

I also found that my TruSeq adapter name wasn't fully named, as the TruSeq adapters include either TruSeq2 or TruSeq3 adapter files. I only added 'TruSeq-PE.fa', instead of 'TruSeq2/3-PE.fa'.

After correcting all of this, my latest command line is as follows:

java -jar \$trimmomatic PE /home/research_6/C240_CTTGTA_L004_R1_001.fastq.gz /home/research_6/C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.PAIRED.fastq.gz C240_CTTGTA_L004_R1.UNPAIRED.fastq.gz C240_CTTGTA_L004_R2.PAIRED.fastq.gz C240_CTTGTA_L004_R2.UNPAIRED.fastq.gz ILLUMINACLIP:TruSeq2-PE.fa:2:30:10


This seemed to work and generated what I assume to be the output when Trimmomatic actually functions (see below) - however, I am not entirely sure, as I have actually never seen what a proper Trimmomatic output response looks like.

> TrimmomaticPE: Started with arguments:
/home/research_6/C240_CTTGTA_L004_R1_001.fastq.gz /home/research_6/C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.PAIRED.fastq.gz C240_CTTGTA_L004_R1.UNPAIRED.fastq.gz C240_CTTGTA_L004_R2.PAIRED.fastq.gz C240_CTTGTA_L004_R2.UNPAIRED.fastq.gz ILLUMINACLIP:TruSeq2-PE.fa:2:30:10
Using PrefixPair: 'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT'
Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT'
Using Long Clipping Sequence: 'AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG'
Using Long Clipping Sequence: 'TTTTTTTTTTAATGATACGGCGACCACCGAGATCTACAC'
Using Long Clipping Sequence: 'TTTTTTTTTTCAAGCAGAAGACGGCATACGA'
Using Long Clipping Sequence: 'CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT'
Using Long Clipping Sequence: 'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 6 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
Exception in thread "main" java.io.FileNotFoundException: C240_CTTGTA_L004_R1.PAIRED.fastq.gz (Permission denied)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)


This showed all parts to work, apart from my read 1 file not being accessed (would this also be true for the read 2 file, as they are in the same directory, even though the log didn't state this?). I assume now, the only thing I can do now is wait for the admin to change my read permissions as you'd perviously mentioned? Any other issues/mistakes you may spot, please make me aware.

As always, your help is greatly appreciated.

Many thanks,

Will.

0
Entering edit mode

We made some progress but this is not working yet. Looks like you are reading your data from non-local directories but you are not able to write the output to the directory you are running this from (you may not have write permissions there). Make sure you include the relevant path before the output file name where you are able to write files.

0
Entering edit mode

Hi, thank you for the reply. I tried your path idea and it worked, with no error messages following. However, my terminal was unresponsive once I inputted the command for a long time. Was the software still processing the files? If so, when would you typically know when Trimmomatic has been properly completed? I checked the directory after running the command and all the new output files, alongside a trimlog txt file were now present.

FYI - I used Control C to escape from the unresponsive terminal, potentially cutting Trimmomatic's processing short. I then went to use FastQC on my new files; however, I got an error message stating that my files that have tried to be processed may be truncated. Again, is this the result of me maybe cutting Trimmomatic short before it ended? No pun intended!

The error log is seen below:

fastqc C240_CTTGTA_L004_R1_paired.fastq.gz Started analysis of C240_CTTGTA_L004_R1_paired.fastq.gz Approx 5% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 10% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 15% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 20% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 25% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 30% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 35% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 40% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 45% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 50% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 55% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 60% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 65% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 70% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 75% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 80% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 85% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 90% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Approx 95% complete for C240_CTTGTA_L004_R1_paired.fastq.gz Failed to process file C240_CTTGTA_L004_R1_paired.fastq.gz uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Ran out of data in the middle of a fastq entry. Your file is probably truncated at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:179) at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125) at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76) at java.lang.Thread.run(Thread.java:745)

0
Entering edit mode

how big is your data? how many reads do you have. trimmomatic is fairly efficient and fast - but in the end it all depends on the amount of data

what is a "long time"? 10 minutes, 1 hour, 10 hours?

If you interrupt a program the results it produces will be unreliable

0
Entering edit mode

The two reads are roughly 23GB each, the only other large output file generated in my last attempt was the trimlog file, which was only 1.5GB. The time I've ran was ~30mins for my first attempt and 2 hours and on-going for my second. I also using 8 cores for my -threads flag.

0
Entering edit mode

It is going to take as long as it takes. You need to be patient and wait until the prompt returns. Based on speed of your CPU/disk things could be slower. It is difficult to provide an estimate.

0
Entering edit mode

Ran out of data in the middle of a fastq entry. Your file is probably truncated at

That is not good news. It appears that your trimmed data file is incomplete since you interrupted running trimmomatic process. You should wait until your system prompt returns which will ensure that the process was complete. Please re-run trommomatic command and wait for it to complete (your system prompt becomes visible again).

0
Entering edit mode

I have re-run Trimmomatic for a few hours now, but still no prompt from the terminal. I included '-threads 8' and so expected it to be somewhat faster. I will continue to run it, how long does it usually take (especially including 8 cores)?

0
Entering edit mode
4 weeks ago

the way run it right now makes tacit assumptions.

0
Entering edit mode

Thank you, would this mean I need to add the newly downloaded adapters into the same directory as the Trimmomatic.jar file, or does it not necessarily matter?