Question: Trimmomatic becomes sleeping process
0
gravatar for marongiu.luigi
3.1 years ago by
Germany, Mannheim, UMM
marongiu.luigi380 wrote:

Hello,
I am running trimmomatic to remove TruSeq adapters with the following command:

java -jar /usr/bin/trimmomatic.jar PE -threads 16 -phred33 input1.fastq input2.fastq 1_paired 1_unpaired 2_paired 2_unpaired ILLUMINACLIP:./IlluminaTags/TruSeq_RNA.fa:2:30:10:1:true

where the TruSeq_RNA.fa I created is made of:

>PrefixNX/1 
ATCACGAC
>PrefixNX/2 
ACAGTGGT
>PrefixNX/3 
CAGATCCA
>PrefixNX/4 
ACAAACGG
>PrefixNX/5 
ACCCAGCA
>PrefixNX/6 
AACCCCTC
>PrefixNX/7 
CCCAACCT
>PrefixNX/8 
CACCACAC
>PrefixNX/9 
GAAACCCA
>PrefixNX/10 
TGTGACCA
>PrefixNX/11 
AGGGTCAA
>PrefixNX/12 
AGGAGTGG
>A501 
TGAACCTT 
>A501_rc
AAGGTTCA
>A502
TGCTAAGT 
>A502_rc
ACTTAGCA
>A503 
TGTTCTCT 
>A503_rc
AGAGAACA
>A504
TAAGACAC 
>A504_rc
GTGTCTTA
>A505 
CTAATCGA 
>A505_rc
TCGATTAG
>A506 
CTAGAACA 
>A506_rc
TGTTCTAG
>A507 
TAAGTTCC 
>A507_rc
GGAACTTA
>A508 
TAGACCTA 
>A508_rc
TAGGTCTA

I get the good output

TrimmomaticPE: Started with arguments: -threads 16 -phred33 SRR364001_ni_1.fastq SRR364001_ni_2.fastq ni1_paired ni1_unpaired ni2_paired ni2_unpaired ILLUMINACLIP:./IlluminaTags/TruSeq_RNA.fa:2:30:10:1:true
Using PrefixPair: 'ATCACGAC' and 'ACAGTGGT'
Using Short Clipping Sequence: 'CTAGAACA'
Using Short Clipping Sequence: 'TAAGTTCC'
Using Short Clipping Sequence: 'TAAGACAC'
Using Short Clipping Sequence: 'CTAATCGA'
Using Short Clipping Sequence: 'GGAACTTA'
Using Short Clipping Sequence: 'TAGACCTA'
Using Short Clipping Sequence: 'AGGGTCAA'
Using Short Clipping Sequence: 'TGTGACCA'
Using Short Clipping Sequence: 'TCGATTAG'
Using Short Clipping Sequence: 'TGTTCTAG'
Using Short Clipping Sequence: 'TGCTAAGT'
Using Short Clipping Sequence: 'TGTTCTCT'
Using Short Clipping Sequence: 'ACTTAGCA'
Using Short Clipping Sequence: 'AGGAGTGG'
Using Short Clipping Sequence: 'TGAACCTT'
Using Short Clipping Sequence: 'AGAGAACA'
Using Short Clipping Sequence: 'TAGGTCTA'
Using Short Clipping Sequence: 'AAGGTTCA'
Using Short Clipping Sequence: 'ACCCAGCA'
Using Short Clipping Sequence: 'AACCCCTC'
Using Short Clipping Sequence: 'CAGATCCA'
Using Short Clipping Sequence: 'ACAAACGG'
Using Short Clipping Sequence: 'GAAACCCA'
Using Short Clipping Sequence: 'CCCAACCT'
Using Short Clipping Sequence: 'GTGTCTTA'
Using Short Clipping Sequence: 'CACCACAC'
ILLUMINACLIP: Using 1 prefix pairs, 26 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

but then the process is marked as sleeping from the Ubunut's system monitor, and in fact after hours nothing has happened.
Could you tell me what I am getting wrong?
Thank you
L

assembly • 1.2k views
ADD COMMENTlink written 3.1 years ago by marongiu.luigi380
1

Are you giving it barcode sequences? Why not just give it the beginning of the truseq adapter sequence, which should be the same regardless of barcode (i.e., you'll only need a single sequence).

ADD REPLYlink written 3.1 years ago by Devon Ryan90k

I got these from the manual provided by Illumina on request. I looked at page 16. I called the i7 adapters as 'prefix' and the others with their names -- probably I should not use the '*_rc' because they come from other kits. But I am not sure about these, I am just guessing; I made it in analogy with the Nextera adapters. An alternative is a file I found online, where the TruSeq adapters are simply GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNN[NN]ATCTCGTATGCCGTCTTCTGCTTG. So shall I just use this sequence?

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by marongiu.luigi380

You need the part of the adapter sequence that is next to the end of the read when the insert size is shorter than the read length. As you see in the sequence you gave above, there is quite a stretch of adapter sequence before the stretch of Ns for the barcode. If you just give trimmomatic the barcode sequence, it will not remove any stretches of adapter sequence upstream (to the 5' side of) of the barcode.

The diagrams on the U. Texas at Austin genome center and Tufts Uni webpages should help you understand how the Illumina sequencing constructs work:

https://wikis.utexas.edu/display/GSAF/Illumina+-+all+flavors

https://www.med.unc.edu/pharm/calabreselab/files/tufts-sequencing-primer

ADD REPLYlink written 3.1 years ago by mastal5112.0k

OK, so if I understand properly the Illumina Universal adapter is at the 5' end of the insert/sequence of intesret (SOI) while the Indexed adapter with the embedded barcodes is at the 3' end of SOI. But how shall I made the TruSeq_RNA.fa file? Can I just use:

adapter
GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
or shall I add the whole combination, for instance:
adapter1
GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
and shall I also add the universal adapter:
uni
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
with all the reverse complements?

ADD REPLYlink written 3.1 years ago by marongiu.luigi380

Always try a small test file (e.g. cut a hundred reads out of a file and then use that) when you are new to this and trying to run things for the first time. Use [-trimlog <logfile>] option to write a log file so the actions of the program can be followed. If the trimmed files are increasing in size then don't go by the "S" designation in activity monitor.

ADD REPLYlink written 3.1 years ago by genomax67k

Actually it worked before and was even a fast process -- compared to alignments and indexing -- I don't know what happened now.

ADD REPLYlink written 3.1 years ago by marongiu.luigi380

I agree with you, trimmomatic is usually fast compared to alignments or running tophat. When you ran it before and it ran in a shorter time, did you also give it such a long list of sequences in the adapter.fasta file?

ADD REPLYlink written 3.1 years ago by mastal5112.0k

Yes I did, that's why this time looked odd.

ADD REPLYlink written 3.1 years ago by marongiu.luigi380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1807 users visited in the last hour