Trimmomatic becomes sleeping process
0
0
Entering edit mode
8.6 years ago

Hello,
I am running trimmomatic to remove TruSeq adapters with the following command:

java -jar /usr/bin/trimmomatic.jar PE -threads 16 -phred33 input1.fastq input2.fastq 1_paired 1_unpaired 2_paired 2_unpaired ILLUMINACLIP:./IlluminaTags/TruSeq_RNA.fa:2:30:10:1:true

where the TruSeq_RNA.fa I created is made of:

>PrefixNX/1 
ATCACGAC
>PrefixNX/2 
ACAGTGGT
>PrefixNX/3 
CAGATCCA
>PrefixNX/4 
ACAAACGG
>PrefixNX/5 
ACCCAGCA
>PrefixNX/6 
AACCCCTC
>PrefixNX/7 
CCCAACCT
>PrefixNX/8 
CACCACAC
>PrefixNX/9 
GAAACCCA
>PrefixNX/10 
TGTGACCA
>PrefixNX/11 
AGGGTCAA
>PrefixNX/12 
AGGAGTGG
>A501 
TGAACCTT 
>A501_rc
AAGGTTCA
>A502
TGCTAAGT 
>A502_rc
ACTTAGCA
>A503 
TGTTCTCT 
>A503_rc
AGAGAACA
>A504
TAAGACAC 
>A504_rc
GTGTCTTA
>A505 
CTAATCGA 
>A505_rc
TCGATTAG
>A506 
CTAGAACA 
>A506_rc
TGTTCTAG
>A507 
TAAGTTCC 
>A507_rc
GGAACTTA
>A508 
TAGACCTA 
>A508_rc
TAGGTCTA

I get the good output

TrimmomaticPE: Started with arguments: -threads 16 -phred33 SRR364001_ni_1.fastq SRR364001_ni_2.fastq ni1_paired ni1_unpaired ni2_paired ni2_unpaired ILLUMINACLIP:./IlluminaTags/TruSeq_RNA.fa:2:30:10:1:true
Using PrefixPair: 'ATCACGAC' and 'ACAGTGGT'
Using Short Clipping Sequence: 'CTAGAACA'
Using Short Clipping Sequence: 'TAAGTTCC'
Using Short Clipping Sequence: 'TAAGACAC'
Using Short Clipping Sequence: 'CTAATCGA'
Using Short Clipping Sequence: 'GGAACTTA'
Using Short Clipping Sequence: 'TAGACCTA'
Using Short Clipping Sequence: 'AGGGTCAA'
Using Short Clipping Sequence: 'TGTGACCA'
Using Short Clipping Sequence: 'TCGATTAG'
Using Short Clipping Sequence: 'TGTTCTAG'
Using Short Clipping Sequence: 'TGCTAAGT'
Using Short Clipping Sequence: 'TGTTCTCT'
Using Short Clipping Sequence: 'ACTTAGCA'
Using Short Clipping Sequence: 'AGGAGTGG'
Using Short Clipping Sequence: 'TGAACCTT'
Using Short Clipping Sequence: 'AGAGAACA'
Using Short Clipping Sequence: 'TAGGTCTA'
Using Short Clipping Sequence: 'AAGGTTCA'
Using Short Clipping Sequence: 'ACCCAGCA'
Using Short Clipping Sequence: 'AACCCCTC'
Using Short Clipping Sequence: 'CAGATCCA'
Using Short Clipping Sequence: 'ACAAACGG'
Using Short Clipping Sequence: 'GAAACCCA'
Using Short Clipping Sequence: 'CCCAACCT'
Using Short Clipping Sequence: 'GTGTCTTA'
Using Short Clipping Sequence: 'CACCACAC'
ILLUMINACLIP: Using 1 prefix pairs, 26 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

but then the process is marked as sleeping from the Ubunut's system monitor, and in fact after hours nothing has happened.
Could you tell me what I am getting wrong?
Thank you
L

Assembly • 2.4k views
ADD COMMENT
1
Entering edit mode

Are you giving it barcode sequences? Why not just give it the beginning of the truseq adapter sequence, which should be the same regardless of barcode (i.e., you'll only need a single sequence).

ADD REPLY
0
Entering edit mode

I got these from the manual provided by Illumina on request. I looked at page 16. I called the i7 adapters as 'prefix' and the others with their names -- probably I should not use the '*_rc' because they come from other kits. But I am not sure about these, I am just guessing; I made it in analogy with the Nextera adapters. An alternative is a file I found online, where the TruSeq adapters are simply GATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNN[NN]ATCTCGTATGCCGTCTTCTGCTTG. So shall I just use this sequence?

ADD REPLY
0
Entering edit mode

You need the part of the adapter sequence that is next to the end of the read when the insert size is shorter than the read length. As you see in the sequence you gave above, there is quite a stretch of adapter sequence before the stretch of Ns for the barcode. If you just give trimmomatic the barcode sequence, it will not remove any stretches of adapter sequence upstream (to the 5' side of) of the barcode.

The diagrams on the U. Texas at Austin genome center and Tufts Uni webpages should help you understand how the Illumina sequencing constructs work:

https://wikis.utexas.edu/display/GSAF/Illumina+-+all+flavors

https://www.med.unc.edu/pharm/calabreselab/files/tufts-sequencing-primer

ADD REPLY
0
Entering edit mode

OK, so if I understand properly the Illumina Universal adapter is at the 5' end of the insert/sequence of intesret (SOI) while the Indexed adapter with the embedded barcodes is at the 3' end of SOI. But how shall I made the TruSeq_RNA.fa file? Can I just use:

adapter
GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
or shall I add the whole combination, for instance:
adapter1
GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
and shall I also add the universal adapter:
uni
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
with all the reverse complements?

ADD REPLY
0
Entering edit mode

Always try a small test file (e.g. cut a hundred reads out of a file and then use that) when you are new to this and trying to run things for the first time. Use [-trimlog <logfile>] option to write a log file so the actions of the program can be followed. If the trimmed files are increasing in size then don't go by the "S" designation in activity monitor.

ADD REPLY
0
Entering edit mode

Actually it worked before and was even a fast process -- compared to alignments and indexing -- I don't know what happened now.

ADD REPLY
0
Entering edit mode

I agree with you, trimmomatic is usually fast compared to alignments or running tophat. When you ran it before and it ran in a shorter time, did you also give it such a long list of sequences in the adapter.fasta file?

ADD REPLY
0
Entering edit mode

Yes I did, that's why this time looked odd.

ADD REPLY

Login before adding your answer.

Traffic: 1345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6