Question: How to determine which adapter to use while trimming?
gravatar for MAPK
2.1 years ago by
MAPK1.6k wrote:

Hi All, I have several smRNAseq data (single end fastq files) and I would like to trim the adapter using trimmomatic, but I am not sure which adapter I have in them. Is there a better way to figure out whether they have NEB-SE, Nextera or TruSeq2 adapter in them?

Here is the command I am using, but I am not sure if that is NEB or Nextera or Truseq.

java -jar trimmomatic-0.36.jar SE -phred33 /media/owner/SeqL008_001.fastq Trimmed_SeqL008_001.fq.gz ILLUMINACLIP:NEB-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:18

It looks like there is no adapter(already trimmed) in my sequence below. can someone please confirm? I also want to know if CATGGC is barcode sequence and if that needs to be trimmed as well?

@K00363:128:HV3CJBBXX:3:1101:2240:1859 1:N:0:CATGGC
@K00363:128:HV3CJBBXX:3:1101:2706:1859 1:N:0:CATGGC
@K00363:128:HV3CJBBXX:3:1101:2909:1859 1:N:0:CATGGC
@K00363:128:HV3CJBBXX:3:1101:2717:1877 1:N:0:CATGGC
adapter fastq ngs trimmomatic • 1.1k views
ADD COMMENTlink modified 2.1 years ago by sschmeier80 • written 2.1 years ago by MAPK1.6k
gravatar for genomax
2.1 years ago by
United States
genomax87k wrote:

smRNAseq if that stands for small RNAseq data then there could be kit specific adapters that you would need to specifically look for (e.g. there are kits that ligate an adapter directly on 3'-end of RNA) those. You would need to know the name of kit for this to work along with instructions of how to process the data which are included in the manual. CATGGC is illumina index sequence that has been transferred to the fastq header during demultiplexing. You don't need to do anything to it.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by genomax87k

Thanks for the helpful answer.

ADD REPLYlink written 2.1 years ago by MAPK1.6k
gravatar for sschmeier
2.1 years ago by
New Zealand
sschmeier80 wrote:
  1. You can use a tool like fastqc to look what kind of sequences are over-represented in your data. That should give you an idea what sequences you are looking at. You could then compare the over-represented seqs with seqs in the illumina adapter file.

  2. For trimming you could also extract all adapter sequences in that illumina adapter file and run your file with fastq-mcf (from ea-utils) which allows to submit an adapter-file and will remove all seqs in that file from your data.

  3. Re-run fastqc afterwards to see if the over-represented seqs are gone from your data.

ADD COMMENTlink written 2.1 years ago by sschmeier80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour