Cutadpapt for paired-end sequencing reads
1
0
Entering edit mode
7.3 years ago
flsnike ▴ 10

Hi dear All, I have been recently doing rna-seq analysis. In the fastqc report it shows overpresented sequcences(1.7%) but with "NO Hit" in the "Possible Source" column. DO I need to trim them off? As it is paired-end reads, how to present the "adapter" when using cutadapt ?

For instance, the sequence I wanna trim is "GTGCTCTT", so the command should be like

cutadapt -a GTGCTCTT -A AAGAGCAC -o out.1.fastq -p out.2.fastq reads.1.fastq reads.2.fastq

where the sequence following "-A" is reverse-complementary to the sequence behind "-a"

Can anyone tell me if it is right to write command above ? or if not correct, plz show me the correct command

Thanks in advance!!

RNA-Seq sequencing • 1.6k views
ADD COMMENT
0
Entering edit mode
7.3 years ago

Those are too short for confident adapter identification. For paired-end reads, the best approach is to use BBDuk as illustrated at the top of this thread. This will use all commonly-used Illumina adapter-sequences for trimming as well as overlap-detection to trim adapter fragments too short to catch via sequence-matching. If you believe that your data contains adapters not present in BBMap's adapters.fa file, you can determine the sequence yourself like this:

bbmerge.sh in=r1.fq in2=r2.fq outa=adapters.fa reads=2m

The sequences you mention ARE part of normal TruSeq adapter sequences, by the way - just not at the beginning. So, in addition to being too short, they are probably not the correct sequences to use for good adapter trimming.

ADD COMMENT

Login before adding your answer.

Traffic: 2491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6