paired end reads cutadapt
1
0
Entering edit mode
4.2 years ago
anna ▴ 10

I have several paired end reads files. After performing the FastQC analysis i found out that some pairs have one or more overrepresented sequences. Should i trimm these sequences from both of the two files (1_1 and 1_2)? or just in the only one that have them overrepresented (1_2)? This is an example:

  1. File 1_1.fq: No Overrepresented sequences
  2. File 1_2.fq AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTT TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Then I'm considering the command line as follows:

cutadapt -b AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTT \
         -b TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT \
         -B AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTT \
         -B TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT \
         -o out_1.fastq \
         -p out_2.fq \
         1_1.fastq 1_2.fq
cutadapt overrepresented sequences • 1.8k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. It is difficult to see what you are presenting.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode

I formatted OP's code this time. I think we should add splitting up long one-liners to the (upcoming) formatting manual.

ADD REPLY
0
Entering edit mode

How many reads are affected (%)?

ADD REPLY
1
Entering edit mode
4.2 years ago

There have been a few threads on this topic already:

In conclusion, I would just remove the standard adapters that are known to CutAdapt from the sequences, and also filter / trim reads based on length and quality, and then proceed to alignment. My feeling is that the main thing that is affected by trimming an filtering reads is the quality metrics like percent alignment. Most 'junk' reads, including poly A and T, will not align anyway.

Kevin

ADD COMMENT
1
Entering edit mode

Thanks for your answer. I will consider removing the overrepresented sequences and compare the "clean" data against the raw data, which already have a very good quality of the reads.

ADD REPLY

Login before adding your answer.

Traffic: 1168 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6