Question: how to remove Overrepresented sequences for paired end with cutadapt?
0
gravatar for Lila M
4 weeks ago by
Lila M 370
UK
Lila M 370 wrote:

Hi guys, I have a question. After running fasqc, I've discovered that some of my reads has overrepresnted sequences as follow

fastq.R1
Sequence    Count   Percentage  Possible Source
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA  141860  0.4972976921930607  TruSeq Adapter, Index 13 (97% over 40bp)
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 55712   0.19530134659142676 No Hit
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTAT  50886   0.17838354973167975 TruSeq Adapter, Index 13 (97% over 40bp)

fastq.R2
Sequence    Count   Percentage  Possible Source
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG  72457   0.2540018249205738  No Hit
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 61284   0.21483428569265142 No Hit

but the adapter content is perfect. I would like to remove those adapters or overrepresented sequences. I've never done that before in PE, so I'm trying to figure out. At that moment I'm trying:

cutadapt -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA -a NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTAT -A GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG -A NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -o out.1.fastq -p out.2.fastq R1.fastq R2.fastq

any one with experience could tell me if this is right?

Thank you in advance!

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Lila M 370

If someone else has the same issue, I would like to add more information. If you are planning to map the sequences with STAR, for example, you may have an error like EXITING because of FATAL ERROR in reads input: short read sequence line . It can be solved if you add the parameter -m N, in my case I've chosen it based on the minimum Sequence length reported in fastqc. I hope this may help!

ADD REPLYlink written 4 weeks ago by Lila M 370
3
gravatar for glihm
4 weeks ago by
glihm530
France
glihm530 wrote:

Hello Lila M,

as mentioned in the cutadapt documentation you are doing the things well.

You can use several adapters (-a/g multiple time) and you can set the adapter search for a particular mate of the pair (-a/g for R1 and -A/G for R2).

So, if you try the command you mentioned it should work as you are expecting. ;)

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by glihm530

Thank you very much!

ADD REPLYlink written 4 weeks ago by Lila M 370
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 948 users visited in the last hour