Question

Finding Adapters For Illumina Reads

6

Entering edit mode

10.7 years ago

figo ▴ 220

Hi All

I have some illumina reads but I don't know which adapters has been used. How I can find if my illumina reads has adapters and of which type.

Best

fastqc • 18k views

ADD COMMENT • link updated 10.0 years ago by Irsan ★ 7.8k • written 10.7 years ago by figo ▴ 220

0

Entering edit mode

hi, I have the same problems, how do you resolve it?

ADD REPLY • link 10.7 years ago by xiaojuhu13 ▴ 150

score 1 · Answer 1 · 2013-11-28

1

Entering edit mode

10.7 years ago

Phil S. ▴ 700

Hi,

imprt them into the FastQC program. Go for "overrepresented sequences" there should be stated which adaptors are used.

ADD COMMENT • link 10.7 years ago by Phil S. ▴ 700

1

Entering edit mode

That will only happen if there is adapter contamination. Usually that is not the case, as people always try and sequence a fragment longer than the read length

ADD REPLY • link 10.7 years ago by gammyknee ▴ 210

score 0 · Answer 2 · 2013-12-01

FastQC only gave you warning when overrepresented sequences were in first 200,000 sequences. see FastQC documentation

Supposed read1.fastq and read2.fastq is the paired end data with 4 lines per read.

Download common Illumina adapters from https://github.com/vsbuffalo/scythe/blob/master/illumina_adapters.fa

Go through each adapter, e.g. sampling 1 million read1.fastq for truseq-forward-contam adapter:

cat read1.fastq | head -4000000 | grep AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | wc -l 
cat read1.fastq | tail -4000000 | grep AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | wc -l

If the above command output >100, then sampling 1 million read2.fastq for truseq-reverse-contam will output with similar number:

cat read2.fastq | head -4000000 | grep AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA| wc -l 
cat read2.fastq | tail -4000000 | grep AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA| wc -l

score 0 · Answer 3 · 2014-07-28

0

Entering edit mode

10.0 years ago

rse ▴ 100

In reference to the comment above, i tried the following commands:

cat read1.fastq | head -4000000 | grep <Adaptersequence> | wc -l 
cat read1.fastq | tail -4000000 | grep <Adaptersequence> | wc -l

on my fastq file and got different output numbers (2731 and 1818 respectively).

What does this signify?

ADD COMMENT • link 10.0 years ago by rse ▴ 100

0

Entering edit mode

Means your forward reads have 2731 adapters in them and reverse have 1818. The numbers do not have to be identical.

ADD REPLY • link 10.0 years ago by Adrian Pelin ★ 2.6k

score 0 · Answer 4 · 2014-07-28

0

Entering edit mode

10.0 years ago

Irsan ★ 7.8k

You can find the illumina adapters on the illumina website.

ADD COMMENT • link 10.0 years ago by Irsan ★ 7.8k

0

Entering edit mode

The link doesn't work.

ADD REPLY • link 8.1 years ago by lxu16 • 0