TruSeq Illumina adapters are BLASTed with a high confidence to some genes/terms
Entering edit mode
15 months ago
e.r.zakiev ▴ 210

A FASTQC report shows an overrepresented sequence defined as "Truseq adapter" enter image description here but when I BLAST its nucleotide sequence (GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCGGAGCATCTCGTAT) enter image description here

it aligns with high confidence to, say, coronavirus genes enter image description here Why is that? I was expecting to see the top hit saying something like "Illumina Adapter".

And also why the nucleotide sequence reported by FastQC as "truseq adapter 18" doesn't match it's namesake in one listed in Illumina's official document?????? it's not the same sequence as listed in the Fastqc report, believe me

I am asking this because i wanted to BLAST the overrepresented sequences in my data and see if they come from the ribosomal RNA contamination or not. And with results like THAT even for the adapter sequences I clearly don't understand something

RNAseq adapters Truseq BLAST • 579 views
Entering edit mode
15 months ago
ATpoint 82k

The sequence you provide is the beginning of the Illumina adapter as in the document you link. I see no problem here. It is only 0.27% of reads, so why bother?

As for this BLAST search, I would first of all check that the genome assembly has no leftovers of adapter sequences. But again, it is way less than 1% of sequences, just ignore and continue with analysis.


Login before adding your answer.

Traffic: 1735 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6