TruSeq Illumina adapters are BLASTed with a high confidence to some genes/terms
1
1
Entering edit mode
14 months ago
e.r.zakiev ▴ 200

A FASTQC report shows an overrepresented sequence defined as "Truseq adapter" enter image description here but when I BLAST its nucleotide sequence (GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCGGAGCATCTCGTAT) enter image description here

it aligns with high confidence to, say, coronavirus genes enter image description here Why is that? I was expecting to see the top hit saying something like "Illumina Adapter".

And also why the nucleotide sequence reported by FastQC as "truseq adapter 18" doesn't match it's namesake in one listed in Illumina's official document?????? it's not the same sequence as listed in the Fastqc report, believe me

I am asking this because i wanted to BLAST the overrepresented sequences in my data and see if they come from the ribosomal RNA contamination or not. And with results like THAT even for the adapter sequences I clearly don't understand something

RNAseq adapters Truseq BLAST • 556 views
ADD COMMENT
3
Entering edit mode
14 months ago
ATpoint 81k

The sequence you provide is the beginning of the Illumina adapter as in the document you link. I see no problem here. It is only 0.27% of reads, so why bother?

As for this BLAST search, I would first of all check that the genome assembly has no leftovers of adapter sequences. But again, it is way less than 1% of sequences, just ignore and continue with analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 1489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6