Entering edit mode
6.0 years ago
jophy63
•
0
Why do I find, in the overrepresented sequences of FastQC / read QC analysis, adapters with different indexes than I used to build my libraries? To be clearer: I sequence 2 libraries one marked with index I1 the other with index I2. In the result of FastQC / readQC I find in the overrepresented sequences adapters carrying the index I4, I7, I14 etc ... Is this contamination from other libraries? If yes, what is the acceptable threshold for this contamination? JBH
Hopefully this is a harmless side-effect of FastQC scan of your data. FastQC may be recognizing the core sequence that is common with adapters and showing you a representative. If the data is demultiplexed there should be only one index per sample file. If you had 2 libraries then there will be 2 de-multiplexed files.
Maybe post a screenshot?
How to add images to a Biostars post
Thanks for your answer I start in small RNA-seq. For begin, I chose to work on data published in Sciences. (Chen Q et al 2016). I downloaded the fastq files SRX1457529 and SRX1457530. And I just analyzed with FastQC to determine the trimming conditions. Unfortunately there is little information in the material and method of the article. They built the library with TruSeq for small RNA (Illumina) and the quality review was done by the Beijing Genomics Institute without any more precision except that the sequencing was done in Hi-Seq2000. thank you for your help JBH
For small RNAseq you should definitely scan/trim the data. I recommend using
bbduk.sh
from BBMap suite. BBMap includes all illumina kit adapter sequences in a file calledadapters.fa
inresources
directory in the software distribution.