Hello,
New to RNA Seq analysis and I couldn't find an answer to this question elsewhere. I have an overrepresented sequence in one of my Fastq files (I did pair-end sequencing), which is for one of five drug-treated samples (five control samples, same five drug-treated). This sequence makes up 0.107% of total reads, flagging an amber warning. This sequence was not overrepresented in any of the other control or drug-treated samples, so I'm unsure if this is contamination or something about this sample in particular (the FASTQC report did not identify the source of the sequence = 'No Hit'). I blasted the sequence and the hit was an inflammatory mediator, which isn't totally unrelated to what the research question is addressing.
Should I remove this sequence? Or is it likely this is just biological variance in this one sample responding more profoundly to the treatment?
Thanks in advance!
The only over-represented sequences you'd have to remove are adapter dimers (if there are any).