I am working on a set of microRNA rna-seq data. One strange problem that we have noticed while checking the data quality with FastQC is that a large portion of the reads in all samples (roughly 40% to 60%) in all our samples are duplicates of just one read (it comes to around roughly 2-4 million reads in all samples). FastQC tags this sequence as a possible PCR primer. We tried to BLAST this sequence to miRBase (after removing the adapter), but couldn't find a matching microRNA. My colleagues are suggesting that this could be biological, but I am not convinced. So my questions are assuming that FastQC tagging of this read as a PCR primer is a false positive, could it be possible that one microRNA is dominant in all the sequenced samples? and how can we confirm whether it is biological or a problem during sequencing ?
We contacted the folks who sequenced our samples (done externally) with the problem I mentioned. After some checking (I don't know the details yet), they informed us that it was an error in library preparation/sequencing step, and agreed to re-sequence our samples. So, thank you all for taking interest.