I have a few E. coli small RNome samples I tried to align with both E. coli genome and only with the non-coding E. coli RNAs. In both cases, I had a high percentage of unnaligned reads (~90%).
I blasted a few of them, and sometimes I got fragments of phage viruses, other E coli strains and, curiously, Cyprinus carpo hits (all of them with a query cover of 40-60%). I also noticed that many of them end with some variation of a long sequence of AAAGGGGGGG's, which does not seem to be the case of the ones that aligned. This happened for every sample.
Has anyone experienced something similar?