Question: Looking for a database of murine piRNA sequences
gravatar for giovanni.birolo
23 months ago by
giovanni.birolo0 wrote:

Hello, I have a problem concerning a differential expression analysis of small non-coding RNA using sequencing data. I am trying to adapt a pipeline that I use for human small non-coding to mouse small non-coding. For human, I have a set of reference sequences where I use piRNA sequences from the piRBase database. The problem is that while piRBase has around 50-60 thousand sequences for human, mouse has 50 million sequences.

This makes little sense to me from a biological perspective, why should mouse have so much more piRNAs than human? Consider that rat has around 120 thousand sequences in piRBase, which is much closer to human than mouse.

This is also a problem when I perform read alignment on this reference, since BWA appears to be using a lot of memory and crashing. I think the problem is related to the sheer amount of sequences in the reference.

Does anyone know anything about this? Should I avoid using piRBase, at least for murine piRNA?

Any help in making sense of this is appreciated, thanks!

rna-seq • 418 views
ADD COMMENTlink written 23 months ago by giovanni.birolo0

RNAcentral has ~73K piRNA for mouse. If that helps any.

ADD REPLYlink modified 23 months ago • written 23 months ago by genomax76k

Thanks, that is a possible solution to the technical problem. 73k sequences are much more manageable, but I am still puzzled by the wildly different numbers in these databases...

ADD REPLYlink written 23 months ago by giovanni.birolo0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 946 users visited in the last hour