Question: Looking for a database of murine piRNA sequences
gravatar for giovanni.birolo
2.4 years ago by
giovanni.birolo0 wrote:

Hello, I have a problem concerning a differential expression analysis of small non-coding RNA using sequencing data. I am trying to adapt a pipeline that I use for human small non-coding to mouse small non-coding. For human, I have a set of reference sequences where I use piRNA sequences from the piRBase database. The problem is that while piRBase has around 50-60 thousand sequences for human, mouse has 50 million sequences.

This makes little sense to me from a biological perspective, why should mouse have so much more piRNAs than human? Consider that rat has around 120 thousand sequences in piRBase, which is much closer to human than mouse.

This is also a problem when I perform read alignment on this reference, since BWA appears to be using a lot of memory and crashing. I think the problem is related to the sheer amount of sequences in the reference.

Does anyone know anything about this? Should I avoid using piRBase, at least for murine piRNA?

Any help in making sense of this is appreciated, thanks!

rna-seq • 486 views
ADD COMMENTlink written 2.4 years ago by giovanni.birolo0

RNAcentral has ~73K piRNA for mouse. If that helps any.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax85k

Thanks, that is a possible solution to the technical problem. 73k sequences are much more manageable, but I am still puzzled by the wildly different numbers in these databases...

ADD REPLYlink written 2.4 years ago by giovanni.birolo0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1646 users visited in the last hour