Sorry if this seems very basic but I tried to search the answers but I only found the posts that ask for ID conversions or downloading sequences from identified miRNAs.
I am trying to figure out if there is a way to identify tens of thousands of miRNA sequences in a mouse miRNA seq data. I tried mirBase that use nBlast to identify the microRNA based on the sequence input but it only allow for one sequence at a time. It works but is pretty slow. Basically this is what the data look like below, just two columns... miRNA seq and number of reads. Is there any simpler/more efficient method? I tried to look through miBase bioconductor but I think it doesn't have the utility to do mass identification, I am not sure,
AAGCTGCCAGTTGAAGAACTGT 791753
TTCAAGTAATCCAGGATAGGCT 659068
TACCCTGTAGAACCGAATTTGT 595526
TTCACAGTGGCTAAGTTCTG 483846
TTCACAGTGGCTAAGTTCTGC 323892
You can always do this search locally by downloading mature miRNA sequences from miRBase and then use
blat
locally.But if you want to use a GUI based way then use sequence search at RNACentral (50 sequences at a time) or at miRBase using the same tool.
Thanks so much! I feel really dumb, didn't realize that you can just download mature miRNA seq file!
The way I do this analysis is to align the reads to the genome and then pull the Ensembl annotations (including known/annotated miRNA - you can use the biomaRt package to do this or even just use the Ensembl GTF files) and count them up as standard features (just like normal RNA-seq).