How to trim miRNA reads?
2
0
Entering edit mode
4 months ago
Sanjukta • 0

Hi there,

I am new to bioinformatics. I am trying to prepare fasta.gz files for uploading onto CPSS, a websever for miRNA-seq datasets. My data is from Gene Omnibus db. Basically the sample fasta file appears like this:

;>SRR1658346.1 HISEQ1:187:D0NWFACXX:3:1101:2565:2050 length=51
ATCATACAAGGACAATTTCTTTTAACGTCGTATGCCGTCTTCTGCTTGNAA
>SRR1658346.2 HISEQ1:187:D0NWFACXX:3:1101:2654:2232 length=51
TCGAGGAGCTCACAGTCTAGTATAACGTCGTATGCCGTCTTCTGCTTGAAA
>SRR1658346.3 HISEQ1:187:D0NWFACXX:3:1101:2870:2103 length=51
TTCAAGTAATCCAGGATAGGCTAACGTCGTATGCCGTCTTCTGCTTGAAAA
>SRR1658346.4 HISEQ1:187:D0NWFACXX:3:1101:3001:2147 length=51
TAGCACCATCCGAAATCAGTTTAACGTCGTATGCCGGCTTCTGCTTGAAAA

And my clean file should be like this (an example from CPSS):

>t0000001_823508
TGAGGTAGTAGATTGTATAGTT
>t0000002_757054
TGAGGTAGTAGGTTGTATAGTT
>t0000003_252586
ACAGTAGTCTGCACATTGGTT

With my limited knowledge, I can guess that there are adaptors along with the typical 21 nt long miRNA sequence. But I am not sure as how to trim them as the terminal sequences are of varying composition.

(edited) I am trying to re-analyse an miRNA dataset to discover some desirable miRNAs which are not reported in the relevant publication. Here's a link to the webtool.

mirna adapter-trimming fastq • 825 views
ADD COMMENT
0
Entering edit mode

The question needs some clarification:

  • What is the purpose of your analysis?
  • Are you sure the tool is still relevant for your question, the web page I found shows an error (Bad gateway)
  • [Why do you want to trim the reads to the mature miRNA?] (Sorry, I missed the point here, of course you should be trimming adapter sequences)
  • Did you consider miRDeep2 or similar alternatives?
ADD REPLY
0
Entering edit mode

I have replied to your queries in the main post. And, no I have not checked miRDeep2 yet.

ADD REPLY
0
Entering edit mode

Hi Michael,

I went with Galaxy for now, and not proper miRDeep2. The installation file is pretty large, and temporary internet issues are preventing me from downloading it, is taking pretty long time.

I did QC on galaxy and it could not detect adaptor to my surprise. I am not sure what could be done, I am writing another post, any pointer will be appreciated.

ADD REPLY
0
Entering edit mode

sRNAtoolkit is also an option

ADD REPLY
2
Entering edit mode
4 months ago
GenoMax 141k

It would be ideal to know the kit used so you will know the specific adapter that was added to the miRNA's. But looking at the reads above you can see that TAACGTCGTATGCCGTCTTCTGC is likely a safe bet to trim your data.

ATCATACAAGGACAATTTCTTT TAACGTCGTATGCCGTCTTCTGCTTGNAA
TCGAGGAGCTCACAGTCTAGTA TAACGTCGTATGCCGTCTTCTGCTTGAAA
 TTCAAGTAATCCAGGATAGGC TAACGTCGTATGCCGTCTTCTGCTTGAAAA
 TAGCACCATCCGAAATCAGTT TAACGTCGTATGCCGGCTTCTGCTTGAAAA
ADD COMMENT
0
Entering edit mode

Thank you so much for pointing the sequence out.

ADD REPLY
1
Entering edit mode
4 months ago
Michael 54k

If you are looking for an all-in-one qc and adapter trimming pipeline, fastp should do well. It should also be able to detect the adapter sequences automatically or you can use the sequence given by GenoMax. If you use mirDeep, it has also a built-in trimming step.

ADD COMMENT

Login before adding your answer.

Traffic: 1870 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6