Dear all,I am a freshman in bioinformatics and I am trying to deal with some miRNA data.I have some question not sure. I download the human 3'UTR sequence form the biomart.As a gene have a series of transcripts, I am not sure to use them all or use the shortest utr. Another thing I am not sure is some genes are on the reverse strand ,should I reverse the utr sequences.Thanks for your replies.
This is an actual issue in miRNA research and is often overlooked.
The proper thing to do is to identify which isoforms (primary and/or secondary isoforms) are expressed in your cell line or tissue and utilize the exact 3'UTR and CDS for the target identification.
This in many cases is very difficult.
If you have the supporting NGS data (some really good RNA-Seq or RNA-Seq + PolyA-Seq or 3P-Seq) then this type of investigation can be supported.
If you are working with a really small number of genes then this again is also manageable.
If not, then people usually take the dominant transcript from the available annotation (e.g. ENSEMBL) or the transcript with the longest 3'UTR.
Most target prediction programs follow usually one of those two approaches.