Some questions about 3'UTR regions from rat 6.0 fasta and gtf files
Entering edit mode
4 months ago

I met a problem about miRNAseq and the miRNA target genes prediction.

I know the basic workflow and I tried the first method:

I downloaded the 3'UTR fasta files (version: rat 6.0) from ensembl biochart and UCSC respectively.

And I used each of these files and predicted different numbers of target genes using miranda in linux.

However I were not satisfied with all the results on the numbers of target genes(with setting parameters below:).

miranda rno_DEGs.fasta  GCF_000001895.5_Rnor_6.0_3'UTR.fa -sc 150 -en -30 -strict | grep ">>" > 

Can I adjust the two parametesr:-sc 150 -en -30 to low standard ??? the default parameter is: -sc 140 -en 1

Here is the fasta and gtf files links:

annotation files´╝Ü

So I tried the second method:

I used the rat 7.0 3'UTR fasta file from ensembl biochart and run the same flowchart and what suprised me was I got the most numbers of target genes and I thought it was a good result.

My question is my mRNA counts was done by rat 6.0 fasta I described above . So I don't know if it is suitable for me to use different versions of fasta files to analysis target genes ?

If not, I also have the other one method:

I think I can extract 3'UTR sequences from rat 6.0 fasta and gtf files. But I didn't find the coordinations of 3'UTR and any its information from the 6.0 gtf file. I don't know it's why. And because of this reason, I have no idea how to extract 3'UTR information from fasta and gtf files then.

And I got the final method:

I communicate with the sequencing company. They told me they use the whole genomic fasta file as the 3'UTR sequence and to get the target genes prediciton. I still don't know why they do this step in this way ??? Can I do this?

I looked up many methods including using R biomart or other methods but most of them were not suitable for me.

So I really hope somebody could give me some advice or method. Vary thankful.

UTR miRNAseq • 372 views
Entering edit mode

Who could help me

Entering edit mode
4 months ago
Shred ▴ 870

A quick script to get just 3' UTR in BED format

import sys

with open(sys.argv[1],'r') as gtf_file:
    for line in gtf_file:
        if line.startswith('#'):
            fields = line.rstrip().split('\t')
            if fields[2] == "three_prime_utr":

Launch this with

python3 your_annotation.gtf > three_prime_utr.BED

Then use BEDtools getfasta to extract fasta of those regions

bedtools getfasta [OPTIONS] -fi <input FASTA> -bed three_prime_utr.BED
Entering edit mode

Thanks, sir. I am not familiar with the python. But I saw the words "three_prime_utr" . I don't know if it is that I should "three_prime_utr" in the GTF ?

Entering edit mode

That's how the 3'UTR regions are encoded inside the GTF file.


Login before adding your answer.

Traffic: 1156 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6