Question: How to extract 3'UTR regions
0
gravatar for tianshenbio
4 months ago by
tianshenbio50
tianshenbio50 wrote:

I have a fasta file of the genome and an gff file (CDS features only) of a non-model organism, Bicyclus anynana. I need to extract the 3'UTR regions for each gene registered in the gff file, is there any program I can use?

rna-seq 3'utr assembly genome • 201 views
ADD COMMENTlink modified 4 months ago by lieven.sterck8.0k • written 4 months ago by tianshenbio50

It would help if you post head of the fasta and gff files and also mention the organism name. Anyways, you can use bedtools getfasta to extract regions from a fasta file using coordinates specified in bed/gff format.

ADD REPLYlink written 4 months ago by ashish470

Thanks Ashish, my gff file only contains coordinates of CDS regions. So I do not have any coordinates for 3'UTR in the gff file. What I want is to identify and extract the 3'UTR region from the fasta file based on the coordinates of the genes (CDSs) from the gff file.

ADD REPLYlink written 4 months ago by tianshenbio50
1

Use bedtools complement It will give you all the regions not represented in your gff file. You can then use it extract all the "non-CDS" regions from your genome but to identify UTRs in those regions will take more downstream steps. How did you end up with a gff file containing only CDS coordinates?

ADD REPLYlink written 4 months ago by ashish470
0
gravatar for lieven.sterck
4 months ago by
lieven.sterck8.0k
VIB, Ghent, Belgium
lieven.sterck8.0k wrote:

To determine the UTR regions of a gene you will need to align the transcript data to the genome, link it to the CDS coordinates you have, and then the part that is covered by transcript and is not within CDS is UTR.

There is no way that you could located them without transcript evidence (there have been some attempts to predict them 'in silico' but success rate of those approaches is really low)

ADD COMMENTlink written 4 months ago by lieven.sterck8.0k
1

Hi, actually I have the transcriptome which is already mapped to the genome. Are there any tools I can use to extract the UTR part?

ADD REPLYlink written 4 months ago by tianshenbio50
1

Give stringtie2 a go. It will assemble transcripts from mapped RNAseq data.

ADD REPLYlink written 4 months ago by i.sudbery8.2k

no expert on that part but I assume that with using bedtools and related software it should be possible to get that done

ADD REPLYlink written 4 months ago by lieven.sterck8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 579 users visited in the last hour