retrieval of upstream non-coding sequences
2
0
Entering edit mode
9.0 years ago
zizigolu ★ 4.3k

Hello

I am going to retrieve upstream non-coding sequences of some Saccharomyces genes from Ensembl but I do not how I can do it considering I am a beginner in working with data bases, if possible someone help me.

Thank you friends

sequence • 4.5k views
ADD COMMENT
0
Entering edit mode

Hi all,

How do bioinformaticians actually annotate UTRs? Is there a software available for such application?

The species I am working on (oreochromis niloticus) have GENSCAN annotated reference genome. However, the UTRs were not annotated. I intent to annotate the UTRs myself. Any primers for me to know where should I start?

Regards,
Ziyi

ADD REPLY
3
Entering edit mode
9.0 years ago
  1. Go to http://www.ensembl.org/biomart/martview/
  2. Choose Ensemble Genes 79
  3. Choose Saccharomyces cerevisiae genes (R64-1-1)
  4. Click Attributes and select "Sequences"
  5. Select regions you are interested in. You can put 2000 or 5000 in Upstream Flank.
ADD COMMENT
0
Entering edit mode

Sorry I did your advice step wise and entered 2000 bp in gene start part in filter and clicked on count but I can't find anything.

I got confused

ADD REPLY
0
Entering edit mode

Sorry I mean after doing 6 steps you stated, where I should enter my gene name or sequence to retrieve just 2000 bp of gene upstream non-coding sequence

ADD REPLY
0
Entering edit mode

If you click on Filters you can give Ensembl Geneids of genes you are interested in OR you can download the 2Kb upstream region for all the genes in yeast and further select your genes of interest from the big file. Make sure you select gene name in the header information so that you can use it to select your genes of interest later on. Keep fooling around, I am sure you will get it. This ain't rocket science :-).

PS: Just realised that you have 10 genes in total. Retrieve their ENSEMBL GeneIDs and give it them as an input in the filter section to retrieve sequences for those genes only.

ADD REPLY
0
Entering edit mode

Thank you very much

Really I got disappointed because I am just a beginner in bioinformatics

ADD REPLY
0
Entering edit mode

I am so sorry I did the steps as you told me, I entered gene ID and chromosome location, when I clicked on go the downloaded fasta file included just this sentence:

>YKL140W|YKL140W
Sequence unavailable

I don't know the reason.

ADD REPLY
0
Entering edit mode

Hey take it easy. It may take a time getting used to the BioMart site. Please follow the instructions below.

  1. For every gene symbol extract SGD ID from http://www.yeastgenome.org For example, Gene symbol YKL140W gives the SGD ID: S000001623
  2. Now you can go to BioMart as I explained above. In Filter , select Gene and then select "Input external references ID list [Max 500 advised] ", then select "SGD Gene ID(s)" and paste SGD IDs for all genes you are interested in.
  3. Click Attributes and select "Sequences". Select regions you are interested in. You can put 2000 or 5000 in Upstream Flank. Also select "Associated Gene Name" under Gene Information.
  4. Click Result on the left corner.
ADD REPLY
0
Entering edit mode

This is not a correct gene symbol "YKL140W|YKL140W". It should be "YKL140W" . Also you will have to select appropriate external references ID in BioMart. For example, you can't use these gene symbols as Ensembl Gene IDs. See above for a new answer.

ADD REPLY
0
Entering edit mode

I am really thankful for your patient and kindness. In attributes, sequences, I selected flank(gene), 5' UTR Start and 5' UTR End. This link was my result: http://asia.ensembl.org/biomart/martview/8fc2d044ce4f09bd12d4f4a02ee52703

I saw your another post now after getting this result. I will try your tips. Thanks a lot

ADD REPLY
0
Entering edit mode

No problem. I am glad that I could help. The link you provide is a temporary link. But I assume that you are close to what you actually wanted. Keep working and you should be able to figure it out.

ADD REPLY
0
Entering edit mode

thanks again

ADD REPLY
0
Entering edit mode

Good morning Ashutosh

you thought me to search and retrieve gene from Ensembl. I went to http://www.ensembl.org/biomart/martview/ and did the steps you mentioned, but there are some doubts for me yet, if possible help me again:

You imagine I need to get "the 2000 bp (upstream of the initiating ATG) of, non-coding sequence for my interest genes

In Attributes section-sequences, I should select Flank (Gene) option or 5' UTR?

in header information, gene information and transcript information, which option I should select???

I got confused again because by selecting whichever of these option, result is different and I don know which is my right sequence, I need to predict promoter by NNPP and confirm that by YASS, then retrieving the exact needed sequence is important

Thank you

ADD REPLY
1
Entering edit mode
9.0 years ago

Depending upon of the actual amount of genes you define with your "some" , you can go through two or more strategies

  • If that "some" is high, I would go for a gff and/or gft Saccharomyces file that I will process with R and the package GenomicRange or even with my own code
  • If that "some" is low, I would go to places such as the NCBI datababase that allow you to "Change the region shown". You look into the annotations, and select the desired region. Then you can save that file in any of the available formats
ADD COMMENT
0
Entering edit mode

Thank you very much

I have less than ten genes, sorry I searched in NCBI but I could not find where you advised. May you tell me please how I can retrieve this region from Ensembl?

ADD REPLY
0
Entering edit mode

Can you give the name of the ten genes?

ADD REPLY
0
Entering edit mode

Yes of course

YLL012/YEH1 (steryl ester hydrolase 1), YLR020/YEH2, TGL1, SWE1, VHS1, KCC4, and YNR047

ADD REPLY

Login before adding your answer.

Traffic: 2457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6