Question: Obtaining Downstream Non-Coding Sequences For A Gene From Ucsc Or Ensembl
6.1 years ago by
Dhillonv10100 wrote:

Background: In this study [1] "Transposable elements have rewired the core regulatory network of human embryonic stem cells" the authors mention that they obtained 20kb of both upstream and downstream non-coding sequences for certain genes.

Question: To keep this brief, let's generalize this to say I would like to obtain 20-kb of non-coding sequences for a given gene (say SFRP2), or multiple genes, how would I do this through UCSC or Ensembl?

I have tried to do this in UCSC with the table browser with the following, use one sample gene in the table browser and then select "Gene and Gene Prediction Tracks", pick for the output format to return sequences and in the sequence retrieval part pick the following:

Sequence Retrieval Region Options, pick Downstream by 20000 bases

Is this the correct way to obtain the non-coding sequences? Thank you for the help


ensembl ucsc • 1.6k views
It does sound like you are using UCSC table browser appropriately.

It does sound like you are using UCSC table browser appropriately.

ADD REPLYlink written 6.1 years ago by Obi Griffith17k
6.1 years ago by
Emily_Ensembl18k wrote:


You can get what you need using BioMart on Ensembl. There's a video on using BioMart here:

Put your gene names in under "ID list limit", selecting the type of gene name they are. In your attributes, choose sequence and select the upstream or downstream sequence, inputting the length you want. You can choose whatever you want to put in your header for each sequence.


ADD COMMENTlink written 6.1 years ago by Emily_Ensembl18k
