Question: Extracting upstream and Downstream of a Gene based on different intervals
gravatar for always_learning
11 months ago by
Doha, Qatar
always_learning960 wrote:

Hi All,

I want to extract the regions for the few genes in upstream and downstream based on certain intervals for example in the range of 0-50, 50-5000, 5000-50000 basepair. What could be the best way to do that? One way I could think off is to make a bed files for these intervals separately and then search it using tools. Is there any other smarter way to do this?


upstream intervals • 878 views
ADD COMMENTlink modified 11 months ago by sacha1.7k • written 11 months ago by always_learning960

You are on the right track. The general approach is described here: retrieving sequences of a upstream and downstream of a coordinate for hg19

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax67k

To do something like this in the past, I began by retrieving the TSS locations for all genes from Ensembl Biomart. I then wrote a simple AWK command like the one in the link provided by @genomax to get exactly what I want, using the TSS positions as the starting location.

EDIT: With sachas answer below you now have the 2 main ways to do this. Both begin with getting the coordinates of your gene and then selecting what you want from a reference file via an scripted command (e.g. AWK) or a bedtools function.

ADD REPLYlink modified 11 months ago • written 11 months ago by YaGalbi1.4k

Thanks !!

How will I extract something like interval 50-5000 BPs region from a gene?

ADD REPLYlink written 11 months ago by always_learning960
gravatar for sacha
11 months ago by
sacha1.7k wrote:
  • Create a bed file with your gene. ( using refSeq.txt for instance ) : gene.bed
  • Use bedtools slop to make regions larger according how many basepair you want : gene_slop.bed
  • Use bedtools intersect between gene.bed and gene_slop.bed to keep only upstream & downstream region
  • Use bedtools gefasta to extract sequence
ADD COMMENTlink written 11 months ago by sacha1.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1536 users visited in the last hour