Question: Identify genes near occurrences of a TF binding sequence
0
gravatar for codybj
2.9 years ago by
codybj0
codybj0 wrote:

Hello, I have a consensus binding sequence that has been defined for a certain transcription factor. I would like to search for matches to this consensus sequence in the human genome and then in some way export or make a list of gene symbols / GI numbers / some identifiers for the gene nearest to the putative TF binding hit. In this way I am hoping to create a preliminary list of genes that may be affected by this transcription factor.

I tried to use BLAST but I can't figure out how to get from the results (i.e. nucleotide coordinates on a chromosome assembly) to identifying the nearest named gene in an automated fashion.

I'd appreciate any guidance and I apologize if this is a foolish question!

ADD COMMENTlink modified 2.9 years ago by Ar830 • written 2.9 years ago by codybj0

Try BLAST/BLAT from Ensembl. From the results page, click on the links under 'Genomic Location' to visualise the neighbouring genes, or check the column 'Overlapping gene(s)'. Check the 'Ensembl BLAST and BLAT tools' video for more details on how to run the online interface and explore the results.

ADD REPLYlink written 2.9 years ago by Denise - Open Targets4.9k

I have similar problem with you. If you have any success solving this problem, I would like to know your suggestion.

ADD REPLYlink written 2.8 years ago by bharata1803420
0
gravatar for Ar
2.9 years ago by
Ar830
United States
Ar830 wrote:

Use FIMO. You need to upload the motifs and choose the database of your choice. It will give you the position and then using the position you need to map to the nearest gene.

ADD COMMENTlink written 2.9 years ago by Ar830

Ar, thanks for your help. It's that second part that I was actually having a problem with... How can I programmatically map these positions to the nearest gene and produce a list of gene symbols or IDs?

ADD REPLYlink written 2.9 years ago by codybj0
0
gravatar for Ar
2.9 years ago by
Ar830
United States
Ar830 wrote:

Fastest way to do it is by using GREAT. You need to provide the genomic location in the bed format. Another way is by

  1. Using the genomic location of all the genes
  2. Subtracting to all the genomic loci of the TF binding site
  3. Then find the smallest distance.

However, be cautious about the positive and negative strand of the genes, in terms of what is a TSS and TSE.

ADD COMMENTlink written 2.9 years ago by Ar830

GREAT looks great, and I've decided to give it a go using a region that Ensembl annotates as a CTCF Binding Site on GRCh37 (i.e chr17 62225957 62226356). GREAT does seem to report that the TEX2 falls in that region of the human chromosome 17 (results). But it took me a little while to find out where the results were. The 'no terms' next to GO made me think the job did not give any results. One needs to click on 'Job description' to get the list of genes mapped to the coordinates used as input. I did not find a way to run the job on GRCh38 though and one should use the format chr17 62225957 62226356 (17 62225957 62226356) does not work.

ADD REPLYlink written 2.9 years ago by Denise - Open Targets4.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 676 users visited in the last hour