Question: Find Out The Genes That Correspond To My Coordinates
2
gravatar for e.karasmani
6.5 years ago by
e.karasmani120
e.karasmani120 wrote:

Dear All,

I have the following coordinates

 1         chr1 [  9933699,   9934385]   |  
 2         chr1 [ 88255056,  88257357]   |

How can I find out what genes are located next or in the aforementioned coordinates? I would like to get a refseq name and not the ensemble names such as ENSMUSG00000093178 or NM_00234

Could you please give me a guideline for that?

Thank you in advance

Best regards Lena

ADD COMMENTlink written 6.5 years ago by e.karasmani120
6
gravatar for Pierre Lindenbaum
6.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum115k wrote:

Using the mysql server of the UCSC:

$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e '
select  distinct
    name,
   chrom,
   txStart,
   txEnd,
   IF(NOT(txEnd < 9933699 OR txStart > 9934385), 0, IF(txStart > 9934385,txStart-9934385,9933699-txEnd)) as distance
  from refGene where chrom="chr1" order by distance limit 20'
+--------------+-------+----------+----------+----------+
| name         | chrom | txStart  | txEnd    | distance |
+--------------+-------+----------+----------+----------+
| NM_001012329 | chr1  |  9908333 |  9970316 |        0 |
| NM_020248    | chr1  |  9908333 |  9970316 |        0 |
| NM_001009566 | chr1  |  9789078 |  9884550 |    49149 |
| NM_014944    | chr1  |  9789078 |  9884550 |    49149 |
| NM_032368    | chr1  |  9989775 | 10002826 |    55390 |
| NM_022787    | chr1  | 10003485 | 10045556 |    69100 |
| NM_052960    | chr1  | 10057254 | 10076078 |   122869 |
| NM_005026    | chr1  |  9711789 |  9789172 |   144527 |
| NM_001105562 | chr1  | 10093040 | 10241296 |   158655 |
| NM_006048    | chr1  | 10093040 | 10241296 |   158655 |
| NR_027045    | chr1  |  9712667 |  9714644 |   219055 |
| NM_001130924 | chr1  |  9648931 |  9674935 |   258764 |
| NM_001010866 | chr1  |  9648931 |  9665020 |   268679 |
| NM_032315    | chr1  |  9599527 |  9642831 |   290868 |
| NM_015074    | chr1  | 10270763 | 10441661 |   336378 |
| NM_183416    | chr1  | 10270763 | 10368655 |   336378 |
| NM_025106    | chr1  |  9352940 |  9429590 |   504109 |
| NM_002631    | chr1  | 10459084 | 10480201 |   524699 |
| NM_198544    | chr1  | 10490158 | 10512060 |   555773 |
| NM_199006    | chr1  | 10490158 | 10512060 |   555773 |
+--------------+-------+----------+----------+----------+
ADD COMMENTlink modified 6.5 years ago • written 6.5 years ago by Pierre Lindenbaum115k

Would it be possible to print the gene name TMEM201 instead of NM_001130924?

ADD REPLYlink written 19 months ago by Tommy Carstensen150

yes use name2 instead of name

ADD REPLYlink written 19 months ago by Pierre Lindenbaum115k
3
gravatar for Vikas Bansal
6.5 years ago by
Vikas Bansal2.3k
Berlin, Germany
Vikas Bansal2.3k wrote:

Use bedtools. Download refseq genes from UCSC. Then use bedtools. Have a look at closestBed and intersectBed.

EDIT: Firstly you have to make your input file (chr, coordinates) in bed file.

ADD COMMENTlink modified 6.5 years ago • written 6.5 years ago by Vikas Bansal2.3k
1
gravatar for Treylathe
6.5 years ago by
Treylathe940
San Francisco
Treylathe940 wrote:

A simple Table Browser search of these regions do the trick, unless you need something more robust and for larger sets of data (NM_ is the refseq as mentioned above)?

table browser: http://genome.ucsc.edu/cgi-bin/hgTables?command=start

choose species and assembly choose genes and gene prediction choose refseq and ref gene define regions above output format: selected fields (choose at minimum gene name and alternative)

Gives a table delimited text file of gene names. For example, region above chr1:9933699-9934385 (assuming human, hg19) gives:

name chrom txStart txEnd name2

NM_020248 chr1 9908333 9970316 CTNNBIP1

NM_001012329 chr1 9908333 9970316 CTNNBIP1

You could use related tables to pull out other IDs and GO terms, etc.

ADD COMMENTlink written 6.5 years ago by Treylathe940
0
gravatar for Leonor Palmeira
6.5 years ago by
Leonor Palmeira3.6k
Liège, Belgium
Leonor Palmeira3.6k wrote:

NM_002341 is a RefSeq accession number.

If you want to get a gene official name rather than an accession number, then (assuming these coordinates are on Homo sapiens), you could have a look at this.

ADD COMMENTlink written 6.5 years ago by Leonor Palmeira3.6k
0
gravatar for e.karasmani
6.5 years ago by
e.karasmani120
e.karasmani120 wrote:

thank you very much!

however is there a way by using R (since everything that I am doing is in R)....

i have my coordinates in IRanges or a data.frame (if this can help you)

thank you in advance

best regards Lena

ADD COMMENTlink written 6.5 years ago by e.karasmani120
0
gravatar for Ian
6.5 years ago by
Ian5.3k
University of Manchester, UK
Ian5.3k wrote:

A R specific method is the Bioconductor package ChIPpeakAnno.

ADD COMMENTlink written 6.5 years ago by Ian5.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1658 users visited in the last hour