Map Genomic Positions Onto Rat Genome
2
1
Entering edit mode
12.6 years ago
Emempe ▴ 50

I have a large list of genomic positions for the latest assembly of the rat genome (rn4). I would like to map those positions onto the genome and get a list of protein coding genes that lie in the same regions.

More specifically, my questions are:

  • I understand that the Rat Genome Sequencing Consortium (http://www.hgsc.bcm.tmc.edu/project-species-m-Rat.hgsc?pageLocation=Rat) provides the reference assembly for the rat genome. How does the annotated data available at UCSC Genome Browser, NCBI Genome and ENSEMBL compare and what kind of annotation do they offer?

  • Which database would you use for the task described above (return genes for genomic positions)?

  • What program would you use to do that (e.g. R/BioConductor ...)?

I haven't worked much with sequence data before and I am a little confused with the diversity of annotation databases. Any help would be great!

genome annotation mapping • 3.1k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
4
Entering edit mode
12.6 years ago
Bert Overduin ★ 3.7k

As I believe that the rn4 assembly is the same as the Baylor 3.4 assembly, you can easily retrieve the genes (plus annotation) in your regions of interest using Ensembl BioMart:

  • Go to the Ensembl homepage.
  • Click on the ‘BioMart’ link on the toolbar.
  • Choose the ‘Ensembl Genes 63’ database.
  • Choose the ‘Rattus norvegicus genes (RGSC3.4)’ dataset.
  • Click on ‘Filters’ in the left panel.
  • Expand the ‘REGION’ section by clicking on the + box.
  • Enter your list of genomic region in 'Multiple chromosomal regions' text box.
  • Click on ‘Attributes’ in the left panel.
  • Select any attributes you want to output.
  • Click the [Results] button on the toolbar.
  • Check 'Unique results only'.
  • Select ‘View All rows as HTML’ or export all results to a file (note that you can export to an Excel spreadsheet by choosing 'XLS' as your file format).

You can also find a video on how to use BioMart on YouTube.

Hope this helps.

ADD COMMENT
0
Entering edit mode

Thanks, that helps a lot! Do you know if BioMart is accessible via Bioconductor/Matlab Bioinformatics Toolbox to put this in a script?

ADD REPLY
0
Entering edit mode
12.6 years ago

As far as annotation is concerned, I would use as many as is practical. Here, annotation is the data such as function associated with each gene. Once you have a list of genes, you'd like to know their attributes, or have those handy in a data table. In this regard, it could be quite informative to grab similar data for the mouse genes defined by these regions in rat. Two examples: mouse knockouts will give you important functional data and mouse QTLs will help link gene to disease.

For rat, I don't do much genome-wide, but look at single genes. I prefer The Rat Genome Database at Med. College of Wisconsin for that info.

ADD COMMENT
0
Entering edit mode

Thanks for your reply, but my question aims at finding genes within the rat genome (rather than associate more information with this genes). Once I have the genes, I will look into it. I think mapping to mouse/human could be interesting.

ADD REPLY
0
Entering edit mode

Sure, I understand what you need to do. I thought that I gave a response pertinent to your specific question #1, regarding the annotation offered.

ADD REPLY

Login before adding your answer.

Traffic: 1474 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6