Using The Imgt/Gene-Db Service To Find Rss
0
1
Entering edit mode
12.0 years ago
Faheemmitha ▴ 210

I'm trying to get the data for the Human and Mouse 12 and 23 Recomination Signal Sequences (RSS), to run a classification algorithm on it. I'm not a biologist, so I apologise in advance for my misunderstandings and confusion.

A version of the data is available here. There is also another slightly different version available, but only for the mouse, here. However, I thought I would try to get it from www.imgt.org, if possible. One reason is that they might have a bigger data set available. If anyone can suggest other places that I could get it from, that would be fine as well.

I'm trying to follow the instructions at http://www.imgt.org/FAQ/#question43 to obtain Recombination Signal Sequences for the mouse.

Here is what I have selected at the [search page] (http://www.imgt.org/IMGT_GENE-DB/GENElect?livret=0):

Identification:
Species    : Mus Musculus
GeneType: any
Functionality: functional
MolecularComponent: any
Clone name: <blank>

IMGT group: IGHV
IMGT subgroup: any
IMGT gene: <blank>

I'm not clear what Locus, Main locus, and IGMT group mean here exactly. Specifically, what is the difference between Locus and Main locus?

I think, but am not sure, that IGHVcorresponds to V genes in the Immunoglobulin heavy locus (IGH@) on chromosome 14, where locus here denotes collections of genes. Clarifications and corrections appreciated.

I would have expected that the IGH locus would correspond to IMGT group entries like IGHJ, IGHV etc, and the IGK locus would correspond to IMGT group entries like IGK, IGKJ, IGKV, but no matter what I select for Locus, it does not change the possible entries for IMGT group.

Running the search gives

Number of resulting genes : 218 Number of resulting alleles : 350

As instructed, I went to the bottom, selected Select all genes, clicked on Choose label(s) for extraction, and selected V-RS.

I got

Number of results=98

The first few results were

>X02459|IGHV1-4*02|Mus musculus_BALB/c|F|V-RS|395..432|38 nt|NR| | | | 
|38+0=38| | |
cacagtggtgcaaccacatcccgactgtgtcagaaacc

>X02064|IGHV1-54*02|Mus musculus|F|V-RS|295..332|38 nt|NR| | | | |38+0=38| 
| |
cacagtgttgcaaccacatcctgagtgtgtcagaaatc

>M34978|IGHV1-58*02|Mus musculus_A/J|P|V-RS|554..560|7 nt|NR| | | | 
|7+0=7|partial in 3'| |
cacagtg

Ok, now I'm confused. The lengths of the RSS should be 28 or 39. but I counted lengths of 4,7, 31, 38, and 39. Are the results here not supposed to contain the 12 and 23 RSS?

So, I must be misunderstanding things here. Possibly many things. Any explanations and clarifications are appreciated.

• 2.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2291 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6