Genes with multiple chromosome locations
0
0
Entering edit mode
8 months ago
Pac314 ▴ 10

Hi, I am trying to annotate my list of gene IDs, some of which have multiple loci, e.g.:

   refseq_mrna hgnc_symbol   gene_biotype chromosome_name start_position end_position
7    NM_000076      CDKN1C protein_coding  HSCHR11_1_CTG7         115392       118091
8    NM_000076      CDKN1C protein_coding              11        2883213      2885775

What is the recommended practice for collapsing gene annotation with multiple entries of genes with alternative loci for a given gene?

gene annotation loci alternative • 657 views
ADD COMMENT
1
Entering edit mode

Where did this annotation originate from?

CDKN1C seems to be annotated only at one location. https://www.ncbi.nlm.nih.gov/gene/1028/ and https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:1786

ADD REPLY
0
Entering edit mode

Thanks for your reply. I obtained this annotation using the biomaRt R library:

library(biomaRt)

ensembl <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")

attrs <- c('refseq_mrna', 'hgnc_symbol', 'gene_biotype', 'chromosome_name', 
           'start_position', 'end_position')

annot_mrna <- getBM(attributes = attrs,
                     filters = 'refseq_mrna',
                     values = rownames(cts_mat),
                     mart = ensembl)

I have multiple gene IDs with different chromosome names like the above example.

ADD REPLY
1
Entering edit mode

You might look for a canonical gene isoform. UCSC may have some useful advice: https://genome.ucsc.edu/FAQ/FAQgenes.html#singledownload

ADD REPLY

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6