How to get the chromosome numbers from RefSeq accession IDs?
4.7 years ago

I have an array of RefSeq accession IDs, which looks like the following:

NC_000001.11 NC_000002.12 NC_000003.12 NC_000004.12 NC_000005.10 NC_000006.12 NC_000007.14 NC_000008.11 NC_000009.12 NC_000010.11 NC_000011.10 . . .

I am interested in knowing which chromosomes they refer to? Is there a way to automatically retrieve this information?

Definately not a good solution but you can upload the ids to batch entrez (http://www.ncbi.nlm.nih.gov/sites/batchentrez) and get the summary as output text file which gives the chromosome name.

See my comment on your previous question CGAT: Error while running gtf2gtf

4.7 years ago

It's not ideal, but you can download the assembly report, which has these IDs and the associated chromosome names (in UCSC and Ensembl nomenclatures). I have to do this sort of thing when I update the chromosome name conversion tables :(

22 months ago

You could find the chromosomes of the alternative accession numbers (NT_... / NW_...) in this directory.
Download the files with the name :
1. alts_accessions_GRCh38.p12
2. chr_NC_gi
3. chr_accessions_GRCh38.p12
4. unplaced_accessions_GRCh38.p12
5. unlocalized_accessions_GRCh38.p12

Once you download them, you might be prompted to enter some 'Keychain Access' password. The workaround which I found for this is that to convert the downloaded file to a '.txt' format and you'll be able to view whats inside the file.

An extract from the file is given below :

### Chromosome RefSeq Accession.version

1 NW_012132914.1
1 NW_015495298.1
9 NW_009646201.1
10 NW_011332692.1
11 NW_015148966.1

18 months ago
vkkodali ★ 2.6k

See this post: A: How to obtain the chromosome out of an accession number? In short, NCBI RefSeq provides assembly_report.txt files with each genome assembly that has the mapping information in a tab-delimited table. That would be the most up-to-date source for this sort of information.