How to get the chromosome numbers from RefSeq accession IDs?
3
0
Entering edit mode
4.7 years ago

I have an array of RefSeq accession IDs, which looks like the following:

NC_000001.11 NC_000002.12 NC_000003.12 NC_000004.12 NC_000005.10 NC_000006.12 NC_000007.14 NC_000008.11 NC_000009.12 NC_000010.11 NC_000011.10 . . .

I am interested in knowing which chromosomes they refer to? Is there a way to automatically retrieve this information?

RefSeq • 3.4k views
0
Entering edit mode

Definately not a good solution but you can upload the ids to batch entrez (http://www.ncbi.nlm.nih.gov/sites/batchentrez) and get the summary as output text file which gives the chromosome name.

0
Entering edit mode

See my comment on your previous question CGAT: Error while running gtf2gtf

1
Entering edit mode
4.7 years ago

It's not ideal, but you can download the assembly report, which has these IDs and the associated chromosome names (in UCSC and Ensembl nomenclatures). I have to do this sort of thing when I update the chromosome name conversion tables :(

0
Entering edit mode
22 months ago

You could find the chromosomes of the alternative accession numbers (NT_... / NW_...) in this directory.
Download the files with the name :
1. alts_accessions_GRCh38.p12
2. chr_NC_gi
3. chr_accessions_GRCh38.p12
4. unplaced_accessions_GRCh38.p12
5. unlocalized_accessions_GRCh38.p12

Once you download them, you might be prompted to enter some 'Keychain Access' password. The workaround which I found for this is that to convert the downloaded file to a '.txt' format and you'll be able to view whats inside the file.

An extract from the file is given below :

### Chromosome RefSeq Accession.version

1 NW_012132914.1
1 NW_015495298.1
9 NW_009646201.1
10 NW_011332692.1
11 NW_015148966.1

0
Entering edit mode
18 months ago
vkkodali ★ 2.6k

See this post: A: How to obtain the chromosome out of an accession number? In short, NCBI RefSeq provides assembly_report.txt files with each genome assembly that has the mapping information in a tab-delimited table. That would be the most up-to-date source for this sort of information.