How to query GRCh37 on Entrez and get a specific Chromosome?
0
0
Entering edit mode
9.0 years ago
bfeeny ▴ 50

I want to be able to get Chromosomes for GRCh37. I know I can pull them using the RefSeq like so:

refseq="NC_000021.8"
fileFormat="fasta"
net_handle = Entrez.efetch(db="nucleotide", id=refseq, rettype=fileFormat, retmode="text")

But what if I do not know the RefSeq, but all I know is I want Chromosome 1, or Chromosome 2, etc from GRCh37? Is there someway I can pass in a RefSeq Assembly Accession, for example for GRCh37.13 its GCF_000001405.25, and also pass in a specific chromosome value? I tried using the [WORD] fields and a few others but I got no results returned.

biopython Entrez • 2.1k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

That doesn't help me. I realize the build has a GB assembly accession and RS assembly accession, but I don't see where I can use those. If I just query an Accession of say NC_000001* that returns GRCh38 (latest patch) by default. I don't see where there are any fields in the build that I can leverage to do the search, I have been playing with the Advanced Search form on the nucleotide database trying to use the [ALL] field to search for any unique identifiers with no luck.

ADD REPLY
0
Entering edit mode

If I try to use the assembly database (as opposed to nucleotide), it seems promising, I think perhaps that's where you were trying to lead me. However using that database, if I query NC_000001*[Accession] it only returns GRCh38 results! Even with that aside, I don't see how to drill down into the RefSeq via the search interface, since I have to manually click on the chromosome data. Any ideas?

ADD REPLY

Login before adding your answer.

Traffic: 1965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6