Right now I am working with GRCh37.p13 RefSeq data from Entrez, querying using efetch in Biopython, here is a sample of the type of query I am doing (Chromosome 1):
net_handle = Entrez.efetch(db="nucleotide",id="NC_000001.10",rettype="fasta", retmode="text")
This result is drastically different than the reference chromosome 1 from 1000 genomes project. Specifically whats contained in their file human_g1k_v37.fasta.gz which I obtained from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/
I am working with analyzing SNP's and my understanding is that Ancestry, 23andMe and others typically use 1000 genomes data as their reference, that is why I am looking to use it as well. If this is incorrect, and they in fact use another reference you are aware of please let me know.
What would be ideal is if I could query the data from human_g1k_v37 via a RefSeq or some other means using the Entrez service. Does anyone know if this is possible?