I feel like I'm missing smth very trivial, but how to automatically extract the most complete information on genes chromosomal locations using ENTREZID (preferably in R). I specifically have problems with uncharacterized loci (but sometimes with some genes also). When I use some typical approaches in R I get NA for those loci (e.g. by extracting info from org.Hs.eg.db, v. 3.7.0). For example, this gene - https://www.ncbi.nlm.nih.gov/gene/?term=LOC105370787.
Not a solution in
R but you can use NCBI unix utils to get this information.
$ efetch -db gene -id LOC105370787 1. LOC105370787 uncharacterized LOC105370787 [Homo sapiens (human)] Chromosome: 15; Location: 15q15.1 Annotation: Chromosome 15 NC_000015.10 (40075943..40083225, complement) ID: 105370787
If you want a tab-delimited output that can be easily imported into R, you can use
xtract, another tool from the Entrez Direct package as follows:
esearch -db gene -q LOC105370787 | esummary | xtract -pattern DocumentSummary -element Id,Name -group GenomicInfoType -element ChrAccVer,ChrStart,ChrStop 105370787 LOC105370787 NC_000015.10 40083224 40075942
There's additional information in the XML output of esummary that may be of interest to you.