I'm trying to access the complete ncbi tissue expression dataset. When you look at any individual gene, NCBI provides a
expression chart to look at rna-seq counts across tissues. For example, https://www.ncbi.nlm.nih.gov/gene/6304 You can see higher expression in brain and lymph nodes.
I contacted ncbi, and they showed me the data for all genes is accessible here: https://ftp.ncbi.nih.gov/gene/DATA/expression/
The data is a giant xml file that's formatted for solr Apache databse. They provide a schema file to help read the data.
However, my first attempt at loading the data into solr totally failed. Has anyone set up scripts for loading and querying this data?