Using GEOquery to download multiple expression data series files and then combine
1
0
Entering edit mode
2.2 years ago
kmyers2 ▴ 40

I need to combine all available expression data for a number of bacterial species. I have written a python script that can download all the relevant files from GEO for each species. However, when I look at the different series files, often the gene IDs are different. In order to combine all the data together, I will need to have the same gene ID for each series data set. I know I can download the platform files and link them, but that seems inefficient.

Is there a way to modify the geo series file to include a specific (or all) gene ID using GEOquery (or anything else available that you know of) so that when I export it, I can easily combine multiple series files into one by linking the gene IDs?

RNA-Seq expression array NCBI GEO R • 695 views
ADD COMMENT
0
Entering edit mode
2.2 years ago

There is not a way to do this with GEOquery. Your best option will be to 'harmonise' the gene IDs using biomaRt.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2221 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6