Question

Database In Python For Storing Microarray Files

0

Entering edit mode

12.3 years ago

pixie@bioinfo ★ 1.5k

Hi, I would like to create a database using python have some microarray data (like Affy cell files) and would use them for further analysis (like differential gene expression etc. ). I know that MySQLdb module is used in python to create databases ..but is it possible to store files in them and call them as and when required ?...any suggestion on how to start...would be greatly appreciated..thanks

python microarray database • 3.5k views

ADD COMMENT • link updated 12.3 years ago by quentin.delettre ▴ 440 • written 12.3 years ago by pixie@bioinfo ★ 1.5k

score 7 · Answer 1 · 2013-03-21

You don't need to use python to upload files to a MySQL database. It will be easier, and faster, to simply use the "load data local infile" syntax from MySQL:

load data local infile 'uniq.csv' into table tblUniq fields terminated by ','
enclosed by '"'
lines terminated by '\n'
(uniqName, uniqCity, uniqComments)

(example taken from http://www.tech-recipes.com/rx/2345/import_csv_file_directly_into_mysql/ )

Are you sure that you really need to use Python and MySQL to store your files? During my first year of PhD I've wasted a lot of time for writing a SQLAlchemy/Elixir module to handle genotype data in a MySQL database. At the end, the module worked perfectly, but I didn't use it very much, because I figured out that R and its standard library are much better for this type of applications. Moreover, if you are working with affy files, you can use the Bioconductor affy library, which doesn't have an equivalent in python.

score 1 · Answer 2 · 2013-03-21

I agree, BioConductor and SQL binding libraries seem the most adapted for that task. But if you have a python pipeline already set, and/or if you don't have time to spend in learning R, there are some pythonic solution to deal with DB: if you just want to store the files in python binary format so it's (very!) faster to load after, I believe you're looking for what python calls "persistence": http://docs.python.org/2/library/persistence.html You can see that there are several modules (including sqlite and other SQL-ish ones) devoted to that task. Among the non-SQL, I think Pickle is the most famous (and cross-time compatible), but I personally found marshal easier to deal with: as an example, you create your file with marshal.dump(myObject, file). To load it: marshal.load(myObject, file). File has to be open in binary mode (which means file=open("file.pyc", wb) to create, and file=open("file.pyc", rb) to load.

Good luck.

score 1 · Answer 3 · 2013-03-22

1

Entering edit mode

12.3 years ago

quentin.delettre ▴ 440

Just to add that there is a question on stackoverflow that could be interesting for you.

ADD COMMENT • link 12.3 years ago by quentin.delettre ▴ 440