Question

How to use python to process .CEL file?

1

Entering edit mode

9.1 years ago

zero_hsy ▴ 110

Hello:

I am processing the raw data on the CMAP where there are a lot of row data using the .CEL format. I know I can read the value using the R software. But ,what I want is to use python to process .CEL data. I have learned that there is a package called biopython which can process CEL data. Could anyone know the detail of how to process .CEL data using python? The following is my code to process .CEL data using python. But there is something wrong.

from Bio.Affy import CelFile
with open('AGENT_p_NCLE_RNA6_HG-U133_Plus_2_A01_436578.CEL') as handle:
    c = CelFile.read(handle)

print c
print(c.ncols, c.nrows)

The result is as the followings:

<Bio.Affy.CelFile.Record object at 0x02534730>
(None, None)

What is the wrong with my code? And using R, the CDF is used however in python it is not used,why?

It would be nice of you to answer my problem.

cel affymetrix python R cmap • 9.6k views

ADD COMMENT • link updated 23 months ago by Ram 43k • written 9.1 years ago by zero_hsy ▴ 110

0

Entering edit mode

Your code looks fine but are you sure the file exists in the same directory and has some contents ?

ADD REPLY • link 9.1 years ago by GouthamAtla 12k

0

Entering edit mode

I am sure they are in the same directory and the .CEL data have contents and can be run by R

ADD REPLY • link 9.1 years ago by zero_hsy ▴ 110

0

Entering edit mode

Can you try the same code on the CEL file given in the BioPython repo?

https://github.com/biopython/biopython/tree/master/Tests/Affy

Here the is download link: affy_v3_example.CEL

ADD REPLY • link updated 23 months ago by Ram 43k • written 9.1 years ago by GouthamAtla 12k

0

Entering edit mode

When I downloaded the data affy_v3_example.CEL

with open('affy_v3_example.CEL') as handle:
    c = CelFile.read(handle)

print c
print(c.ncols, c.nrows)
print(c.intensities)

The result is as follows:

(5, 5)
[[   234.    170.  22177.    164.  22104.]
 [   188.    188.  21871.    168.  21883.]
 [   188.    193.  21455.    198.  21300.]
 [   188.    182.  21438.    188.  20945.]
 [   193.  20370.    174.  20605.    168.]]

It is fine,but what is the problem with my data?

ADD REPLY • link updated 23 months ago by Ram 43k • written 9.1 years ago by zero_hsy ▴ 110

0

Entering edit mode

And I have open affy_v3_example.CEL, It is the data that is processed, I think. Because the CEL data is raw data about the probe set. And My cel data is messy code.

ADD REPLY • link 9.1 years ago by zero_hsy ▴ 110

0

Entering edit mode

Also I am confessed with the method python used, since .cel contained a lot of probes which means that it should need CDF. And this is done right using R. However, in python it does not matter CDF. How can it done?

ADD REPLY • link 9.1 years ago by zero_hsy ▴ 110