How to use python to process .CEL file?
0
1
Entering edit mode
7.9 years ago
zero_hsy ▴ 110

Hello:

I am processing the raw data on the CMAP where there are a lot of row data using the .CEL format. I know I can read the value using the R software. But ,what I want is to use python to process .CEL data. I have learned that there is a package called biopython which can process CEL data. Could anyone know the detail of how to process .CEL data using python? The following is my code to process .CEL data using python. But there is something wrong.

from Bio.Affy import CelFile
with open('AGENT_p_NCLE_RNA6_HG-U133_Plus_2_A01_436578.CEL') as handle:

print c
print(c.ncols, c.nrows)


The result is as the followings:

<Bio.Affy.CelFile.Record object at 0x02534730>
(None, None)


What is the wrong with my code? And using R, the CDF is used however in python it is not used,why?

It would be nice of you to answer my problem.

cel affymetrix python R cmap • 8.3k views
0
Entering edit mode

Your code looks fine but are you sure the file exists in the same directory and has some contents ?

0
Entering edit mode

I am sure they are in the same directory and the .CEL data have contents and can be run by R

0
Entering edit mode

Can you try the same code on the CEL file given in the BioPython repo?

https://github.com/biopython/biopython/tree/master/Tests/Affy

0
Entering edit mode

with open('affy_v3_example.CEL') as handle:

print c
print(c.ncols, c.nrows)
print(c.intensities)


The result is as follows:

(5, 5)
[[   234.    170.  22177.    164.  22104.]
[   188.    188.  21871.    168.  21883.]
[   188.    193.  21455.    198.  21300.]
[   188.    182.  21438.    188.  20945.]
[   193.  20370.    174.  20605.    168.]]


It is fine,but what is the problem with my data?

0
Entering edit mode

And I have open affy_v3_example.CEL, It is the data that is processed, I think. Because the CEL data is raw data about the probe set. And My cel data is messy code.

0
Entering edit mode

Also I am confessed with the method python used, since .cel contained a lot of probes which means that it should need CDF. And this is done right using R. However, in python it does not matter CDF. How can it done?