Question

Getting copy number variation from .CHP files for REMBRANDT study

0

Entering edit mode

4.6 years ago

PleaseDeleteMe • 0

Hi all,

As a preface: I am not very familiar with genetics/genetics processing/bioinformatics as this is not normally the side of work I'm on. So if some of the terminology I use is wrong, please feel free to correct me and I apologize in advance! I've also made this post a bit more extensive to include my current research so far, and hopefully include some useful resources for other people that might run into the same problem in the future).

I'm trying to work with the REMBRANDT data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108474), and trying to determine specific genetic mutations. For example, some of the mutations that I am interest in are the 1p/19q co-deletion and IDH1/2 mutation status.

I've had a look around these forums, and at different tutorials and in most cases the .CEL files are used as these contain the raw data, and from there the copy number variations (CNVs) are derived to make the final calls on specific mutations. Although this seems to be the best route, unfortunately the .CEL files for the REMBRANDT study have apparently been lost and only .CHP files remain.

From my research and understanding so far .CHP files are basically processed from the .CEL files, and contain less information, however it should still be possible to get the CNVs from these files, provided that in the processing step from the .CHP to .CEL file the specific locations you are interested in are included.

Normally you could go from .CHP to .CNCHP or .LOCHP files, which contain Copy Number Variation using Affymetrix Genotyping Console. However, I never got this to and couldn't find a good tutorial on how this should work. For a lot of .CHP files it specifies that they could not be loaded, but not why they could not be loaded. The .CHP files were specified as Affymetrix Human Mapping 50K Hind240 SNP Array and Affymetrix Human Mapping 50K Xba240 SNP Array, which I guessed were two platforms for analysis. Even though I downloaded the required files from these platforms, such as the library files, it did not load them.

This ended me up at the Affymetrix Power Tools, which were quite helpful, as they allowed me to convert the .CHP files to raw txt files and inspect them. Everything seemed to be in order, although I did not fully understand the file format. I looked around, but also could not find a program that was able to read these converted text files to work further with.

I've tried the affxparser package in R, which can read .CHP files, however from there I'm still stuck, as I couldn't find how to process these data further,

So ultimately this is where I'm stuck, and I'll try to summarize the problem here:

I have .CHP files from patients, from which I would like to determine the copy number variation of specific genetic mutations (the ultimate goal is actually to only get the final 'call', i.e. is IDH1/2 mutated or not?, so if this can be achieved while skipping the CNV that would be fine as well). How can I analyze these files, getting either the copy number variation or the final calls for specific mutations, given that the original .CEL files are not available?

R snp SNP • 774 views

ADD COMMENT • link updated 4.4 years ago by Biostar 20 • written 4.6 years ago by PleaseDeleteMe • 0

0

Entering edit mode

I think I am too late with my answer by try rCGH R package.

ADD REPLY • link 4.4 years ago by German.M.Demidov ★ 2.9k