Hi all,
I am working on a big dataset that used Affymetrix platform for SNP genotyping. I have never worked on Affymetrix-related genotyping datasets. What I want to do is to identify copy number variations in chromosome 10 for n=120,000.
Let say I have a SNP 'intensity' data (chromosome 10) from Affymetrix platform, as shown below:
Person Sex SNP1_A SNP1_B SNP2_A SNP2B ……… SNP30385_A SNP30385_B
Person1 F 505.87 2824.05 305.99 810.14 2322.44 380.69
Person2 M 578.64 2541.85 218.58 968.26 375.97 2329.70
Person3 F 663.55 1539.48 4498.34 968.96 1334.80 1726.83
.
.
.
Person120000
One way to convert Affy's intensity data into LRR/BAF data is by using PennCNV software. I have read the Penncnv-Affy guidelines for Axiom Arrays, but I could not identify which step that I have to start from…
http://penncnv.openbioinformatics.org/en/latest/user-guide/affy/
-Substep 1.4 LRR and BAF calculation:
I suspect I have to start from this point to obtain the LRR/BAF data. However when I tried to use the script normalize_affy_geno_cluster.pl and load the above raw ‘intensity’ data, the input file was not the right one to be used by that script.
Therefore, can someone suggest a more direct way to obtain the LRR/BAF data from Affy's intensity data? If Penncnv-Affy workflow is the best one to use, what sort of files do I need to have to make it run? Any help will be very much appreciated! And I am sorry if all this is confusing to read.. Thank you.
Best wishes, Nikman