illumina gene expression
3
1
Entering edit mode
5.8 years ago
Kritika ▴ 260

Hello all

i am dealing with certain illumina microarray data.

i working on genomestudio and proceeded the way it is given in user guide. But i am facing problem, while loading files (.idat) in repository tab, once i click on folder appearing(Barcode) on sentrix array it is not recognizing the sample (idat files). So what all files do require to keep in folder where my idat files are saved? and what is the reason that it is not recognizing my files.

genomestudio microarray geneexpression illumina • 2.9k views
0
Entering edit mode

Hi posionAlien,

i am trying to use your Source script (AnalyzeBead.R), but am running to this error. What could be the problem?

result = beadAnalyze(idats = c("4487653088_J_Grn.idat","4487653088_K_Grn.idat","4487653088_L_Grn.idat","4487653151_A_Grn.idat"),
names = c("4487653088_J","4487653088_K","4487653088_L","4487653151_A"),
condition = c("day0","day0","day2","day2"),
ref.condition = "day0", fdr = 0.05, plotPCA = T)

Annotating control probes using package illuminaHumanv3.db Version:1.26.0
Calculating array weights
Array weights

Error in levels<-(*tmp*, value = if (nl == nL) as.character(labels) else paste0(labels,  :
factor level [4] is duplicated


Hope to hear from you.

Cheers

0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

This should be posted as a comment under poisonAlien answer.

3
Entering edit mode
5.8 years ago
poisonAlien ★ 3.1k

Not sure about GenomeStudio. But if you are comfortable using R, use this script. It takes idat files as input, does normalization and performs differential expression between two groups. (Assuming there are no batch effects)

Usage:

source("AnalyzeBead.R")
result = beadAnalyze(idats = c("file1.idat","file2.idat","file3.idat","file4.idat"),names = c("control1","control2","treated1","treated2"),condition = c("control","control","treated","treated"),ref.condition = "treated")
1
Entering edit mode

Just to tack onto poisonAlien's answer, the behaviour you're seeing in GenomeStudio is just a quirk of their software, and I'm sure there was a reason for it once upon a time, you need the IDATs to be in a folder separated by chip ID (each folder is the chip ID number), in addition you'll need SDF files in the folder too. GenomeStudio is not as flexible as bioconductor methods for analysing microarray data, so I'd second poisonAlien's answer, to try the analysis in R, you'll get more of an appreciation for what actually occurs in a typical differential expression analysis. If you still have trouble with GenomeStudio, I'd suggest you contact Illumina support, you've paid for a licence, so you should make use of the support they provide.

0
Entering edit mode

Hello poisonAlien

the script which you shown above is giving error :

Error in idatData$Quants[, "CodesBinData"] : subscript out of bounds please tell me what this error means and how to rectify it. ThanK you ADD REPLY 0 Entering edit mode What platform are you using ? chip ID ? ADD REPLY 0 Entering edit mode currently i am working on one dummy sample. ADD REPLY 0 Entering edit mode You gotta be more specific. That code assumes that you're working on human arrays (to be specific HT12 v4 chip, because thats what we use frequently in our lab). If you're using another array, you will need to change the annotation. Do you have replicates ? And do you have all the libraries installed ? (beadarray, limmailluminaHumanv4.db ) ADD REPLY 0 Entering edit mode yes the chip is HT12v4 i confirmed from where i got the samples. yes all libraries are installed ADD REPLY 0 Entering edit mode Can you post your command ? ADD REPLY 0 Entering edit mode source("Microarray/AnalyzeBead.R") result = beadAnalyze(idats = c("/dummy_data/Image Data/9666412702/9666412702_A_Grn.idat" , "/dummy_data/Image Data/9666412702/9666412702_B_Grn.idat"), names = c("control","treated1"), condition = c("control","control","treated","treated"), ref.condition = "treated") Error in [<-.data.frame(*tmp*, , "sampleFac", value = c("control", : replacement has 4 rows, data has 2 ADD REPLY 1 Entering edit mode Ahh ! See you are providing two idat files (one treated and one control) but your condition says two control and two treated. That's what your error report says. Try: result = beadAnalyze(idats = c("/dummy_data/Image Data/9666412702/9666412702_A_Grn.idat" , "/dummy_data/Image Data/9666412702/9666412702_B_Grn.idat"), names = c("control","treated1"), condition = c("control","treated"), ref.condition = "treated") Note, you dont have replicates so you wont get any p-values. ADD REPLY 0 Entering edit mode Oh!!!!!!!!!!! thank :) poisonAlien can you please explain me this line names = c("control","treated1"), condition = c("control","treated"), ref.condition = "treated")  if i have replicates then what command should i use ?? same the above you provided i tried to understand the source code of this but its going out of my understanding thankss ADD REPLY 0 Entering edit mode what is understood from this command is name c( control","treated1) will refer to object of control and treated condition = ("control" , "treated") will handle error or warning? what ref.condition this? ADD REPLY 0 Entering edit mode According to this commands result = beadAnalyze(idats = c("file1.idat","file2.idat","file3.idat","file4.idat"),names = c("control1","control2","treated1","treated2"),condition = c("control","control","treated","treated"),ref.condition = "treated") file1.idat , file2.idat are replicates for treated and file3.idat file4.idat are replicates of control? am i correct as i said already i m dealing with dummy data i tried some more sample so after running this command i got message :- Annotating control probes using package illuminaHumanv4.db Version:1.26.0 Calculating array weights Array weight after typing result it is showing certain values with column ID logFC AveExpr t P.Value adj.P.Val B ILMN_XXXXX ADD REPLY 1 Entering edit mode idats is vector of your dat files (in the above example there are 4 dat files) names is sample names for those dat files (above they are named as control1, control2, treated1 and treated2).Yes they're replicates. condition is sample characteristics. First two are control and last two are treated. It can be anything based on your experiment. (like knockdown, over expression, etc) ref.condition is which one of the condition to use as a reference. Here I am comparing everything with treated. All up or down genes are with respect to treated samples. Output is typical limma results. You may want to read limma manual. In short, logFC is fold change with respect to control samples, AveExpr is average expression across all your samples, t statistics, p-value, adj.P.Val is FDR, B is odd ratios. Also there are other stuffs like Probe sequence, probe quality, its locus on genome, where it lies on transcript, etc. Script itself is well commented, so you should be able to follow. However life will be easier if you know how expressionset object is represented and its slots in Bioconductor. Tomorrow I will update the script with PCA, you can check again. ADD REPLY 0 Entering edit mode Very usefull information Thanks a lot PoisonAlien ADD REPLY 0 Entering edit mode 5.8 years ago Kritika ▴ 260 Actually just going through manual of genome studio gene expression I kept idat file only in one folder... But after that I kept all files in same folder so it works .. I will try bioconducter also for my data.. Anyways thanks andrew and poisonalien for help me... ADD COMMENT 0 Entering edit mode 5.8 years ago Kritika ▴ 260 Hello poisonAlien the script which you shown above is giving error : Error in idatData$Quants[, "CodesBinData"] : subscript out of bounds

please tell me what this error means and how to rectify it.

ThanK you

0
Entering edit mode

Hi Kritika,

Same like yours I need to do gene expression analysis with IDAT files. Could you please tell me how did you do your analysis? Workflow and packages.

Thank you