Entering edit mode

3.7 years ago

dp0b
▴
70

Hi,

I have a trait that was measured using a assay but a large proportion of the samples where below the threshold for detection of the assay so my phenotype isnt normally distributed and transformation (log, sqrt, box-cox) isn't successful. Is it better then to treat the data as continous for gwas or categorical high-low. The only thing is with high-low, there would be individuals with phenotype values close to the distinction between the two. Advise would be much appreciated. The distribution is below

I was hoping to use gcta software for the gwas.

Thanks

Can you show the distribution here? - you can share images by uploading and obtaining a URL from here: https://imgbb.com/

How are you aiming to conduct the analysis - PLINK or SAS or R or ... ? If the distribution follows a Poisson or Inverse Gaussian, you would be able to select that in SAS or R. Cannot confirm if it's possible with PLINK.

Thanks for getting back to me. I have edited the question and shown the distribution as you suggested. I was hoping to do the gwas using gcta but can use SAS to transform the data.

Thanks for that. So, it looks like an Inverse Gaussian. You're using the genotypes to predict the phenotype, I imagine?

I'm not a SAS programmer (more R), so, don't know the exact way to code for Inverse Gaussian but I'm sure that it's not difficult.

In R, it would be something like:

I've put parallelised code for running these models in R on my GitHub page, if you wanted to try that out. Go to my profile here on Biostars and get the link. It's also here: R functions edited for parallel processing

Kevin

Thanks for you help,Ill have a look but preference is a mixed model that fits the grm