Question: How to perform GWAS using BOLT-LMM iteratively for many phenotypes in bash
0
gravatar for kl
5 months ago by
kl10
kl10 wrote:

Hello,

Does anyone know/have any code to perform GWAS using BOLT-LMM for many phenotypes iteratively in bash so it is more automated, rather than running a GWAS for each phenotype at a time?

genome gene • 415 views
ADD COMMENTlink modified 4 months ago • written 5 months ago by kl10
0
gravatar for Sam
5 months ago by
Sam3.3k
New York
Sam3.3k wrote:

If all your samples have all the phenotype, the easiest way will be just generate one phenotype file containing all the phenotype information, then you can provide the all phenotype names to the --phenoCol. If, however, your samples contain some missing data, then you can still generate one phentoype file and then do

pheno=( "A" "B" "C" )
for i in `seq 1 ${#pheno[@]}`; do
    bolt-lmm --phenoCol ${pheno[${i}-1} .....;
done

Where you fill in the .... with other relevant commands.

ADD COMMENTlink written 5 months ago by Sam3.3k
0
gravatar for kl
4 months ago by
kl10
kl10 wrote:

Hi Sam,

Thanks for your response. I was wondering if you can also advise on the following if you have used BOLT-LMM. I have a file with hard-called SNPS in .bim,.bed,.fam with my directly genotyped and imputed SNPs combined in these files. For example, chr1.bim has the directly genotyped and the imputed SNPs. For the flag, --modelsnps do we provide the .bim files all over again (they have been through QC before imputation so SNPs of poor quality etc have already been removed)? Do I need to provide files for the following arguments as they are all about dosages? I only had dosages when I downloaded my imputed data from Michigan server but I then converted them to plink format... --dosageFile=EUR_subset.dosage.chr17first100 \ --dosageFile=EUR_subset.dosage.chr22last100.gz \ --dosageFidIidFile=EUR_subset.dosage.indivs \ --statsFileDosageSnps=example.dosageSnps.stats \ --impute2FileList=EUR_subset.impute2FileList.txt \ --impute2FidIidFile=EUR_subset.impute2.indivs \ --statsFileImpute2Snps=example.impute2Snps.stats \ --dosage2FileList=EUR_subset.dosage2FileList.txt \ --statsFileDosage2Snps=example.dosage2Snps.stats \

I've pasted the code I would use for my data type below. I would really appreciate your advice. Thanks!

SKELETON OF CODE I WOULD USE: ../bolt \ --bfile=EUR_subset \ --remove=EUR_subset.remove \ --exclude=EUR_subset.exclude \ --phenoFile=EUR_subset.pheno.covars \ --phenoCol=PHENO \ --covarFile=EUR_subset.pheno.covars \ --covarCol=CAT_COV \ --qCovarCol=QCOV{1:2} \ --modelSnps=EUR_subset \ --lmm \ --LDscoresFile=../tables/LDSCORE.1000G_EUR.tab.gz \ --numThreads=2 \ --statsFile=example.stats

ADD COMMENTlink modified 4 months ago • written 4 months ago by kl10

Once you convert the file into plink format, you lost the dosage information. As a result of that, you can use the plink file as if you only got the genotype file.

ADD REPLYlink written 4 months ago by Sam3.3k

Great thanks. For the --modelSnps file, can I include all the SNPs or would it be best to go into each of my dosage pre-processed chromosome files and include those with a good INFO score?

ADD REPLYlink written 4 months ago by kl10

Filtering will help to remove problematic SNPs, so do try to filter by INFO score first

ADD REPLYlink written 4 months ago by Sam3.3k

Is there a way to create a separate job report for each phenotype?

ADD REPLYlink written 4 months ago by kl10

That depends on your job submission system

ADD REPLYlink written 4 months ago by Sam3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2360 users visited in the last hour
_