Entering edit mode
19 months ago
antoton
•
0
I have completed a bunch of GWAS, using the same genotype data each time but with different traits. I now wanted to make a Manhattan plot illustrating all the results, all traits in a single plot. For this need a single "summary statistics" file where each variant only has the lowest p-value obtained among all the GWAS.
What is the best way of accomplishing this? Does anyone know of any tool that might do this? I'm a bit frustrated because I only needed it for a throwaway figure, but I feel like any script I could write for this would take far too long to run.
Thanks in advance!
It would be handy if you provided the format of the summary file(s) you are trying to further summarize; and whether the variants in the files are all the same, and occur in identical orders.
Thanks for your reply! I realize now that I didn't give much info above. The files are simple text files with the following columns:
rsid chromosome position A1 A2 pval beta tstat n
Every file has the exact same variants in the exact same order. There are about 7.4M variants. Each file is 580MB. I have a few hundred files but I can subset to those showing significant hits to make it more reasonable.
I'm thinking that if I tabix index each of the files I might be able to do a lookup across all files for each position at a reasonable time-frame? I would gladly hear any opinions or suggestions. Thanks again!