Entering edit mode
6.3 years ago
kakukeshi
▴
80
Hi guys,
Is there any way to split the bed, bim and fam files by a specific number of SNPs. Let say in the files for chr1 I have 86000 SNPs. I would like to generate 86 files with 1000 SNPs each.
Many thanks
You can extract SNPs of your interest using --extract flag. Loop for every 1,000 SNPs.
Thanks, that's exactly what I'm doing. I created x files with different subsets of snps and the extract them using --extract flag
Please use tags appropriately. As such experts can quickly find your question. In this case "plink" would have been relevant, so I have edited your post to add it. Please keep this in mind for future posts.
if you have one SNP per line, you can use
split
command in linux. you can split the file either by number of lines per output file or number of output files you want.If I'm not mistaken some of these plink files are binary, would that work with
split
?i am not sure if binary files can be split with split. Got confused bed with binary ped. @ WouterDeCoster
What is the next step, what is the purpose? I worked with millions of SNPs per chromosome and never really hit the wall using plink.
The reason is that I'm running a software called "Trinculo" to perform GWAS on an ordinal variable with multiple categories and the calculations are slow, so I need to split the files into chunks to parallelize the calculation.
there are tools/script to split chromosome wise. Can you not use them? kakukeshi
Out of curiosity, why would someone ask how to do something if they already knew how to do it or which tools to use?
Also, the OP is not asking chromosome-wise they're asking snp-wise.