Merging bed bim fam files from chromosome 1 to 22 at one time
3
0
Entering edit mode
3.3 years ago
khn ▴ 130

Hello,

I am trying to merge multiple bed bim fam files (each chromosome has separate file) into one file.

plink --allow-no-sex --bfile XX.chr1 --merge-list mylist.txt --make-bed --out XX.ch1-22

This does not work.

mylist.txt contains:

XX.chr2.bed XX.chr2.bim XX.chr2.fam
XX.chr3.bed XX.chr3.bim XX.chr3.fam
XX.chr4.bed XX.chr4.bim XX.chr4.fam
...
XX.chr22.bed XX.chr22.bim XX.chr22.fam

Thank you in advance!

SNP genome snp • 8.5k views
ADD COMMENT
0
Entering edit mode

Is plink giving you an error? Otherwise, please define "does not work".

ADD REPLY
0
Entering edit mode

Yes, it says that "cannot open the file XX.chr2.fam". What would you think is wrong?

ADD REPLY
0
Entering edit mode

Does that file exist?

ADD REPLY
0
Entering edit mode

Yes. I changed the txt file to csv file, but it still says "Error: Failed to open XX.chr2.fam."

ADD REPLY
0
Entering edit mode

Message is as below... What would you think is wrong??

*** MB RAM detected; reserving *** MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (*** variants, *** people).
--file:
XX.ch1-22-temporary.bed
+
XX.ch1-22-temporary.bim
+
xx.ch1-22-temporary.fam
written.
Error: Failed to open XX.ch2.fam.
ADD REPLY
0
Entering edit mode

@khn How did you solve this problem? I am also new to plink. I am facing the same problem. How can we merge multiple binary files of different chromosomes using plink?

ADD REPLY
0
Entering edit mode

I have a similar trouble, but it seems to be different.

After conduct of command described in this page,

XX.ch1-22-merge.fam
XX.ch1-22-merge.missnp
XX.ch1-22.log

were generated, and no XX.ch1-22-merge.bed XX.ch1-22-merge.bim.

Is there any solutions?

ADD REPLY
3
Entering edit mode
ADD COMMENT
3
Entering edit mode
17 months ago
Shicheng Guo ★ 8.7k

Here is the solution:

Be careful, before you merge different plink files, check the duplicate SNPs and remove them! and You'd better remove non-ATCG variants before merging, such as . in the dataset. Here, suppose you have chr1, chr2 .... and want to merge them.

chr1.bed
chr1.bim
chr1.fam
chr2.bed
chr2.bim
chr2.fam
....


rm mergelist.txt

for i in {1..22}
do
echo chr$i >> mergelist.txt
done

plink --merge-list mergelist.txt --make-bed --out G1000plink
ADD COMMENT
0
Entering edit mode

Why the rm? OP's commands do not show an existing file by that name.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

What is in there that justifies your rm command?

ADD REPLY
0
Entering edit mode

Hi Ram, I believe the "rm mergelist.txt" is there so that it removes the mergelist.txt INCASE that file is already in existence. This code requires you to not have a mergelist.txt OR to have an empty mergelist.txt. So by "rm mergelist.txt", you make sure that the file doesn't exist prior to running your code.

ADD REPLY
1
Entering edit mode

If mergelist.txt does not exist, depending on other settings, there might be anything between a warning and script failure. Either test and remove, or use something like

seq 1 22 | xargs -I _num echo chr_num >mergelist.txt

so it overwrites any existing content. parallel > xargs > for and scripting an non-interactive careless rm can be dangerous.

ADD REPLY
2
Entering edit mode
21 months ago
mike313177 ▴ 20

Hi, it seems I just solved my problem on this issue. It may worth to check the file path in your merge-list text file, which means you need to make sure each path is really directed to the corresponding bed/bim/fam file, not just the file name, but the completed path to it. Hope it helps. Xiao

ADD COMMENT

Login before adding your answer.

Traffic: 1893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6