Plink: Number of Samples reducing after merging multiple bed,bim,fam files with Plink's --merge-list function
0
0
Entering edit mode
15 days ago
Swetaleena • 0

I have multiple BED/BIM/FAM files (from separate GWAS experiments) and I wish to create a merged set of these files. I am using the command:

plink --bfile myfile1 --merge-list all_my_files.txt --make-bed --out mymerged

my_files.txt has the format:

file2.bed file2.bim file2.fam 
 ... 
fileK.bed fileK.bim fileK.fam

The problem is I am merging the files for about 3000 samples (divided into about 34 GWAS runs). When I do a line count (wc -l ) for the resulting fam file from the merged exercise, I get about 2400. I want to know why is the information for almost 600 samples lost?

--merge-list fam_file Plink • 144 views
ADD COMMENT
0
Entering edit mode

You might have duplicated IDs in your file*.fam?

How many unique lines do you get when you run

cat file*.fam | awk '{print $1,$2}' | sort -u | wc -l 
ADD REPLY
0
Entering edit mode

When I run this command for my merged.fam file, I get 2480. But originally, I had 34 bed,bim and fam files for 3061 samples which I merged to get the merged.bim, merged.fam and merged.bed files.

ADD REPLY
0
Entering edit mode

Also when I run this command in the folder having all my 34 separate .fam files, I get 2957. That means my merged.fam doesnt have the information for almost 500 samples.

ADD REPLY

Login before adding your answer.

Traffic: 1936 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6