Extra header column stopping EMMAX (C), but not seen in input files
Entering edit mode
5.6 years ago
michael.nagle ▴ 100

I'm using a C implementation of EMMAX for GWAS and am getting an error that I can't get to the bottom of because I don't know enough about C. I hope there's a C expert who can take a quick look at this and let me know which of my input files has the problem, and what is causing the error.


emmax -v -d 10 -t [tped/tfam file prefix] -p [phenotype file] -k [kinship matrix] -o [output]

Source code: Actual code can be downloaded here (http://csg.sph.umich.edu//kang/emmax/download/index.html) and equivalent code for a slightly different version that can give the same error is on Github (https://github.com/slowkoni/EPI-EMMAX)

There's a problem with the variable nheadercols that leads to the error I will show below.

Input files:

Top 5 rows, 7 columns of .tfam file (tab-delimited):

1   .   0   67  0   0   C
1   .   0   92  0   0   0
1   .   0   95  0   0   0
1   .   0   102 0   0   0
1   .   0   103 0   0   0

Kinship matrix looks as expected, with 1.00 down diagonal because every genotype has perfect kinship with itself, # rows and columns match the number of genotypes in the phenotype and .tfam files.

I've tried phenotype files with and without an extra column of genotype labels (shown below as first column, with no extra) (everything is tab delimited)... the two genotype IDs in the first two columns are the same for this population.

PhenolabelA1 PhenolabelA2   NA
PhenolabelB1 PhenolabelB2   0
PhenolabelC1 PhenolabelC2       1
PhenolabelD1 PhenolabelD2   0
PhenolabelE1 PhenolabelE2   NA

Standard out:

Reading TFAM file [tfam/tped prefix].tfam ....

Reading kinship file [prefix].kinf...
  882 rows and 882 columns were observed with 0 missing values.

Reading the phenotype file [phenotype input].txt...
ERROR: Number of header columns are 2, but only 1 columns were observed

Thanks much for helping me decipher this!

emmax genomics Cpp plink GWAS • 1.8k views
Entering edit mode

This has probably nothing to do with the code but with the way the file is formatted. If the file is expected to be tab-delimited, check that spaces haven't been used instead of tabs.

Entering edit mode

The code works. I'm trying to look at the code to figure out what the problem with input is. The files are tab-delimited text as per EMMAX instructions.

Entering edit mode

Just to clarify, there's most likely nothing to be learned from the code. It expects tab-delimited input but your input file is not fully tab-delimited. This is what the error message suggests. The most common cause for this kind of things is when tab characters are replaced by some other white space characters. You can check with a perl one-liner if your input file is really tab-delimited with the expected number of columns, e.g. checking for 7 columns:

perl -ne '@row = split(/\t/); $n++; $l = @row; if ($l != 6) { print "Line $n has $l columns and is probably not tab-delimited.\n";}; END{print "Done. Checked $n lines.\n";}' input.txt
Entering edit mode

This one-liner says the input files are tab-delimited and I also made sure by replacing all tabs with $ and by double checking the perl code used to format the tab-delimited input files.

Entering edit mode

Have you solved it? I also encountered the same problem when using the emmax software.Thanks much for helping me solve this problem.


Login before adding your answer.

Traffic: 3239 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6