ADMIXTURE: "Invalid chromosome code. Use integers!"
0
2
Entering edit mode
7.0 years ago
beneficii ▴ 60

I'm trying to use admixture but now I have a problem. It tells me that I need to use integers for chromosomes, but does not tell me how I might do that. The same with the admixture guide; it only tells me I need to use numbers, but doesn't tell me how I can change the files to do that.

I have the plink.bed, plink.bim, and plink.fam files produced by the plink program, which were made off the zipped VCF file I received from PGP. They do seem to use a "chr#" naming system, where # is replaced with the autosomal number, or X, Y, or M (for mitochondrial DNA), but I'm not sure how to change that.

What would be the best way of resolving this issue?

admixture genome • 9.0k views
ADD COMMENT
0
Entering edit mode

My first guess would be to change chr1 to 1, chrX to 23, chrY to 24, chrM to 25

ADD REPLY
0
Entering edit mode

I agree, but where? I tried opening the VCF file, unzipped in Wordpad, but it was a binary file, so no go there. Even with a hex editor, I kinda don't wanna mess with a binary file unless I'm really sure of its format.

Other than that, I don't know where I can change the the chromosomes' designations.

Do you know?

ADD REPLY
1
Entering edit mode

You can change that in the plink bim file

ADD REPLY
1
Entering edit mode

And chrM becomes 25, right?

ADD REPLY
0
Entering edit mode

This thread seems relevant: VCF files: Change Chromosome Notation

ADD REPLY
0
Entering edit mode

I think the first awk solution wouldn't fix the chrX issue.

ADD REPLY
1
Entering edit mode

Sam seems to have the right idea:

You can change that in the plink bim file

It seems chrX is 23 and chrY is 24. The only thing I'm not sure about is chrM (MT-DNA). Would that be 25?

ADD REPLY
0
Entering edit mode

Right, it wouldn't. Need some more unix magic, but changing the chromosome identifiers while converting to plink formats would be the most convenient/flexible/error proof.

ADD REPLY
0
Entering edit mode

I see lots of commands in PLINK, but their explanation is quite a thing to wade through. I ended up opening the BIM file in an IDE and using the Replace function.

But now I have a new problem. (Won't I always?) It now ends in a message that says:

"Error: detected that all genotypes are missing for a SNP locus. "Please apply quality-control filters to remove such loci."

No idea what to do about this.

If I can solve it by re-running PLINK, what commands should I use?

ADD REPLY
0
Entering edit mode

Try to do the following code

sed 's/^chrM\s/25\t/g; s/^chrX\s/23\t/g; s/^chrY\s/24\t/g; s/^chr//g' your.bim > fixed.bim

Should help you to modify the bim file. As for your error, it is likely that your replace has changed the number of line of bim, leading to the problem

ADD REPLY
0
Entering edit mode

Thanks. I had made a backup of plink.bim before modifying it, so I was able to restore it. I used your sed function to move the backup to plink.bim while doing the modification listed above. It succeeded.

Unfortunately, I am still having this error: "Error: detected that all genotypes are missing for a SNP locus. "Please apply quality-control filters to remove such loci."

Does PLINK produce empty lines by default, and is there an option to turn it off?

ADD REPLY
0
Entering edit mode

Strange. PLINK does not produce empty lines filtered. Maybe you should try and add --geno 0.1 in your plink command to remove SNPs with high missingness?

ADD REPLY

Login before adding your answer.

Traffic: 1105 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6