Question

NGSadmix program returns -nan as a result for the admixture proportions of each individual

0

Entering edit mode

6.1 years ago

bekahlee54 • 0

I have been trying to run the program NGSadmix (http://www.popgen.dk/software/index.php/NgsAdmix) using nine whole genome sequences. The program will run, however, the output in my .fopt.gz and .qopt files only contain -nan in both columns (running a k=2).

I made a beagle input file using the angsd program (v0.918) and this code: angsd -GL 1 -out genolike -doGlf 2 -doMajorMinor 1 -SNP_pval 1e-5 -doMaf 1 -bam bam.filelist

where the bam.filelist is a file listing the location of each of the sorted and indexed .bam files on a new line.

I am running NGSadmix with this code: ./NGSadmix -likes genolike.beagle.gz -K 2 -o outputadmix

This is the log file I receive: Input: lname=genolike.beagle.gz nPop=2, fname=(null) qname=(null) outfiles=outputadmix Setup: seed=1520867306 nThreads=1 method=1 Convergence: maxIter=2000 tol=0.000010 tolLike50=0.100000 dymBound=0 Filters: misTol=0.050000 minMaf=0.050000 minLrt=0.000000 minInd=0 Input file has dim: nsites=25438921 nind=9 Input file has dim (AFTER filtering): nsites=25438921 nind=9 [ALL done] cpu-time used = 40718.93 sec [ALL done] walltime used = 41370.00 sec best like = -nan after 2000 iterations

The .qopt output file contains this information: $head outputadmix.qopt -nan -nan -nan -nan -nan -nan ...

I've tried re-making the input file, but am consistently getting the same results. I have had a hard time trouble-shooting this issue because I am unclear as to why I'm seeing -nan as a result.

Any suggestions for how to solve this problem would be much appreciated.

genome software error NGSadmix angsd • 2.3k views

ADD COMMENT • link updated 5.6 years ago by Biostar 20 • written 6.1 years ago by bekahlee54 • 0

score 0 · Answer 1 · 2018-03-13

You are using NGSadmix using 25 Million sites which seems like a very large number of SNPs - and you only have 9 individuals. Also it appears like you don't have a minimum minor allele frequency threshold.

Given the very low number of individuals I would try a more stringent filter on SNPs for example angsd -GL 1 -out genolike -doGlf 2 -doMajorMinor 1 -SNP_pval 1e-6 -doMaf 1 -bam bam.filelist -minMaf 0.1

also I would only use autosomes. This should reduce the number of sites.