Question: poisson - negative binomial mixture with flexmix and countreg
0
gravatar for smt8n
6 months ago by
smt8n0
smt8n0 wrote:

Dear all,

I am testing making a finite mixture model for poisson and negative binomial on some artificial data using flexmix and countreg packages. Here is the core part of my small script:

require("countreg")
require("flexmix")

## generating data
## poisson
lambda <- 20; nptsPois <- 2000
poisData <- rpois(n=nptsPois,lambda=lambda)
## negative binomial
nptsNB <- 2000; nbdisp <- 0.5; nbmean <- 100
nbData <- rnbinom(n=nptsNB,size=nbdisp,mu=nbmean)
## together
dataAll <- c(poisData,nbData)

## plotting data
par(mfrow = c(1, 2))
hist(poisData)
hist(nbData)
##
quartz()
hist(nbData,col="red",breaks=400,xlim=c(0,100))
hist(poisData,col="blue",add=T)

## fitting
modelP <- FLXMRglm(family = "poisson")
modelNB <- FLXMRnegbin()

flexfit <- stepFlexmix(dataAll~1,k=2,model=list(modelP,modelNB),nrep=5,drop=TRUE)

This fits the generated data but there are some problems I do not know how to resolve.

  • When I try refit(flexfit), I get Error in optim(fn = FLXlogLikfun(object), par = FLXgetParameters(object), : gradient in optim evaluated to length 5 not 7 and googling does not help much
  • When I run parameters(flexfit), I get

    [[1]] Comp.1.coef.(Intercept) Comp.2.coef.(Intercept) 3.098849 5.483547

[[2]] Comp.1 Comp.2 coef.(Intercept) 3.098849 5.483549 theta 1.909100 3.351303 I understand intercepts here being log means of the distributions (ok, somewhat off, but not crucial for now), theta is dispersion in the case of negative binomial, but what is the "dispersion" for poisson? In addition to being confused about the meaning, this prevents me from using parameters in the automated script by ways of plotting fitted curves. If I got 1 parameter for one part of mixture, I would send it to Poisson function, but what to do with 2?

  • The model is way off in assigning data to the clusters (models), instead of 2000 - 2000 it has 3347 - 653

    Could somebody, please, advise me on a) what to do about the refit function; b) the meaning of dispersion in the Poisson case and the possibility of distinguishing which parameters belong to which part of the mixture; c) whether any improvement is possible for data point assignment to clusters.

    sessionInfo() R version 3.5.2 (2018-12-20) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6

    Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

    locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

    attached base packages: [1] stats graphics grDevices utils datasets methods base

    other attached packages: [1] flexmix_2.3-15 lattice_0.20-38 countreg_0.2-1 MASS_7.3-51.1

    loaded via a namespace (and not attached): [1] compiler_3.5.2 modeltools_0.2-22 nnet_7.3-12
    [4] grid_3.5.2 stats4_3.5.2

Thank you.

ADD COMMENTlink written 6 months ago by smt8n0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1581 users visited in the last hour