gamglm is a software in C++ for Gamma generalized linear model of
huge number of binary features, such as some thousands.
Because Gamma generalized linear model is not convex in its parameters,
ordinary optimization (like L-BFGS) would stuck for the huge humber of
features. This software employs a simple MCMC algorithm
and an efficient data structure for inference.
Makefile
and type make.
% gamglm -h gamglm: Bayesian Gamma generalized linear model. $Id: gamglm.cpp,v 1.4 2014/10/27 12:01:07 daichi Exp $ usage: gamglm [-I iter] [-e eps] [-s sigma] TRAIN MODELOptions are:
When the iterations are finished, there will be model files below:
- -I iter
- number of MCMC iterations. (default 1)
- -e eps
- standard deviation of Gaussian random walk. (default 0.2)
- -s sigma
- standard deviation of L2 regularization of weights. (default 0.1)
- model.dic
- Dictionary of features. Internally each feature is assigned an integer corresponding to its line number.
- model.a
- Regression weights wa of Gamma regression for the shape parameter.
- model.b
- Regression weights wb of Gamma regression for the scale parameter.
y feature_1 feature_2 feature_3 .. feature_n
test.dat
included in the package.
% gamglm-predict usage: gamglm-predict TEST MODEL $Id: gamglm-predict.cpp,v 1.1 2014/10/28 08:39:59 daichi Exp $TEST is a data file whose format is the same as the training data, but the target variable y is not used and can be any number (such as -1). It will output the prediction of a and b to stdout:
% gamglm-predict test.dat model -1 0.998879 1.026384 -1 1.108263 0.733772 -1 1.229099 0.723988 -1 1.187355 0.708005 -1 1.290324 0.675810 -1 1.310694 0.556131 ^-- parameter a ^-- parameter bThen you can use predicted a and b above in the Gam(a,b) distribution.