sas notes
NLMIXED
METHOD
- DEFAULT: [GAUSS] adaptive Gauss-Hermite Quadrature
- [FIRO] : First-Order Method : only for Normal outcome and Random is used
- [HARDY] : Hardy Quadrature : only ONE Random
For GAUSS and ISAMP, NO ADAPTIVE QUADRATURE and NO ADAPTIVE SCALING can also be used (noad and noadscale)
can use y ~ negbin(m, p)
, or ll=logpdf('negbinomial', y, p,n)
then
model y~general(ll)
. but can not use pdf
, log scale is needed.
TROUBLESHOOT
- sort by subject
- no misspell in
PARAM
- no same name in data set
- reparamatrize :
exp(2\*logsd_u)
- rescale
GCONV = 0
makesFCONV
; change toGCONV = 0.01
might help- look at convergence : change covnrgence criteria
- try simpler model first
PARMS
for starting point- try previous different convergence methods
SORT
PROC SORT DATA = INPUT OUT = OUTPUT [NODUPLICATES];
BY [DESCENDING] VAR [VAR2] [_ALL_];
RUN;
UNIVARIATE
HISTOGRAM
proc univariate data=Steel;
var Length Width;
histogram;
ods output moments = a;
run;
MEANS
PROC MEANS DATA = KISD NWAY;
CLASS GROUP;
VAR age wt;
OUTPUT OUT = outdata MEAN = aveage avgwt;
RUN;
by group;
output out = a mean=;
sort in advance
ods output summary = y;
...
proc transpose;
run;
GPLOT
symbol1 I=R V=circle interpol = join;
PROC GPLOT DATA=a;
plot avgfall*time avginj*time;
RUN;
plot (fall inj) * time = categories;
sysbol
v(value) = dot/circle
h(size) = 0.1/1/3
axis
axis1 order =(1 to 5 by 2, 10 to 100 by 10) label = (a=90 'y axis');
then
plot a*b /haxis = axis1 vaxis = axis2 legend =legend;
Double axis
plot y*x;
plot2 y2*x;
overlay
plot ... /overlay;
legend
title 'sss';
legend1 value = ("va1" "va2");
plot ... /overlay legend = legend1;
FREQ
PROC FREQ DATA=BABA NLEVELS; /nlevels for check how many leveles/
TABLES idunit;
RUN;
ods select nlevels;
Options for tables
:
tables A /missprint; displays missing
tables A /missing; treat missing as a level
Use own format
proc format;
value bedscat
1='<100'
2='100-400'
3='>=400';
value owner
1='Public'
2='For Profit'
3='Not For Profit';
value teaching
1='COTH'
2='Residency, Not COTH'
3='Nonteaching';
run;
proc freq data=firsthosp nlevels;
tables bedscat owner teaching;
format bedscat bedscat. owner owner. teaching teaching.;
run;
SURVEYSELECT
PROC SURVEYSELECT METHOD = SRS (URS) SAMPLESIZE = 10 REP = 1 SEED = 110 OUT = datout;
id _all_ ;
RUN;
or use samprate = 0.4
NLMIXED vs GLIMMIX
Syntax
proc glimmix data=fake;
class treat ident;
model y/m = treat / ddfm=residual;
random ident;
parms (0.25) (1) / hold=2;
error=binomial, link=logit;
run;
proc nlmixed data=fake tech=trureg df=297;
bounds sigma2>0;
parms mu=1 t1=-2 t2=2 sigma2=0.25;
if treat=1 then eta=mu + t1 + e;
else if treat=2 then eta=mu + t2 + e;
else eta=mu + e;
prob=exp(eta)/(1+exp(eta));
model y ~ binomial(m, prob);
random e ~ normal(0, sigma2) subject=ident;
contrast 'treat' t1, t2;
predict prob out=results;
run;
bounds
bounds -1 < rho < 1;
Pros
- Likelihood can be user-specified if distribution is non-standard (ZI-NB)
- More exact than glimmix or nlinmix and runs faster than both of them. Some say slower.
- GLIMMIX: pseudo-likelihood vs NLMIXED: true log likelihood MLE (approximate) by Adaptive Gauss-Hermite quadrature. This estimation can also be available in GLIMMIX (method = QUAD in GLIMMIX), which can do nested model comparison
- GLIMMIX dependent variables have to be from an exponential distribution
- For marginal log likelihood approximation: NLMIXED more accurate (integral approximation by Gaussian quadrature), where GLIMMIX by linearization, potentially biased estimates for both fixed effects and covariane parameters, especially for binary data.
- GLIMMIX wald tye test and confidence intervals and nested models can not be compared with true likelihood ratio test
- INITIAL VALUE: NLMIXED > GLIMMIX; For GLIMMIX, more sensitive to initial values
Cons
- Random effects must come from a (multivariate) normal distribution
- All random effects must share the same subject (i.e. cannot have multi-level mixed models)
- Random effects cannot be nested or crossed
- DF for Wald-tests or contrasts generally require manual intervention
- Statistically, GLIMMIX can handle more random effects than NLMIXED.
- GLIMMIX allow REML , better for number of higher level units is small. NLMIXED no REML
Detail
There are some problems which GLIMMIX can handle with much greater facility than NLMIXED (e.g., models with both random effects and repeated measurements). On the other hand, the NLMIXED procedure may handle some problems which the GLIMMIX procedure cannot handle very well. For instance, you might have a study which is subject to random effects and your response follows a zero-inflated negative binomial distribution. The GLIMMIX procedure would not be able to fit a zero-inflated negative binomial response, but the NLMIXED procedure could be employed.
In general, the NLMIXED procedure is considerably more computationally intensive than is the GLIMMIX procedure, even though the GLIMMIX procedure is doubly iterative. Seldom will the NLMIXED procedure require less computational resources than GLIMMIX for equivalent problems.
The GLIMMIX procedure also fits mixed models for nonnormal data with nonlinearity in the conditional mean function. In contrast to the NLMIXED procedure, PROC GLIMMIX assumes that the model contains a linear predictor that links covariates to the conditional mean of the response. The NLMIXED procedure is designed to handle general conditional mean functions, whether they contain a linear component or not. As mentioned earlier, the GLIMMIX procedure by default estimates parameters in generalized linear mixed models by pseudo-likelihood techniques, whereas PROC NLMIXED by default performs maximum likelihood estimation by adaptive Gauss-Hermite quadrature. This estimation method is also available with the GLIMMIX procedure (METHOD=QUAD in the PROC GLIMMIX statement).
Summary
- NLMIXED:
- model is simple and limited to two levels
- number of random effects is small
- number of level-2 groups is relatively large
- binary data require accurate covariance parameter
- user-defined response distribution
- nested models need to be compared using the likelihood ratio
- GLIMMIX:
- more complex models
- a large number of random effects
- more than two levels
- number of higher-level units is small
GLIMMIX TROUBLESOME
- Use a different
METHOD
= option, such asLAPLACE
,QUAD
,RMPL
in PROC GLIMMIX, if possible. INITGLM
option in the proc GLIMMIX statementTECH = NRRIDG
orTECH = NEWRAP
, in NLOPTIONS statement: especially useful for Binary, Binomial, Negative Binomial, and Poisson distribution- increase the number of optimizations using the MAXOPT=options (50 ) GLIMMIX STATEMENT
- increase the number of iterations using the MAXITER = option in NLOPTIONS
-
if
convergen > 1e-3
, usenloptions gconv = 0
- initial starting values in the PARMS statement
Sample:
proc glimmix startglm inititer=10 method=laplace/rmpl/quad;
class clinic A;
model y/n = A / link=logit dist=binomial;
random clinic;
parms (0.4) / noiter;
nloptions technique=newrap;
run;
Memory Issue
- use
DDFM=RESIDUAL
orDDFM = BW
in model statement would exponentially reduce the memory, especially for many level random effect (BW
) NOTEST
in model statement- random ,
TYPE=VC
NOCLPRINT
- NO
SOLUTION
- hold random covariance
- Try HPMIXED
Model
A|B = A B A*B
Last Obs
data temp;
set mylib.tourrevenue;
retain HoldRevenue;
Revenue = LandCost * NumberOfBookings;
output;
HoldRevenue = Revenue;
run;
or Use lag
function and dif
Data teresaserial;
set teresa;
by idunit yearmonth;
lag_inj = lag(injfallscount);
dif_inj = dif(injfallscount);
if first.idunit then lag_inj = .;
run;
proc expand data=unemp out=unemp_laglead method = none;
id date;
convert rate = rate_lag1 / transformout=(lag 1);
convert rate;
convert rate = rate_lead1 / transformout=(lead 1);
run;
Count
data new;
set temp;
by group;
cnt + 1;
if first.group then cnt = 1;
run;
proc sort;
by group descending cnt;
run;
data new2;
set new;
by group;
retain totcnt;
if first.group then totcnt = cnt;
output;
run;
Beta-Binomial
logit link
proc nlmixed data=cardtox method=GAUSS NOAD fd qpoints=30;
parms b0=-0.24 b_dose=1.06 b_time=-0.07 rho=0.11;
pi_i = exp(b0 + b_dose*dose + b_time*time)/
(1 + exp(b0 + b_dose*dose + b_time*time)) ;
alpha_i1 = pi_i*(1-rho)/rho ;
alpha_i2 = (1- pi_i)*(1-rho)/rho ;
like1 = gamma(alpha_i1 + y)*gamma(alpha_i2+n-y)/
gamma(alpha_i1 + alpha_i2 + n ) ;
like2 = gamma(alpha_i1) * gamma(alpha_i2)/
gamma(alpha_i1 + alpha_i2) ;
llik = log(like1) - log(like2);
model z ̃ general(llik);
run;
whether z or y does not matter, and no need to add (n choose y). can use lgamma
.
ODS
cheatsheet: http://support.sas.com/rnd/base/ods/scratch/ods-tips.pdf
can create pdf and html
ods listing close;
ods pdf file="" startpage = no;
odf pdf close;
ods listing;
for html, change pdf
to html
ods html style=banker path="C:\" gpath="images" body="mygraph.html";
check table name :
ods trace on;
proc ...
ods trace off;
then check log for table name
otherwise,
ods output name = yourdataset;
Mixture Model
- nlmixed
- FMM (finite mixture model)
- R library : flexmix
- IML
In nlmixed proc, specify the mixture likelihood, and get the estimates by maximizing the likelihood
pros:
- straight forward
- can do random effect model
- more controls for component likelihood (say same alpha in NB)
cons:
- highly sensitive to initial value
- weird behavior (p estimates are the exact initial values)
FMM:
can do ZIP, ZINB, whatever mixture model, can also specify criterion to find the best model
pros:
- easy to implement
cons:
- does not fit models for multinomial data or models with random effects
- User-defined distributions or link functions are not supported
initial value :
proc fmm;
model y = x1 x2 / k=3;
probmodel / parms(2, 1);
run;
TECH=keyword
- CONGRA performs a conjugate-gradient optimization.
- DBLDOG performs a version of double-dogleg optimization.
- NEWRAP performs a Newton-Raphson optimization combining a line-search algorithm with ridging.
- NMSIMP performs a Nelder-Mead simplex optimization.
- NONE performs no optimization.
- NRRIDG performs a Newton-Raphson optimization with ridging.
- QUANEW performs a dual quasi-Newton optimization.
- TRUREG performs a trust-region optimization.
Dummy Variable
array studyq(17) studyq\_1 - studyq\_17;
do i = 1 to 17;
studyq(i) = (studyqtr = i);
end;
IML/MLE
http://www.psych.yorku.ca/lab/sas/iml.htm
Use IML
to do MLE:
call an optimization routine:
nlpqn(rc, xr, "fun", x, option, con) grd = "grad";
nlpnra : newton-raphson
con = {lower bound, upper bound}
con = {. 0, . .};
opt = {1 (to find max), 4 (print a lot of output)};
a simple example:
proc iml;
start ll(param) global (x) ;
mu = param[1];
sigma2 = param[2]##2;
n = nrow(x);
c = x - mu;
f = -n/2*log(sigma2) - 1/2/sigma2*sum(c##2);
return (f);
finish;
use sim;
read all var {yi} into x;
close sim;
p = {1, 2};
opt = {1, 4};
call nlpnra(rc, result, "ll", p, opt);
quit;
Read data into iml
use sim ; /* Input the simulated dataset */;
read all var {one u} into x ; /* Read the design matrix into x */;
read all var {yi} into y ; /* Read the outcome data into y */;
read all into mat[rownames=name];
real all into dat;
close sim;
all
: all obs
read all into mat[rowname=name] where sex='M';
Output
output to a SAS dataset
xys = yhat || res || weight;
cname = {"_YHAT_" "_RESID_" "_WEIGHT_" };
create out from xys [ colname=cname ];
append from xys;
CREATE libref.dataset <FROM matrix <[r=vector1 c=vector2]>>;
Variable
p = 2;
vec = {1, 2};
mat = j(row, col, num); constant matrix
mat = {2 4, 3 1};
nrow(mat)
mat[i, ]
vec[i]
index = 1:5 ;
col = 4:6` ; 3*1 matrix
vars = 'XX1':'XX7'
I(6);
diag(vector);
diag(mat);
t(mat) or mat` ; transpose
combine data
x = x0 || x1 || x2;
new row
mat = mat//submat[+, 2];
Index
ind = {2 3}
mat[ind, ind];
mat[1, 1:3]
column sum:
b = a[+,]
column sum of squares:
b = a[##,]
row mean:
b = a[, :]
sum
is quicker than a[+,+]
.
LOC
location:
loc(a[,2] ^= .)
Matrix
coffee[r = names c=days] ; # colnames and rownames
print coffee[r=names c=days] weektot[format=dollar7.2] ,
daytot[c=days f=dollar8.2] ' ' total[f=dollar7.2];
reshape ( stored in row-major order) :
a = shape(x, 4, 3);
x = a || b || c;
math
power: x##2
elementwise product #
matrix product *
Condition (IF)
if expression THEN statement1;
ELSE IF expression THEN statements2;
ELSE statements;
loop (DO)##
do i = 1 to p by 2;
end;
DO WHILE (expression);
end;
DO UNTIL (expression);
end;
function (Modules)
start myfun(param) global (x);
return (f);
finish myfun;
eval function:
run myfun(param);
call nlpnra(rc, result, "ll", p, opt);
a = myfun(param)
print 'text' outcome;
Random
RANUNI(SEED) ; uniform(0, 1)
RANNORM(SEED) ; standard normal
RANBIN(SEED, n, p) ; binomial n, p
Random number generator
In IML
n = 5; m = 4;
x = j(n,m);
Mu = 1:m;
Sigma = (1:m)/m;
call randgen(x, "Normal", Mu, Sigma);
print x;
NLMIXED TEST
CONTRAST 'mean' b1-b2=0, b2-b3=0;
TROUBLESOME
Unable to read sas report file
This is an error in enterprise 4.2.
Ref: http://blogs.sas.com/content/bi/2009/09/03/differences-in-sas-enterprise-guide-4-1-and-4-2/
Change options to show in HTML: Tools > Results > Results General
Aggregate
Means
proc means ;
class ;
var;
output out = dd sum=;
run;
Summary
proc summary data = SampleData;
var Y2010 Y2011 Y2012;
output out=ProcSumOut sum=;
run;
SQL
proc sql outobs=12;
create table name as
select City, Country, mean(AvgHigh, AvgLow)
as MeanTemp
from sql.worldtemps
group by xx
where calculated MeanTemp gt 75
order by MeanTemp desc;
quit;
Count unique levels:
proc sql;
create table new as
select count(distinct(name)) as namecount,
count(distinct(age)) as agecount,
count(distinct(sex)) as sexcount
from sashelp.class;
quit;
Summary of Data
AVG, MEAN
COUNT, FREQ, N
CSS : corrected sum of squares
CV: coefficient of variation
MAX
MIN
NMISS: number of missing
PRT
RANGE
STD
STDERR : se of the mean
SUM
SUMWGT
T: testing the hypo that mean is zero
USS: uncorrected sum of squares
VAR
Optimization
http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml/
- IML for log likelihood
- constraints
- nonlinear optimization (nlpnra (newton-raphson), etc)
available subroutines
- NLPCG: Conjugate Gradient Method
- NLPDD: Double Dogleg Method
- NLPNMS: Nelder-Mead Simplex Method
- NLPNRA: Newton-Raphson Method
- NLPNRR: Newton-Raphson Ridge Method
- NLPQN: (Dual) Quasi-Newton Method
- NLPQUA: Quadratic Optimization Method
- NLPTR: Trust-Region Method
A least squares problem is a special form of minimization problem where the objective function is defined as a sum of squares of other (nonlinear) functions.
- NLPLM: Levenberg-Marquardt Least Squares Method
- NLPHQN: Hybrid Quasi-Newton Least Squares Methods
Usage
nlpnra(rc, result, "FUN", initial, opt, con)
return codes (RC)
>0
: converge<0
: not converge
Termination criteria (TC)
local vs global
All the IML optimization algorithms converge toward local rather than global optima.
One way to find out whether the objective function has more than one local optimum is to run various optimizations with a pattern of different starting points. For a more mathematical definition of optimality, refer to the Kuhn-Tucker theorem in standard optimization literature. Using rather nonmathematical language,
OPT
- opt[1]:
- 1: max
- 0: min
- m: least squares
- opt[2]: print options
Merge
data new;
merge old1 (rename = (y = y1))
old2 (rename = (y = y2));
by id;
run;
If only want matched pair in old1:
merge old1 (in = x) old2;
if x = 1;
Macro
%MACRO myfun(a = , b = );
...
%MEND myfun;
%myfun(a = , b= );
Random number
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001466748.htm
x = RAND('NORMAL',<,[Theta],[lambda]>)
summation by group in IML
proc iml;
use pottery;
read all var{site x fe} into a;
print a;
/*In these statements you should not need to change anything other than the matrix name.*/
unique_rows=uniqueby(a,1,1:nrow(a));
print unique_rows;
do i=1 to nrow(unique_rows);
/* Next line is for the last BY group */
if i=nrow(unique_rows) then index=unique_rows[i]:nrow(a);
else index=unique_rows[i]:unique_rows[i+1]-1;
submat=a[index,];
/*Below is where you put your function. This particular example computes the sum for X and FE in each BY group.*/
/* sum=sum//submat[+,2:3];*/
sum = sum//submat[+, 2];
end;
print sum;
quit;
Or:
proc iml;
a={3 9 3 1,
3 0 2 1,
2 8 3 2};
c=ncol(a);
group=1;
b=a[loc(a[,c]=group),1:(c-1)][+,1:3];
print b;
quit;
Two sample proportion
proc freq data=ownerdata;
weight Count;
table source * owner / chisq;
where source = 'aha' or source = 'uti';
run;
Two sample ttest
proc ttest data=babies plots(unpack)=summary;
class momsmoker;
var bwt;
run;
Transpose
proc transpose;
by group;
id **;
var **;