Title: | Semi-Parametric Variance Regression |
---|---|
Description: | Methods for fitting semi-parametric mean and variance models, with normal or censored data. Extended to allow a regression in the location, scale and shape parameters, and further for multiple regression in each. |
Authors: | Kristy Robledo [aut, cre], Ian Marschner [ths] |
Maintainer: | Kristy Robledo <[email protected]> |
License: | GPL-3 |
Version: | 2.0.1 |
Built: | 2024-11-08 04:12:46 UTC |
Source: | https://github.com/kristyrobledo/varreg |
censlinVarReg
performs censored multivariate mean and multivariate variance regression.
This function is designed to be used by the semiVarReg
function.
censlinVarReg( dat, mean.ind = c(2), var.ind = c(2), cens.ind = c(3), mean.intercept = TRUE, para.space = c("all", "positive", "negative"), mean.init = NULL, var.init = NULL, control = list(...), ... )
censlinVarReg( dat, mean.ind = c(2), var.ind = c(2), cens.ind = c(3), mean.intercept = TRUE, para.space = c("all", "positive", "negative"), mean.init = NULL, var.init = NULL, control = list(...), ... )
dat |
Dataframe containing outcome and covariate data. Outcome data must be in the first column, with censored values set to the limits. Covariates for mean and variance model in next columns. |
mean.ind |
Vector containing the column numbers of the data in 'dat' to be fit as covariates in the mean model. 0 indicates constant mean option. NULL indicates zero mean option. |
var.ind |
Vector containing the column numbers of the data in 'dat' to be fit as covariates in the variance model. FALSE indicates constant variance option. |
cens.ind |
Vector containing the column number of the data in 'dat' to indicate the censored data. 0 indicates no censoring, -1 indicates left (lower) censoring and 1 indicates right (upper) censoring. |
mean.intercept |
Logical to indicate if an intercept is to be included in the mean model. Default is TRUE. |
para.space |
Parameter space to search for variance parameter estimates. "positive" means only search positive parameter space, "negative" means search only negative parameter space and "all" means search all. Default is all. |
mean.init |
Vector of initial estimates to be used for the mean model. |
var.init |
Vector of initial estimates to be used for the variance model. |
control |
List of control parameters. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
censlinVarReg
returns a list of output including:
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed of the EM algorithm.
reldiff
: the positive convergence tolerance that occured at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
boundary
: Logical argument indicating if estimates are on the boundary.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
mean.ind
: Vector of integer(s) indicating the column number(s) in the dataframe
data
that were fit in the mean model.
mean
: Vector of the maximum likelihood estimates of the mean parameters.
var.ind
: Vector of integer(s) indicating the column(s) in the dataframe
data
that were fit in the variance model.
variance
: Vector of the maximum likelihood estimates of the variance parameters.
cens.ind
: Integer indicating the column in the dataframe data
that
corresponds to the censoring indicator.
data
: Dataframe containing the variables included in the model.
censloop_em
is an EM loop function for censored data to be utilised by various other higher level functions.
censloop_em( meanmodel, theta.old, beta.old, p.old, x.0, X, censor.ind, mean.intercept, maxit, eps )
censloop_em( meanmodel, theta.old, beta.old, p.old, x.0, X, censor.ind, mean.intercept, maxit, eps )
meanmodel |
Dataframe containing only the covariates to be fit in the mean model. NULL for zero mean model and FALSE for constant mean model. |
theta.old |
Vector containing the initial variance parameter estimates to be fit in the variance model. |
beta.old |
Vector containing the initial mean parameter estimates to be fit in the mean model. |
p.old |
Vector of length n containing the initial variance estimate. |
x.0 |
Matrix of covariates (length n) to be fit in the variance model. All have been rescaled so zero is the minimum. If NULL, then its a constant variance model. |
X |
Vector of length n of the outcome variable. |
censor.ind |
Vector of length n of the censoring indicator. 0=uncensored, -1=left censored and 1 is right censored. |
mean.intercept |
Logical to indicate if mean intercept is to be included in the model. |
maxit |
Number of maximum iterations for the EM algorithm. |
eps |
Very small number for the convergence criteria. |
A list of the results from the EM algorithm, including:
conv
: Logical argument indicating if convergence occurred
it
: Total iterations performed of the EM algorithm
reldiff
: the positive convergence tolerance that occured at the final iteration.
theta.new
: Vector of variance parameter estimates. Note that these are not yet
transformed back to the appropriate scale
mean
: Vector of mean parameter estimates
fittedmean
: Vector of fitted mean estimates
p.old
: Vector of fitted variance estimates
criterion
calculates various information criterion for the algorithms in this package
criterion(n, loglik, param)
criterion(n, loglik, param)
n |
Number of observations |
loglik |
Loglikelihood from model |
param |
Number of parameters fit in model |
A list of the four IC
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
A dataset containing 221 observations from a light detection and ranging (LIDAR) experiment.
lidar
lidar
A data frame with 221 rows and 2 variables:
distance travelled before the light is reflected back to its source.
logarithm of the ratio of received light from two laser sources.
...
Sigrist, M. (Ed.) (1994). Air Monitoring by Spectroscopic Techniques (Chemical Analysis Series, vol. 197). New York: Wiley.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003) Semiparametric Regression Cambridge University Press. http://stat.tamu.edu/~carroll/semiregbook/
library(VarReg) data(lidar) attach(lidar) plot(range,logratio)
library(VarReg) data(lidar) attach(lidar) plot(range,logratio)
linVarReg
performs multivariate mean and multivariate variance regression. This function is
designed to be used by the semiVarReg
function.
linVarReg( dat, var.ind = c(2), mean.ind = c(2), para.space = c("all", "positive", "negative"), control = list(...), ... )
linVarReg( dat, var.ind = c(2), mean.ind = c(2), para.space = c("all", "positive", "negative"), control = list(...), ... )
dat |
Dataframe containing outcome and covariate data. Outcome data must be in the first column. Covariates for mean and variance model in next columns. |
var.ind |
Vector containing the column numbers of the data in 'dat' to be fit as covariates in the variance model. FALSE indicates constant variance option. |
mean.ind |
Vector containing the column numbers of the data in 'dat' to be fit as covariates in the mean model. 0 indicates constant mean option. NULL indicates zero mean option. |
para.space |
Parameter space to search for variance parameter estimates. "positive" means only search positive parameter space, "negative" means search only negative parameter space and "all" means search all. |
control |
List of control parameters. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
linVarReg
returns a list of output including:
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed of the EM algorithm.
reldiff
: the positive convergence tolerance that occured at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
boundary
: Logical argument indicating if estimates are on the boundary.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
mean.ind
: Vector of integer(s) indicating the column number(s) in the dataframe
data
that were fit in the mean model.
mean
: Vector of the maximum likelihood estimates of the mean parameters.
var.ind
: Vector of integer(s) indicating the column(s) in the dataframe
data
that were fit in the variance model.
variance
: Vector of the maximum likelihood estimates of the variance parameters.
cens.ind
: Integer indicating the column in the dataframe data
that
corresponds to the censoring indicator. Always NULL.
data
: Dataframe containing the variables included in the model.
loop_em
is a basic EM loop function to be utilised by various other higher level functions.
loop_em(meanmodel, theta.old, p.old, x.0, X, maxit, eps)
loop_em(meanmodel, theta.old, p.old, x.0, X, maxit, eps)
meanmodel |
Dataframe containing only the covariates to be fit in the mean model. NULL for zero mean model and FALSE for constant mean model. |
theta.old |
Vector containing the initial variance parameter estimates to be fit in the variance model. |
p.old |
Vector of length n containing the containing the initial variance estimate. |
x.0 |
Matrix of covariates (length n) to be fit in the variance model. All have been rescaled so zero is the minimum. If NULL, then its a constant variance model. |
X |
Vector of length n of the outcome variable. |
maxit |
Number of maximum iterations for the EM algorithm. |
eps |
Very small number for the convergence criteria. |
A list of the results from the EM algorithm, including
conv
: Logical argument indicating if convergence occurred
it
: Total iterations performed of the EM algorithm
reldiff
: the positive convergence tolerance that occured at the final iteration.
theta.new
: Vector of variance parameter estimates. Note that these are not yet
transformed back to the appropriate scale
mean
: Vector of mean parameter estimates
fittedmean
: Vector of fitted mean estimates
p.old
: Vector of fitted variance estimates
loop_lss
is the EM loop function for the LSS model to be utilised by various other higher level functions
loop_lss( alldat, xiold, omega2old, nuold, mean.ind, var.ind, nu.ind, para.space, maxit, eps, int.maxit, print.it )
loop_lss( alldat, xiold, omega2old, nuold, mean.ind, var.ind, nu.ind, para.space, maxit, eps, int.maxit, print.it )
alldat |
Dataframe containing all the data for the models. Outcome in the first column. |
xiold |
Vector of initial location parameter estimates to be fit in the location model. |
omega2old |
Vector of initial scale2 parameter estimates to be fit in the scale2 model. |
nuold |
Vector of initial nu parameter estimates to be fit in the nu model. |
mean.ind |
Vector containing the column numbers of the data in 'alldat' to be fit as covariates in the location model. |
var.ind |
Vector containing the column numbers of the data in 'alldat' to be fit as covariates in the scale2 model. FALSE indicates a constant variance model. |
nu.ind |
Vector containing the column numbers of the data in 'alldat' to be fit as covariates in the nu model. NULL indicates constant model. |
para.space |
Parameter space to search for variance parameter estimates. "positive" means only search positive parameter space, "negative" means search only negative parameter space and "all" means search all. |
maxit |
Number of maximum iterations for the main EM algorithm. |
eps |
Very small number for the convergence criteria. |
int.maxit |
Number of maximum iterations for the internal EM algorithm for the location and scale. |
print.it |
Logical to indicate if the estimates for each iteration should be printed. |
A list of the results from the algorithm, including conv, reldiff, it, mean, xi.new, omega2.new, nu.new, fitted.xi
conv
: Logical argument indicating if convergence occurred
it
: Total iterations performed of the EM algorithm
reldiff
: the positive convergence tolerance that occured at the final iteration
xinew
: Vector of location parameter estimates
omega2new
: Vector of scale squared parameter estimates
nunew
: Vector of shape parameter estimates
fitted.xi
: Vector of fitted location estimates
lss_calc
performs calculations for transforming SN data (location, scale and shape) to mean, variance and skew. This function is utilised by other, higher level functions.
lss_calc(x)
lss_calc(x)
x |
Object of class lssVarReg (output from |
dataframe containing:
y
: y variable
x
: x variable
eta
: or fitted location estimates
omega
: or fitted scale estimates
shape
: or fitted shape estimates
predicted mean
: fitted mean estimates
predicted variance
: fitted variance estimates
Predicted skewness
: fitted skewness estimates
stand.res2
: Squared standardised residuals
lssVarReg
performs a semiparametric location ( or xi), shape (
or nu) and scale (
or omega) regression model. Currently, this is only designed for a single covariate that is fit in the location, scale and shape models.
lssVarReg( y, x, locationmodel = c("constant", "linear", "semi"), scale2model = c("constant", "linear", "semi"), shapemodel = c("constant", "linear"), knots.l = 2, knots.sc = 2, knots.sh = 2, degree = 2, mono.scale = c("none", "inc", "dec"), para.space = c("all", "positive", "negative"), location.init = NULL, scale2.init = NULL, shape.init = NULL, int.maxit = 1000, print.it = FALSE, control = list(...), ... )
lssVarReg( y, x, locationmodel = c("constant", "linear", "semi"), scale2model = c("constant", "linear", "semi"), shapemodel = c("constant", "linear"), knots.l = 2, knots.sc = 2, knots.sh = 2, degree = 2, mono.scale = c("none", "inc", "dec"), para.space = c("all", "positive", "negative"), location.init = NULL, scale2.init = NULL, shape.init = NULL, int.maxit = 1000, print.it = FALSE, control = list(...), ... )
y |
Vector containing outcome data. Must be no missing data. |
x |
Vector containing the covariate data, same length as |
locationmodel |
Text to specify the location model to be fit. Options: |
scale2model |
Text to specify the scale^2 model to be fit. Options: |
shapemodel |
Text to specify the shape model to be fit. Options: |
knots.l |
Integer indicating the number of internal knots to be fit in the location model. Default is '2'. (Note that the knots are placed equidistantly over x.) |
knots.sc |
Integer indicating the number of internal knots to be fit in the scale^2 model. Default is '2'. (Note that the knots are placed equidistantly over x.) |
knots.sh |
Integer indicating the number of internal knots to be fit in the shape model. Default is '2'. (Note that the knots are placed equidistantly over x.) |
degree |
Integer to indicate the degree of the splines fit in the location and scale. Default is '2'. |
mono.scale |
Text to indicate whether the scale2 model is monotonic. Default is |
para.space |
Text to indicate the parameter space to search for scale2 parameter estimates. |
location.init |
Vector of initial parameter estimates for the location model. Defaults to vector of 1's of appropriate length. |
scale2.init |
Vector of initial parameter estimates for the scale^2 model. Defaults to vector of 1's of appropriate length. |
shape.init |
Vector of initial parameter estimates for the shape model. Defaults to vector of 1's of appropriate length. |
int.maxit |
Integer of maximum iterations for the internal location and scale EM algorithm. Default is 1000 iterations. |
print.it |
Logical for printing progress of estimates through each iteration. Default is |
control |
List of control parameters for the algorithm. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
lssVarReg
returns an object of class "lssVarReg"
, which inherits most from class
"VarReg"
. This object of class lssVarReg
is a list of the following components:
modeltype
: Text indicating the model that was fit, always "LSS model".
locationmodel
, scale2model
, shapemodel
, knots.l
, knots.sc
,
knots.sh
, degree
,mono.scale
: Returning the input variables as described above
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed of the main algorithm (not including the
internal EM algorithm).
reldiff
: the positive convergence tolerance that occured at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
location
: Vector of the maximum likelihood estimates of the location parameters.
scale2
: Vector of the maximum likelihood estimates of the scale (squared) parameters.
shape
: Vector of the maximum likelihood estimates of the shape parameters.
data
: Dataframe containing the variables included in the model.
## run a model with linear mean, linear variance and constant shape (not run): ## lssmodel<-lssVarReg(mcycle$accel, mcycle$times, locationmodel="linear", scale2model="linear", ## shapemodel="constant", maxit=10000)
## run a model with linear mean, linear variance and constant shape (not run): ## lssmodel<-lssVarReg(mcycle$accel, mcycle$times, locationmodel="linear", scale2model="linear", ## shapemodel="constant", maxit=10000)
lssVarReg.multi
performs a semiparametric location ( or xi), shape (
or nu) and scale (
or omega) regression model. This is designed for multiple covariates that are fit in the location, scale and shape models.
lssVarReg.multi( y, x, locationmodel = c("constant", "linear", "semi"), location.vars = c(1), scale2model = c("constant", "linear", "semi"), scale2.vars = c(1), shapemodel = c("constant", "linear", "semi"), shape.vars = c(1), knots.l = NULL, knots.sc = NULL, knots.sh = NULL, degree = 2, location.init = NULL, scale2.init = NULL, shape.init = NULL, int.maxit = 1000, print.it = FALSE, control = list(...), ... )
lssVarReg.multi( y, x, locationmodel = c("constant", "linear", "semi"), location.vars = c(1), scale2model = c("constant", "linear", "semi"), scale2.vars = c(1), shapemodel = c("constant", "linear", "semi"), shape.vars = c(1), knots.l = NULL, knots.sc = NULL, knots.sh = NULL, degree = 2, location.init = NULL, scale2.init = NULL, shape.init = NULL, int.maxit = 1000, print.it = FALSE, control = list(...), ... )
y |
Vector containing outcome data. Must be no missing data. |
x |
Matrix containing the covariate data, same length as |
locationmodel |
Vector to specify the location model to be fit for each covariate. Options: |
location.vars |
Vector to specify the column(s) in |
scale2model |
Vector to specify the scale^2 model to be fit for each covariate. Options: |
scale2.vars |
Vector to specify the column(s) in |
shapemodel |
Vector to specify the shape model to be fit for each covariate. Options: |
shape.vars |
Vector to specify the column(s) in |
knots.l |
Vector indicating the number of internal knots to be fit in the location model for each covariate. Default is '2'. (Note that the knots are placed equidistantly over x.) |
knots.sc |
Vector indicating the number of internal knots to be fit in the scale^2 model for each covariate. Default is '2'. (Note that the knots are placed equidistantly over x.) |
knots.sh |
Vector indicating the number of internal knots to be fit in the shape model for each covariate. Default is '2'. (Note that the knots are placed equidistantly over x.) |
degree |
Integer to indicate the degree of the splines fit in the location, scale and shape. Default is '2'. |
location.init |
Vector of initial parameter estimates for the location model. Defaults to vector of 1's of appropriate length. |
scale2.init |
Vector of initial parameter estimates for the scale^2 model. Defaults to vector of 1's of appropriate length. |
shape.init |
Vector of initial parameter estimates for the shape model. Defaults to vector of 1's of appropriate length. |
int.maxit |
Integer of maximum iterations for the internal location and scale EM algorithm. Default is 1000 iterations. |
print.it |
Logical for printing progress of estimates through each iteration. Default is |
control |
List of control parameters for the algorithm. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
lssVarReg
returns an object of class "lssVarReg"
, which inherits most from class
"VarReg"
. This object of class lssVarReg
is a list of the following components:
modeltype
: Text indicating the model that was fit, always "LSS model" for this model.
locationmodel
, scale2model
, shapemodel
, knots.l
, knots.sc
,
knots.sh
, degree
,mono.scale
: Returning the input variables as described above
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed of the main algorithm (not including the
internal EM algorithm).
reldiff
: the positive convergence tolerance that occured at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
location
: Vector of the maximum likelihood estimates of the location parameters.
scale2
: Vector of the maximum likelihood estimates of the scale (squared) parameters.
shape
: Vector of the maximum likelihood estimates of the shape parameters.
data
: Dataframe containing the variables included in the model.
## not run ## library(palmerpenguins) ## cc<-na.omit(penguins) ## y<-cc$body_mass_g ## x<-as.data.frame(cbind(cc$bill_length_mm, cc$flipper_length_mm,cc$bill_depth_mm)) ## colnames(x) <-c("bill length mm", "flipper length mm","bill depth mm") ## model1<-lssVarReg.multi(y, x, ## locationmodel="linear", location.vars = 2, ## scale2model="constant", ## shapemodel=c("linear", "semi"), shape.vars = c(2,3), ## knots.sh = 1, int.maxit=10 ) ## model1[-21] ## print model
## not run ## library(palmerpenguins) ## cc<-na.omit(penguins) ## y<-cc$body_mass_g ## x<-as.data.frame(cbind(cc$bill_length_mm, cc$flipper_length_mm,cc$bill_depth_mm)) ## colnames(x) <-c("bill length mm", "flipper length mm","bill depth mm") ## model1<-lssVarReg.multi(y, x, ## locationmodel="linear", location.vars = 2, ## scale2model="constant", ## shapemodel=c("linear", "semi"), shape.vars = c(2,3), ## knots.sh = 1, int.maxit=10 ) ## model1[-21] ## print model
A dataset containing 133 observations from a simulated motorcycle accident, used to test crash helmets.
mcycle
mcycle
A data frame with 133 rows and 2 variables:
in milliseconds from time of impact
in g, acceleration of the head
...
Silverman, B. W. (1985) Some aspects of the spline smoothing approach to non-parametric curve fitting. Journal of the Royal Statistical Society series B 47, 1-52.
Venables, W. N. and Ripley, B. D. (1999) Modern Applied Statistics with S-PLUS. Third Edition. Springer.
library(VarReg) data(mcycle) attach(mcycle) plot(times,accel)
library(VarReg) data(mcycle) attach(mcycle) plot(times,accel)
plotlssVarReg
is used to produce graphics for models fit in the VarReg
package with the
lssVarReg
function. As the skew-normal distribution is used to fit this type of model, the data needs
to be transformed from the SN parameters (location, scale and shape) to the typical mean,
variance and skew parameters.
plotlssVarReg(x, knot.lines = FALSE, xlab = "x", ylab = "y")
plotlssVarReg(x, knot.lines = FALSE, xlab = "x", ylab = "y")
x |
Object of class lssVarReg (output from |
knot.lines |
Logical to show the knot lines on the graphics (if model is type "semi").
Default is |
xlab |
Label to be placed on the x axis of graphics (covariate) |
ylab |
Label to be placed on the y axis of graphics (outcome) |
A graphic is returned, as well as a dataframe. The graphic returned is a 2 by 2 plot of:
the mean function over the x-variable, with or without the knot lines indicated
the variance function over the x-variable, with or without the knot lines indicated
the skew function over the x-variable, with or without the knot lines indicated
a Q-Q plot of the squared residuals from the model, plotted against the Chi-squared (df=1) distribution. For data from a skew-normal distribution, these residuals should follow a Chi-squared (df=1) distribution, regardless of skew.
The dataframe returned contains the following columns:
x
: x variable
y
: y variable
eta
: (), the location parameter
omega
: (), the scale parameter
shape
: (), the shape parameter
predicted~mean
: (), the mean
predicted~variance
: (), the variance
predicted~skewness
: (), the skew
stand.res2
: the standardised residuals squared.
data(mcycle) ## not run. LSS model followed by the basic plot command ##lssmodel<-lssVarReg(mcycle$accel, mcycle$times, locationmodel="linear", scale2model="linear", ##shapemodel="constant", maxit=10000) ##lssplot_out<-plotlssVarReg(lssmodel, xlab="Time in seconds", ylab="Acceleration")
data(mcycle) ## not run. LSS model followed by the basic plot command ##lssmodel<-lssVarReg(mcycle$accel, mcycle$times, locationmodel="linear", scale2model="linear", ##shapemodel="constant", maxit=10000) ##lssplot_out<-plotlssVarReg(lssmodel, xlab="Time in seconds", ylab="Acceleration")
plotVarReg
to produce graphics for models fit in this package.
plotVarReg( x, knot.lines = FALSE, ci = FALSE, ci.type = c("im", "boot"), bootreps = 1000, xlab = "x", ylab = "y", control = list(...), ... )
plotVarReg( x, knot.lines = FALSE, ci = FALSE, ci.type = c("im", "boot"), bootreps = 1000, xlab = "x", ylab = "y", control = list(...), ... )
x |
Object of class |
knot.lines |
Logical to indicate if knot lines should be shown on graphics
(if model is type "semi"). Default is |
ci |
Logical indicate if 95% CI should be shown on the plots. Default is |
ci.type |
Text to indicate the type of CI to plot. Either |
bootreps |
Integer to indicate the number of bootstrap replications to be performed if |
xlab |
Text for the label to be placed on the |
ylab |
Text for the label to be placed on the |
control |
list of control parameters to be used in bootstrapping.
See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
This function returns a 2x2 plot, with slightly different plots given, depending on the outcome data. For uncensored data, the plots are:
the mean function over the x
-variable, with or without 95% CI, and with or
without the knot lines indicated
the variance function over the x
-variable, with or without 95% CI and with or
without the knot lines indicated
a Q-Q plot of the residuals from the model
a histogram of the residuals from the model
If the outcome data is censored, the last two plots are no longer appropriate. Given the censored residuals from the model, we can compare the squared standardised residuals (given in black) with their censoring indicator to the chi-squared distribution with one degree of freedom (given in red). This is one of the plots given for censored data, and the other is a plot of the data, coloured by the censoring status. The triangles with the point at the top are bottom censored and the triangles with the point at the bottom are top censored.
data(mcycle) linmodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="linear", varmodel="linear", maxit=10000) plotVarReg(linmodel) plotVarReg(linmodel, ci=TRUE, ci.type="im", ylab="Range", xlab="Time in seconds") ##not run ##plotVarReg(linmodel, ci=TRUE, ci.type="boot", bootreps=10,ylab="Acceleration", ##xlab="Time in seconds") ##not run ##semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, maxit=10000) ##plotVarReg(semimodel, ci=TRUE, ci.type="boot",bootreps=10,ylab="Acceleration", ##xlab="Time in seconds", maxit=10000)
data(mcycle) linmodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="linear", varmodel="linear", maxit=10000) plotVarReg(linmodel) plotVarReg(linmodel, ci=TRUE, ci.type="im", ylab="Range", xlab="Time in seconds") ##not run ##plotVarReg(linmodel, ci=TRUE, ci.type="boot", bootreps=10,ylab="Acceleration", ##xlab="Time in seconds") ##not run ##semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, maxit=10000) ##plotVarReg(semimodel, ci=TRUE, ci.type="boot",bootreps=10,ylab="Acceleration", ##xlab="Time in seconds", maxit=10000)
searchVarReg
performs multiple semi-parametric mean and variance regression models for a covariate of interest, in order to search for the optimal number of knots. The best model is chosen based on the information criterion of preference ("selection"
). At the moment, this is only designed for a single covariate that is fit in both the mean and variance models.
searchVarReg( y, x, cens.ind = NULL, maxknots.m = 3, maxknots.v = 3, degree = 2, mono.var = c("none", "inc", "dec"), selection = c("AIC", "AICc", "HQC", "BIC"), print.it = FALSE, control = list(...), ... )
searchVarReg( y, x, cens.ind = NULL, maxknots.m = 3, maxknots.v = 3, degree = 2, mono.var = c("none", "inc", "dec"), selection = c("AIC", "AICc", "HQC", "BIC"), print.it = FALSE, control = list(...), ... )
y |
Vector containing outcome data. Must be no missing data and any censored values must be set to the limits of detection. |
x |
Vector containing the covariate data. Must be no missing data and same length as |
cens.ind |
Vector containing the censoring indicator, if applicable. There must be no missing
data contained in the vector and this vector should be the same length as |
maxknots.m |
Integer indicating the maximum number of internal knots to be fit in the mean model. Default is |
maxknots.v |
Integer indicating the maximum number of internal knots to be fit in the variance model. Default is |
degree |
The degree of the splines fit in the mean and variance. Default is |
mono.var |
Text to indicate whether the variance model is monotonic (only applied to 'linear' or
semi-parametric variance models). Default is " |
selection |
Text to indicate which information criteria is to be used for the selection of the
best model. Choices are " |
print.it |
Logical to indicate whether to print progress from each model as the models are
performed. Default is |
control |
list of control parameters. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
A matrix of models are performed, of increasing complexity. Mean models start at a zero mean
model, then constant mean, linear, 0 internal knots, etc, up to a maximum internal knots as specified
in maxknots.m
. Variance models start at constant variance, linear variance, 0 internal knots,
etc, up to max internal knots as specified in maxknots.v
.
Note that this function can take some time to run, due to the number of models to be fit. A window will appear on windows based systems to show a progress bar for the function.
searchVarReg
returns an list, with the following components:
ll
: a dataframe of the log-likelihoods from each of the models that have been fit.
AIC
: a dataframe of the AIC from each of the models that have been fit. The parameters
fit in the mean model are given in the columns, and the parameters in the variance are given
in the rows.
AICc
: a dataframe of the AIC-c from each of the models that have been fit.
BIC
: a dataframe of the BIC from each of the models that have been fit.
HQC
: a dataframe of the HQC from each of the models that have been fit.
best.model
: an object of class VarReg
(see semiVarReg
)
containing the output from the optimal model (that model within the specified models in
the mean and variance with the lowest information criterion according to the criterion selected).
data(mcycle) ### not run ### find<-searchVarReg(mcycle$accel, mcycle$times, maxknots.v=3, maxknots.m=3, ### selection="HQC", maxit=10000)
data(mcycle) ### not run ### find<-searchVarReg(mcycle$accel, mcycle$times, maxknots.v=3, maxknots.m=3, ### selection="HQC", maxit=10000)
semiVarReg
performs semi-parametric mean and variance regression models. Currently, this is
only designed for a single covariate that is fit in the mean and variance models.
semiVarReg( y, x, cens.ind = NULL, meanmodel = c("zero", "constant", "linear", "semi"), mean.intercept = TRUE, varmodel = c("constant", "linear", "semi"), knots.m = 2, knots.v = 2, degree = 2, mono.var = c("none", "inc", "dec"), para.space = c("all", "positive", "negative"), control = list(...), ... )
semiVarReg( y, x, cens.ind = NULL, meanmodel = c("zero", "constant", "linear", "semi"), mean.intercept = TRUE, varmodel = c("constant", "linear", "semi"), knots.m = 2, knots.v = 2, degree = 2, mono.var = c("none", "inc", "dec"), para.space = c("all", "positive", "negative"), control = list(...), ... )
y |
Vector containing outcome data. Must be no missing data and any censored values must be set to the limits of detection. |
x |
Vector containing the covariate data. Must be no missing data and same length as |
cens.ind |
Vector containing the censoring indicator, if applicable. There must be no missing
data contained in the vector and this vector should be the same length as |
meanmodel |
Text to specify the mean model to be fit to the data. The possible inputs are
|
mean.intercept |
Logical argument to indicate if the mean model is to include an intercept
term. This option is only available in the censored mean model, and the default= |
varmodel |
Text to specify the variance model to be fit to the data. The possible inputs are
|
knots.m |
Integer indicating the number of internal knots to be fit in the semi-parametric
mean model. Knots are placed equidistantly over the covariate. The default value is |
knots.v |
Integer indicating the number of internal knots to be fit in the semi-parametric
variance model. Knots are placed equidistantly over the covariate. The default value is |
degree |
Integer indicating the degree of the splines fit in the mean and the variance models.
The default value is |
mono.var |
Text to indicate whether the variance model is monotonic. Note that this is not
available for the |
para.space |
Text to indicate the parameter space to search for scale2 parameter estimates.
|
control |
list of control parameters. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
semiVarReg
returns an object of class "VarReg"
which inherits some components from the class "glm"
. This object of class "VarReg"
is a list containing the following components:
modeltype
: Text indicating the model that was fit, indicating if a censored approach or an uncensored approach was performed.
knots.m
, knots.v
, degree
, meanmodel
, varmodel
: Returning the input variables as described above
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed.
reldiff
: the positive convergence tolerance that occurred at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
boundary
: Logical argument indicating if the MLE is on the boundary of the parameter space.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
mean.ind
: Vector of integer(s) indicating the column number(s) in the dataframe
data
that were fit in the mean model.
mean
: Vector of the maximum likelihood estimates of the mean parameters.
var.ind
: Vector of integer(s) indicating the column(s) in the dataframe
data
that were fit in the variance model.
variance
: Vector of the maximum likelihood estimates of the variance parameters.
cens.ind
: Integer indicating the column in the dataframe data
that
corresponds to the censoring indicator.
data
: Dataframe containing the variables included in the model.
data(mcycle) ## run a model with linear mean and linear variance: linmodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="linear", varmodel="linear", maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric variance (2 knots): ##not run ##semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric monotonic ## variance (2 knots): ## not run ##semimodel_inc<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, mono.var="inc")
data(mcycle) ## run a model with linear mean and linear variance: linmodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="linear", varmodel="linear", maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric variance (2 knots): ##not run ##semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric monotonic ## variance (2 knots): ## not run ##semimodel_inc<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="semi", ##knots.m=4, knots.v=2, mono.var="inc")
semiVarReg.multi
performs semi-parametric mean and variance regression models. This is
designed for multiple covariates fit in the mean and variance models.
semiVarReg.multi( y, x, mean.model = c("zero", "constant", "linear", "semi"), mean.vars = c(1), knots.m = NULL, var.model = c("constant", "linear", "semi"), var.vars = c(1), knots.v = NULL, degree = 2, control = list(...), ... )
semiVarReg.multi( y, x, mean.model = c("zero", "constant", "linear", "semi"), mean.vars = c(1), knots.m = NULL, var.model = c("constant", "linear", "semi"), var.vars = c(1), knots.v = NULL, degree = 2, control = list(...), ... )
y |
Vector containing outcome data. Must be no missing data and any censored values must be set to the limits of detection. |
x |
Matrix containing the covariate data. Must be no missing data and same length as |
mean.model |
Vector to specify the mean model to be fit to the data. The possible inputs are
|
mean.vars |
Vector to specify column(s) in |
knots.m |
Vector indicating the number of internal knots to be fit in each of covariate(s) fit in the semi-parametric
mean model. Must be one entry per |
var.model |
Vector to specify the variance model to be fit to the data. The possible inputs are
|
var.vars |
Vector to specify column(s) in |
knots.v |
Vector indicating the number of internal knots to be fit in the semi-parametric variance model. Knots are placed equidistantly over the covariate. |
degree |
Integer indicating the degree of the splines fit in the mean and the variance models.
The default value is |
control |
list of control parameters. See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
semiVarReg.multi
returns an object of class "VarReg"
which inherits some components from the class "glm"
. This object of class "VarReg"
is a list containing the following components:
modeltype
: Text indicating the model that was fit, indicating an uncensored approach was performed.
knots.m
, knots.v
, degree
, meanmodel
, varmodel
: Returning the input variables as described above
converged
: Logical argument indicating if convergence occurred.
iterations
: Total iterations performed.
reldiff
: the positive convergence tolerance that occurred at the final iteration.
loglik
: Numeric variable of the maximised log-likelihood.
boundary
: Logical argument indicating if the MLE is on the boundary of the parameter space.
aic.c
: Akaike information criterion corrected for small samples
aic
: Akaike information criterion
bic
: Bayesian information criterion
hqc
: Hannan-Quinn information criterion
mean.ind
: Vector of integer(s) indicating the column number(s) in the dataframe
data
that were fit in the mean model.
mean
: Vector of the maximum likelihood estimates of the mean parameters.
var.ind
: Vector of integer(s) indicating the column(s) in the dataframe
data
that were fit in the variance model.
variance
: Vector of the maximum likelihood estimates of the variance parameters.
data
: Dataframe containing the variables included in the model.
data(mcycle) ## run a model with linear mean and linear variance: linmodel<-semiVarReg.multi(mcycle$accel, x=mcycle, mean.model="linear",mean.vars=2, var.model="linear", var.vars=2, maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric variance (2 knots): ##not run ##semimodel<-semiVarReg.multi(mcycle$accel, x=mcycle, meanmodel="semi",mean.vars=2, varmodel="semi", ##var.vars=2,knots.m=4, knots.v=2, maxit=10000)
data(mcycle) ## run a model with linear mean and linear variance: linmodel<-semiVarReg.multi(mcycle$accel, x=mcycle, mean.model="linear",mean.vars=2, var.model="linear", var.vars=2, maxit=10000) ## run a model with semi-parametric mean (4 internal knots) and semi-parametric variance (2 knots): ##not run ##semimodel<-semiVarReg.multi(mcycle$accel, x=mcycle, meanmodel="semi",mean.vars=2, varmodel="semi", ##var.vars=2,knots.m=4, knots.v=2, maxit=10000)
seVarReg
calculates SE for an object of class VarReg
. If the result is not on a
boundary, the Fishers Information matrix SE are given. The bootstrapped 95% CI can also be
calculated. Designed to be called by the plot function plotVarReg
, rather than run by a user.
seVarReg( x, boot = FALSE, bootreps = 1000, vector.mean = x$data[, 2], vector.variance = x$data[, 2], control = list(...), ... )
seVarReg( x, boot = FALSE, bootreps = 1000, vector.mean = x$data[, 2], vector.variance = x$data[, 2], control = list(...), ... )
x |
Object of class |
boot |
Logical to indicate if bootstrapped CI should be calculated. Default is |
bootreps |
Number of bootstraps to be performed if |
vector.mean |
Vector of |
vector.variance |
Vector of |
control |
List of control parameters for the bootstrapped models.
See |
... |
arguments to be used to form the default control argument if it is not supplied directly |
The result is a list of results. This includes:
mean.est
: dataframe of overall results from the mean model, including parameter estimates
from the model, SEs from information matrix (if boundary=FALSE
) and if specified, the SE
from bootstrapping with the bootstrapped 95% CI.
variance.est
: dataframe of overall results from the variance model, including parameter
estimates from the model, SEs from information matrix (if boundary=FALSE
) and if specified,
the SE from bootstrapping with the bootstrapped 95% CI.
mean.im
: dataframe of the expected information matrices for the mean (as appropriate)
variance.im
: dataframe of the expected information matrices for the variance
(as appropriate)
mean.outputs
: dataframe with complete output for mean graphics. Includes the
vector.mean
as input, and the mean vector (mean.mean
) and the SE vector
mean.se.im
, and bootstrapping outputs as appropriate.
variance.outputs
: dataframe with complete output for variance graphics. Includes the
vector.variance
as input, and the mean vector (var.mean
) and the SE vector
var.se.im
, and bootstrapping outputs as appropriate.
data(mcycle) ##Fit model with range as a covariate in the mean and the variance model semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="linear", knots.m=4, maxit=10000) ##Calculate SE se1<-seVarReg(semimodel, boot=FALSE) ##not run: with bootstrapping ##se2<-seVarReg(semimodel, boot=TRUE, bootreps=10) ##not run: calculate mean and SE for a given sequence ##test.seq<-seq(min(mcycle$times), max(mcycle$times), ##by=((max(mcycle$times)-min(mcycle$times))/999)) ##se2<-seVarReg(semimodel, boot=TRUE, bootreps=10, vector.mean=test.seq)
data(mcycle) ##Fit model with range as a covariate in the mean and the variance model semimodel<-semiVarReg(mcycle$accel, mcycle$times, meanmodel="semi", varmodel="linear", knots.m=4, maxit=10000) ##Calculate SE se1<-seVarReg(semimodel, boot=FALSE) ##not run: with bootstrapping ##se2<-seVarReg(semimodel, boot=TRUE, bootreps=10) ##not run: calculate mean and SE for a given sequence ##test.seq<-seq(min(mcycle$times), max(mcycle$times), ##by=((max(mcycle$times)-min(mcycle$times))/999)) ##se2<-seVarReg(semimodel, boot=TRUE, bootreps=10, vector.mean=test.seq)
Methods for fitting semi-parametric mean and variance models, with normal or censored data. Also extended to allow a regression in the location, scale and shape parameters.
This package provides functions to fit semi-parametric mean and variance regression models. These models are based upon EM-type algorithms, which can have more stable convergence properties than other algorithms for additive variance regression models.
The primary function to use for linear and semi-parametric mean and variance models is semiVarReg
.
This function also is able to fit models to censored outcome data. There is also a plot function for these
models called plotVarReg
.
A search function has also been produced in order to assist users to find the optimal number of knots in
the model (searchVarReg
).
The other functions that are of particular use are lssVarReg
and its plot function
plotlssVarReg
. This uses the skew-normal distribution and combines the EM algorithm with
a coordinate-ascent type algorithm in order to fit a regression model in the location, scale and shape,
therefore extending the semi-parametric models to non-normal data.
Multivariate models can be fit with semiVarReg.multi
and lssVarReg.multi
Kristy Robledo [email protected]
Use VarReg.control
to determine parameters for the fitting of semiVarReg
. Typically only used internally within functions.
VarReg.control(bound.tol = 1e-05, epsilon = 1e-06, maxit = 1000)
VarReg.control(bound.tol = 1e-05, epsilon = 1e-06, maxit = 1000)
bound.tol |
Positive tolerance for specifying the interior of the parameter space. This allows the algorithm to terminate early if an interior maximum is found. If set to |
epsilon |
Positive convergence tolerance. If |
maxit |
integer giving the maximum number of EM algorithm iterations for a given parameterisation. |
This is used similarly to glm.control
. If required, it may be internally passed to another function.
A list of the three components: bound.tol
, epsilon
and maxit
.
A dataset containing 100 observations of mean velocity of circumferential fibre shortening (vcf), made by long axis and short axis echocardiography.
vcf
vcf
A data frame with 133 rows and 3 variables:
patient identifier
vcf measurement from long axis
vcf measurement from short axis
...
Data from Bland JM, Altman DG. (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet i, 307-310. (Supplied by Paul D'Arbela)
library(VarReg) data(vcf) attach(vcf) plot(rowMeans(vcf[-1]),vcf$vcflong-vcf$vcfshort)
library(VarReg) data(vcf) attach(vcf) plot(rowMeans(vcf[-1]),vcf$vcflong-vcf$vcfshort)