Fitting Function for Optweight for Survey Weights
optweight.svy.fit.Rd
optweight.svy.fit
performs the optimization (via osqp; Anderson, 2018) for optweight.svy
and should, in most coses, not be used directly. No processing of inputs is performed, so they must be given exactly as described below.
Usage
optweight.svy.fit(covs,
tols = 0,
targets,
s.weights = NULL,
norm = "l2",
std.binary = FALSE,
std.cont = TRUE,
min.w = 1E-8,
verbose = FALSE,
...)
Arguments
- covs
A matrix of covariates to be targeted. Should must be numeric but does not have to be full rank.
- tols
A vector of target balance tolerance values.
- targets
A vector of target populaton mean values for each covariate. The resulting weights will yield sample means within
tols
units of the target values for each covariate. If any target values areNA
, the corresponding variable will not be targeted and its weighted mean will be wherever the weights yield the smallest variance. To ensure the weighted mean for a covairate is equal to its unweighted mean (i.e., so that its orginal mean is its target mean), its original mean must be supplied as a target.- s.weights
A vector of sampling weights. Optimization occurs on the product of the sampling weights and the estimated weights.
- norm
A string containing the name of the norm corresponding to the objective function to minimize. The options are
"l1"
for the L1 norm,"l2"
for the L2 norm (the default), and"linf"
for the L\(\infty\) norm. The L1 norm minimizes the average absolute distance between each weight and the mean of the weights; the L2 norm minimizes the variance of the weights; the L\(\infty\) norm minimizes the largest weight. The L2 norm has a direct correspondence with the effective sample size, making it ideal if this is your criterion of interest.- std.binary, std.cont
logical
; whether the tolerances are in standardized mean units (TRUE
) or raw units (FALSE
) for binary variables and continuous variables, respectively. The default isFALSE
forstd.binary
because raw proportion differences make more sense than standardized mean difference for binary variables.- min.w
A single
numeric
value between 0 and 1 for the smallest allowable weight. Some analyses require nonzero weights for all units, so a small, nonzero minimum may be desirable. Doing so will likely (slightly) increase the variance of the resulting weights depending on the magntiude of the minimum. The default is 1e-8, which does not materially change the properties of the weights from a minimum of 0 but prevents warnings in some packages that use weights to estimate treatment effects.- verbose
Whether information on the optimization problem solution should be printed. This information contains how many iterations it took to estimate the weights and whether the solution is optimal.
- ...
Options that are passed to
osqpSettings
for use in thepar
arguments ofsolve_osqp
.
Value
An optweight.svy.fit
object with the following elements:
- w
The estimated weights, one for each unit.
- duals
A data.frame containing the dual variables for each covariate. See Zubizarreta (2015) for interpretation of these values.
- info
The
info
component of the output ofsolve_osqp
, which contains information on the performance of the optimization at termination.
Details
optweight.svy.fit
transforms the inputs into the required inputs for solve_osqp
, which are (sparse) matrices and vectors, and then supplies the outputs (the weights, duals variables, and convergence information) back to optweight.svy
. No processing of inputs is performed, as this is normally handled by optweight.svy
.
References
Anderson, E. (2018). osqp: Quadratic Programming Solver using the 'OSQP' Library. R package version 0.1.0. https://CRAN.R-project.org/package=osqp
Wang, Y., & Zubizarreta, J. R. (2017). Approximate Balancing Weights: Characterizations from a Shrinkage Estimation Perspective. ArXiv:1705.00998 [Math, Stat]. Retrieved from http://arxiv.org/abs/1705.00998
Zubizarreta, J. R. (2015). Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data. Journal of the American Statistical Association, 110(511), 910–922. doi:10.1080/01621459.2015.1023805
See also
optweight.svy
which you should use for estimating the balancing weights, unless you know better.
https://osqp.org/docs/index.html for more information on osqp, the underlying solver, and the options for solve_osqp
.
osqpSettings
for details on options for solve_osqp
.
Examples
library("cobalt")
data("lalonde", package = "cobalt")
covs <- splitfactor(lalonde[c("age", "educ", "race",
"married", "nodegree")],
drop.first = FALSE)
targets <- c(23, 9, .3, .3, .4, .2, .5)
tols <- rep(0, 7)
ows.fit <- optweight.svy.fit(covs,
tols = tols,
targets = targets,
norm = "l2")
#Unweighted means
apply(covs, 2, mean)
#> age educ race_black race_hispan race_white married
#> 27.3631922 10.2687296 0.3957655 0.1172638 0.4869707 0.4153094
#> nodegree
#> 0.6302932
#Weighted means; same as targets
apply(covs, 2, weighted.mean, w = ows.fit$w)
#> age educ race_black race_hispan race_white married
#> 27.3631922 10.2687296 0.3957655 0.1172638 0.4869707 0.4153094
#> nodegree
#> 0.6302932