# Generate Balancing Weights

`weightit.Rd`

`weightit()`

allows for the easy generation of balancing weights using a variety of available methods for binary, continuous, and multi-category treatments. Many of these methods exist in other packages, which `weightit()`

calls; these packages must be installed to use the desired method. Also included are `print()`

and `summary()`

methods for examining the output.

## Usage

```
weightit(formula,
data = NULL,
method = "glm",
estimand = "ATE",
stabilize = FALSE,
focal = NULL,
by = NULL,
s.weights = NULL,
ps = NULL,
moments = NULL,
int = FALSE,
subclass = NULL,
missing = NULL,
verbose = FALSE,
include.obj = FALSE,
...)
# S3 method for weightit
print(x, ...)
```

## Arguments

- formula
a formula with a treatment variable on the left hand side and the covariates to be balanced on the right hand side. See

`glm()`

for more details. Interactions and functions of covariates are allowed.- data
an optional data set in the form of a data frame that contains the variables in

`formula`

.- method
a string of length 1 containing the name of the method that will be used to estimate weights. See Details below for allowable options. The default is

`"glm"`

for propensity score weighting using a generalized linear model to estimate the propensity score.- estimand
the desired estimand. For binary and multi-category treatments, can be

`"ATE"`

,`"ATT"`

,`"ATC"`

, and, for some methods,`"ATO"`

,`"ATM"`

, or`"ATOS"`

. The default for both is`"ATE"`

. This argument is ignored for continuous treatments. See the individual pages for each method for more information on which estimands are allowed with each method and what literature to read to interpret these estimands.- stabilize
`logical`

; whether or not to stabilize the weights. For the methods that involve estimating propensity scores, this involves multiplying each unit's weight by the proportion of units in their treatment group. Default is`FALSE`

.- focal
when multi-category treatments are used and ATT weights are requested, which group to consider the "treated" or focal group. This group will not be weighted, and the other groups will be weighted to be more like the focal group. If specified,

`estimand`

will automatically be set to`"ATT"`

.- by
a string containing the name of the variable in

`data`

for which weighting is to be done within categories or a one-sided formula with the stratifying variable on the right-hand side. For example, if`by = "gender"`

or`by = ~gender`

, weights will be generated separately within each level of the variable`"gender"`

. (The argument used to be called`exact`

, which will still work but with a message.) Only one`by`

variable is allowed; to stratify by multiply variables simultaneously, create a new variable that is a full cross of those variables using`interaction()`

.- s.weights
A vector of sampling weights or the name of a variable in

`data`

that contains sampling weights. These can also be matching weights if weighting is to be used on matched data. See the individual pages for each method for information on whether sampling weights can be supplied.- ps
A vector of propensity scores or the name of a variable in

`data`

containing propensity scores. If not`NULL`

,`method`

is ignored, and the propensity scores will be used to create weights.`formula`

must include the treatment variable in`data`

, but the listed covariates will play no role in the weight estimation. Using`ps`

is similar to calling`get_w_from_ps()`

directly, but produces a full`weightit`

object rather than just producing weights.- moments
`numeric`

; for some methods, the greatest power of each covariate to be balanced. For example, if`moments = 3`

, for each non-categorical covariate, the covariate, its square, and its cube will be balanced. This argument is ignored for other methods; to balance powers of the covariates, appropriate functions must be entered in`formula`

. See the individual pages for each method for information on whether they accept`moments`

.- int
`logical`

; for some methods, whether first-order interactions of the covariates are to be balanced. This argument is ignored for other methods; to balance interactions between the variables, appropriate functions must be entered in`formula`

. See the individual pages for each method for information on whether they accept`int`

.- subclass
`numeric`

; the number of subclasses to use for computing weights using marginal mean weighting with subclasses (MMWS). If`NULL`

, standard inverse probability weights (and their extensions) will be computed; if a number greater than 1, subclasses will be formed and weights will be computed based on subclass membership. Attempting to set a non-`NULL`

value for methods that don't compute a propensity score will result in an error; see each method's help page for information on whether MMWS weights are compatible with the method. See`get_w_from_ps()`

for details and references.- missing
`character`

; how missing data should be handled. The options and defaults depend on the`method`

used. Ignored if no missing data is present. It should be noted that multiple imputation outperforms all available missingness methods available in`weightit()`

and should probably be used instead. Consider the MatchThem package for the use of`weightit()`

with multiply imputed data.- verbose
`logical`

; whether to print additional information output by the fitting function.- include.obj
`logical`

; whether to include in the output any fit objects created in the process of estimating the weights. For example, with`method = "glm"`

, the`glm`

objects containing the propensity score model will be included. See the individual pages for each method for information on what object will be included if`TRUE`

.- ...
other arguments for functions called by

`weightit()`

that control aspects of fitting that are not covered by the above arguments. See Details.- x
a

`weightit`

object; the output of a call to`weightit()`

.

## Value

A `weightit`

object with the following elements:

- weights
The estimated weights, one for each unit.

- treat
The values of the treatment variable.

- covs
The covariates used in the fitting. Only includes the raw covariates, which may have been altered in the fitting process.

- estimand
The estimand requested.

- method
The weight estimation method specified.

- ps
The estimated or provided propensity scores. Estimated propensity scores are returned for binary treatments and only when

`method`

is`"glm"`

,`"gbm"`

,`"cbps"`

,`"super"`

, or`"bart"`

.- s.weights
The provided sampling weights.

- focal
The focal variable if the ATT was requested with a multi-category treatment.

- by
A data.frame containing the

`by`

variable when specified.- obj
When

`include.obj = TRUE`

, the fit object.- info
Additional information about the fitting. See the individual methods pages for what is included.

## Details

The primary purpose of `weightit()`

is as a dispatcher to functions that perform the estimation of balancing weights using the requested `method`

. Below are the methods allowed and links to pages containing more information about them, including additional arguments and outputs (e.g., when `include.obj = TRUE`

), how missing values are treated, which estimands are allowed, and whether sampling weights are allowed.

`"glm"`

- Propensity score weighting using generalized linear models.`"gbm"`

- Propensity score weighting using generalized boosted modeling.`"cbps"`

- Covariate Balancing Propensity Score weighting.`"npcbps"`

- Non-parametric Covariate Balancing Propensity Score weighting.`"ebal"`

- Entropy balancing.`"optweight"`

- Optimization-based weighting.`"super"`

- Propensity score weighting using SuperLearner.`"bart"`

- Propensity score weighting using Bayesian additive regression trees (BART).`"energy"`

- Energy balancing.

`method`

can also be supplied as a user-defined function; see `method_user`

for instructions and examples.

When using `weightit()`

, please cite both the WeightIt package (using `citation("WeightIt")`

) and the paper(s) in the references section of the method used.

## See also

`weightitMSM()`

for estimating weights with sequential (i.e., longitudinal) treatments for use in estimating marginal structural models (MSMs).

`weightit.fit()`

, which is a lower-level dispatcher function that accepts a matrix of covariates and a vector of treatment statuses rather than a formula and data frame and performs minimal argument checking and processing. It may be useful for speeding up simulation studies for which the correct arguments are known. In general `weightit()`

should be used.

## Examples

```
library("cobalt")
data("lalonde", package = "cobalt")
#Balancing covariates between treatment groups (binary)
(W1 <- weightit(treat ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "glm", estimand = "ATT"))
#> A weightit object
#> - method: "glm" (propensity score weighting with GLM)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: 2-category
#> - estimand: ATT (focal: 1)
#> - covariates: age, educ, married, nodegree, re74
summary(W1)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> treated 1.0000 || 1.0000
#> control 0.0222 |---------------------------| 2.0438
#>
#> - Units with the 5 most extreme weights by group:
#>
#> 10 8 4 3 1
#> treated 1 1 1 1 1
#> 411 595 269 409 296
#> control 1.3303 1.4365 1.5005 1.6369 2.0438
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> treated 0.000 0.000 -0.00 0
#> control 0.823 0.701 0.33 0
#>
#> - Effective Sample Sizes:
#>
#> Control Treated
#> Unweighted 429. 185
#> Weighted 255.99 185
bal.tab(W1)
#> Call
#> weightit(formula = treat ~ age + educ + married + nodegree +
#> re74, data = lalonde, method = "glm", estimand = "ATT")
#>
#> Balance Measures
#> Type Diff.Adj
#> prop.score Distance 0.0199
#> age Contin. 0.0459
#> educ Contin. -0.0360
#> married Binary 0.0044
#> nodegree Binary 0.0080
#> re74 Contin. -0.0275
#>
#> Effective sample sizes
#> Control Treated
#> Unadjusted 429. 185
#> Adjusted 255.99 185
#Balancing covariates with respect to race (multi-category)
(W2 <- weightit(race ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "ebal", estimand = "ATE"))
#> A weightit object
#> - method: "ebal" (entropy balancing)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: 3-category (black, hispan, white)
#> - estimand: ATE
#> - covariates: age, educ, married, nodegree, re74
summary(W2)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> black 0.5530 |-------------------------| 5.3496
#> hispan 0.1409 |----------------| 3.3322
#> white 0.3979 |-------| 1.9224
#>
#> - Units with the 5 most extreme weights by group:
#>
#> 226 244 485 181 182
#> black 2.5215 2.5491 2.8059 3.5551 5.3496
#> 392 564 269 345 371
#> hispan 2.0467 2.53 2.6322 2.7049 3.3322
#> 68 457 599 589 531
#> white 1.7106 1.7226 1.7426 1.7743 1.9224
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> black 0.590 0.413 0.131 0
#> hispan 0.609 0.440 0.163 0
#> white 0.371 0.306 0.068 0
#>
#> - Effective Sample Sizes:
#>
#> black hispan white
#> Unweighted 243. 72. 299.
#> Weighted 180.47 52.71 262.93
bal.tab(W2)
#> Call
#> weightit(formula = race ~ age + educ + married + nodegree + re74,
#> data = lalonde, method = "ebal", estimand = "ATE")
#>
#> Balance summary across all treatment pairs
#> Type Max.Diff.Adj
#> age Contin. 0.0001
#> educ Contin. 0.0000
#> married Binary 0.0001
#> nodegree Binary 0.0001
#> re74 Contin. 0.0001
#>
#> Effective sample sizes
#> black hispan white
#> Unadjusted 243. 72. 299.
#> Adjusted 180.47 52.71 262.93
#Balancing covariates with respect to re75 (continuous)
(W3 <- weightit(re75 ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "cbps", over = FALSE))
#> A weightit object
#> - method: "cbps" (covariate balancing propensity score weighting)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: continuous
#> - covariates: age, educ, married, nodegree, re74
summary(W3)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> all 0.0153 |---------------------------| 13.1539
#>
#> - Units with the 5 most extreme weights:
#>
#> 484 482 180 481 483
#> all 9.2281 10.8362 11.0895 11.9898 13.1539
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> all 1.151 0.449 0.288 0
#>
#> - Effective Sample Sizes:
#>
#> Total
#> Unweighted 614.
#> Weighted 264.37
bal.tab(W3)
#> Call
#> weightit(formula = re75 ~ age + educ + married + nodegree + re74,
#> data = lalonde, method = "cbps", over = FALSE)
#>
#> Balance Measures
#> Type Corr.Adj
#> age Contin. -0
#> educ Contin. -0
#> married Binary -0
#> nodegree Binary 0
#> re74 Contin. -0
#>
#> Effective sample sizes
#> Total
#> Unadjusted 614.
#> Adjusted 264.37
```