`method_cbps.Rd`

This page explains the details of estimating weights from covariate balancing propensity scores by setting `method = "cbps"`

in the call to `weightit()`

or `weightitMSM()`

. This method can be used with binary, multinomial, and continuous treatments.

In general, this method relies on estimating propensity scores using generalized method of moments and then converting those propensity scores into weights using a formula that depends on the desired estimand. This method relies on `CBPS::CBPS()`

from the CBPS package.

For binary treatments, this method estimates the propensity scores and weights using `CBPS::CBPS()`

. The following estimands are allowed: ATE, ATT, and ATC. The weights are taken from the output of the `CBPS`

fit object. When the estimand is the ATE, the return propensity score is the probability of being in the "second" treatment group, i.e., `levels(factor(treat))[2]`

; when the estimand is the ATC, the returned propensity score is the probability of being in the control (i.e., non-focal) group.

For multinomial treatments with three or four categories and when the estimand is the ATE, this method estimates the propensity scores and weights using one call to `CBPS::CBPS()`

. For multinomial treatments with three or four categories or when the estimand is the ATT, this method estimates the propensity scores and weights using multiple calls to `CBPS::CBPS()`

. The following estimands are allowed: ATE and ATT. The weights are taken from the output of the `CBPS`

fit objects.

For continuous treatments, the generalized propensity score and weights are estimated using `CBPS::CBPS()`

.

For longitudinal treatments, the weights are the product of the weights estimated at each time point. This is not how `CBPS::CBMSM()`

in the CBPS package estimates weights for longitudinal treatments.

Sampling weights are supported through `s.weights`

in all scenarios. See Note about sampling weights.

In the presence of missing data, the following value(s) for `missing`

are allowed:

`"ind"`

(default)First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is

`NA`

and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with 0s (this value is arbitrary and does not affect estimation). The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resulting`weightit`

object will be the original covariates with the`NA`

s.

All arguments to `CBPS()`

can be passed through `weightit()`

or `weightitMSM()`

, with the following exceptions:

`method`

in`CBPS()`

is replaced with the argument`over`

in`weightit()`

. Setting`over = FALSE`

in`weightit()`

is the equivalent of setting`method = "exact"`

in`CBPS()`

.`sample.weights`

is ignored because sampling weights are passed using`s.weights`

.`standardize`

is ignored.

All arguments take on the defaults of those in `CBPS()`

. It may be useful in many cases to set `over = FALSE`

, especially with continuous treatments.

`obj`

When

`include.obj = TRUE`

, the CB(G)PS model fit. For binary treatments, multinomial treatments with`estimand = "ATE"`

and four or fewer treatment levels, and continuous treatments, the output of the call to`CBPS::CBPS()`

. For multinomial treatments with`estimand = "ATT"`

or with more than four treatment levels, a list of`CBPS`

fit objects.

CBPS estimates the coefficients of a logistic regression model (for binary treatments), multinomial logistic regression model (form multinomial treatments), or linear regression model (for continuous treatments) that is used to compute (generalized) propensity scores, from which the weights are computed. It involves augmenting the standard regression score equations with the balance constraints in an over-identified generalized method of moments estimation. The idea is to nudge the estimation of the coefficients toward those that produce balance in the weighted sample. The just-identified version (with `exact = FALSE`

) does away with the score equations for the coefficients so that only the balance constraints (and the score equation for the variance of the error with a continuous treatment) are used. The just-identified version will therefore produce superior balance on the means (i.e., corresponding to the balance constraints) for binary and multinomial treatments and linear terms for continuous treatments than will the over-identified version.

Note that WeightIt provides less functionality than does the CBPS package in terms of the versions of CBPS available; for extensions to CBPS, the CBPS package may be preferred.

**Binary treatments**

Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243–263.

**Multinomial Treatments**

Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243–263.

**Continuous treatments**

Fong, C., Hazlett, C., & Imai, K. (2018). Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements. The Annals of Applied Statistics, 12(1), 156–177. doi:10.1214/17-AOAS1101

`CBPS::CBPS()`

for the fitting function

When sampling weights are used with `CBPS::CBPS()`

, the estimated weights already incorporate the sampling weights. When `weightit()`

is used with `method = "cbps"`

, the estimated weights are separated from the sampling weights, as they are with all other methods.

```
library("cobalt")
data("lalonde", package = "cobalt")
#Balancing covariates between treatment groups (binary)
(W1 <- weightit(treat ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "cbps", estimand = "ATT"))
#> A weightit object
#> - method: "cbps" (covariate balancing propensity score weighting)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: 2-category
#> - estimand: ATT (focal: 1)
#> - covariates: age, educ, married, nodegree, re74
summary(W1)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> treated 1.000 || 1.0000
#> control 0.017 |---------------------------| 2.2742
#>
#> - Units with 5 most extreme weights by group:
#>
#> 5 4 3 2 1
#> treated 1 1 1 1 1
#> 589 595 269 409 296
#> control 1.4755 1.4873 1.5799 1.7484 2.2742
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> treated 0.000 0.000 -0.000 0
#> control 0.839 0.707 0.341 0
#>
#> - Effective Sample Sizes:
#>
#> Control Treated
#> Unweighted 429. 185
#> Weighted 251.99 185
bal.tab(W1)
#> Call
#> weightit(formula = treat ~ age + educ + married + nodegree +
#> re74, data = lalonde, method = "cbps", estimand = "ATT")
#>
#> Balance Measures
#> Type Diff.Adj
#> prop.score Distance 0.0163
#> age Contin. -0.0032
#> educ Contin. 0.0017
#> married Binary -0.0003
#> nodegree Binary -0.0003
#> re74 Contin. 0.0005
#>
#> Effective sample sizes
#> Control Treated
#> Unadjusted 429. 185
#> Adjusted 251.99 185
if (FALSE) {
#Balancing covariates with respect to race (multinomial)
(W2 <- weightit(race ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "cbps", estimand = "ATE"))
summary(W2)
bal.tab(W2)
}
#Balancing covariates with respect to re75 (continuous)
(W3 <- weightit(re75 ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "cbps", over = FALSE))
#> A weightit object
#> - method: "cbps" (covariate balancing propensity score weighting)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: continuous
#> - covariates: age, educ, married, nodegree, re74
summary(W3)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> all 0.0153 |---------------------------| 13.1553
#>
#> - Units with 5 most extreme weights by group:
#>
#> 484 482 180 481 483
#> all 9.2225 10.8379 11.1353 11.9904 13.1553
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> all 1.152 0.449 0.288 0
#>
#> - Effective Sample Sizes:
#>
#> Total
#> Unweighted 614.
#> Weighted 264.09
bal.tab(W3)
#> Call
#> weightit(formula = re75 ~ age + educ + married + nodegree + re74,
#> data = lalonde, method = "cbps", over = FALSE)
#>
#> Balance Measures
#> Type Corr.Adj
#> age Contin. -0.0002
#> educ Contin. -0.0000
#> married Binary -0.0002
#> nodegree Binary 0.0000
#> re74 Contin. -0.0003
#>
#> Effective sample sizes
#> Total
#> Unadjusted 614.
#> Adjusted 264.09
```