Nonparametric Covariate Balancing Propensity Score Weighting
Source:R/weightit2npcbps.R
method_npcbps.Rd
This page explains the details of estimating weights from nonparametric covariate balancing propensity scores by setting method = "npcbps"
in the call to weightit()
or weightitMSM()
. This method can be used with binary, multi-category, and continuous treatments.
In general, this method relies on estimating weights by maximizing the empirical likelihood of the data subject to balance constraints. This method relies on CBPSnpCBPS from the CBPS package.
Binary Treatments
For binary treatments, this method estimates the weights using CBPSnpCBPS. The ATE is the only estimand allowed. The weights are taken from the output of the npCBPS
fit object.
Multi-Category Treatments
For multi-category treatments, this method estimates the weights using CBPSnpCBPS. The ATE is the only estimand allowed. The weights are taken from the output of the npCBPS
fit object.
Continuous Treatments
For continuous treatments, this method estimates the weights using CBPSnpCBPS. The weights are taken from the output of the npCBPS
fit object.
Longitudinal Treatments
For longitudinal treatments, the weights are the product of the weights estimated at each time point. This is not how CBPSCBMSM estimates weights for longitudinal treatments.
Missing Data
In the presence of missing data, the following value(s) for missing
are allowed:
"ind"
(default)First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is
NA
and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with the covariate medians (this value is arbitrary and does not affect estimation). The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resultingweightit
object will be the original covariates with theNA
s.
Details
Nonparametric CBPS involves the specification of a constrained optimization problem over the weights. The constraints correspond to covariate balance, and the loss function is the empirical likelihood of the data given the weights. npCBPS is similar to entropy balancing and will generally produce similar results. Because the optimization problem of npCBPS is not convex it can be slow to converge or not converge at all, so approximate balance is allowed instead using the cor.prior
argument, which controls the average deviation from zero correlation between the treatment and covariates allowed.
Additional Arguments
moments
and int
are accepted. See weightit()
for details.
quantile
A named list of quantiles (values between 0 and 1) for each continuous covariate, which are used to create additional variables that when balanced ensure balance on the corresponding quantile of the variable. For example, setting
quantile = list(x1 = c(.25, .5. , .75))
ensures the 25th, 50th, and 75th percentiles ofx1
in each treatment group will be balanced in the weighted sample. Can also be a single number (e.g.,.5
) or an unnamed list of length 1 (e.g.,list(c(.25, .5, .75))
) to request the same quantile(s) for all continuous covariates, or a named vector (e.g.,c(x1 = .5, x2 = .75)
to request one quantile for each covariate. Only allowed with binary and multi-category treatments.
All arguments to npCBPS()
can be passed through weightit()
or weightitMSM()
.
All arguments take on the defaults of those in npCBPS()
.
Additional Outputs
obj
When
include.obj = TRUE
, the nonparametric CB(G)PS model fit. The output of the call to CBPSnpCBPS.
References
Fong, C., Hazlett, C., & Imai, K. (2018). Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements. The Annals of Applied Statistics, 12(1), 156–177. doi:10.1214/17-AOAS1101
See also
weightit()
, weightitMSM()
, method_cbps
CBPSnpCBPS for the fitting function
Examples
# Examples take a long time to run
library("cobalt")
data("lalonde", package = "cobalt")
# \donttest{
#Balancing covariates between treatment groups (binary)
(W1 <- weightit(treat ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "npcbps", estimand = "ATE"))
#> A weightit object
#> - method: "npcbps" (non-parametric covariate balancing propensity score weighting)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: 2-category
#> - estimand: ATE
#> - covariates: age, educ, married, nodegree, re74
summary(W1)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> treated 0.5587 |---------------------------| 9.8862
#> control 0.5594 |---| 2.1293
#>
#> - Units with the 5 most extreme weights by group:
#>
#> 172 69 58 181 182
#> treated 3.3634 4.199 8.3691 8.4396 9.8862
#> 411 595 269 409 296
#> control 1.6451 1.6633 1.7413 1.8249 2.1293
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> treated 1.143 0.512 0.302 0
#> control 0.269 0.230 0.035 0
#>
#> - Effective Sample Sizes:
#>
#> Control Treated
#> Unweighted 429. 185.
#> Weighted 400.19 80.48
bal.tab(W1)
#> Balance Measures
#> Type Diff.Adj
#> age Contin. 0.0295
#> educ Contin. -0.0135
#> married Binary 0.0407
#> nodegree Binary -0.0121
#> re74 Contin. 0.0704
#>
#> Effective sample sizes
#> Control Treated
#> Unadjusted 429. 185.
#> Adjusted 400.19 80.48
#Balancing covariates with respect to race (multi-category)
(W2 <- weightit(race ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "npcbps", estimand = "ATE"))
#> A weightit object
#> - method: "npcbps" (non-parametric covariate balancing propensity score weighting)
#> - number of obs.: 614
#> - sampling weights: none
#> - treatment: 3-category (black, hispan, white)
#> - estimand: ATE
#> - covariates: age, educ, married, nodegree, re74
summary(W2)
#> Summary of weights
#>
#> - Weight ranges:
#>
#> Min Max
#> black 0.6417 |--------------------------| 9.3668
#> hispan 0.2853 |---------------| 5.5253
#> white 0.4681 |-----| 2.5147
#>
#> - Units with the 5 most extreme weights by group:
#>
#> 226 244 605 181 182
#> black 2.5367 2.7284 3.0888 4.5045 9.3668
#> 392 564 269 371 345
#> hispan 2.0057 2.2325 3.0749 4.2753 5.5253
#> 68 457 599 589 531
#> white 1.9505 1.9707 2.0169 2.1335 2.5147
#>
#> - Weight statistics:
#>
#> Coef of Var MAD Entropy # Zeros
#> black 0.753 0.413 0.160 0
#> hispan 0.820 0.462 0.220 0
#> white 0.414 0.331 0.079 0
#>
#> - Effective Sample Sizes:
#>
#> black hispan white
#> Unweighted 243. 72. 299.
#> Weighted 155.3 43.31 255.36
bal.tab(W2)
#> Balance summary across all treatment pairs
#> Type Max.Diff.Adj
#> age Contin. 0.0310
#> educ Contin. 0.0431
#> married Binary 0.0225
#> nodegree Binary 0.0158
#> re74 Contin. 0.0432
#>
#> Effective sample sizes
#> black hispan white
#> Unadjusted 243. 72. 299.
#> Adjusted 155.3 43.31 255.36
# }