Some methods involve tuning of parameters and use some measure of covariate balance as the criterion to select the optimal parameter values. For example with method = "gbm", a balance criterion can be used to select the optimal number of trees to use. In WeightIt, the argument stop.method controls which criterion is optimized, using the naming conventions originally used in twang. This page documents the available arguments to stop.method. There has been some research on which balance criteria perform better than others; see the References section for some articles. Of course, how each method fares in any given dataset depends on a variety of factors, and multiple methods should be tried and compared before moving forward with a set of weights.

Each treatment type has its own balance criteria available.

### Binary Treatments

"es.mean", "es.max", "es.rms"

The average, maximum, or root mean squared absolute standardized mean difference (ASMD) among the covariates, respectively. The ASMD is computed using col_w_smd() in cobalt. All covariates are standardized, including binary covariates (note that in cobalt, raw mean differences are used for binary variables by default). The standardization factor (i.e., s.d.denom) depends on the estimand requested and follows the conventions used in cobalt.

"ks.mean", "ks.max", "ks.rms"

The average, maximum, or root mean squared Kolmogorov-Smirnov (KS) statistic among the covariates, respectively. The KS statistic is computed using col_w_ks() in cobalt.

"mahalanobis"

The sample Mahalanobis distance in the weighted sample. This is similar to "es.rms" except that variables that are redundant with each other will be downweighted. The sample Mahalanobis distance is computed using the (generalized) inverse of the unweighted covariance matrix computed in the focal group when the estimand is the ATT or ATC and using the (generalized) inverse of the average of the unweighted covariance matrices computed within each treatment group (analogous the the ASMD, which uses the average of the group variances in its denominator).

"energy.dist"

The energy distance between the weighted samples, as described by Huling & Mak (2020). The "improved" energy distance is used when the estimand is not the ATT. (Note that weights directly minimizing the energy distance can be found using method = "energy".)

"r2"

The pseudo-R2 of a logistic regression of the treatment on the covariates with the weights applied. Franklin et al. (2014) consider a similar metric, the post-matching C-statistic, but the pseudo-R2 accomplishes the same goal without requiring a decision boundary. The pseudo-R2 used is the McKelvy & Zavoina pseudo-R2.

"L1.med"

The L1 statistic of the weighted samples, which is half the average absolute difference in proportion for categories of a multidimensional histogram formed by coarsening the covariates. The coarsening used is the one that yields the median unweighted L1 statistic among 101 random coarsenings of the data. Each continuous covariate is coarsened into between 2 and 12 bins, and each categorical covariate is combined into between 2 and 12 levels (or however many levels are available). Because the coarsening is random, a seed should be set to ensure results are replicable.

### Multinomial Treatments

"es.mean", "es.max", "es.rms"

The average, maximum, or root mean squared absolute standardized mean difference (ASMD) among the covariates, respectively, across all pairs of treatments. The ASMD is computed using col_w_smd() in cobalt. All covariates are standardized, including binary covariates (note that in cobalt, raw mean differences are used for binary variables by default). The standardization factor (i.e., s.d.denom) depends on the estimand requested and follows the conventions used in cobalt. The same standardization factor is used across treatment pairs for each covariate. When the estimand is the ATT, only differences between the focal group and each other group are computed.

"ks.mean", "ks.max", "ks.rms"

The average, maximum, or root mean squared Kolmogorov-Smirnov (KS) statistic among the covariates, respectively, across all pairs of treatments. The KS statistic is computed using col_w_ks() in cobalt. When the estimand is the ATT, only differences between the focal group and each other group are computed.

"energy.dist"

The total energy distance among the weighted samples, as described by Huling & Mak (2020). The "improved" energy distance is used when the estimand is not the ATT. (Note that weights directly minimizing the energy distance can be found using method = "energy".)

"L1.med"

The L1 statistic of the weighted samples, which is the average absolute difference in proportion for categories of a multidimensional histogram formed by coarsening the covariates, divided by the number of treatment levels. The coarsening used is the one that yields the median unweighted L1 statistic among 101 random coarsenings of the data. Each continuous covariate is coarsened into between 2 and 12 bins, and each categorical covariate is combined into between 2 and 12 levels (or however many levels are available). Because the coarsening is random, a seed should be set to ensure results are replicable.

### Continuous Treatments

"p.mean", "p.max", "p.rms"

The average, maximum, or root mean squared absolute Pearson correlation between the treatment and covariates, respectively. The Pearson correlation is computed using col_w_cov() in cobalt. The correlation uses the unweighted standard deviations of the treatment and covariates in the denominator.

"s.mean", "s.max", "s.rms"

The average, maximum, or root mean squared absolute Spearman correlation between the treatment and covariates, respectively. The Spearman correlation is computed using col_w_cov() in cobalt. The correlation uses the unweighted standard deviations of the rank-transformed treatment and covariates in the denominator.

"r2"

The model R2 of a linear regression of the treatment on the covariates with the weights applied. The standard R2 is used.

"L1.med"

The L1 statistic of the weighted samples, which is the average absolute difference in proportion for categories of a multidimensional histogram formed by coarsening the covariates nd treatment, divided by the number of levels of the coarsened treatment. The coarsening used is the one that yields the median unweighted L1 statistic among 101 random coarsenings of the data. The treatment and each continuous covariate are coarsened into between 2 and 12 bins, and each categorical covariate is combined into between 2 and 12 levels (or however many levels are available). Because the coarsening is random, a seed should be set to ensure results are replicable.

## References

Ali, M. S., Groenwold, R. H. H., Pestman, W. R., Belitser, S. V., Roes, K. C. B., Hoes, A. W., de Boer, A., & Klungel, O. H. (2014). Propensity score balance measures in pharmacoepidemiology: A simulation study. Pharmacoepidemiology and Drug Safety, 23(8), 802–811. doi:10.1002/pds.3574

Belitser, S. V., Martens, E. P., Pestman, W. R., Groenwold, R. H. H., de Boer, A., & Klungel, O. H. (2011). Measuring balance and model selection in propensity score methods. Pharmacoepidemiology and Drug Safety, 20(11), 1115–1129. doi:10.1002/pds.2188

Franklin, J. M., Rassen, J. A., Ackermann, D., Bartels, D. B., & Schneeweiss, S. (2014). Metrics for covariate balance in cohort studies of causal effects. Statistics in Medicine, 33(10), 1685–1699. doi:10.1002/sim.6058

Huling, J. D., & Mak, S. (2020). Energy Balancing of Covariate Distributions. ArXiv:2004.13962 [Stat]. https://arxiv.org/abs/2004.13962

Griffin, B. A., McCaffrey, D. F., Almirall, D., Burgette, L. F., & Setodji, C. M. (2017). Chasing Balance and Other Recommendations for Improving Nonparametric Propensity Score Models. Journal of Causal Inference, 5(2). doi:10.1515/jci-2015-0026

method_gbm, method_super, which use these balance criteria.
balance summary in cobalt for details of some calculations.