Algorithms Reference

This page provides detailed documentation for all algorithms available in CausalBoundingEngine.

Note

All algorithms are implementations of published research. For complete citations, references, and proper attribution, see the References and Credits page. Please cite the original papers when using these algorithms in your research.

Algorithm Overview

Algorithm Comparison

Algorithm

ATE

PNS

Scenarios

Dependencies

Notes

Manski

BinaryConf

Core

Most conservative

TianPearl

BinaryConf

Core

PNS only

Autobound

BinaryConf, BinaryIV

Core

Optimization-based

EntropyBounds

BinaryConf

Core

Requires theta parameter

Causaloptim

BinaryConf, BinaryIV

R

Symbolic derivation

Zaffalonbounds

BinaryConf, BinaryIV

Java

Credal networks

ZhangBareinboim

ContIV

Core

Continuous IV

Core Algorithms

Manski Bounds

Reference: Manski, C. F. (1990). Nonparametric Bounds on Treatment Effects. The American Economic Review, 80(2), 319-323. See References and Credits for complete citation.

Description: Provides the most conservative bounds on the ATE under no additional assumptions beyond the observed data. These bounds are derived by considering the worst-case scenarios for the unobserved potential outcomes.

Mathematical Foundation:

For binary outcomes, the ATE bounds are:

\[ \begin{align}\begin{aligned}ATE_{lower} = P(Y=1 \mid X=1) \cdot P(X=1) - P(Y=1 \mid X=0) \cdot P(X=0) - P(X=1)\\ATE_{upper} = P(Y=1 \mid X=1) \cdot P(X=1) + P(X=0) - P(Y=1 \mid X=0) \cdot P(X=0)\end{aligned}\end{align} \]

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

bounds = scenario.ATE.manski()
print(f"Manski bounds: {bounds}")
Properties:
  • No additional assumptions required

  • Always provides valid bounds

  • Most conservative (widest intervals)

  • Fast computation

  • Only available for ATE

When to use:
  • As a baseline for comparison

  • When no additional assumptions can be justified

  • For quick exploratory analysis

Tian-Pearl Bounds

Reference: Tian, J., & Pearl, J. (2000). Probabilities of Causation: Bounds and Identification. Annals of Mathematics and Artificial Intelligence, 28(1-4), 287-313. See References and Credits for complete citation.

Description: Nonparametric bounds that use the joint distribution of treatment and outcome to derive bounds for PNS (Probability of Necessity and Sufficiency).

Mathematical Foundation:

For PNS:

\[ \begin{align}\begin{aligned}PNS_{lower} = 0\\PNS_{upper} = P(Y=1, X=1) + P(Y=0, X=0)\end{aligned}\end{align} \]

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

pns_bounds = scenario.PNS.tianpearl()
print(f"Tian-Pearl PNS: {pns_bounds}")
Properties:
  • Only available for PNS

  • Fast computation

  • No additional parameters

  • Nonparametric approach

When to use:
  • For PNS estimation

  • When you need nonparametric PNS bounds

Autobound

Reference: Duarte, G., Finkelstein, N., Knox, D., Mummolo, J., & Shpitser, I. (2023). An Automated Approach to Causal Inference in Discrete Settings. Journal of the American Statistical Association, 1-12. See References and Credits for complete citation.

Description: A general-purpose algorithm that formulates causal bounding as a linear programming problem. Can handle complex causal graphs and both confounded and IV settings.

Mathematical Foundation:

Autobound represents the causal problem using:
  • Decision variables for each potential outcome type

  • Constraints matching observed distributions

  • Linear programming optimization

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)
bounds = scenario.ATE.autobound()

# IV setting
Z = np.array([0, 1, 1, 0, 1])
scenario_iv = BinaryIV(X, Y, Z)
bounds_iv = scenario_iv.ATE.autobound()
Properties:
  • Works with both confounded and IV settings

  • Available for both ATE and PNS

  • Principled optimization approach

  • Moderate computation time

When to use:
  • When you need a general-purpose algorithm

  • For IV settings where other algorithms aren’t available

  • When you want theoretically grounded bounds

EntropyBounds

Reference: Jiang, Z., Wei, L., & Kocaoglu, M. (2023). Approximate Causal Effect Identification under Weak Confounding. Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15125-15143. See References and Credits for complete citation.

Description: Uses mutual information constraints to bound causal effects under the assumption of “weak confounding” - The confounder entropy is lower than some θ.

Mathematical Foundation:

The algorithm constrains the mutual information between potential outcomes and treatment:

\[H(U) \leq \theta\]

where θ is a user-specified parameter controlling the strength of confounding (i.e. it’s entropy).

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

# Different theta values give different bounds
strict_bounds = scenario.ATE.entropybounds(theta=0.1)  # Strong assumption
loose_bounds = scenario.ATE.entropybounds(theta=0.9)   # Weak assumption

print(f"Strict bounds (θ=0.1): {strict_bounds}")
print(f"Loose bounds (θ=0.9): {loose_bounds}")
Parameters:
  • theta (float): Information constraint level in [0, 1]. Lower values give tighter bounds but require stronger assumptions.

Properties:
  • Requires theta parameter (no default)

  • Available for both ATE and PNS

  • Uses convex optimization

  • Sensitive to theta choice

When to use:
  • When you can justify weak confounding assumptions

  • For sensitivity analysis across different theta values

  • When domain knowledge suggests limited confounding

External Engine Algorithms

Causaloptim

Dependencies: R, rpy2, causaloptim R package

Reference: Sachs, M. C., Sjölander, A., & Gabriel, E. E. (2022). A General Method for Deriving Tight Symbolic Bounds on Causal Effects. Journal of Computational and Graphical Statistics, 31(2), 496-510. See References and Credits for complete citation.

Description: Uses symbolic computation to derive analytic bounds on causal effects. Integrates with the R package causaloptim for graph specification and optimization.

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

try:
    bounds = scenario.ATE.causaloptim()
    print(f"Causaloptim bounds: {bounds}")
except ImportError:
    print("R support not available")

# IV setting
Z = np.array([0, 1, 1, 0, 1])
scenario_iv = BinaryIV(X, Y, Z)
bounds_iv = scenario_iv.ATE.causaloptim()
Parameters:
  • r_path (str, optional): Custom path to R executable

Properties:
  • Symbolic derivation of bounds

  • Works with both confounded and IV settings

  • Available for both ATE and PNS

  • Requires R installation

Installation:

# Install R support
pip install causalboundingengine[r]
When to use:
  • When you want symbolically derived bounds

  • For complex causal graphs

  • When R environment is available

Zaffalonbounds

Dependencies: Java, jpype1, CREMA/CREDICI libraries

Reference: Zaffalon, M., Antonucci, A., Cabañas, R., Huber, D., & Azzimonti, D. (2022). Bounding Counterfactuals under Selection Bias. Proceedings of The 11th International Conference on Probabilistic Graphical Models, 289-300. Uses CREMA and CREDICI libraries. See References and Credits for complete citation.

Description: Uses credal networks and EM-based learning to compute bounds. Based on the CREMA and CREDICI Java libraries developed at IDSIA.

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

try:
    bounds = scenario.ATE.zaffalonbounds()
    print(f"Zaffalonbounds: {bounds}")
except ImportError:
    print("Java support not available")
Properties:
  • Uses credal network inference

  • EM-based parameter learning

  • Works with both confounded and IV settings

  • Available for both ATE and PNS

  • Requires Java installation

Installation:

# Install Java support
pip install causalboundingengine[java]
When to use:
  • When you want Bayesian-style bounds

  • For complex probabilistic reasoning

  • When Java environment is available

Specialized Algorithms

ZhangBareinboim

Reference: Zhang, J., & Bareinboim, E. (2021). Bounding Causal Effects on Continuous Outcome. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 12207-12215. See References and Credits for complete citation.

Description: Designed specifically for continuous instrumental variable settings. Uses linear programming to handle compliance types in IV analysis.

Usage:

from causalboundingengine.scenarios import ContIV
import numpy as np

# Continuous data (will be discretized internally)
Z = np.random.normal(0, 1, 100)  # Instrument
X = Z + np.random.normal(0, 0.5, 100)  # Treatment
Y = X + np.random.normal(0, 0.5, 100)  # Outcome

scenario = ContIV(X, Y, Z)
bounds = scenario.ATE.zhangbareinboim()
Properties:
  • Specifically for continuous IV settings

  • Handles compliance types automatically

  • Only available for ATE

  • Uses linear programming

When to use:
  • With continuous instrumental variables

  • When compliance patterns are complex

  • For rigorous IV analysis

Algorithm Implementation Details

Error Handling

All algorithms implement consistent error handling:

import logging
logging.basicConfig(level=logging.WARNING)

# Failed algorithms return trivial bounds
scenario = BinaryConf(X, Y)
bounds = scenario.ATE.some_algorithm()

# Check for trivial bounds
if bounds == (-1.0, 1.0):  # ATE trivial bounds
    print("Algorithm failed, returned trivial bounds")

if bounds == (0.0, 1.0):   # PNS trivial bounds
    print("Algorithm failed, returned trivial bounds")

Performance Characteristics

Typical Performance

Algorithm

Speed

Notes

Manski

Very Fast

Simple calculations

TianPearl

Very Fast

Simple calculations (PNS only)

Autobound

Moderate

Linear programming

EntropyBounds

Moderate

Convex optimization

Causaloptim

Slow

R interface overhead

Zaffalonbounds

Very Slow

Java interface + EM algorithm

ZhangBareinboim

Moderate

Linear programming

Memory Usage

Most algorithms have modest memory requirements, but some considerations:

  • Zaffalonbounds: May need increased JVM heap size for large datasets

  • Autobound: Linear programming may use significant memory

  • EntropyBounds: Convex optimization scales with data size

# For large datasets with Java algorithms
import jpype
jpype.startJVM("-Xmx4g")  # 4GB heap size

Choosing the Right Algorithm

Decision Tree

  1. What type of data do you have?

    • Binary treatment/outcome → Continue to step 2

    • Continuous variables → Use ZhangBareinboim (if IV available)

  2. Do you have an instrument?

    • Yes → Use Autobound, Causaloptim, or Zaffalonbounds

    • No → Continue to step 3

  3. What are your computational constraints?

    • Need fast results → Use Manski (ATE) or TianPearl (PNS)

    • Have more time → Consider Autobound, Causaloptim, or Zaffalonbounds

  4. What (further) assumptions can you make?

    • Weak confounding → Use EntropyBounds with appropriate theta

  5. What external dependencies do you have?

    • Core Python only → Use Manski, Autobound, or EntropyBounds (ATE); TianPearl (PNS)

    • R available → Consider Causaloptim

    • Java available → Consider Zaffalonbounds

Robustness Strategy

For important analyses, consider using multiple algorithms:

def robust_analysis(X, Y, Z=None):
    \"\"\"Run multiple algorithms for robustness.\"\"\"
    if Z is None:
        scenario = BinaryConf(X, Y)
        algorithms = ['manski', 'autobound']
    else:
        scenario = BinaryIV(X, Y, Z)
        algorithms = ['autobound', 'causaloptim', 'zaffalonbounds']

    results = {}
    for alg in algorithms:
        try:
            results[alg] = getattr(scenario.ATE, alg)()
        except Exception as e:
            print(f"Failed {alg}: {e}")

    return results

# Compare results
bounds_dict = robust_analysis(X, Y)
for alg, bounds in bounds_dict.items():
    print(f"{alg}: {bounds}")
This approach helps identify:
  • Consensus across methods

  • Algorithms that may be failing

  • Sensitivity to different assumptions