Algorithms Reference

This page provides detailed documentation for all algorithms available in CausalBoundingEngine.

Note

All algorithms are implementations of published research. For complete citations, references, and proper attribution, see the References and Credits page. Please cite the original papers when using these algorithms in your research.

Algorithm Overview

Algorithm Comparison
Algorithm	ATE	PNS	Scenarios	Dependencies	Notes
Manski	✓	✗	BinaryConf	Core	Most conservative
TianPearl	✗	✓	BinaryConf	Core	PNS only
Autobound	✓	✓	BinaryConf, BinaryIV	Core	Optimization-based
EntropyBounds	✓	✓	BinaryConf	Core	Requires theta parameter
Causaloptim	✓	✓	BinaryConf, BinaryIV	R	Symbolic derivation
Zaffalonbounds	✓	✓	BinaryConf, BinaryIV	Java	Credal networks
ZhangBareinboim	✓	✗	ContIV	Core	Continuous IV

Core Algorithms

Manski Bounds

Reference: Manski, C. F. (1990). Nonparametric Bounds on Treatment Effects. The American Economic Review, 80(2), 319-323. See References and Credits for complete citation.

Description: Provides the most conservative bounds on the ATE under no additional assumptions beyond the observed data. These bounds are derived by considering the worst-case scenarios for the unobserved potential outcomes.

Mathematical Foundation:

For binary outcomes, the ATE bounds are:

\[ \begin{align}\begin{aligned}ATE_{lower} = P(Y=1 \mid X=1) \cdot P(X=1) - P(Y=1 \mid X=0) \cdot P(X=0) - P(X=1)\\ATE_{upper} = P(Y=1 \mid X=1) \cdot P(X=1) + P(X=0) - P(Y=1 \mid X=0) \cdot P(X=0)\end{aligned}\end{align} \]

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

bounds = scenario.ATE.manski()
print(f"Manski bounds: {bounds}")

Properties:

No additional assumptions required
Always provides valid bounds
Most conservative (widest intervals)
Fast computation
Only available for ATE

When to use:

As a baseline for comparison
When no additional assumptions can be justified
For quick exploratory analysis

Tian-Pearl Bounds

Reference: Tian, J., & Pearl, J. (2000). Probabilities of Causation: Bounds and Identification. Annals of Mathematics and Artificial Intelligence, 28(1-4), 287-313. See References and Credits for complete citation.

Description: Nonparametric bounds that use the joint distribution of treatment and outcome to derive bounds for PNS (Probability of Necessity and Sufficiency).

Mathematical Foundation:

For PNS:

\[ \begin{align}\begin{aligned}PNS_{lower} = 0\\PNS_{upper} = P(Y=1, X=1) + P(Y=0, X=0)\end{aligned}\end{align} \]

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

pns_bounds = scenario.PNS.tianpearl()
print(f"Tian-Pearl PNS: {pns_bounds}")

Properties:

Only available for PNS
Fast computation
No additional parameters
Nonparametric approach

When to use:

For PNS estimation
When you need nonparametric PNS bounds

Autobound

Reference: Duarte, G., Finkelstein, N., Knox, D., Mummolo, J., & Shpitser, I. (2023). An Automated Approach to Causal Inference in Discrete Settings. Journal of the American Statistical Association, 1-12. See References and Credits for complete citation.

Description: A general-purpose algorithm that formulates causal bounding as a linear programming problem. Can handle complex causal graphs and both confounded and IV settings.

Mathematical Foundation:

Autobound represents the causal problem using:

Decision variables for each potential outcome type
Constraints matching observed distributions
Linear programming optimization

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)
bounds = scenario.ATE.autobound()

# IV setting
Z = np.array([0, 1, 1, 0, 1])
scenario_iv = BinaryIV(X, Y, Z)
bounds_iv = scenario_iv.ATE.autobound()

Properties:

Works with both confounded and IV settings
Available for both ATE and PNS
Principled optimization approach
Moderate computation time

When to use:

When you need a general-purpose algorithm
For IV settings where other algorithms aren’t available
When you want theoretically grounded bounds

EntropyBounds

Reference: Jiang, Z., Wei, L., & Kocaoglu, M. (2023). Approximate Causal Effect Identification under Weak Confounding. Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15125-15143. See References and Credits for complete citation.

Description: Uses mutual information constraints to bound causal effects under the assumption of “weak confounding” - The confounder entropy is lower than some θ.

Mathematical Foundation:

The algorithm constrains the mutual information between potential outcomes and treatment:

\[H(U) \leq \theta\]

where θ is a user-specified parameter controlling the strength of confounding (i.e. it’s entropy).

Usage:

from causalboundingengine.scenarios import BinaryConf
import numpy as np

X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

# Different theta values give different bounds
strict_bounds = scenario.ATE.entropybounds(theta=0.1)  # Strong assumption
loose_bounds = scenario.ATE.entropybounds(theta=0.9)   # Weak assumption

print(f"Strict bounds (θ=0.1): {strict_bounds}")
print(f"Loose bounds (θ=0.9): {loose_bounds}")

Parameters:

theta (float): Information constraint level in [0, 1]. Lower values give tighter bounds but require stronger assumptions.

Properties:

Requires theta parameter (no default)
Available for both ATE and PNS
Uses convex optimization
Sensitive to theta choice

When to use:

When you can justify weak confounding assumptions
For sensitivity analysis across different theta values
When domain knowledge suggests limited confounding

External Engine Algorithms

Causaloptim

Dependencies: R, rpy2, causaloptim R package

Reference: Sachs, M. C., Sjölander, A., & Gabriel, E. E. (2022). A General Method for Deriving Tight Symbolic Bounds on Causal Effects. Journal of Computational and Graphical Statistics, 31(2), 496-510. See References and Credits for complete citation.

Description: Uses symbolic computation to derive analytic bounds on causal effects. Integrates with the R package causaloptim for graph specification and optimization.

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

try:
    bounds = scenario.ATE.causaloptim()
    print(f"Causaloptim bounds: {bounds}")
except ImportError:
    print("R support not available")

# IV setting
Z = np.array([0, 1, 1, 0, 1])
scenario_iv = BinaryIV(X, Y, Z)
bounds_iv = scenario_iv.ATE.causaloptim()

Parameters:

r_path (str, optional): Custom path to R executable

Properties:

Symbolic derivation of bounds
Works with both confounded and IV settings
Available for both ATE and PNS
Requires R installation

Installation:

# Install R support
pip install causalboundingengine[r]

When to use:

When you want symbolically derived bounds
For complex causal graphs
When R environment is available

Zaffalonbounds

Dependencies: Java, jpype1, CREMA/CREDICI libraries

Reference: Zaffalon, M., Antonucci, A., Cabañas, R., Huber, D., & Azzimonti, D. (2022). Bounding Counterfactuals under Selection Bias. Proceedings of The 11th International Conference on Probabilistic Graphical Models, 289-300. Uses CREMA and CREDICI libraries. See References and Credits for complete citation.

Description: Uses credal networks and EM-based learning to compute bounds. Based on the CREMA and CREDICI Java libraries developed at IDSIA.

Usage:

from causalboundingengine.scenarios import BinaryConf, BinaryIV
import numpy as np

# Confounded setting
X = np.array([0, 1, 1, 0, 1])
Y = np.array([1, 0, 1, 0, 1])
scenario = BinaryConf(X, Y)

try:
    bounds = scenario.ATE.zaffalonbounds()
    print(f"Zaffalonbounds: {bounds}")
except ImportError:
    print("Java support not available")

Properties:

Uses credal network inference
EM-based parameter learning
Works with both confounded and IV settings
Available for both ATE and PNS
Requires Java installation

Installation:

# Install Java support
pip install causalboundingengine[java]

When to use:

When you want Bayesian-style bounds
For complex probabilistic reasoning
When Java environment is available

Specialized Algorithms

ZhangBareinboim

Reference: Zhang, J., & Bareinboim, E. (2021). Bounding Causal Effects on Continuous Outcome. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 12207-12215. See References and Credits for complete citation.

Description: Designed specifically for continuous instrumental variable settings. Uses linear programming to handle compliance types in IV analysis.

Usage:

from causalboundingengine.scenarios import ContIV
import numpy as np

# Continuous data (will be discretized internally)
Z = np.random.normal(0, 1, 100)  # Instrument
X = Z + np.random.normal(0, 0.5, 100)  # Treatment
Y = X + np.random.normal(0, 0.5, 100)  # Outcome

scenario = ContIV(X, Y, Z)
bounds = scenario.ATE.zhangbareinboim()

Properties:

Specifically for continuous IV settings
Handles compliance types automatically
Only available for ATE
Uses linear programming

When to use:

With continuous instrumental variables
When compliance patterns are complex
For rigorous IV analysis

Algorithm Implementation Details

Error Handling

All algorithms implement consistent error handling:

import logging
logging.basicConfig(level=logging.WARNING)

# Failed algorithms return trivial bounds
scenario = BinaryConf(X, Y)
bounds = scenario.ATE.some_algorithm()

# Check for trivial bounds
if bounds == (-1.0, 1.0):  # ATE trivial bounds
    print("Algorithm failed, returned trivial bounds")

if bounds == (0.0, 1.0):   # PNS trivial bounds
    print("Algorithm failed, returned trivial bounds")

Performance Characteristics

Typical Performance
Algorithm	Speed	Notes
Manski	Very Fast	Simple calculations
TianPearl	Very Fast	Simple calculations (PNS only)
Autobound	Moderate	Linear programming
EntropyBounds	Moderate	Convex optimization
Causaloptim	Slow	R interface overhead
Zaffalonbounds	Very Slow	Java interface + EM algorithm
ZhangBareinboim	Moderate	Linear programming

Memory Usage

Most algorithms have modest memory requirements, but some considerations:

Zaffalonbounds: May need increased JVM heap size for large datasets
Autobound: Linear programming may use significant memory
EntropyBounds: Convex optimization scales with data size

# For large datasets with Java algorithms
import jpype
jpype.startJVM("-Xmx4g")  # 4GB heap size

Choosing the Right Algorithm

Decision Tree

What type of data do you have?
- Binary treatment/outcome → Continue to step 2
- Continuous variables → Use ZhangBareinboim (if IV available)
Do you have an instrument?
- Yes → Use Autobound, Causaloptim, or Zaffalonbounds
- No → Continue to step 3
What are your computational constraints?
- Need fast results → Use Manski (ATE) or TianPearl (PNS)
- Have more time → Consider Autobound, Causaloptim, or Zaffalonbounds
What (further) assumptions can you make?
- Weak confounding → Use EntropyBounds with appropriate theta
What external dependencies do you have?
- Core Python only → Use Manski, Autobound, or EntropyBounds (ATE); TianPearl (PNS)
- R available → Consider Causaloptim
- Java available → Consider Zaffalonbounds

Robustness Strategy

For important analyses, consider using multiple algorithms:

def robust_analysis(X, Y, Z=None):
    \"\"\"Run multiple algorithms for robustness.\"\"\"
    if Z is None:
        scenario = BinaryConf(X, Y)
        algorithms = ['manski', 'autobound']
    else:
        scenario = BinaryIV(X, Y, Z)
        algorithms = ['autobound', 'causaloptim', 'zaffalonbounds']

    results = {}
    for alg in algorithms:
        try:
            results[alg] = getattr(scenario.ATE, alg)()
        except Exception as e:
            print(f"Failed {alg}: {e}")

    return results

# Compare results
bounds_dict = robust_analysis(X, Y)
for alg, bounds in bounds_dict.items():
    print(f"{alg}: {bounds}")

This approach helps identify:

Consensus across methods
Algorithms that may be failing
Sensitivity to different assumptions