Conclusion

Co-π-tree shows that LLM reasoning can be distilled into executable and interpretable policy trees for human-AI collaboration. The learned policy tree executes directly at test time, reduces online LLM dependence, and supports local branch-level inspection and refinement.

Interpretability

Partner prediction and action selection remain explicit in the policy tree.

Efficiency

After learning, execution proceeds without repeated LLM calls at test time.

Collaboration

Partner-behavior prediction and partner-conditioned action selection improve coordination with AI and human partners.