SP
SP controlling player 0 (blue).
Human-partner qualitative comparisons across five MARL methods and three Co-π-tree variants. Layout names are written in full, while the source filenames keep their original abbreviations in the assets folder.
Compact coordination with dense interaction and fast handoffs between the human partner and the evaluated policy.
SP controlling player 0 (blue).
PBT controlling player 0 (blue).
FCP controlling player 0 (blue).
MEP controlling player 0 (blue).
COLE controlling player 0 (blue).
Co-π-tree controlling player 0 (blue).
Co-π-tree-PI controlling player 0 (blue).
Co-π-tree w/o P controlling player 0 (blue).
SP controlling player 1 (green).
PBT controlling player 1 (green).
FCP controlling player 1 (green).
MEP controlling player 1 (green).
COLE controlling player 1 (green).
Co-π-tree controlling player 1 (green).
Co-π-tree-PI controlling player 1 (green).
Co-π-tree w/o P controlling player 1 (green).
Long-horizon movement in the ring highlights how clearly each policy communicates routing intent to a human teammate.
SP controlling player 0 (blue).
PBT controlling player 0 (blue).
FCP controlling player 0 (blue).
MEP controlling player 0 (blue).
COLE controlling player 0 (blue).
Co-π-tree controlling player 0 (blue).
Co-π-tree-PI controlling player 0 (blue).
Co-π-tree w/o P controlling player 0 (blue).
SP controlling player 1 (green).
PBT controlling player 1 (green).
FCP controlling player 1 (green).
MEP controlling player 1 (green).
COLE controlling player 1 (green).
Co-π-tree controlling player 1 (green).
Co-π-tree-PI controlling player 1 (green).
Co-π-tree w/o P controlling player 1 (green).
Counter bottlenecks make this layout especially useful for comparing whether the human partner can read and trust each method's intent.
SP controlling player 0 (blue).
PBT controlling player 0 (blue).
FCP controlling player 0 (blue).
MEP controlling player 0 (blue).
COLE controlling player 0 (blue).
Co-π-tree controlling player 0 (blue).
Co-π-tree-PI controlling player 0 (blue).
Co-π-tree w/o P controlling player 0 (blue).
SP controlling player 1 (green).
PBT controlling player 1 (green).
FCP controlling player 1 (green).
MEP controlling player 1 (green).
COLE controlling player 1 (green).
Co-π-tree controlling player 1 (green).
Co-π-tree-PI controlling player 1 (green).
Co-π-tree w/o P controlling player 1 (green).
Specialized zones create strong role asymmetries, making it easy to inspect how each policy coordinates with a human under structural imbalance.
SP controlling player 0 (blue).
PBT controlling player 0 (blue).
FCP controlling player 0 (blue).
MEP controlling player 0 (blue).
COLE controlling player 0 (blue).
Co-π-tree controlling player 0 (blue).
Co-π-tree-PI controlling player 0 (blue).
Co-π-tree w/o P controlling player 0 (blue).
SP controlling player 1 (green).
PBT controlling player 1 (green).
FCP controlling player 1 (green).
MEP controlling player 1 (green).
COLE controlling player 1 (green).
Co-π-tree controlling player 1 (green).
Co-π-tree-PI controlling player 1 (green).
Co-π-tree w/o P controlling player 1 (green).
Separated workspaces emphasize complementary timing and reveal whether a human can easily anticipate each method's next move.
SP controlling player 0 (blue).
PBT controlling player 0 (blue).
FCP controlling player 0 (blue).
MEP controlling player 0 (blue).
COLE controlling player 0 (blue).
Co-π-tree controlling player 0 (blue).
Co-π-tree-PI controlling player 0 (blue).
Co-π-tree w/o P controlling player 0 (blue).
SP controlling player 1 (green).
PBT controlling player 1 (green).
FCP controlling player 1 (green).
MEP controlling player 1 (green).
COLE controlling player 1 (green).
Co-π-tree controlling player 1 (green).
Co-π-tree-PI controlling player 1 (green).
Co-π-tree w/o P controlling player 1 (green).