Video-ZSC with human

Human-partner qualitative comparisons across five MARL methods and three Co-π-tree variants. Layout names are written in full, while the source filenames keep their original abbreviations in the assets folder.

5 layouts 8 method slots per role player 0 (blue) / player 1 (green)
Each panel presents replay footage for qualitative comparison with a human partner. The layout uses fixed slots for every method-role combination so the comparison grid remains aligned across layouts.

Cramped Room

Compact coordination with dense interaction and fast handoffs between the human partner and the evaluated policy.

16 slots
Player 0 (blue)

SP

SP controlling player 0 (blue).

PBT

PBT controlling player 0 (blue).

FCP

FCP controlling player 0 (blue).

MEP

MEP controlling player 0 (blue).

COLE

COLE controlling player 0 (blue).

Co-π-tree

Co-π-tree controlling player 0 (blue).

Co-π-tree-PI

Co-π-tree-PI controlling player 0 (blue).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 0 (blue).

Player 1 (green)

SP

SP controlling player 1 (green).

PBT

PBT controlling player 1 (green).

FCP

FCP controlling player 1 (green).

MEP

MEP controlling player 1 (green).

COLE

COLE controlling player 1 (green).

Co-π-tree

Co-π-tree controlling player 1 (green).

Co-π-tree-PI

Co-π-tree-PI controlling player 1 (green).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 1 (green).

Coordination Ring

Long-horizon movement in the ring highlights how clearly each policy communicates routing intent to a human teammate.

16 slots
Player 0 (blue)

SP

SP controlling player 0 (blue).

PBT

PBT controlling player 0 (blue).

FCP

FCP controlling player 0 (blue).

MEP

MEP controlling player 0 (blue).

COLE

COLE controlling player 0 (blue).

Co-π-tree

Co-π-tree controlling player 0 (blue).

Co-π-tree-PI

Co-π-tree-PI controlling player 0 (blue).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 0 (blue).

Player 1 (green)

SP

SP controlling player 1 (green).

PBT

PBT controlling player 1 (green).

FCP

FCP controlling player 1 (green).

MEP

MEP controlling player 1 (green).

COLE

COLE controlling player 1 (green).

Co-π-tree

Co-π-tree controlling player 1 (green).

Co-π-tree-PI

Co-π-tree-PI controlling player 1 (green).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 1 (green).

Counter Circuit

Counter bottlenecks make this layout especially useful for comparing whether the human partner can read and trust each method's intent.

16 slots
Player 0 (blue)

SP

SP controlling player 0 (blue).

PBT

PBT controlling player 0 (blue).

FCP

FCP controlling player 0 (blue).

MEP

MEP controlling player 0 (blue).

COLE

COLE controlling player 0 (blue).

Co-π-tree

Co-π-tree controlling player 0 (blue).

Co-π-tree-PI

Co-π-tree-PI controlling player 0 (blue).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 0 (blue).

Player 1 (green)

SP

SP controlling player 1 (green).

PBT

PBT controlling player 1 (green).

FCP

FCP controlling player 1 (green).

MEP

MEP controlling player 1 (green).

COLE

COLE controlling player 1 (green).

Co-π-tree

Co-π-tree controlling player 1 (green).

Co-π-tree-PI

Co-π-tree-PI controlling player 1 (green).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 1 (green).

Asymmetric Advantages

Specialized zones create strong role asymmetries, making it easy to inspect how each policy coordinates with a human under structural imbalance.

16 slots
Player 0 (blue)

SP

SP controlling player 0 (blue).

PBT

PBT controlling player 0 (blue).

FCP

FCP controlling player 0 (blue).

MEP

MEP controlling player 0 (blue).

COLE

COLE controlling player 0 (blue).

Co-π-tree

Co-π-tree controlling player 0 (blue).

Co-π-tree-PI

Co-π-tree-PI controlling player 0 (blue).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 0 (blue).

Player 1 (green)

SP

SP controlling player 1 (green).

PBT

PBT controlling player 1 (green).

FCP

FCP controlling player 1 (green).

MEP

MEP controlling player 1 (green).

COLE

COLE controlling player 1 (green).

Co-π-tree

Co-π-tree controlling player 1 (green).

Co-π-tree-PI

Co-π-tree-PI controlling player 1 (green).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 1 (green).

Forced Coordination

Separated workspaces emphasize complementary timing and reveal whether a human can easily anticipate each method's next move.

16 slots
Player 0 (blue)

SP

SP controlling player 0 (blue).

PBT

PBT controlling player 0 (blue).

FCP

FCP controlling player 0 (blue).

MEP

MEP controlling player 0 (blue).

COLE

COLE controlling player 0 (blue).

Co-π-tree

Co-π-tree controlling player 0 (blue).

Co-π-tree-PI

Co-π-tree-PI controlling player 0 (blue).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 0 (blue).

Player 1 (green)

SP

SP controlling player 1 (green).

PBT

PBT controlling player 1 (green).

FCP

FCP controlling player 1 (green).

MEP

MEP controlling player 1 (green).

COLE

COLE controlling player 1 (green).

Co-π-tree

Co-π-tree controlling player 1 (green).

Co-π-tree-PI

Co-π-tree-PI controlling player 1 (green).

Co-π-tree w/o P

Co-π-tree w/o P controlling player 1 (green).