Spatial-temporal multiscale prediction
Predict not only what genes are expressed, but where in the embryo. Models must jointly produce expression, cell-type composition, and 3D spatial organization at held-out stages, evaluated against a 4D MERFISH atlas. This is the multiscale core of the challenge — getting global expression right while putting cells in the right physical place.
3D MERFISH atlas: serial coronal sections decoded into per-cell 3D coordinates plus measured RNA across the same E6.75 → E12.5 developmental window as Task 1. Each cell carries (x, y, z), expression vector, anatomical region, and cell-type label. This is a separate atlas from the single-cell Multiome used in Task 1.
What the model gets, what it must predict
Full 3D MERFISH atlas (coordinates + RNA + annotations) for all stages other than the four held-out sub-splits.
Future-extrapolation validation. Models trained on earlier stages must predict the held-out E10.5 spatial snapshot.
Future-extrapolation final test. Ground truth withheld until competition close.
Intermediate-stage interpolation validation — models must recover a stage between two observed stages.
Intermediate-stage interpolation final test. Tests whether models capture the continuous trajectory rather than memorising observed stages.
Public MERFISH stages with full 3D coordinates, expression, and annotations. The model is asked to emit a full 4D snapshot (per-cell 3D positions + expression + cell-type labels) at each held-out stage.
A 4D prediction: per-cell (x, y, z) positions, predicted expression vector, predicted cell-type label, with the cell count matched to the released schema for the target stage.
Metrics & formulas
Final ranking is a composite over these sub-scores; sub-scores are also reported individually so participants can diagnose where their model is strong and weak. Hidden labels are never released; all evaluation runs on the organisers' platform.
T is the coupling matrix between predicted and observed cell sets, constrained to valid transport plans Π(P, Q). α ∈ [0, 1] trades off feature similarity (expression) against geometric structure (cell-cell distances). Penalises predictions that get marginals right but the geometry wrong.
Same definitions as Task 1, but reported globally and per anatomical region. Per-region pseudobulk reveals when a model is right on average but wrong about where the right cells are.
Base-2 logarithm so JSD ∈ [0, 1]. Computed at global, regional, and per-condition levels using a frozen cell-type probe classifier locked before evaluation.
MMD / energy / sliced-Wasserstein on the k-nearest-neighbour expression distributions around matched coordinate cells. Localises spatial accuracy: a prediction can match global marginals but miss the local cell-cell co-occurrence pattern that defines tissue identity.
File contract
Submissions are validated against this schema before scoring; mismatches result in a validation error rather than a low score. The starter kit ships a self-check script.
- Format
- AnnData .h5ad with X = expression + .obsm["spatial"] = (n, 3) coords
- Cells
- Match released cell count for the target stage (within tolerance)
- Genes
- MERFISH gene panel — fixed order in starter kit
- Coordinates
- µm, embryo-centred frame, axes documented in starter kit
- Obs columns
- `cell_type` (required), `region` (recommended)
Starting points the starter kit will provide
- ▸Stage-warp baseline: morph the nearest observed stage onto the target via affine + non-rigid registration.
- ▸Spatial VAE conditioned on stage with separate spatial and expression decoders.
- ▸Diffusion model over (position, expression) joint with developmental-time conditioning.
- ▸Graph neural network over cell-cell neighbourhood with message-passing across pseudo-time.
- ▸Spateo-style flow + reaction-diffusion priors on the spatial decoder.
- !Joint prediction of expression and 3D structure is fundamentally multiscale: get the marginals right and you still fail FGW if the geometry is wrong.
- !Intermediate-stage prediction (E7.5, E8.5) is interpolation in time but extrapolation in cell-type composition — new lineages have appeared by E8.5 that are not just averages of neighbours.
- !MERFISH gene panels are smaller than full transcriptome, so models must reason carefully about which expression dimensions matter for tissue identity.
- !The frozen probe classifier for JSD is unknown to participants — predictions tuned to the wrong cell-type vocabulary will be re-labelled by the probe and scored unfavourably.
- ×Treating spatial coordinates as a regression target with MSE — scores reasonable on coordinate distance but tanks FGW because the relational structure is wrong.
- ×Ignoring the embryo-centred coordinate convention; raw stage coordinates will mis-register against the evaluation frame.
- ×Producing too few cells (or too many) — the FGW coupling penalises mismatched marginals.
- ×Memorising MERFISH gene-set means and emitting a flat prediction — clears pseudobulk Pearson, fails neighbourhood MMD.
Detailed task documents, the starter-kit repo, the submission portal, and the discussion forum land alongside the P1 launch (2026-07-30). Until then, reach the organisers at [email protected].