This page display the submited results for CW Panoptic Segmentation Leaderboards. For each submission, we display several main metrics in main table. For detailed information, more metrics, per-sequence results and visualisation (coming soon), please click submission name. For all tables, you can click headers to sort the results. Note you can download the submission zip file as well. Legends, metrics descriptions and reference are displayed after leaderboards table. For more information on submission preparation, click here.

The challenge is again (after the CVPR 2024 workshop) open and running. We show the leading submission from each group on the Closed-world Panoptic Segmentation leaderboard. For more information on the dataset, metric and benchmark, please refer to JRDB-PanoTrack paper.

CW Panoptic Segmentation Submissions


0.666 0.726 0.511 32.857
Feng Li, Hao Zhang, Huaizhe xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation in CVPR 2023

Additional Information Used

Symbol Description
Individual Image Method uses individual images from each camera
Stitched Image Method uses stitched images combined from the individual cameras
Pointcloud Method uses 3D pointcloud data
Online Tracking Method does frame-by-frame processing with no lookahead
Offline Tracking Method does not do in-order frame processing
Public Detections Method uses publicly available detections
Private Detections Method uses its own private detections

Evaluation Measures [1]

Measure Better Perfect Description
OSPA lower 0.0 OSPA is a set-based metric which can directly capture a distance, between two sets of mask tracks without a thresholding parameter [2,3].
OSPA Localization lower 0.0 Representing prediction error such as the displacement, track ID switches, track fragmentation or even track late initiation/early termination [2,3].
OSPA Cardinality lower 0.0 Representing cardinality mismatch between two sets, penalizing missed or false tracks without an explicit definition for them [2,3].
PQ higher 1.0 Measure how closely matched segments are with the ground truths [6]


  1. The style and content of the Evaluation Measures section is reference from MOT Challenges.
  2. Duy-Tho Le, Chenhui Gou, Stavya Datta, Hengcan Shi, Ian Reid, Jianfei Cai, Hamid Rezatofighi. JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments In CVPR, 2024.
  3. Hamid Rezatofighi∗, Tran Thien Dat Nguyen∗, Ba-Ngu Vo, Ba-Tuong Vo, Silvio Savarese, and Ian Reid. How Trustworthy are Performance Evaluationsfor Basic Vision Tasks? IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023.
  4. Keni Bernardin and Rainer Stiefelhagen. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. Image and Video Processing, 2008(1):1-10, 2008.
  5. Yuan Li, Chang Huang and Ram Nevatia. Learning to Associate: HybridBoosted Multi-Target Tracker for Crowded Scene . In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009.
  6. Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollar. Panoptic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019.