JRDB Monash

Augest 20

Submission Deadline
Augest 22

Dataset Released
September 6

Acceptance Notifications
September 22

Camera Ready Data
October 22

Challenge Deadline
October 28

Date of Workshop

Workshop Goals

This is the second workshop from the JRDB workshop series, tailored to many perceptual problems for an autonomous robot to operate, interact and navigate in human environments. These perception tasks include any 2D or 3D visual scene understating problem as well as any other problems pertinent to human action, intention and social behaviour understanding such as 2D-3D human detection, tracking and forecasting, 2D-3D human body skeleton pose estimation, tracking and forecasting and human social grouping and activity recognition.

Recently, the community has paid increasing attention to the human activity understanding problem due to the availability of several large-scale annotated datasets for this computer vision and robotics task. However, the existing datasets for this problem are often collected from platforms such as Youtube and are only limited to 2D annotations for individual actions and activities annotations. The main focus of our CVPR workshop is the novel problem of social human activity understanding, consisting of three sub-tasks such as individual action detection, social group identification and social activity recognition. We also introduce JRDB-Act, a large-scale, ego-centric and multi-modal dataset.

JRDB dataset contains 67 minutes of the annotated sensory data acquired from the JackRabbot mobile manipulator and includes 54 indoor and outdoor sequences in a university campus environment. The sensory data includes a stereo RGB 360° cylindrical video stream, 3D point clouds from two LiDAR sensors, audio and GPS positions. In addition to the current 2D-3D person detection and tracking, we will release a new set of annotations for this dataset such as 1) individual actions, 2) human social group formation, and 3) social activity of each social group. Using these unique annotations, we would launch two new benchmarks and challenges for this workshop. We also have, as invited speakers, world-renowned experts in the field of visual perceptions for understanding human action, intention and social behaviour. Finally, we aim to foster discussion between the attendants to find useful synergies and applications of the solutions of these (or similar) perceptual tasks.

Dataset

The currently available annotations on the JackRobbot dataset and benchmark (JRDB) include:

2D bounding box annotations around all the pedestrians visible in five RGB streams and their cylindrical composition.

3D oriented cuboid annotations around pedestrians in two Velodyne-16 LiDAR streams.

2D-3D associations between bounding boxes and cuboids.

Time consistent trajectories (tracks) for all annotated individuals in both 2D and 3D.

In addition to the above, we have provided a new set of annotations, including:

Individual action labels for all the individuals visible in RGB 360° cylindrical video stream, representing:

Human pose actions (total 11 exclusive categories including a miscellaneous class).

Human-to-human interaction actions (total 3 categories including a miscellaneous class).

Human-to-object interaction actions (total 12 categories including a miscellaneous class).

The action label difficulty level (4 categories) for each attribute.

Social group formation of all the individuals, visible in RGB 360° cylindrical video stream, (ranging from the groups with 1 person to the groups with 29 people)

Social activity labels for each social groups as the accumulation of the majority of individual action labels for each social group

Open Challenge

In addition to the existing four benchmarks and challenges on JRDB, i.e. 2D-3D person detection and tracking challenges, in this workshop, we organise three new challenges using our new annotations:

Human social group identification

Individual action detection

Social activity recognition

The first winner of each of the seven challenges will be awarded a $100-$300 amazon gift card and a certificate. The winners will also have an opportunity to present their work as a spotlight (5 minutes) and poster presentation during the workshop.
Guidelines for participation: The participants should strictly follow the same submission policy provided in the main JRDB webpage, which can be found here. Also, in order to distinguish the challenge submissions from the other regular submissions, each submission name should be followed by a CVPR21 tag, e.g., "submissionname_CVPR21". Otherwise, we ignore those submissions for the challenge. Each challenge submission should be followed by an extended abstract submission via our CMT webpage (the details are available below) or a link to an existing Arxiv preprint /publication.
Evaluation: we use the first metric after "name" in all the leaderboards as the main evaluation for ranking the entries. The evaluation instruction and toolkits for all seven benchmarks are available here.

Call for Papers

We invite researchers to submit their papers addressing topics related to autonomous (robot) navigation in human environments. Relevant topics include, but not limited to:

Social group formation and identification

Individual, group and social activity recognition

2D or 3D human detection and tracking

2D or 3D skeleton pose estimation and tracking

Human body reconstruction

Motion prediction and social models

2D or 3D human face detection

Human gaze estimation

Visual and social navigation in crowded scenes

Traversability estimation

Dataset proposals and bias analysis

New metrics and performance measure for different visual perception problems related to autonomous navigation

Submissions could follow the CVPR format (4-8 double-column pages excluding references) with the submission deadline of April 12 or extended abstract (1 page, double-column excluding references) with the submission deadline of May 30. Accepted papers have the opportunity to be presented as a poster during the workshop. However, only papers in CVPR format will appear in the proceedings. By submitting to this workshop, the authors agree to the review process and understand that we will do our best to match papers to the best possible reviewers. The reviewing process is double-blind. Submission to the challenge is independent of the paper submission, but we encourage the authors to submit to one of the challenges.

Important dates:

Submission deadline for the full papers: April 13-23:59 PST

Acceptance notification of full papers: April 17-23:59 PST

Camera-ready deadline for the full papers: April 19-23:59 PST

Submission deadline for the extended abstracts: May 30-23:59 PST

Acceptance notification of the extended abstracts: June 10-23:59 PST

Submissions can be made here. If you have any questions about submitting, please contact us here.

Program

Start Time	End Time	Description
12:30 PM	12:40 PM	Introduction
12:40 PM	13:10 PM	Invited Talk: Bastian Leibe, RWTH Aachen University - Mobile Person Detection and Tracking using 2D and 3D Sensor Data
13:10 PM	13:30 PM	Full Papers' Oral Presentation: Yuhang He etal, "Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality Collaboration" Emre Hatay etal, "Learning to Detect Phone-related Pedestrian Distracted Behaviors with Synthetic Data"
13:30 PM	14:00 PM	Invited Talk: Laura Leal-Taixé/Aljosa Osep, Technical University Munich Tracking Every Object and Pixel
14:00 PM	14:30 PM	Coffee Break & Video Demo Session
14:30 PM	15:00 PM	Invited Talk: Marynel Vázquez, Yale University Applications of Graph Neural Networks to Spatial Reasoning in Robotics
15:00 PM	15:30 PM	Invited Talk: Kris Kitani, Carnegie Mellon University Modeling Attention in Social Group Interactions
15:30 PM	15:45 PM	Introduction to JRDB Activity Dataset and Challenge
15:45 PM	16:10 PM	Challenge Winners' Presentation
16:40 PM	16:50 PM	Discussion, Closing Remarks and Awards

Invited Speakers

Juan Carlos Niebles

Bastian Leibe

Marynel Vázquez

Kris Kitani

Aljosa Osep