3rd Workshop on Visual Perception for Navigation in Human Environments:

The JackRabbot Human Body Pose Dataset and Benchmark

Workshop Goals

This is the third workshop from the JRDB workshop series, tailored to many perceptual problems for an autonomous robot to operate, interact and navigate in human environments. These perception tasks include any 2D or 3D visual scene understating problem as well as any other problems pertinent to human action, intention and social behaviour understanding such as 2D-3D human detection, tracking and forecasting, 2D-3D human body skeleton pose estimation, tracking and forecasting and human social grouping and activity recognition.

JRDB dataset contains 67 minutes of the annotated sensory data acquired from the JackRabbot mobile manipulator and includes 54 indoor and outdoor sequences in a university campus environment. The sensory data includes a stereo RGB 360° cylindrical video stream, 3D point clouds from two LiDAR sensors, audio and GPS positions. In our first workshop, we introduced JRDB including annotations for 2D bounding boxes and 3D oriented cuboids around pedestrians. In our second workshop, we further introduced new annotations for individual actions, human social group formation, and social activity of each social group. In this workshop, we additionally release anontations for 2D human body pose including 800,000+ annotated human body skeletons with visibility and occlusion labels. We also have, as invited speakers, world-renowned experts in the field of visual perceptions for understanding human action, behaviour, shape, and pose.

Call for Papers

Guidelines

We invite researchers to submit their papers addressing topics related to autonomous (robot) navigation in human environments. Relevant topics include, but not limited to:

2D or 3D human detection and tracking
2D or 3D skeleton pose estimation and tracking
Human motion/body skeleton pose prediction
Social group formation and identification
Individual, group and social activity recognition
Abnormal activity recognition
Motion prediction and social models
Human walking behaviour analysis
2D and 3D semantic, instance or panoptic segmentation
Visual and social navigation in crowded scenes
Dataset proposals and bias analysis
New metrics and performance measure for different visual perception problems related to autonomous navigation

Timeline

Submission deadline for the full papers:
July 20 (Anywhere on Earth)
Acceptance notification of full papers:
August 4
Camera-ready deadline for the full papers:
August 18
Submission deadline for the extended abstracts:
TBD
Acceptance notification of the extended abstracts:
TBD

Submissions

Submissions could follow the ECCV format (maximum 14 single-column pages excluding references) with the submission deadline of June 29, or extended abstract (maximum 2 page, single-column excluding references) with the submission deadline of TBD. Accepted papers have the opportunity to be presented as a poster during the workshop. However, only papers in ECCV format will appear in the proceedings. By submitting to this workshop, the authors agree to the review process and understand that we will do our best to match papers to the best possible reviewers. The reviewing process is double-blind. Submission to the challenge is independent of the paper submission, but we encourage the authors to submit to one of the challenges.

Submissions can be made here. If you have any questions about submitting, please contact us here.

Program

Start Time	End Time	Description
12:30 PM	12:40 PM	Introduction
12:40 PM	13:10 PM	Invited Talk
13:10 PM	13:30 PM	Full Paper Oral Presentations
13:30 PM	14:00 PM	Invited Talk
14:00 PM	14:30 PM	Coffee Break & Video Demo Session
14:30 PM	15:00 PM	Invited Talk
15:00 PM	15:30 PM	Invited Talk
15:30 PM	15:45 PM	Introduction to JRDB Pose Dataset and Challenge
15:45 PM	16:10 PM	Challenge Winners' Presentation
16:40 PM	16:50 PM	Discussion, Closing Remarks and Awards

Challenge

In addition to the existing benchmarks and challenges on JRDB (2D-3D person detection and tracking, human social group identification, individual action detection, and social activity recognition), in this workshop, we organise two new challenges using our new annotations:

2D Human skeleton pose estimation
2D Human skeleton pose tracking

The first winner of each of the challenges will be awarded a prize (TBD) and a certificate. The winners will also have an opportunity to present their work as a spotlight (5 minutes) and poster presentation during the workshop.

Dataset

In our previous two workshops at ICCV 2019 and CVPR 2021 we introduced annotations for the JackRobbot dataset and benchmark (JRDB) including:

2D bounding box and 3D oriented cuboids
2D-3D associations between bounding boxes and cuboids.
Individual action labels for all the individuals, representing:

Human pose actions
Human-to-human and human-to-object interaction actions
The action label difficulty level for each attribute.

Social group formation of all the individuals and social activity labels for each social group
Time consistent trajectories (tracks) for all annotated individuals in both 2D and 3D.

Now, in addition to all of the annotation above, we introduce a new set of annotations for human body pose including:

Human body pose annotations, representing:
- 17 Keypoints per person
- Occlusion Severity for each keypoint (Visible, Occluded, Invisible)
Time consistent trajectories (tracks) for all annotated individuals

For more details about the dataset, see here.

Guidelines for participation

The participants should strictly follow the same submission policy provided in the main JRDB webpage, which can be found here. Also, in order to distinguish the challenge submissions from the other regular submissions, each submission name should be followed by a ECCV22 tag, e.g., "submissionname_ECCV22". Otherwise, we ignore those submissions for the challenge. Each challenge submission should be followed by an extended abstract submission via our CMT webpage (the details are available below) or a link to an existing Arxiv preprint/publication.

Evaluation & Toolkit

We use the first metric after "name" in all the leaderboards as the main evaluation for ranking the entries. For each benchmark, we have also created toolkits to work with the dataset, perform evaluation, and create submissions. These toolkits are available at here.

Invited Speakers

Gerard Pons-Moll

Professor of Computer Science, University of Tübingen

Yaser Sheikh

Associate Professor in the Robotics Institute, Carnegie Mellon University and Director, Facebook Reality Lab

Dima Damen

Professor of Computer Vision at the University of Bristol

Andreas Geiger

Professor of Computer Science, University of Tübingen

Dana Kulić

Professor at Monash University

Adrien Gaidon

Head of Machine Learning Research at Toyota Research Institute (TRI)

Program Committee

Name	Organization
Aakash Kumar	University of Central Florida
Dan Jia	RWTH Aachen University
Edwin Pan	Stanford University
Ehsan Adeli	Stanford University
Haofei Xu	University of Tübingen
Hsu-Kuang Chiu	Waymo
Huangying Zhan	The University of Adelaide
Karttikeya Mangalam	UC Berkeley
Michael Wray	University of Bristol
Michael Villamizar	Idiap Research Institute
Mohsen Fayyaz	Microsoft
Nathan Tsoi	Yale University
Nikos Athanasiou	Max Planck Institute for Intelligent Systems
Saeed Saadatnejad	EPFL
Sandika Biswas	TCS
Shyamal Buch	Stanford University
Tianyu Zhu	Monash University
Vida Adeli	University of Toronto
Vineet Kosaraju	Stanford University
Ye Yuan	Carnegie Mellon University

Organizers

Hamid Rezatofighi

Monash University

Edward Vendrow

Stanford University

Ian Reid

The University of Adelaide

Silvio Savarese

Stanford University