ECCV 2022 Oral Presentation

OPD: Single-view 3D Openable Part Detection

What can open—and how will it move—from just one image?

Hanxiao Jiang Yongsen Mao Manolis Savva Angel X. Chang

01 / Problem

See the part. Understand the motion.

An input cabinet image followed by detected openable parts and predicted motion axes and origins across object categories

We address the task of predicting which parts of an object can open and how those parts move. Given one image, OPD detects every openable part and estimates the parameters that describe its articulation.

To study this problem, we introduce two complementary datasets: OPDSynth, built from existing synthetic objects, and OPDReal, built from RGB-D reconstructions of real objects. We also introduce OPDRCNN, a neural architecture that jointly detects openable parts and predicts their motion parameters.

The experiments expose a difficult generalization problem: a model must infer functional structure from limited visual evidence, often on object categories it did not see during training. OPDRCNN improves over prior work and baselines, especially when using RGB input.

Localize

Find every openable part with a bounding box and segmentation mask.

Classify

Identify the part category and whether its motion is rotational or translational.

Parameterize

Recover a 3D motion axis and origin for each detected part.

02 / Approach

From articulated objects to motion-aware predictions.

Dataset / 01

OPDSynth + OPDReal

OPDSynth covers varied part categories and kinematic structures at scale. OPDReal complements it with RGB-D reconstructions, semantic part annotations, and real-world appearance.

Examples from OPDSynth and OPDReal, including lids, drawers, doors, cabinet kinematic structures, and reconstructed real objects — Synthetic variation above; reconstructed real objects below.

Architecture / 02

OPDRCNN-C + OPDRCNN-O

Both architectures extend a Mask R-CNN backbone with motion heads. Alongside detection and segmentation, the network predicts motion type, 3D axis orientation, and motion origin for each openable part.

Network architecture for OPDRCNN-C and OPDRCNN-O with Mask R-CNN detection and additional motion prediction heads — Joint part detection, segmentation, and motion estimation.

03 / Open resources

Code, datasets, and pretrained models.

Main projectOPD

OPDRCNN

The full implementation, OPDSynth and OPDReal data, plus pretrained RGB, depth, and RGB-D models.

Code Dataset Models

BaselineOPDPN

OPDPN baseline

The project implementation, prepared dataset, and pretrained models used for the OPDPN comparison.

Code Dataset Models

BaselineANCSH

Category-level baseline

Our PyTorch implementation of ANCSH with the converted dataset and pretrained comparison models.

Code Dataset Models

04 / Publication

Read the OPD paper.

ECCV 2022 / Paper

Preview strip of pages from the OPD paper

Open PDF

Acknowledgements

Made possible by people and infrastructure.

This work was funded in part by a Canada CIFAR AI Chair, a Canada Research Chair, and an NSERC Discovery Grant, and enabled in part by support from WestGrid and Compute Canada.

We thank Sanjay Haresh for scanning support and video narration; Yue Ruan for scanning and data annotation; and Supriya Pandhre, Xiaohao Sun, and Qirui Wu for annotation support.

05 / Citation

Cite OPD.

@inproceedings{jiang2022opd,
  title={OPD: Single-view 3D Openable Part Detection},
  author={Jiang, Hanxiao and Mao, Yongsen and Savva, Manolis and Chang, Angel X.},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXIX},
  pages={410--426},
  year={2022},
  organization={Springer}
}