S2O: Static to Openable Enhancement for Articulated 3D Objects

Despite much progress in large 3D datasets there are currently few interactive 3D object datasets, and their scale is limited due to the manual effort required in their construction. We introduce the static to openable (S2O) task which creates interactive articulated 3D objects from static counterparts through openable part detection, motion prediction, and interior geometry completion.

We formulate a unified framework to tackle this task, and curate a challenging dataset of openable 3D objects that serves as a test bed for systematic evaluation. Our experiments benchmark methods from prior work, extended and improved methods, and simple yet effective heuristics for the S2O task.

We find that turning static 3D objects into interactively openable counterparts is possible but that all methods struggle to generalize to realistic settings of the task, and we highlight promising future work directions.

Overview

The input to our static to openable (S2O) task is a static 3D triangle mesh of an openable container. The output is an articulated 3D object that can be interactively opened and closed, with explicitely predicted part segmentation and articulation parameters. We distinguish 3 part types: drawers, doors and lids.

Data

We start with a subset of PartNet-Mobility, the most commonly used dataset of articulated objects, curating a split of openable contatiners (PM-Openable). However, we find it to be self-repetetive, as well as exhibiting limited complexity and realism. To address this, we introduce Articulated Containters Dataset (ACD), which is more diverse and challenging.

ACD is curated by selecting 354 shapes from ABO, HSSD and 3D-FUTURE datasets, and manually annotating the parts and articulation parameters. This dataset features more complex objects with higher number of parts per object, part arrangements and articulations, diverse object categories, and serves as a more realistic testbed for the S2O task. ACD objects rarely feature any interior geometry as they were designed to be static, however we employ our interior completion heuristic to prepare simulator-ready assets.

Pipeline

Our pipeline consists of three main components: openable part segmentation, motion prediction, and interior geometry completion.

For part segmentation, we benchmark a number of SOTA methods across 3 modalities: meshes (MeshWalker), point clouds (PointGroup, Mask3D) and images (OPDFormer). Finding that point cloud based methods work the best, we propose FPNGroup which further improves the segmentaiton. For images and point clouds, we use a set of heuristics to map predictions back onto mesh.

For motion prediction, we propose a simple yet effective heuristic, as well as an extension to our segmentation method, FPNGroupMot that predicts mobility parameters directly.

For interior geometry completion, we propose a simple heuristic that targets drawer box completion.

Results

Segmentation

Here we show openable part segmentation results, drawers are visualized in blue, doors in orange. We note that the task remains challenging, especially on more realistic and challenging ACD shapes. This is also attributed to missing interiors in ACD, while they are present in the training PM-Openable data. We find that our FPNGroup outperforms the baselines, while adding a version of PM-Openable without interiors (PM-Openable-ext) to the training split improves the results further.

Motion Prediction

Here we show motion prediction results on PM-Openable and ACD datasets with our heuristic mobility and learned predicitons. Quantitatively, we find that our heuristic outperforms the learned method. Overall, mobility prediction results depend heavily on the segmetnation quality. On PM-openable, resutls are great with FPNGroup and our heuristic, due to the good segmentation quality and reasonable heuristic. On ACD, the segmetnation results make mobility prediction challenging.

Interior completion

Here we present a comparison of our interior completion heuristic as compared to other methods potenitally capable of outputting interior geometry. We find that image-conditioned articulated object generation method SINGAPO can output inteiors well but fails to preserve the geometry of the original input shape, due to its retrieval-based nature. Image-to-3D generative model TRELLIS is not capable of outputting the interiors. Our heuristic effectively produces the interiors, while preserving the base part geometry due to formulation of our task.

BibTeX


        @misc{iliash2024s2ostaticopenableenhancement,
          title={S2O: Static to Openable Enhancement for Articulated 3D Objects}, 
          author={Denys Iliash and Hanxiao Jiang and Yiming Zhang and Manolis Savva and Angel X. Chang},
          year={2024},
          eprint={2409.18896},
          archivePrefix={arXiv},
          primaryClass={cs.CV},
          url={https://arxiv.org/abs/2409.18896}, 
        }