Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikaela Angelina Uy

ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Process

Jan 18, 2024
Kiyohiro Nakayama, Mikaela Angelina Uy, Yang You, Ke Li, Leonidas Guibas

Neural radiance fields (NeRFs) have gained popularity across various applications. However, they face challenges in the sparse view setting, lacking sufficient constraints from volume rendering. Reconstructing and understanding a 3D scene from sparse and unconstrained cameras is a long-standing problem in classical computer vision with diverse applications. While recent works have explored NeRFs in sparse, unconstrained view scenarios, their focus has been primarily on enhancing reconstruction and novel view synthesis. Our approach takes a broader perspective by posing the question: "from where has each point been seen?" -- which gates how well we can understand and reconstruct it. In other words, we aim to determine the origin or provenance of each 3D point and its associated information under sparse, unconstrained views. We introduce ProvNeRF, a model that enriches a traditional NeRF representation by incorporating per-point provenance, modeling likely source locations for each point. We achieve this by extending implicit maximum likelihood estimation (IMLE) for stochastic processes. Notably, our method is compatible with any pre-trained NeRF model and the associated training camera poses. We demonstrate that modeling per-point provenance offers several advantages, including uncertainty estimation, criteria-based view selection, and improved novel view synthesis, compared to state-of-the-art methods. Please visit our project page at https://provnerf.github.io

Via

Access Paper or Ask Questions

NeRF Revisited: Fixing Quadrature Instability in Volume Rendering

Oct 31, 2023
Mikaela Angelina Uy, Kiyohiro Nakayama, Guandao Yang, Rahul Krishna Thomas, Leonidas Guibas, Ke Li

Neural radiance fields (NeRF) rely on volume rendering to synthesize novel views. Volume rendering requires evaluating an integral along each ray, which is numerically approximated with a finite sum that corresponds to the exact integral along the ray under piecewise constant volume density. As a consequence, the rendered result is unstable w.r.t. the choice of samples along the ray, a phenomenon that we dub quadrature instability. We propose a mathematically principled solution by reformulating the sample-based rendering equation so that it corresponds to the exact integral under piecewise linear volume density. This simultaneously resolves multiple issues: conflicts between samples along different rays, imprecise hierarchical sampling, and non-differentiability of quantiles of ray termination distances w.r.t. model parameters. We demonstrate several benefits over the classical sample-based rendering equation, such as sharper textures, better geometric reconstruction, and stronger depth supervision. Our proposed formulation can be also be used as a drop-in replacement to the volume rendering equation of existing NeRF-based methods. Our project page can be found at pl-nerf.github.io.

* Neurips 2023

Via

Access Paper or Ask Questions

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

May 04, 2023
Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas

Figure 1 for DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

Figure 2 for DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

Figure 3 for DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

Figure 4 for DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

While the community of 3D point cloud generation has witnessed a big growth in recent years, there still lacks an effective way to enable intuitive user control in the generation process, hence limiting the general utility of such methods. Since an intuitive way of decomposing a shape is through its parts, we propose to tackle the task of controllable part-based point cloud generation. We introduce DiffFacto, a novel probabilistic generative model that learns the distribution of shapes with part-level control. We propose a factorization that models independent part style and part configuration distributions and presents a novel cross-diffusion network that enables us to generate coherent and plausible shapes under our proposed factorization. Experiments show that our method is able to generate novel shapes with multiple axes of control. It achieves state-of-the-art part-level generation quality and generates plausible and coherent shapes while enabling various downstream editing applications such as shape interpolation, mixing, and transformation editing. Project website: https://difffacto.github.io/

Via

Access Paper or Ask Questions

SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Mar 23, 2023
Mikaela Angelina Uy, Ricardo Martin-Brualla, Leonidas Guibas, Ke Li

Figure 1 for SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Figure 2 for SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Figure 3 for SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Figure 4 for SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Neural radiance fields (NeRFs) have enabled high fidelity 3D reconstruction from multiple 2D input views. However, a well-known drawback of NeRFs is the less-than-ideal performance under a small number of views, due to insufficient constraints enforced by volumetric rendering. To address this issue, we introduce SCADE, a novel technique that improves NeRF reconstruction quality on sparse, unconstrained input views for in-the-wild indoor scenes. To constrain NeRF reconstruction, we leverage geometric priors in the form of per-view depth estimates produced with state-of-the-art monocular depth estimation models, which can generalize across scenes. A key challenge is that monocular depth estimation is an ill-posed problem, with inherent ambiguities. To handle this issue, we propose a new method that learns to predict, for each view, a continuous, multimodal distribution of depth estimates using conditional Implicit Maximum Likelihood Estimation (cIMLE). In order to disambiguate exploiting multiple views, we introduce an original space carving loss that guides the NeRF representation to fuse multiple hypothesized depth maps from each view and distill from them a common geometry that is consistent with all views. Experiments show that our approach enables higher fidelity novel view synthesis from sparse views. Our project page can be found at https://scade-spacecarving-nerfs.github.io .

* CVPR 2023

Via

Access Paper or Ask Questions

PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Mar 21, 2023
Konstantinos Tertikas, Despoina Paschalidou, Boxiao Pan, Jeong Joon Park, Mikaela Angelina Uy, Ioannis Emiris, Yannis Avrithis, Leonidas Guibas

Figure 1 for PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Figure 2 for PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Figure 3 for PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Figure 4 for PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Impressive progress in generative models and implicit representations gave rise to methods that can generate 3D shapes of high quality. However, being able to locally control and edit shapes is another essential property that can unlock several content creation applications. Local control can be achieved with part-aware models, but existing methods require 3D supervision and cannot produce textures. In this work, we devise PartNeRF, a novel part-aware generative model for editable 3D shape synthesis that does not require any explicit 3D supervision. Our model generates objects as a set of locally defined NeRFs, augmented with an affine transformation. This enables several editing operations such as applying transformations on parts, mixing parts from different objects etc. To ensure distinct, manipulable parts we enforce a hard assignment of rays to parts that makes sure that the color of each ray is only determined by a single NeRF. As a result, altering one part does not affect the appearance of the others. Evaluations on various ShapeNet categories demonstrate the ability of our model to generate editable 3D objects of improved fidelity, compared to previous part-based generative approaches that require 3D supervision or models relying on NeRFs.

* To appear in CVPR 2023, Project Page: https://ktertikas.github.io/part_nerf

Via

Access Paper or Ask Questions

Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

Dec 17, 2021
Mikaela Angelina Uy, Yen-yu Chang, Minhyuk Sung, Purvi Goel, Joseph Lambourne, Tolga Birdal, Leonidas Guibas

Figure 1 for Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

Figure 2 for Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

Figure 3 for Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

Figure 4 for Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

We propose Point2Cyl, a supervised network transforming a raw 3D point cloud to a set of extrusion cylinders. Reverse engineering from a raw geometry to a CAD model is an essential task to enable manipulation of the 3D data in shape editing software and thus expand their usages in many downstream applications. Particularly, the form of CAD models having a sequence of extrusion cylinders -- a 2D sketch plus an extrusion axis and range -- and their boolean combinations is not only widely used in the CAD community/software but also has great expressivity of shapes, compared to having limited types of primitives (e.g., planes, spheres, and cylinders). In this work, we introduce a neural network that solves the extrusion cylinder decomposition problem in a geometry-grounded way by first learning underlying geometric proxies. Precisely, our approach first predicts per-point segmentation, base/barrel labels and normals, then estimates for the underlying extrusion parameters in differentiable and closed-form formulations. Our experiments show that our approach demonstrates the best performance on two recent CAD datasets, Fusion Gallery and DeepCAD, and we further showcase our approach on reverse engineering and editing.

Via

Access Paper or Ask Questions

Joint Learning of 3D Shape Retrieval and Deformation

Jan 19, 2021
Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri, Leonidas Guibas

Figure 1 for Joint Learning of 3D Shape Retrieval and Deformation

Figure 2 for Joint Learning of 3D Shape Retrieval and Deformation

Figure 3 for Joint Learning of 3D Shape Retrieval and Deformation

Figure 4 for Joint Learning of 3D Shape Retrieval and Deformation

We propose a novel technique for producing high-quality 3D models that match a given target object image or scan. Our method is based on retrieving an existing shape from a database of 3D models and then deforming its parts to match the target shape. Unlike previous approaches that independently focus on either shape retrieval or deformation, we propose a joint learning procedure that simultaneously trains the neural deformation module along with the embedding space used by the retrieval module. This enables our network to learn a deformation-aware embedding space, so that retrieved models are more amenable to match the target after an appropriate deformation. In fact, we use the embedding space to guide the shape pairs used to train the deformation module, so that it invests its capacity in learning deformations between meaningful shape pairs. Furthermore, our novel part-aware deformation module can work with inconsistent and diverse part-structures on the source shapes. We demonstrate the benefits of our joint training not only on our novel framework, but also on other state-of-the-art neural deformation modules proposed in recent years. Lastly, we also show that our jointly-trained method outperforms a two-step deformation-aware retrieval that uses direct optimization instead of neural deformation or a pre-trained deformation module.

Via

Access Paper or Ask Questions

Deformation-Aware 3D Model Embedding and Retrieval

Apr 02, 2020
Mikaela Angelina Uy, Jingwei Huang, Minhyuk Sung, Tolga Birdal, Leonidas Guibas

Figure 1 for Deformation-Aware 3D Model Embedding and Retrieval

Figure 2 for Deformation-Aware 3D Model Embedding and Retrieval

Figure 3 for Deformation-Aware 3D Model Embedding and Retrieval

Figure 4 for Deformation-Aware 3D Model Embedding and Retrieval

We introduce a new problem of $\textit{retrieving}$ 3D models that are not just similar but are deformable to a given query shape. We then present a novel deep $\textit{deformation-aware}$ embedding to solve this retrieval task. 3D model retrieval is a fundamental operation for recovering a clean and complete 3D model from a noisy and partial 3D scan. However, given a finite collection of 3D shapes, even the closest model to a query may not be a satisfactory reconstruction. This motivates us to apply 3D model deformation techniques to adapt the retrieved model so as to better fit the query. Yet, certain restrictions are enforced in most 3D deformation techniques to preserve important features of the original model that prevent a perfect fitting of the deformed model to the query. This gap between the deformed model and the query induces $\textit{asymmetric}$ relationships among the models, which cannot be dealt with typical metric learning techniques. Thus, to retrieve the best models for fitting, we propose a novel deep embedding approach that learns the asymmetric relationships by leveraging location-dependent egocentric distance fields. We also propose two strategies for training the embedding network. We demonstrate that both of these approaches outperform other baselines in both synthetic evaluations and real 3D object reconstruction.

Via

Access Paper or Ask Questions

LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

Nov 21, 2019
Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung

Figure 1 for LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

Figure 2 for LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

Figure 3 for LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

Figure 4 for LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching. Our proposed method is a dual auto-encoder neural network that maps 2D and 3D input into a shared latent space representation. We show that such local cross-domain descriptors in the shared embedding are more discriminative than those obtained from individual training in 2D and 3D domains. To facilitate the training process, we built a new dataset by collecting $\approx 1.4$ millions of 2D-3D correspondences with various lighting conditions and settings from publicly available RGB-D scenes. Our descriptor is evaluated in three main experiments: 2D-3D matching, cross-domain retrieval, and sparse-to-dense depth estimation. Experimental results confirm the robustness of our approach as well as its competitive performance not only in solving cross-domain tasks but also in being able to generalize to solve sole 2D and 3D tasks. Our dataset and code are released publicly at \url{https://hkust-vgd.github.io/lcd}.

* Accepted to AAAI 2020 (Oral)

Via

Access Paper or Ask Questions

Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Aug 19, 2019
Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

Figure 1 for Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Figure 2 for Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Figure 3 for Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Figure 4 for Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Deep learning techniques for point cloud data have demonstrated great potentials in solving classical problems in 3D computer vision such as 3D object classification and segmentation. Several recent 3D object classification methods have reported state-of-the-art performance on CAD model datasets such as ModelNet40 with high accuracy (~92%). Despite such impressive results, in this paper, we argue that object classification is still a challenging task when objects are framed with real-world settings. To prove this, we introduce ScanObjectNN, a new real-world point cloud object dataset based on scanned indoor scene data. From our comprehensive benchmark, we show that our dataset poses great challenges to existing point cloud classification techniques as objects from real-world scans are often cluttered with background and/or are partial due to occlusions. We identify three key open problems for point cloud object classification, and propose new point cloud classification neural networks that achieve state-of-the-art performance on classifying objects with cluttered background. Our dataset and code are publicly available in our project page https://hkust-vgd.github.io/scanobjectnn/.

* ICCV 2019 Oral

Via

Access Paper or Ask Questions