Models, code, and papers for "Vladimir G. Kim":

Tags2Parts: Discovering Semantic Regions from Shape Tags

Apr 30, 2018
Sanjeev Muralikrishnan, Vladimir G. Kim, Siddhartha Chaudhuri

We propose a novel method for discovering shape regions that strongly correlate with user-prescribed tags. For example, given a collection of chairs tagged as either "has armrest" or "lacks armrest", our system correctly highlights the armrest regions as the main distinctive parts between the two chair types. To obtain point-wise predictions from shape-wise tags we develop a novel neural network architecture that is trained with tag classification loss, but is designed to rely on segmentation to predict the tag. Our network is inspired by U-Net, but we replicate shallow U structures several times with new skip connections and pooling layers, and call the resulting architecture "WU-Net". We test our method on segmentation benchmarks and show that even with weak supervision of whole shape tags, our method can infer meaningful semantic regions, without ever observing shape segmentations. Further, once trained, the model can process shapes for which the tag is entirely unknown. As a bonus, our architecture is directly operational under full supervision and performs strongly on standard benchmarks. We validate our method through experiments with many variant architectures and prior baselines, and demonstrate several applications.

* To appear at CVPR 2018 

  Click for Model/Code and Paper
Neural Puppet: Generative Layered Cartoon Characters

Oct 04, 2019
Omid Poursaeed, Vladimir G. Kim, Eli Shechtman, Jun Saito, Serge Belongie

We propose a learning based method for generating new animations of a cartoon character given a few example images. Our method is designed to learn from a traditionally animated sequence, where each frame is drawn by an artist, and thus the input images lack any common structure, correspondences, or labels. We express pose changes as a deformation of a layered 2.5D template mesh, and devise a novel architecture that learns to predict mesh deformations matching the template to a target image. This enables us to extract a common low-dimensional structure from a diverse set of character poses. We combine recent advances in differentiable rendering as well as mesh-aware models to successfully align common template even if only a few character images are available during training. In addition to coarse poses, character appearance also varies due to shading, out-of-plane motions, and artistic effects. We capture these subtle changes by applying an image translation network to refine the mesh rendering, providing an end-to-end model to generate new animations of a character with high visual quality. We demonstrate that our generative model can be used to synthesize in-between frames and to create data-driven deformation. Our template fitting procedure outperforms state-of-the-art generic techniques for detecting image correspondences.


  Click for Model/Code and Paper
Deep Parametric Shape Predictions using Distance Fields

Apr 18, 2019
Dmitriy Smirnov, Matthew Fisher, Vladimir G. Kim, Richard Zhang, Justin Solomon

Many tasks in graphics and vision demand machinery for converting shapes into representations with sparse sets of parameters; these representations facilitate rendering, editing, and storage. When the source data is noisy or ambiguous, however, artists and engineers often manually construct such representations, a tedious and potentially time-consuming process. While advances in deep learning have been successfully applied to noisy geometric data, the task of generating parametric shapes has so far been difficult for these methods. Hence, we propose a new framework for predicting parametric shape primitives using deep learning. We use distance fields to transition between shape parameters like control points and input data on a raster grid. We demonstrate efficacy on 2D and 3D tasks, including font vectorization and surface abstraction.


  Click for Model/Code and Paper
Unsupervised cycle-consistent deformation for shape matching

Jul 06, 2019
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, Mathieu Aubry

We propose a self-supervised approach to deep surface deformation. Given a pair of shapes, our algorithm directly predicts a parametric transformation from one shape to the other respecting correspondences. Our insight is to use cycle-consistency to define a notion of good correspondences in groups of objects and use it as a supervisory signal to train our network. Our method does not rely on a template, assume near isometric deformations or rely on point-correspondence supervision. We demonstrate the efficacy of our approach by using it to transfer segmentation across shapes. We show, on Shapenet, that our approach is competitive with comparable state-of-the-art methods when annotated training data is readily available, but outperforms them by a large margin in the few-shot segmentation scenario.


  Click for Model/Code and Paper
3D-CODED : 3D Correspondences by Deep Deformation

Jul 27, 2018
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, Mathieu Aubry

We present a new deep learning approach for matching deformable shapes by introducing {\it Shape Deformation Networks} which jointly encode 3D shapes and correspondences. This is achieved by factoring the surface representation into (i) a template, that parameterizes the surface, and (ii) a learnt global feature vector that parameterizes the transformation of the template into the input surface. By predicting this feature for a new shape, we implicitly predict correspondences between this shape and the template. We show that these correspondences can be improved by an additional step which improves the shape feature by minimizing the Chamfer distance between the input and transformed template. We demonstrate that our simple approach improves on state-of-the-art results on the difficult FAUST-inter challenge, with an average correspondence error of 2.88cm. We show, on the TOSCA dataset, that our method is robust to many types of perturbations, and generalizes to non-human shapes. This robustness allows it to perform well on real unclean, meshes from the the SCAPE dataset.


  Click for Model/Code and Paper
AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

Jul 20, 2018
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, Mathieu Aubry

We introduce a method for learning to generate the surface of 3D shapes. Our approach represents a 3D shape as a collection of parametric surface elements and, in contrast to methods generating voxel grids or point clouds, naturally infers a surface representation of the shape. Beyond its novelty, our new shape generation framework, AtlasNet, comes with significant advantages, such as improved precision and generalization capabilities, and the possibility to generate a shape of arbitrary resolution without memory issues. We demonstrate these benefits and compare to strong baselines on the ShapeNet benchmark for two applications: (i) auto-encoding shapes, and (ii) single-view reconstruction from a still image. We also provide results showing its potential for other applications, such as morphing, parametrization, super-resolution, matching, and co-segmentation.


  Click for Model/Code and Paper
Learning Local Shape Descriptors from Part Correspondences With Multi-view Convolutional Networks

Sep 05, 2017
Haibin Huang, Evangelos Kalogerakis, Siddhartha Chaudhuri, Duygu Ceylan, Vladimir G. Kim, Ersin Yumer

We present a new local descriptor for 3D shapes, directly applicable to a wide range of shape analysis problems such as point correspondences, semantic segmentation, affordance prediction, and shape-to-scan matching. The descriptor is produced by a convolutional network that is trained to embed geometrically and semantically similar points close to one another in descriptor space. The network processes surface neighborhoods around points on a shape that are captured at multiple scales by a succession of progressively zoomed out views, taken from carefully selected camera positions. We leverage two extremely large sources of data to train our network. First, since our network processes rendered views in the form of 2D images, we repurpose architectures pre-trained on massive image datasets. Second, we automatically generate a synthetic dense point correspondence dataset by non-rigid alignment of corresponding shape parts in a large collection of segmented 3D models. As a result of these design choices, our network effectively encodes multi-scale local context and fine-grained surface detail. Our network can be trained to produce either category-specific descriptors or more generic descriptors by learning from multiple shape categories. Once trained, at test time, the network extracts local descriptors for shapes without requiring any part segmentation as input. Our method can produce effective local descriptors even for shapes whose category is unknown or different from the ones used while training. We demonstrate through several experiments that our learned local descriptors are more discriminative compared to state of the art alternatives, and are effective in a variety of shape analysis applications.


  Click for Model/Code and Paper
Learning elementary structures for 3D shape generation and matching

Aug 14, 2019
Theo Deprelle, Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, Mathieu Aubry

We propose to represent shapes as the deformation and combination of learnable elementary 3D structures, which are primitives resulting from training over a collection of shape. We demonstrate that the learned elementary 3D structures lead to clear improvements in 3D shape generation and matching. More precisely, we present two complementary approaches for learning elementary structures: (i) patch deformation learning and (ii) point translation learning. Both approaches can be extended to abstract structures of higher dimensions for improved results. We evaluate our method on two tasks: reconstructing ShapeNet objects and estimating dense correspondences between human scans (FAUST inter challenge). We show 16% improvement over surface deformation approaches for shape reconstruction and outperform FAUST inter challenge state of the art by 6%.


  Click for Model/Code and Paper
Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction

Mar 20, 2019
Chen-Hsuan Lin, Oliver Wang, Bryan C. Russell, Eli Shechtman, Vladimir G. Kim, Matthew Fisher, Simon Lucey

In this paper, we address the problem of 3D object mesh reconstruction from RGB videos. Our approach combines the best of multi-view geometric and data-driven methods for 3D reconstruction by optimizing object meshes for multi-view photometric consistency while constraining mesh deformations with a shape prior. We pose this as a piecewise image alignment problem for each mesh face projection. Our approach allows us to update shape parameters from the photometric error without any depth or mask information. Moreover, we show how to avoid a degeneracy of zero photometric gradients via rasterizing from a virtual viewpoint. We demonstrate 3D object mesh reconstruction results from both synthetic and real-world videos with our photometric mesh optimization, which is unachievable with either na\"ive mesh generation networks or traditional pipelines of surface reconstruction without heavy manual post-processing.

* Accepted to CVPR 2019 (project page & code: https://chenhsuanlin.bitbucket.io/photometric-mesh-optim/) 

  Click for Model/Code and Paper
Learning Material-Aware Local Descriptors for 3D Shapes

Oct 20, 2018
Hubert Lin, Melinos Averkiou, Evangelos Kalogerakis, Balazs Kovacs, Siddhant Ranade, Vladimir G. Kim, Siddhartha Chaudhuri, Kavita Bala

Material understanding is critical for design, geometric modeling, and analysis of functional objects. We enable material-aware 3D shape analysis by employing a projective convolutional neural network architecture to learn material- aware descriptors from view-based representations of 3D points for point-wise material classification or material- aware retrieval. Unfortunately, only a small fraction of shapes in 3D repositories are labeled with physical mate- rials, posing a challenge for learning methods. To address this challenge, we crowdsource a dataset of 3080 3D shapes with part-wise material labels. We focus on furniture models which exhibit interesting structure and material variabil- ity. In addition, we also contribute a high-quality expert- labeled benchmark of 115 shapes from Herman-Miller and IKEA for evaluation. We further apply a mesh-aware con- ditional random field, which incorporates rotational and reflective symmetries, to smooth our local material predic- tions across neighboring surface patches. We demonstrate the effectiveness of our learned descriptors for automatic texturing, material-aware retrieval, and physical simulation. The dataset and code will be publicly available.

* 3DV 2018 

  Click for Model/Code and Paper
Task-Discriminative Domain Alignment for Unsupervised Domain Adaptation

Sep 26, 2019
Behnam Gholami, Pritish Sahu, Minyoung Kim, Vladimir Pavlovic

Domain Adaptation (DA), the process of effectively adapting task models learned on one domain, the source, to other related but distinct domains, the targets, with no or minimal retraining, is typically accomplished using the process of source-to-target manifold alignment. However, this process often leads to unsatisfactory adaptation performance, in part because it ignores the task-specific structure of the data. In this paper, we improve the performance of DA by introducing a discriminative discrepancy measure which takes advantage of auxiliary information available in the source and the target domains to better align the source and target distributions. Specifically, we leverage the cohesive clustering structure within individual data manifolds, associated with different tasks, to improve the alignment. This structure is explicit in the source, where the task labels are available, but is implicit in the target, making the problem challenging. We address the challenge by devising a deep DA framework, which combines a new task-driven domain alignment discriminator with domain regularizers that encourage the shared features as task-specific and domain invariant, and prompt the task model to be data structure preserving, guiding its decision boundaries through the low density data regions. We validate our framework on standard benchmarks, including Digits (MNIST, USPS, SVHN, MNIST-M), PACS, and VisDA. Our results show that our proposed model consistently outperforms the state-of-the-art in unsupervised domain adaptation.

* This paper is accepted for ORAL presentation at the ICCV 2019 MDALC Workshop 

  Click for Model/Code and Paper
Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach

Feb 23, 2019
Minyoung Kim, Pritish Sahu, Behnam Gholami, Vladimir Pavlovic

In unsupervised domain adaptation, it is widely known that the target domain error can be provably reduced by having a shared input representation that makes the source and target domains indistinguishable from each other. Very recently it has been studied that not just matching the marginal input distributions, but the alignment of output (class) distributions is also critical. The latter can be achieved by minimizing the maximum discrepancy of predictors (classifiers). In this paper, we adopt this principle, but propose a more systematic and effective way to achieve hypothesis consistency via Gaussian processes (GP). The GP allows us to define/induce a hypothesis space of the classifiers from the posterior distribution of the latent random functions, turning the learning into a simple large-margin posterior separation problem, far easier to solve than previous approaches based on adversarial minimax optimization. We formulate a learning objective that effectively pushes the posterior to minimize the maximum discrepancy. This is further shown to be equivalent to maximizing margins and minimizing uncertainty of the class predictions in the target domain, a well-established principle in classical (semi-)supervised learning. Empirical results demonstrate that our approach is comparable or superior to the existing methods on several benchmark domain adaptation datasets.


  Click for Model/Code and Paper
FAN: Focused Attention Networks

Jun 01, 2019
Chu Wang, Babak Samari, Vladimir Kim, Siddhartha Chaudhuri, Kaleem Siddiqi

Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through appropriate weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far, however, the learning of attention weights has been driven solely by the minimization of task specific loss functions. We here introduce a method of learning attention weights to better emphasize informative pair-wise relations between entities. The key idea is to use a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We then introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused attention module leads to a new state-of-the-art for the recovery of relations in a relationship proposal task. Our experiments show that it also boosts performance for diverse vision and language tasks, including object detection, scene categorization and document classification.


  Click for Model/Code and Paper
Defensive forecasting for linear protocols

Sep 24, 2005
Vladimir Vovk, Ilia Nouretdinov, Akimichi Takemura, Glenn Shafer

We consider a general class of forecasting protocols, called "linear protocols", and discuss several important special cases, including multi-class forecasting. Forecasting is formalized as a game between three players: Reality, whose role is to generate observations; Forecaster, whose goal is to predict the observations; and Skeptic, who tries to make money on any lack of agreement between Forecaster's predictions and the actual observations. Our main mathematical result is that for any continuous strategy for Skeptic in a linear protocol there exists a strategy for Forecaster that does not allow Skeptic's capital to grow. This result is a meta-theorem that allows one to transform any continuous law of probability in a linear protocol into a forecasting strategy whose predictions are guaranteed to satisfy this law. We apply this meta-theorem to a weak law of large numbers in Hilbert spaces to obtain a version of the K29 prediction algorithm for linear protocols and show that this version also satisfies the attractive properties of proper calibration and resolution under a suitable choice of its kernel parameter, with no assumptions about the way the data is generated.

* 16 pages 

  Click for Model/Code and Paper