Point cloud registration is a crucial technique in 3D computer vision with a wide range of applications. However, this task can be challenging, particularly in large fields of view with dynamic objects, environmental noise, or other perturbations. To address this challenge, we propose a model called PosDiffNet. Our approach performs hierarchical registration based on window-level, patch-level, and point-level correspondence. We leverage a graph neural partial differential equation (PDE) based on Beltrami flow to obtain high-dimensional features and position embeddings for point clouds. We incorporate position embeddings into a Transformer module based on a neural ordinary differential equation (ODE) to efficiently represent patches within points. We employ the multi-level correspondence derived from the high feature similarity scores to facilitate alignment between point clouds. Subsequently, we use registration methods such as SVD-based algorithms to predict the transformation using corresponding point pairs. We evaluate PosDiffNet on several 3D point cloud datasets, verifying that it achieves state-of-the-art (SOTA) performance for point cloud registration in large fields of view with perturbations. The implementation code of experiments is available at https://github.com/AI-IT-AVs/PosDiffNet.
Graphons are limit objects of sequences of graphs, used to analyze the behavior of large graphs. Recently, graphon signal processing has been developed to study large graphs from the signal processing perspective. However, it has the shortcoming that any sparse sequence of graphs always converges to the zero graphon, and the resulting signal processing theory is trivial. In this paper, we propose a signal processing framework based on the generalized graphon theory. The main ingredient is to use the stretched cut distance to compare these graphons. We focus on sampling graph sequences from generalized graphons, and discuss convergence results of associated operators, spectrum as well as signals. Though the paper is theoretical, we also discuss what the theory implies for real large networks.
The utilization of multi-modal sensor data in visual place recognition (VPR) has demonstrated enhanced performance compared to single-modal counterparts. Nonetheless, integrating additional sensors comes with elevated costs and may not be feasible for systems that demand lightweight operation, thereby impacting the practical deployment of VPR. To address this issue, we resort to knowledge distillation, which empowers single-modal students to learn from cross-modal teachers without introducing additional sensors during inference. Despite the notable advancements achieved by current distillation approaches, the exploration of feature relationships remains an under-explored area. In order to tackle the challenge of cross-modal distillation in VPR, we present DistilVPR, a novel distillation pipeline for VPR. We propose leveraging feature relationships from multiple agents, including self-agents and cross-agents for teacher and student neural networks. Furthermore, we integrate various manifolds, characterized by different space curvatures for exploring feature relationships. This approach enhances the diversity of feature relationships, including Euclidean, spherical, and hyperbolic relationship modules, thereby enhancing the overall representational capacity. The experiments demonstrate that our proposed pipeline achieves state-of-the-art performance compared to other distillation baselines. We also conduct necessary ablation studies to show design effectiveness. The code is released at: https://github.com/sijieaaa/DistilVPR
This correspondence points out a technical error in Proposition 4 of the paper [1]. Because of this error, the proofs of Lemma 3, Theorem 1, Theorem 3, Proposition 2, and Theorem 4 in that paper are no longer valid. We provide counterexamples to Proposition 4 and discuss where the flaw in its proof lies. We also provide numerical evidence indicating that Lemma 3, Theorem 1, and Proposition 2 are likely to be false. Since the proof of Theorem 4 depends on the validity of Proposition 4, we propose an amendment to the statement of Theorem 4 of the paper using convergence in operator norm and prove this rigorously. In addition, we also provide a construction that guarantees convergence in the sense of Proposition 4.
Topological Signal Processing (TSP) utilizes simplicial complexes to model structures with higher order than vertices and edges. In this paper, we study the transferability of TSP via a generalized higher-order version of graphon, known as complexon. We recall the notion of a complexon as the limit of a simplicial complex sequence. Inspired by the integral operator form of graphon shift operators, we construct a marginal complexon and complexon shift operator (CSO) according to components of all possible dimensions from the complexon. We investigate the CSO's eigenvalues and eigenvectors, and relate them to a new family of weighted adjacency matrices. We prove that when a simplicial complex sequence converges to a complexon, the eigenvalues of the corresponding CSOs converge to that of the limit complexon. This conclusion is further verified by a numerical experiment. These results hint at learning transferability on large simplicial complexes or simplicial complex sequences, which generalize the graphon signal processing framework.
Graphons have traditionally served as limit objects for dense graph sequences, with the cut distance serving as the metric for convergence. However, sparse graph sequences converge to the trivial graphon under the conventional definition of cut distance, which make this framework inadequate for many practical applications. In this paper, we utilize the concepts of generalized graphons and stretched cut distance to describe the convergence of sparse graph sequences. Specifically, we consider a random graph process generated from a generalized graphon. This random graph process converges to the generalized graphon in stretched cut distance. We use this random graph process to model the growing sparse graph, and prove the convergence of the adjacency matrices' eigenvalues. We supplement our findings with experimental validation. Our results indicate the possibility of transfer learning between sparse graphs.
In generalized graph signal processing (GGSP), the signal associated with each vertex in a graph is an element from a Hilbert space. In this paper, we study GGSP signal reconstruction as a kernel ridge regression (KRR) problem. By devising an appropriate kernel, we show that this problem has a solution that can be evaluated in a distributed way. We interpret the problem and solution using both deterministic and Bayesian perspectives and link them to existing graph signal processing and GGSP frameworks. We then provide an online implementation via random Fourier features. Under the Bayesian framework, we investigate the statistical performance under the asymptotic sampling scheme. Finally, we validate our theory and methods on real-world datasets.
Topological signal processing (TSP) over simplicial complexes typically assumes observations associated with the simplicial complexes are real scalars. In this paper, we develop TSP theories for the case where observations belong to abelian groups more general than real numbers, including function spaces that are commonly used to represent time-varying signals. Our approach generalizes the Hodge decomposition and allows for signal processing tasks to be performed on these more complex observations. We propose a unified and flexible framework for TSP that expands its applicability to a wider range of signal processing applications. Numerical results demonstrate the effectiveness of this approach and provide a foundation for future research in this area.
In this paper, we propose a framework for graph signal processing using category theory. The aim is to generalize a few recent works on probabilistic approaches to graph signal processing, which handle signal and graph uncertainties.
Graph signal processing (GSP) studies graph-structured data, where the central concept is the vector space of graph signals. To study a vector space, we have many useful tools up our sleeves. However, uncertainty is omnipresent in practice, and using a vector to model a real signal can be erroneous in some situations. In this paper, we want to use the Wasserstein space as a replacement for the vector space of graph signals, to account for signal stochasticity. The Wasserstein is strictly more general in which the classical graph signal space embeds isometrically. An element in the Wasserstein space is called a distributional graph signal. On the other hand, signal processing for a probability space of graphs has been proposed in the literature. In this work, we propose a unified framework that also encompasses existing theories regarding graph uncertainty. We develop signal processing tools to study the new notion of distributional graph signals. We also demonstrate how the theory can be applied by using real datasets.