Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Wang

School of Information and Communication Engineering, Xidian University, Xi'an, China

M2NO: Multiresolution Operator Learning with Multiwavelet-based Algebraic Multigrid Method

Jun 07, 2024

Zhihao Li, Zhilu Lai, Xiaobo Wang, Wei Wang

Solving partial differential equations (PDEs) effectively necessitates a multi-scale approach, particularly critical in high-dimensional scenarios characterized by increasing grid points or resolution. Traditional methods often fail to capture the detailed features necessary for accurate modeling, presenting a significant challenge in scientific computing. In response, we introduce the Multiwavelet-based Algebraic Multigrid Neural Operator (M2NO), a novel deep learning framework that synergistically combines multiwavelet transformations and algebraic multigrid (AMG) techniques. By exploiting the inherent similarities between these two approaches, M2NO overcomes their individual limitations and enhances precision and flexibility across various PDE benchmarks. Employing Multiresolution Analysis (MRA) with high-pass and low-pass filters, the model executes hierarchical decomposition to accurately delineate both global trends and localized details within PDE solutions, supporting adaptive data representation at multiple scales. M2NO also automates node selection and adeptly manages complex boundary conditions through its multiwavelet-based operators. Extensive evaluations on a diverse array of PDE datasets with different boundary conditions confirm M2NO's superior performance. Furthermore, M2NO excels in handling high-resolution and super-resolution tasks, consistently outperforming competing models and demonstrating robust adaptability in complex computational scenarios.

Via

Access Paper or Ask Questions

FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Jun 04, 2024

Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Wei Wang, Xiping Hu, Edith C. -H. Ngai

Graph Collaborative Filtering (GCF) has achieved state-of-the-art performance for recommendation tasks. However, most GCF structures simplify the feature transformation and nonlinear operation during message passing in the graph convolution network (GCN). We revisit these two components and discover that a part of feature transformation and nonlinear operation during message passing in GCN can improve the representation of GCF, but increase the difficulty of training. In this work, we propose a simple and effective graph-based recommendation model called FourierKAN-GCF. Specifically, it utilizes a novel Fourier Kolmogorov-Arnold Network (KAN) to replace the multilayer perceptron (MLP) as a part of the feature transformation during message passing in GCN, which improves the representation power of GCF and is easy to train. We further employ message dropout and node dropout strategies to improve the representation power and robustness of the model. Extensive experiments on two public datasets demonstrate the superiority of FourierKAN-GCF over most state-of-the-art methods. The implementation code is available at https://github.com/Jinfeng-Xu/FKAN-GCF.

Via

Access Paper or Ask Questions

Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks

Jun 02, 2024

Cong Wang, Wei Wang, Chengjin Yu, Jie Mu

Patch-level non-local self-similarity is an important property of natural images. However, most existing methods do not consider this property into neural networks for image deraining, thus affecting recovery performance. Motivated by this property, we find that there exists significant patch recurrence property of a rainy image, that is, similar patches tend to recur many times in one image and its multi-scale images and external images. To better model this property for image detaining, we develop a multi-scale graph network with exemplars, called MSGNN, that contains two branches: 1) internal data-based supervised branch is used to model the internal relations of similar patches from the rainy image itself and its multi-scale images and 2) external data-participated unsupervised branch is used to model the external relations of the similar patches in the rainy image and exemplar. Specifically, we construct a graph model by searching the k-nearest neighboring patches from both the rainy images in a multi-scale framework and the exemplar. After obtaining the corresponding k neighboring patches from the multi-scale images and exemplar, we build a graph and aggregate them in an attentional manner so that the graph can provide more information from similar patches for image deraining. We embed the proposed graph in a deep neural network and train it in an end-to-end manner. Extensive experiments demonstrate that the proposed algorithm performs favorably against eight state-of-the-art methods on five public synthetic datasets and one real-world dataset. The source codes will be available at https://github.com/supersupercong/MSGNN.

* IJCAI-24; Project Page: https://github.com/supersupercong/MSGNN

Via

Access Paper or Ask Questions

Multipath Exploitation for Fluctuating Target Detection in RIS-Assisted ISAC Systems

Jun 02, 2024

Shoushuo Zhang, Zichao Xiao, Rang Liu, Ming Li, Wei Wang, Qian Liu

Integrated sensing and communication (ISAC) systems are typically deployed in multipath environments, which is usually deemed as a challenging issue for wireless communications. However, the multipath propagation can also provide extra illumination and observation perspectives for radar sensing, which offers spatial diversity gain for detecting targets with spatial radar cross-section (RCS) fluctuations. In this letter, we propose to utilize reconfigurable intelligent surfaces (RIS) in ISAC systems to provide high-quality and controllable multipath propagation for improving the performance of fluctuating target detection and simultaneously enhancing the quality of communication services. To effectively exploit the spatial diversity offered by RIS-empowered multipath, the dual-functional transmit beamforming and the RIS reflection beamforming are jointly designed to maximize the expectation of radar signal-to-noise ratio (SNR). To solve the resulting complex non-convex optimization problem, we develop an efficient alternating optimization algorithm that utilizes majorization-minimization (MM) and alternating direction method of multipliers (ADMM) algorithms. Simulation results illustrate the advantages of multipath exploitation and the proposed beamforming design algorithm for fluctuating target detection in RIS-assisted ISAC systems.

* submitted to IEEE WCL

Via

Access Paper or Ask Questions

Correlation Matching Transformation Transformers for UHD Image Restoration

Jun 02, 2024

Cong Wang, Jinshan Pan, Wei Wang, Gang Fu, Siyuan Liang, Mengzhu Wang, Xiao-Ming Wu, Jun Liu

This paper proposes UHDformer, a general Transformer for Ultra-High-Definition (UHD) image restoration. UHDformer contains two learning spaces: (a) learning in high-resolution space and (b) learning in low-resolution space. The former learns multi-level high-resolution features and fuses low-high features and reconstructs the residual images, while the latter explores more representative features learning from the high-resolution ones to facilitate better restoration. To better improve feature representation in low-resolution space, we propose to build feature transformation from the high-resolution space to the low-resolution one. To that end, we propose two new modules: Dual-path Correlation Matching Transformation module (DualCMT) and Adaptive Channel Modulator (ACM). The DualCMT selects top C/r (r is greater or equal to 1 which controls the squeezing level) correlation channels from the max-pooling/mean-pooling high-resolution features to replace low-resolution ones in Transformers, which can effectively squeeze useless content to improve the feature representation in low-resolution space to facilitate better recovery. The ACM is exploited to adaptively modulate multi-level high-resolution features, enabling to provide more useful features to low-resolution space for better learning. Experimental results show that our UHDformer reduces about ninety-seven percent model sizes compared with most state-of-the-art methods while significantly improving performance under different training sets on 3 UHD image restoration tasks, including low-light image enhancement, image dehazing, and image deblurring. The source codes will be made available at https://github.com/supersupercong/UHDformer.

* AAAI-24; Source codes, datasets, visual results, and pre-trained models are: https://github.com/supersupercong/UHDformer

Via

Access Paper or Ask Questions

Exploring Channel Estimation and Signal Detection for ODDM-based ISAC Systems

Jun 01, 2024

Dezhi Wang, Chongwen Huang, Lei Liu, Xiaoming Chen, Wei Wang, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

Inspired by providing reliable communications for high-mobility scenarios, in this letter, we investigate the channel estimation and signal detection in integrated sensing and communication~(ISAC) systems based on the orthogonal delay-Doppler multiplexing~(ODDM) modulation, which consists of a pulse-train that can achieve the orthogonality with respect to the resolution of the delay-Doppler~(DD) plane. To enhance the communication performance in the ODDM-based ISAC systems, we first propose a low-complexity approximation algorithm for channel estimation, which addresses the challenge of the high complexity from high resolution in the ODDM modulation, and achieves performance close to that of the maximum likelihood estimator scheme. Then, we employ the orthogonal approximate message-passing scheme to detect the symbols in the communication process based on the estimated channel information. Finally, simulation results show that the detection performance of ODDM is better than other multi-carrier modulation schemes. Specifically, the ODDM outperforms the orthogonal time frequency space scheme by 2.3 dB when the bit error ratio is $10^{-6}$.

* accepted by IEEE Wireless Communications Letters

Via

Access Paper or Ask Questions

Enhancing Large Vision Language Models with Self-Training on Image Comprehension

May 30, 2024

Yihe Deng, Pan Lu, Fan Yin, Ziniu Hu, Sheng Shen, James Zou, Kai-Wei Chang, Wei Wang

Large vision language models (LVLMs) integrate large language models (LLMs) with pre-trained vision encoders, thereby activating the perception capability of the model to understand image inputs for different queries and conduct subsequent reasoning. Improving this capability requires high-quality vision-language data, which is costly and labor-intensive to acquire. Self-training approaches have been effective in single-modal settings to alleviate the need for labeled data by leveraging model's own generation. However, effective self-training remains a challenge regarding the unique visual perception and reasoning capability of LVLMs. To address this, we introduce Self-Training on Image Comprehension (STIC), which emphasizes a self-training approach specifically for image comprehension. First, the model self-constructs a preference dataset for image descriptions using unlabeled images. Preferred responses are generated through a step-by-step prompt, while dis-preferred responses are generated from either corrupted images or misleading prompts. To further self-improve reasoning on the extracted visual information, we let the model reuse a small portion of existing instruction-tuning data and append its self-generated image descriptions to the prompts. We validate the effectiveness of STIC across seven different benchmarks, demonstrating substantial performance gains of 4.0% on average while using 70% less supervised fine-tuning data than the current method. Further studies investigate various components of STIC and highlight its potential to leverage vast quantities of unlabeled images for self-training. Code and data are made publicly available.

* 19 pages, 14 figures, 6 tables

Via

Access Paper or Ask Questions

Cross-Context Backdoor Attacks against Graph Prompt Learning

May 28, 2024

Xiaoting Lyu, Yufei Han, Wei Wang, Hangwei Qian, Ivor Tsang, Xiangliang Zhang

Graph Prompt Learning (GPL) bridges significant disparities between pretraining and downstream applications to alleviate the knowledge transfer bottleneck in real-world graph learning. While GPL offers superior effectiveness in graph knowledge transfer and computational efficiency, the security risks posed by backdoor poisoning effects embedded in pretrained models remain largely unexplored. Our study provides a comprehensive analysis of GPL's vulnerability to backdoor attacks. We introduce \textit{CrossBA}, the first cross-context backdoor attack against GPL, which manipulates only the pretraining phase without requiring knowledge of downstream applications. Our investigation reveals both theoretically and empirically that tuning trigger graphs, combined with prompt transformations, can seamlessly transfer the backdoor threat from pretrained encoders to downstream applications. Through extensive experiments involving 3 representative GPL methods across 5 distinct cross-context scenarios and 5 benchmark datasets of node and graph classification tasks, we demonstrate that \textit{CrossBA} consistently achieves high attack success rates while preserving the functionality of downstream applications over clean input. We also explore potential countermeasures against \textit{CrossBA} and conclude that current defenses are insufficient to mitigate \textit{CrossBA}. Our study highlights the persistent backdoor threats to GPL systems, raising trustworthiness concerns in the practices of GPL techniques.

* Accepted by KDD 2024

Via

Access Paper or Ask Questions

Mixture of In-Context Prompters for Tabular PFNs

May 25, 2024

Derek Xu, Olcay Cirit, Reza Asadi, Yizhou Sun, Wei Wang

Recent benchmarks found In-Context Learning (ICL) outperforms both deep learning and tree-based algorithms on small tabular datasets. However, on larger datasets, ICL for tabular learning cannot run without severely compromising performance, due to its quadratic space and time complexity w.r.t. dataset size. We propose MIXTUREPFN, which both extends nearest-neighbor sampling to the state-of-the-art ICL for tabular learning model and uses bootstrapping to finetune said model on the inference-time dataset. MIXTUREPFN is the Condorcet winner across 36 diverse tabular datasets against 19 strong deep learning and tree-based baselines, achieving the highest mean rank among Top-10 aforementioned algorithms with statistical significance.

* 32 pages, 16 figures

Via

Access Paper or Ask Questions

Fighter flight trajectory prediction based on spatio-temporal graphcial attention network

May 13, 2024

Yao Sun, Tengyu Jing, Jiapeng Wang, Wei Wang

Quickly and accurately predicting the flight trajectory of a blue army fighter in close-range air combat helps a red army fighter gain a dominant situation, which is the winning factor in later air combat. However,due to the high speed and even hypersonic capabilities of advanced fighters, the diversity of tactical maneuvers,and the instantaneous nature of situational transitions,it is difficult to meet the requirements of practical combat applications in terms of prediction accuracy.To improve prediction accuracy,this paper proposes a spatio-temporal graph attention network (ST-GAT) using encoding and decoding structures to predict the flight trajectory. The encoder adopts a parallel structure of Transformer and GAT branches embedded with the multi-head self-attention mechanism in each front end. The Transformer branch network is used to extract the temporal characteristics of historical trajectories and capture the impact of the fighter's historical state on future trajectories, while the GAT branch network is used to extract spatial features in historical trajectories and capture potential spatial correlations between fighters.Then we concatenate the outputs of the two branches into a new feature vector and input it into a decoder composed of a fully connected network to predict the future position coordinates of the blue army fighter.The computer simulation results show that the proposed network significantly improves the prediction accuracy of flight trajectories compared to the enhanced CNN-LSTM network (ECNN-LSTM), with improvements of 47% and 34% in both ADE and FDE indicators,providing strong support for subsequent autonomous combat missions.

Via

Access Paper or Ask Questions