Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Cheng

Optimization of Worker Scheduling at Logistics Depots Using Genetic Algorithms and Simulated Annealing

May 20, 2024

Jinxin Xu, Haixin Wu, Yu Cheng, Liyang Wang, Xin Yang, Xintong Fu, Yuelong Su

This paper addresses the optimization of scheduling for workers at a logistics depot using a combination of genetic algorithm and simulated annealing algorithm. The efficient scheduling of permanent and temporary workers is crucial for optimizing the efficiency of the logistics depot while minimizing labor usage. The study begins by establishing a 0-1 integer linear programming model, with decision variables determining the scheduling of permanent and temporary workers for each time slot on a given day. The objective function aims to minimize person-days, while constraints ensure fulfillment of hourly labor requirements, limit workers to one time slot per day, cap consecutive working days for permanent workers, and maintain non-negativity and integer constraints. The model is then solved using genetic algorithms and simulated annealing. Results indicate that, for this problem, genetic algorithms outperform simulated annealing in terms of solution quality. The optimal solution reveals a minimum of 29857 person-days.

Via

Access Paper or Ask Questions

Research on Credit Risk Early Warning Model of Commercial Banks Based on Neural Network Algorithm

May 17, 2024

Yu Cheng, Qin Yang, Liyang Wang, Ao Xiang, Jingyu Zhang

In the realm of globalized financial markets, commercial banks are confronted with an escalating magnitude of credit risk, thereby imposing heightened requisites upon the security of bank assets and financial stability. This study harnesses advanced neural network techniques, notably the Backpropagation (BP) neural network, to pioneer a novel model for preempting credit risk in commercial banks. The discourse initially scrutinizes conventional financial risk preemptive models, such as ARMA, ARCH, and Logistic regression models, critically analyzing their real-world applications. Subsequently, the exposition elaborates on the construction process of the BP neural network model, encompassing network architecture design, activation function selection, parameter initialization, and objective function construction. Through comparative analysis, the superiority of neural network models in preempting credit risk in commercial banks is elucidated. The experimental segment selects specific bank data, validating the model's predictive accuracy and practicality. Research findings evince that this model efficaciously enhances the foresight and precision of credit risk management.

Via

Access Paper or Ask Questions

Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics

Apr 26, 2024

Ao Xiang, Jingyu Zhang, Qin Yang, Liyang Wang, Yu Cheng

With the development and widespread application of digital image processing technology, image splicing has become a common method of image manipulation, raising numerous security and legal issues. This paper introduces a new splicing image detection algorithm based on the statistical characteristics of natural images, aimed at improving the accuracy and efficiency of splicing image detection. By analyzing the limitations of traditional methods, we have developed a detection framework that integrates advanced statistical analysis techniques and machine learning methods. The algorithm has been validated using multiple public datasets, showing high accuracy in detecting spliced edges and locating tampered areas, as well as good robustness. Additionally, we explore the potential applications and challenges faced by the algorithm in real-world scenarios. This research not only provides an effective technological means for the field of image tampering detection but also offers new ideas and methods for future related research.

Via

Access Paper or Ask Questions

Research on Detection of Floating Objects in River and Lake Based on AI Intelligent Image Recognition

Apr 10, 2024

Jingyu Zhang, Ao Xiang, Yu Cheng, Qin Yang, Liyang Wang

With the rapid advancement of artificial intelligence technology, AI-enabled image recognition has emerged as a potent tool for addressing challenges in traditional environmental monitoring. This study focuses on the detection of floating objects in river and lake environments, exploring an innovative approach based on deep learning. By intricately analyzing the technical pathways for detecting static and dynamic features and considering the characteristics of river and lake debris, a comprehensive image acquisition and processing workflow has been developed. The study highlights the application and performance comparison of three mainstream deep learning models -SSD, Faster-RCNN, and YOLOv5- in debris identification. Additionally, a detection system for floating objects has been designed and implemented, encompassing both hardware platform construction and software framework development. Through rigorous experimental validation, the proposed system has demonstrated its ability to significantly enhance the accuracy and efficiency of debris detection, thus offering a new technological avenue for water quality monitoring in rivers and lakes

Via

Access Paper or Ask Questions

MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

Mar 26, 2024

Wei Tao, Yucheng Zhou, Wenqiang Zhang, Yu Cheng

Figure 1 for MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

Figure 2 for MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

Figure 3 for MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

Figure 4 for MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

In software evolution, resolving the emergent issues within GitHub repositories is a complex challenge that involves not only the incorporation of new code but also the maintenance of existing functionalities. Large Language Models (LLMs) have shown promise in code generation and understanding but face difficulties in code change, particularly at the repository level. To overcome these challenges, we empirically study the reason why LLMs mostly fail to resolve GitHub issues and analyze some impact factors. Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four kinds of agents customized for the software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. This framework leverages the collaboration of various agents in the planning and coding process to unlock the potential of LLMs to resolve GitHub issues. In experiments, we employ the SWE-bench benchmark to compare MAGIS with popular LLMs, including GPT-3.5, GPT-4, and Claude-2. MAGIS can resolve 13.94% GitHub issues, which significantly outperforms the baselines. Specifically, MAGIS achieves an eight-fold increase in resolved ratio over the direct application of GPT-4, the based LLM of our method. We also analyze the factors for improving GitHub issue resolution rates, such as line location, task allocation, etc.

* work in progress

Via

Access Paper or Ask Questions

Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Mar 18, 2024

Wendi Li, Wei Wei, Kaihe Xu, Wenfeng Xie, Dangyang Chen, Yu Cheng

Figure 1 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 2 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 3 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 4 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

To meet the requirements of real-world applications, it is essential to control generations of large language models (LLMs). Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a "first-quantize-then-noise" paradigm to enhance the robustness of the RL algorithm.Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG

* Accepted to NAACL 2024 Findings

Via

Access Paper or Ask Questions

Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

Mar 12, 2024

Shuyao Li, Yu Cheng, Ilias Diakonikolas, Jelena Diakonikolas, Rong Ge, Stephen J. Wright

Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong contamination model, where a constant fraction of datapoints are arbitrarily corrupted. We introduce a general framework for efficiently finding an approximate SOSP with \emph{dimension-independent} accuracy guarantees, using $\widetilde{O}({D^2}/{\epsilon})$ samples where $D$ is the ambient dimension and $\epsilon$ is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on $D$ in the sample complexity is necessary for computationally efficient algorithms.

Via

Access Paper or Ask Questions

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Feb 24, 2024

Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang

Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in diverse tasks across different domains, with an increasing focus on improving their zero-shot generalization capabilities for unseen multimodal tasks. Multimodal instruction tuning has emerged as a successful strategy for achieving zero-shot generalization by fine-tuning pre-trained models on diverse multimodal tasks through instructions. As MLLMs grow in complexity and size, the need for parameter-efficient fine-tuning methods like Low-Rank Adaption (LoRA), which fine-tunes with a minimal set of parameters, becomes essential. However, applying LoRA in multimodal instruction tuning presents the challenge of task interference, which leads to performance degradation, especially when dealing with a broad array of multimodal tasks. To address this, this paper introduces a novel approach that integrates multimodal instruction tuning with Conditional Mixture-of-LoRA (MixLoRA). It innovates upon LoRA by dynamically constructing low-rank adaptation matrices tailored to the unique demands of each input instance, aiming to mitigate task interference. Experimental results on various multimodal evaluation datasets indicate that MixLoRA not only outperforms the conventional LoRA with the same or even higher ranks, demonstrating its efficacy and adaptability in diverse multimodal tasks.

* 8 pages, multimodal instruction tuning

Via

Access Paper or Ask Questions

Avoiding Feature Suppression in Contrastive Learning: Learning What Has Not Been Learned Before

Feb 19, 2024

Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooi

Self-Supervised contrastive learning has emerged as a powerful method for obtaining high-quality representations from unlabeled data. However, feature suppression has recently been identified in standard contrastive learning ($e.g.$, SimCLR, CLIP): in a single end-to-end training stage, the contrastive model captures only parts of the shared information across contrasting views, while ignore the other potentially useful information. With feature suppression, contrastive models often fail to learn sufficient representations capable for various downstream tasks. To mitigate the feature suppression problem and ensure the contrastive model to learn comprehensive representations, we develop a novel Multistage Contrastive Learning (MCL) framework. Unlike standard contrastive learning that often result in feature suppression, MCL progressively learn new features that have not been explored in the previous stage, while maintaining the well-learned features. Extensive experiments conducted on various publicly available benchmarks validate the effectiveness of our proposed framework. In addition, we demonstrate that the proposed MCL can be adapted to a variety of popular contrastive learning backbones and boost their performance by learning features that could not be gained from standard contrastive learning procedures.

* A preprint version of an ongoing work

Via

Access Paper or Ask Questions

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

Feb 18, 2024

Zhiyang Xu, Chao Feng, Rulin Shao, Trevor Ashby, Ying Shen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang

Despite vision-language models' (VLMs) remarkable capabilities as versatile visual assistants, two substantial challenges persist within the existing VLM frameworks: (1) lacking task diversity in pretraining and visual instruction tuning, and (2) annotation error and bias in GPT-4 synthesized instruction tuning data. Both challenges lead to issues such as poor generalizability, hallucination, and catastrophic forgetting. To address these challenges, we construct Vision-Flan, the most diverse publicly available visual instruction tuning dataset to date, comprising 187 diverse tasks and 1,664,261 instances sourced from academic datasets, and each task is accompanied by an expert-written instruction. In addition, we propose a two-stage instruction tuning framework, in which VLMs are firstly finetuned on Vision-Flan and further tuned on GPT-4 synthesized data. We find this two-stage tuning framework significantly outperforms the traditional single-stage visual instruction tuning framework and achieves the state-of-the-art performance across a wide range of multi-modal evaluation benchmarks. Finally, we conduct in-depth analyses to understand visual instruction tuning and our findings reveal that: (1) GPT-4 synthesized data does not substantially enhance VLMs' capabilities but rather modulates the model's responses to human-preferred formats; (2) A minimal quantity (e.g., 1,000) of GPT-4 synthesized data can effectively align VLM responses with human-preference; (3) Visual instruction tuning mainly helps large-language models (LLMs) to understand visual features.

* 8 Pages, visual instruction tuning

Via

Access Paper or Ask Questions