Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiahe Lei

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

Feb 20, 2024
Tongxu Luo, Jiahe Lei, Fangyu Lei, Weihao Liu, Shizhu He, Jun Zhao, Kang Liu

Fine-tuning is often necessary to enhance the adaptability of Large Language Models (LLM) to downstream tasks. Nonetheless, the process of updating billions of parameters demands significant computational resources and training time, which poses a substantial obstacle to the widespread application of large-scale models in various scenarios. To address this issue, Parameter-Efficient Fine-Tuning (PEFT) has emerged as a prominent paradigm in recent research. However, current PEFT approaches that employ a limited set of global parameters (such as LoRA, which adds low-rank approximation matrices to all weights) face challenges in flexibly combining different computational modules in downstream tasks. In this work, we introduce a novel PEFT method: MoELoRA. We consider LoRA as Mixture of Experts (MoE), and to mitigate the random routing phenomenon observed in MoE, we propose the utilization of contrastive learning to encourage experts to learn distinct features. We conducted experiments on 11 tasks in math reasoning and common-sense reasoning benchmarks. With the same number of parameters, our approach outperforms LoRA significantly. In math reasoning, MoELoRA achieved an average performance that was 4.2% higher than LoRA, and demonstrated competitive performance compared to the 175B GPT-3.5 on several benchmarks.

Via

Access Paper or Ask Questions

TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Oct 23, 2023
Fangyu Lei, Tongxu Luo, Pengqi Yang, Weihao Liu, Hanwen Liu, Jiahe Lei, Yiming Huang, Yifan Wei, Shizhu He, Jun Zhao, Kang Liu

Figure 1 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Figure 2 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Figure 3 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Figure 4 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Table-based question answering (TableQA) is an important task in natural language processing, which requires comprehending tables and employing various reasoning ways to answer the questions. This paper introduces TableQAKit, the first comprehensive toolkit designed specifically for TableQA. The toolkit designs a unified platform that includes plentiful TableQA datasets and integrates popular methods of this task as well as large language models (LLMs). Users can add their datasets and methods according to the friendly interface. Also, pleasantly surprised using the modules in this toolkit achieves new SOTA on some datasets. Finally, \tableqakit{} also provides an LLM-based TableQA Benchmark for evaluating the role of LLMs in TableQA. TableQAKit is open-source with an interactive interface that includes visual operations, and comprehensive data for ease of use.

* Work in progress

Via

Access Paper or Ask Questions

HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Sep 22, 2023
Tongxu Luo, Fangyu Lei, Jiahe Lei, Weihao Liu, Shihu He, Jun Zhao, Kang Liu

Figure 1 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Figure 2 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Figure 3 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Figure 4 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Answering numerical questions over hybrid contents from the given tables and text(TextTableQA) is a challenging task. Recently, Large Language Models (LLMs) have gained significant attention in the NLP community. With the emergence of large language models, In-Context Learning and Chain-of-Thought prompting have become two particularly popular research topics in this field. In this paper, we introduce a new prompting strategy called Hybrid prompt strategy and Retrieval of Thought for TextTableQA. Through In-Context Learning, we prompt the model to develop the ability of retrieval thinking when dealing with hybrid data. Our method achieves superior performance compared to the fully-supervised SOTA on the MultiHiertt dataset in the few-shot setting.

Via

Access Paper or Ask Questions

MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

Sep 09, 2023
Weihao Liu, Fangyu Lei, Tongxu Luo, Jiahe Lei, Shizhu He, Jun Zhao, Kang Liu

Figure 1 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

Figure 2 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

Figure 3 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

Figure 4 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

In the real world, knowledge often exists in a multimodal and heterogeneous form. Addressing the task of question answering with hybrid data types, including text, tables, and images, is a challenging task (MMHQA). Recently, with the rise of large language models (LLM), in-context learning (ICL) has become the most popular way to solve QA problems. We propose MMHQA-ICL framework for addressing this problems, which includes stronger heterogeneous data retriever and an image caption module. Most importantly, we propose a Type-specific In-context Learning Strategy for MMHQA, enabling LLMs to leverage their powerful performance in this task. We are the first to use end-to-end LLM prompting method for this task. Experimental results demonstrate that our framework outperforms all baselines and methods trained on the full dataset, achieving state-of-the-art results under the few-shot setting on the MultimodalQA dataset.

Via

Access Paper or Ask Questions