Picture for Fangyu Lei

Fangyu Lei

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Apr 11, 2024
Viaarxiv icon

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Add code
Mar 01, 2024
Viaarxiv icon

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

Add code
Feb 20, 2024
Viaarxiv icon

Competition-Level Problems are Effective LLM Evaluators

Add code
Dec 05, 2023
Figure 1 for Competition-Level Problems are Effective LLM Evaluators
Figure 2 for Competition-Level Problems are Effective LLM Evaluators
Figure 3 for Competition-Level Problems are Effective LLM Evaluators
Figure 4 for Competition-Level Problems are Effective LLM Evaluators
Viaarxiv icon

Assessing Knowledge Editing in Language Models via Relation Perspective

Add code
Nov 15, 2023
Figure 1 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 2 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 3 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 4 for Assessing Knowledge Editing in Language Models via Relation Perspective
Viaarxiv icon

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

Add code
Oct 23, 2023
Figure 1 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 2 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 3 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 4 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Viaarxiv icon

TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Add code
Oct 23, 2023
Figure 1 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 2 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 3 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 4 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Viaarxiv icon

MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models

Add code
Oct 08, 2023
Figure 1 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 2 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 3 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 4 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Viaarxiv icon

HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Add code
Sep 22, 2023
Figure 1 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 2 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 3 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 4 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Viaarxiv icon

MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

Add code
Sep 09, 2023
Figure 1 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Figure 2 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Figure 3 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Figure 4 for MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Viaarxiv icon