Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jaewook Lee

Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions?

May 10, 2024
Hunter McNichols, Jaewook Lee, Stephen Fancsali, Steve Ritter, Andrew Lan

Intelligent Tutoring Systems (ITSs) often contain an automated feedback component, which provides a predefined feedback message to students when they detect a predefined error. To such a feedback component, we often resort to template-based approaches. These approaches require significant effort from human experts to detect a limited number of possible student errors and provide corresponding feedback. This limitation is exemplified in open-ended math questions, where there can be a large number of different incorrect errors. In our work, we examine the capabilities of large language models (LLMs) to generate feedback for open-ended math questions, similar to that of an established ITS that uses a template-based approach. We fine-tune both open-source and proprietary LLMs on real student responses and corresponding ITS-provided feedback. We measure the quality of the generated feedback using text similarity metrics. We find that open-source and proprietary models both show promise in replicating the feedback they see during training, but do not generalize well to previously unseen student errors. These results suggest that despite being able to learn the formatting of feedback, LLMs are not able to fully understand mathematical errors made by students.

* Educational Data Mining 2024

Via

Access Paper or Ask Questions

Math Multiple Choice Question Generation via Human-Large Language Model Collaboration

May 01, 2024
Jaewook Lee, Digory Smith, Simon Woodhead, Andrew Lan

Multiple choice questions (MCQs) are a popular method for evaluating students' knowledge due to their efficiency in administration and grading. Crafting high-quality math MCQs is a labor-intensive process that requires educators to formulate precise stems and plausible distractors. Recent advances in large language models (LLMs) have sparked interest in automating MCQ creation, but challenges persist in ensuring mathematical accuracy and addressing student errors. This paper introduces a prototype tool designed to facilitate collaboration between LLMs and educators for streamlining the math MCQ generation process. We conduct a pilot study involving math educators to investigate how the tool can help them simplify the process of crafting high-quality math MCQs. We found that while LLMs can generate well-formulated question stems, their ability to generate distractors that capture common student errors and misconceptions is limited. Nevertheless, a human-AI collaboration has the potential to enhance the efficiency and effectiveness of MCQ generation.

* 17th International Conference on Educational Data Mining (EDM 2024)

Via

Access Paper or Ask Questions

Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models

Apr 05, 2024
Wanyong Feng, Jaewook Lee, Hunter McNichols, Alexander Scarlatos, Digory Smith, Simon Woodhead, Nancy Otero Ornelas, Andrew Lan

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in assessments and practices. One of the most important aspects of MCQs is the distractors, i.e., incorrect options that are designed to target common errors or misconceptions among real students. To date, the task of crafting high-quality distractors largely remains a labor and time-intensive process for teachers and learning content designers, which has limited scalability. In this work, we study the task of automated distractor generation in the domain of math MCQs and explore a wide variety of large language model (LLM)-based approaches, from in-context learning to fine-tuning. We conduct extensive experiments using a real-world math MCQ dataset and find that although LLMs can generate some mathematically valid distractors, they are less adept at anticipating common errors or misconceptions among real students.

* NAACL 2024 findings

Via

Access Paper or Ask Questions

Fundamental Benefit of Alternating Updates in Minimax Optimization

Feb 16, 2024
Jaewook Lee, Hanseul Cho, Chulhee Yun

The Gradient Descent-Ascent (GDA) algorithm, designed to solve minimax optimization problems, takes the descent and ascent steps either simultaneously (Sim-GDA) or alternately (Alt-GDA). While Alt-GDA is commonly observed to converge faster, the performance gap between the two is not yet well understood theoretically, especially in terms of global convergence rates. To address this theory-practice gap, we present fine-grained convergence analyses of both algorithms for strongly-convex-strongly-concave and Lipschitz-gradient objectives. Our new iteration complexity upper bound of Alt-GDA is strictly smaller than the lower bound of Sim-GDA; i.e., Alt-GDA is provably faster. Moreover, we propose Alternating-Extrapolation GDA (Alex-GDA), a general algorithmic framework that subsumes Sim-GDA and Alt-GDA, for which the main idea is to alternately take gradients from extrapolations of the iterates. We show that Alex-GDA satisfies a smaller iteration complexity bound, identical to that of the Extra-gradient method, while requiring less gradient computations. We also prove that Alex-GDA enjoys linear convergence for bilinear problems, for which both Sim-GDA and Alt-GDA fail to converge at all.

* 77 pages, 2 figures

Via

Access Paper or Ask Questions

Fair Sampling in Diffusion Models through Switching Mechanism

Jan 09, 2024
Yujin Choi, Jinseong Park, Hoki Kim, Jaewook Lee, Saeroom Park

Diffusion models have shown their effectiveness in generation tasks by well-approximating the underlying probability distribution. However, diffusion models are known to suffer from an amplified inherent bias from the training data in terms of fairness. While the sampling process of diffusion models can be controlled by conditional guidance, previous works have attempted to find empirical guidance to achieve quantitative fairness. To address this limitation, we propose a fairness-aware sampling method called \textit{attribute switching} mechanism for diffusion models. Without additional training, the proposed sampling can obfuscate sensitive attributes in generated data without relying on classifiers. We mathematically prove and experimentally demonstrate the effectiveness of the proposed method on two key aspects: (i) the generation of fair data and (ii) the preservation of the utility of the generated data.

* AAAI 2024

Via

Access Paper or Ask Questions

Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Nov 17, 2023
Jaewook Lee, Seongmin Heo, Jay H. Lee

Figure 1 for Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Figure 2 for Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Figure 3 for Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Figure 4 for Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Accurately predicting the lifespan of lithium-ion batteries (LIBs) is pivotal for optimizing usage and preventing accidents. Previous studies in constructing prediction models often relied on inputs challenging to measure in real-time operations and failed to capture intra-cycle and inter-cycle data patterns, essential features for accurate predictions, comprehensively. In this study, we employ attention mechanisms (AM) to develop data-driven models for predicting LIB lifespan using easily measurable inputs such as voltage, current, temperature, and capacity data. The developed model integrates recurrent neural network (RNN) and convolutional neural network (CNN) components, featuring two types of attention mechanisms: temporal attention (TA) and cyclic attention (CA). The inclusion of TA aims to identify important time steps within each cycle by scoring the hidden states of the RNN, whereas CA strives to capture key features of inter-cycle correlations through self-attention (SA). This enhances model accuracy and elucidates critical features in the input data. To validate our method, we apply it to publicly available cycling data consisting of three batches of cycling modes. The calculated TA scores highlight the rest phase as a key characteristic distinguishing LIB data among different batches. Additionally, CA scores reveal variations in the importance of cycles across batches. By leveraging CA scores, we explore the potential to reduce the number of cycles in the input data. The single-head and multi-head attentions enable us to decrease the input dimension from 100 to 50 and 30 cycles, respectively.

Via

Access Paper or Ask Questions

Exploring Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

Aug 07, 2023
Hunter McNichols, Wanyong Feng, Jaewook Lee, Alexander Scarlatos, Digory Smith, Simon Woodhead, Andrew Lan

Figure 1 for Exploring Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

Figure 2 for Exploring Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

Figure 3 for Exploring Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

Figure 4 for Exploring Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in both assessments and practices. An important aspect of MCQs is the distractors, i.e., incorrect options that are designed to target specific misconceptions or insufficient knowledge among students. To date, the task of crafting high-quality distractors has largely remained a labor-intensive process for teachers and learning content designers, which has limited scalability. In this work, we explore the task of automated distractor and corresponding feedback message generation in math MCQs using large language models. We establish a formulation of these two tasks and propose a simple, in-context learning-based solution. Moreover, we explore using two non-standard metrics to evaluate the quality of the generated distractors and feedback messages. We conduct extensive experiments on these tasks using a real-world MCQ dataset that contains student response information. Our findings suggest that there is a lot of room for improvement in automated distractor and feedback generation. We also outline several directions for future work

Via

Access Paper or Ask Questions

Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

Jun 26, 2023
Dahyun Jung, Jaehyung Seo, Jaewook Lee, Chanjun Park, Heuiseok Lim

Figure 1 for Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

Figure 2 for Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

Generative commonsense reasoning refers to the task of generating acceptable and logical assumptions about everyday situations based on commonsense understanding. By utilizing an existing dataset such as Korean CommonGen, language generation models can learn commonsense reasoning specific to the Korean language. However, language models often fail to consider the relationships between concepts and the deep knowledge inherent to concepts. To address these limitations, we propose a method to utilize the Korean knowledge graph data for text generation. Our experimental result shows that the proposed method can enhance the efficiency of Korean commonsense inference, thereby underlining the significance of employing supplementary data.

* Accepted for Data-centric Machine Learning Research (DMLR) Workshop at ICML 2023

Via

Access Paper or Ask Questions

Differentially Private Sharpness-Aware Training

Jun 09, 2023
Jinseong Park, Hoki Kim, Yujin Choi, Jaewook Lee

Figure 1 for Differentially Private Sharpness-Aware Training

Figure 2 for Differentially Private Sharpness-Aware Training

Figure 3 for Differentially Private Sharpness-Aware Training

Figure 4 for Differentially Private Sharpness-Aware Training

Training deep learning models with differential privacy (DP) results in a degradation of performance. The training dynamics of models with DP show a significant difference from standard training, whereas understanding the geometric properties of private learning remains largely unexplored. In this paper, we investigate sharpness, a key factor in achieving better generalization, in private learning. We show that flat minima can help reduce the negative effects of per-example gradient clipping and the addition of Gaussian noise. We then verify the effectiveness of Sharpness-Aware Minimization (SAM) for seeking flat minima in private learning. However, we also discover that SAM is detrimental to the privacy budget and computational time due to its two-step optimization. Thus, we propose a new sharpness-aware training method that mitigates the privacy-optimization trade-off. Our experimental results demonstrate that the proposed method improves the performance of deep learning models with DP from both scratch and fine-tuning. Code is available at https://github.com/jinseongP/DPSAT.

* ICML 2023

Via

Access Paper or Ask Questions

A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

May 11, 2023
Nischal Ashok Kumar, Wanyong Feng, Jaewook Lee, Hunter McNichols, Aritra Ghosh, Andrew Lan

Figure 1 for A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

Figure 2 for A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

Figure 3 for A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

Figure 4 for A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

In this paper, we take a preliminary step towards solving the problem of causal discovery in knowledge tracing, i.e., finding the underlying causal relationship among different skills from real-world student response data. This problem is important since it can potentially help us understand the causal relationship between different skills without extensive A/B testing, which can potentially help educators to design better curricula according to skill prerequisite information. Specifically, we propose a conceptual solution, a novel causal gated recurrent unit (GRU) module in a modified deep knowledge tracing model, which uses i) a learnable permutation matrix for causal ordering among skills and ii) an optionally learnable lower-triangular matrix for causal structure among skills. We also detail how to learn the model parameters in an end-to-end, differentiable way. Our solution placed among the top entries in Task 3 of the NeurIPS 2022 Challenge on Causal Insights for Learning Paths in Education. We detail preliminary experiments as evaluated on the challenge's public leaderboard since the ground truth causal structure has not been publicly released, making detailed local evaluation impossible.

* 16th International Conference on Educational Data Mining (EDM 2023)

Via

Access Paper or Ask Questions