Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew England

Constrained Neural Networks for Interpretable Heuristic Creation to Optimise Computer Algebra Systems

Apr 26, 2024
Dorian Florescu, Matthew England

We present a new methodology for utilising machine learning technology in symbolic computation research. We explain how a well known human-designed heuristic to make the choice of variable ordering in cylindrical algebraic decomposition may be represented as a constrained neural network. This allows us to then use machine learning methods to further optimise the heuristic, leading to new networks of similar size, representing new heuristics of similar complexity as the original human-designed one. We present this as a form of ante-hoc explainability for use in computer algebra development.

* Accepted for presentation at ICMS 2024

Via

Access Paper or Ask Questions

Symbolic Integration Algorithm Selection with Machine Learning: LSTMs vs Tree LSTMs

Apr 23, 2024
Rashid Barket, Matthew England, Jürgen Gerhard

Computer Algebra Systems (e.g. Maple) are used in research, education, and industrial settings. One of their key functionalities is symbolic integration, where there are many sub-algorithms to choose from that can affect the form of the output integral, and the runtime. Choosing the right sub-algorithm for a given problem is challenging: we hypothesise that Machine Learning can guide this sub-algorithm choice. A key consideration of our methodology is how to represent the mathematics to the ML model: we hypothesise that a representation which encodes the tree structure of mathematical expressions would be well suited. We trained both an LSTM and a TreeLSTM model for sub-algorithm prediction and compared them to Maple's existing approach. Our TreeLSTM performs much better than the LSTM, highlighting the benefit of using an informed representation of mathematical expressions. It is able to produce better outputs than Maple's current state-of-the-art meta-algorithm, giving a strong basis for further research.

Via

Access Paper or Ask Questions

Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD

Jan 24, 2024
Tereso del Río, Matthew England

Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the variable ordering decision. We introduce an augmentation technique for polynomial systems problems that allows us to balance and further augment the dataset, improving the machine learning results by 28\% and 38\% on average, respectively. We then demonstrate how the existing machine learning methodology used for the problem $-$ classification $-$ might be recast into the regression paradigm. While this does not have a radical change on the performance, it does widen the scope in which the methodology can be applied to make choices.

Via

Access Paper or Ask Questions

Data Augmentation for Mathematical Objects

Jul 13, 2023
Tereso del Rio, Matthew England

Figure 1 for Data Augmentation for Mathematical Objects

Figure 2 for Data Augmentation for Mathematical Objects

Figure 3 for Data Augmentation for Mathematical Objects

Figure 4 for Data Augmentation for Mathematical Objects

This paper discusses and evaluates ideas of data balancing and data augmentation in the context of mathematical objects: an important topic for both the symbolic computation and satisfiability checking communities, when they are making use of machine learning techniques to optimise their tools. We consider a dataset of non-linear polynomial problems and the problem of selecting a variable ordering for cylindrical algebraic decomposition to tackle these with. By swapping the variable names in already labelled problems, we generate new problem instances that do not require any further labelling when viewing the selection as a classification problem. We find this augmentation increases the accuracy of ML models by 63% on average. We study what part of this improvement is due to the balancing of the dataset and what is achieved thanks to further increasing the size of the dataset, concluding that both have a very significant effect. We finish the paper by reflecting on how this idea could be applied in other uses of machine learning in mathematics.

* 10 pages. To be presented at the 2023 SC-Square Workshop

Via

Access Paper or Ask Questions

Generating Elementary Integrable Expressions

Jun 27, 2023
Rashid Barket, Matthew England, Jürgen Gerhard

There has been an increasing number of applications of machine learning to the field of Computer Algebra in recent years, including to the prominent sub-field of Symbolic Integration. However, machine learning models require an abundance of data for them to be successful and there exist few benchmarks on the scale required. While methods to generate new data already exist, they are flawed in several ways which may lead to bias in machine learning models trained upon them. In this paper, we describe how to use the Risch Algorithm for symbolic integration to create a dataset of elementary integrable expressions. Further, we show that data generated this way alleviates some of the flaws found in earlier methods.

* To appear in proceedings of CASC 2023. This version of the contribution has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections

Via

Access Paper or Ask Questions

Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

Apr 24, 2023
Lynn Pickering, Tereso Del Rio Almajano, Matthew England, Kelly Cohen

Figure 1 for Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

Figure 2 for Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

Figure 3 for Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

Figure 4 for Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition

In recent years there has been increased use of machine learning (ML) techniques within mathematics, including symbolic computation where it may be applied safely to optimise or select algorithms. This paper explores whether using explainable AI (XAI) techniques on such ML models can offer new insight for symbolic computation, inspiring new implementations within computer algebra systems that do not directly call upon AI tools. We present a case study on the use of ML to select the variable ordering for cylindrical algebraic decomposition. It has already been demonstrated that ML can make the choice well, but here we show how the SHAP tool for explainability can be used to inform new heuristics of a size and complexity similar to those human-designed heuristics currently commonly used in symbolic computation.

* 38 pages

Via

Access Paper or Ask Questions

SC-Square: Future Progress with Machine Learning?

Sep 09, 2022
Matthew England

The algorithms employed by our communities are often underspecified, and thus have multiple implementation choices, which do not effect the correctness of the output, but do impact the efficiency or even tractability of its production. In this extended abstract, to accompany a keynote talk at the 2021 SC-Square Workshop, we survey recent work (both the author's and from the literature) on the use of Machine Learning technology to improve algorithms of interest to SC-Square.

* 10 pages. Survey Paper. Accepted into SC-Square 2021 Workshop Proceedings

Via

Access Paper or Ask Questions

A machine learning based software pipeline to pick the variable ordering for algorithms with polynomial inputs

May 22, 2020
Dorian Florescu, Matthew England

Figure 1 for A machine learning based software pipeline to pick the variable ordering for algorithms with polynomial inputs

Figure 2 for A machine learning based software pipeline to pick the variable ordering for algorithms with polynomial inputs

We are interested in the application of Machine Learning (ML) technology to improve mathematical software. It may seem that the probabilistic nature of ML tools would invalidate the exact results prized by such software, however, the algorithms which underpin the software often come with a range of choices which are good candidates for ML application. We refer to choices which have no effect on the mathematical correctness of the software, but do impact its performance. In the past we experimented with one such choice: the variable ordering to use when building a Cylindrical Algebraic Decomposition (CAD). We used the Python library Scikit-Learn (sklearn) to experiment with different ML models, and developed new techniques for feature generation and hyper-parameter selection. These techniques could easily be adapted for making decisions other than our immediate application of CAD variable ordering. Hence in this paper we present a software pipeline to use sklearn to pick the variable ordering for an algorithm that acts on a polynomial system. The code described is freely available online.

* Accepted into Proc ICMS 2020

Via

Access Paper or Ask Questions

Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness

Nov 28, 2019
Dorian Florescu, Matthew England

Figure 1 for Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness

Figure 2 for Improved cross-validation for classifiers that make algorithmic choices to minimise runtime without compromising output correctness

Our topic is the use of machine learning to improve software by making choices which do not compromise the correctness of the output, but do affect the time taken to produce such output. We are particularly concerned with computer algebra systems (CASs), and in particular, our experiments are for selecting the variable ordering to use when performing a cylindrical algebraic decomposition of $n$-dimensional real space with respect to the signs of a set of polynomials. In our prior work we explored the different ML models that could be used, and how to identify suitable features of the input polynomials. In the present paper we both repeat our prior experiments on problems which have more variables (and thus exponentially more possible orderings), and examine the metric which our ML classifiers targets. The natural metric is computational runtime, with classifiers trained to pick the ordering which minimises this. However, this leads to the situation were models do not distinguish between any of the non-optimal orderings, whose runtimes may still vary dramatically. In this paper we investigate a modification to the cross-validation algorithms of the classifiers so that they do distinguish these cases, leading to improved results.

* 16 pages. Accepted into the Proceedings of MACIS 2019. arXiv admin note: text overlap with arXiv:1906.01455

Via

Access Paper or Ask Questions

Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition

Jun 05, 2019
Matthew England, Dorian Florescu

Figure 1 for Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition

Figure 2 for Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition

Figure 3 for Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition

Figure 4 for Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition

There has been recent interest in the use of machine learning (ML) approaches within mathematical software to make choices that impact on the computing performance without affecting the mathematical correctness of the result. We address the problem of selecting the variable ordering for cylindrical algebraic decomposition (CAD), an important algorithm in Symbolic Computation. Prior work to apply ML on this problem implemented a Support Vector Machine (SVM) to select between three existing human-made heuristics, which did better than anyone heuristic alone. The present work extends to have ML select the variable ordering directly, and to try a wider variety of ML techniques. We experimented with the NLSAT dataset and the Regular Chains Library CAD function for Maple 2018. For each problem, the variable ordering leading to the shortest computing time was selected as the target class for ML. Features were generated from the polynomial input and used to train the following ML models: k-nearest neighbours (KNN) classifier, multi-layer perceptron (MLP), decision tree (DT) and SVM, as implemented in the Python scikit-learn package. We also compared these with the two leading human constructed heuristics for the problem: Brown's heuristic and sotd. On this dataset all of the ML approaches outperformed the human made heuristics, some by a large margin.

* Accepted into CICM 2019

Via

Access Paper or Ask Questions