Models, code, and papers for "Markus Wagner":
To make sense of large amounts of textual data, topic modelling is frequently used as a text-mining tool for the discovery of hidden semantic structures in text bodies. Latent Dirichlet allocation (LDA) is a commonly used topic model that aims to explain the structure of a corpus by grouping texts. LDA requires multiple parameters to work well, and there are only rough and sometimes conflicting guidelines available on how these parameters should be set. In this paper, we contribute (i) a broad study of parameters to arrive at good local optima, (ii) an a-posteriori characterisation of text corpora related to eight programming languages from GitHub and Stack Overflow, and (iii) an analysis of corpus feature importance via per-corpus LDA configuration.
Despite significant empirical and theoretically supported evidence that non-static parameter choices can be strongly beneficial in evolutionary computation, the question how to best adjust parameter values plays only a marginal role in contemporary research on discrete black-box optimization. This has led to the unsatisfactory situation in which feedback-free parameter selection rules such as the cooling schedule of Simulated Annealing are predominant in state-of-the-art heuristics, while, at the same time, we understand very well that such time-dependent selection rules can only perform worse than adjustment rules that do take into account the evolution of the optimization process. A number of adaptive and self-adaptive parameter control strategies have been proposed in the literature, but did not (yet) make their way to a broader public. A key obstacle seems to lie in their rather complex update rules. The purpose of our work is to demonstrate that high-performing online parameter selection rules do not have to be very complicated. More precisely, we experiment with a multiplicative, comparison-based update rule to adjust the mutation probability of a (1+1)~Evolutionary Algorithm. We show that this simple self-adjusting rule outperforms the best static unary unbiased black-box algorithm on LeadingOnes, achieving an almost optimal speedup of about~$18\%$.
Most experimental studies initialize the population of evolutionary algorithms with random genotypes. In practice, however, optimizers are typically seeded with good candidate solutions either previously known or created according to some problem-specific method. This "seeding" has been studied extensively for single-objective problems. For multi-objective problems, however, very little literature is available on the approaches to seeding and their individual benefits and disadvantages. In this article, we are trying to narrow this gap via a comprehensive computational study on common real-valued test functions. We investigate the effect of two seeding techniques for five algorithms on 48 optimization problems with 2, 3, 4, 6, and 8 objectives. We observe that some functions (e.g., DTLZ4 and the LZ family) benefit significantly from seeding, while others (e.g., WFG) profit less. The advantage of seeding also depends on the examined algorithm.
Genetic Programming (GP) has found various applications. Understanding this type of algorithm from a theoretical point of view is a challenging task. The first results on the computational complexity of GP have been obtained for problems with isolated program semantics. With this paper, we push forward the computational complexity analysis of GP on a problem with dependent program semantics. We study the well-known sorting problem in this context and analyze rigorously how GP can deal with different measures of sortedness.
Wave energy technologies have the potential to play a significant role in the supply of renewable energy on a world scale. One of the most promising designs for wave energy converters (WECs) are fully submerged buoys. In this work, we explore the optimisation of WEC arrays consisting of a three-tether buoy model called CETO. Such arrays can be optimised for total energy output by adjusting both the relative positions of buoys in farms and also the power-take-off (PTO) parameters for each buoy. The search space for these parameters is complex and multi-modal. Moreover, the evaluation of each parameter setting is computationally expensive -- limiting the number of full model evaluations that can be made. To handle this problem, we propose a new hybrid cooperative co-evolution algorithm (HCCA). HCCA consists of a symmetric local search plus Nelder-Mead and a cooperative co-evolution algorithm (CC) with a backtracking strategy for optimising the positions and PTO settings of WECs, respectively. Moreover, a new adaptive scenario is proposed for tuning grey wolf optimiser (AGWO) hyper-parameter. AGWO participates notably with other applied optimisers in HCCA. For assessing the effectiveness of the proposed approach five popular Evolutionary Algorithms (EAs), four alternating optimisation methods and two modern hybrid ideas (LS-NM and SLS-NM-B) are carefully compared in four real wave situations (Adelaide, Tasmania, Sydney and Perth) with two wave farm sizes (4 and 16). According to the experimental outcomes, the hybrid cooperative framework exhibits better performance in terms of both runtime and quality of obtained solutions.
A commonly used strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. Building on the recent success of Bet-and-Run approaches for restarted local search solvers, we introduce an improved generic Bet-and-Run strategy. The goal is to obtain the best possible results within a given time budget t using a given black-box optimization algorithm. If no prior knowledge about problem features and algorithm behavior is available, the question about how to use the time budget most efficiently arises. We propose to first start k>=1 independent runs of the algorithm during an initialization budget t1<t, pausing these runs, then apply a decision maker D to choose 1<=m<=k runs from them (consuming t2>=0 time units in doing so), and then continuing these runs for the remaining t3=t-t1-t2 time units. In previous Bet-and-Run strategies, the decision maker D=currentBest would simply select the run with the best- so-far results at negligible time. We propose using more advanced methods to discriminate between "good" and "bad" sample runs, with the goal of increasing the correlation of the chosen run with the a-posteriori best one. We test several different approaches, including neural networks trained or polynomials fitted on the current trace of the algorithm to predict which run may yield the best results if granted the remaining budget. We show with extensive experiments that this approach can yield better results than the previous methods, but also find that the currentBest method is a very reliable and robust baseline approach.
A common strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. However, while specific restart strategies have been developed for specific problems (and specific algorithms), restarts are typically not regarded as a general tool to speed up an optimization algorithm. In fact, many optimization algorithms do not employ restarts at all. Recently, "bet-and-run" was introduced in the context of mixed-integer programming, where first a number of short runs with randomized initial conditions is made, and then the most promising run of these is continued. In this article, we consider two classical NP-complete combinatorial optimization problems, traveling salesperson and minimum vertex cover, and study the effectiveness of different bet-and-run strategies. In particular, our restart strategies do not take any problem knowledge into account, nor are tailored to the optimization algorithm. Therefore, they can be used off-the-shelf. We observe that state-of-the-art solvers for these problems can benefit significantly from restarts on standard benchmark instances.
The placement of wind turbines on a given area of land such that the wind farm produces a maximum amount of energy is a challenging optimization problem. In this article, we tackle this problem, taking into account wake effects that are produced by the different turbines on the wind farm. We significantly improve upon existing results for the minimization of wake effects by developing a new problem-specific local search algorithm. One key step in the speed-up of our algorithm is the reduction in computation time needed to assess a given wind farm layout compared to previous approaches. Our new method allows the optimization of large real-world scenarios within a single night on a standard computer, whereas weeks on specialized computing servers were required for previous approaches.
Evolutionary diversity optimization aims to compute a diverse set of solutions where all solutions meet a given quality criterion. With this paper, we bridge the areas of evolutionary diversity optimization and evolutionary multi-objective optimization. We show how popular indicators frequently used in the area of multi-objective optimization can be used for evolutionary diversity optimization. Our experimental investigations for evolving diverse sets of TSP instances and images according to various features show that two of the most prominent multi-objective indicators, namely the hypervolume indicator and the inverted generational distance, provide excellent results in terms of visualization and various diversity indicators.
Subjective perceptual image quality can be assessed in lab studies by human observers. Objective image quality assessment (IQA) refers to algorithms for estimation of the mean subjective quality ratings. Many such methods have been proposed, both for blind IQA in which no original reference image is available as well as for the full-reference case. We compared 8 state-of-the-art algorithms for blind IQA and showed that an oracle, able to predict the best performing method for any given input image, yields a hybrid method that could outperform even the best single existing method by a large margin. In this contribution we address the research question whether established methods to learn such an oracle can improve blind IQA. We applied AutoFolio, a state-of-the-art system that trains an algorithm selector to choose a well-performing algorithm for a given instance. We also trained deep neural networks to predict the best method. Our results did not give a positive answer, algorithm selection did not yield a significant improvement over the single best method. Looking into the results in depth, we observed that the noise in images may have played a role in why our trained classifiers could not predict the oracle. This motivates the consideration of noisiness in IQA methods, a property that has so far not been observed and that opens up several interesting new research questions and applications.
Renewable energy, such as ocean wave energy, plays a pivotal role in addressing the tremendous growth of global energy demand. It is expected that wave energy will be one of the fastest-growing energy resources in the next decade, offering an enormous potential source of sustainable energy. This research investigates the placement optimization of oscillating buoy-type wave energy converters (WEC). The design of a wave farm consisting of an array of fully submerged three-tether buoys is evaluated. In a wave farm, buoy positions have a notable impact on the farm's output. Optimizing the buoy positions is a challenging research problem because of very complex interactions (constructive and destructive) between buoys. The main purpose of this research is maximizing the power output of the farm through the placement of buoys in a size-constrained environment. This paper proposes a new hybrid approach of the heuristic local search combined with a numerical optimization method that utilizes a knowledge-based surrogate power model. We compare the proposed hybrid method with other state-of-the-art search methods in five different wave scenarios -- one simplified irregular wave model and four real wave climates. Our method considerably outperforms all previous heuristic methods in terms of both quality of achieved solutions and the convergence-rate of search in all tested wave regimes.
Ocean wave energy is a source of renewable energy that has gained much attention for its potential to contribute significantly to meeting the global energy demand. In this research, we investigate the problem of maximising the energy delivered by farms of wave energy converters (WEC's). We consider state-of-the-art fully submerged three-tether converters deployed in arrays. The goal of this work is to use heuristic search to optimise the power output of arrays in a size-constrained environment by configuring WEC locations and the power-take-off (PTO) settings for each WEC. Modelling the complex hydrodynamic interactions in wave farms is expensive, which constrains search to only a few thousand model evaluations. We explore a variety of heuristic approaches including cooperative and hybrid methods. The effectiveness of these approaches is assessed in two real wave scenarios (Sydney and Perth) with farms of two different scales. We find that a combination of symmetric local search with Nelder-Mead Simplex direct search combined with a back-tracking optimization strategy is able to outperform previously defined search techniques by up to 3\%.
This research proposes a novel indicator-based hybrid evolutionary approach that combines approximate and exact algorithms. We apply it to a new bi-criteria formulation of the travelling thief problem, which is known to the Evolutionary Computation community as a benchmark multi-component optimisation problem that interconnects two classical NP-hard problems: the travelling salesman problem and the 0-1 knapsack problem. Our approach employs the exact dynamic programming algorithm for the underlying Packing-While-Travelling (PWT) problem as a subroutine within a bi-objective evolutionary algorithm. This design takes advantage of the data extracted from Pareto fronts generated by the dynamic program to achieve better solutions. Furthermore, we develop a number of novel indicators and selection mechanisms to strengthen synergy of the two algorithmic components of our approach. The results of computational experiments show that the approach is capable to outperform the state-of-the-art results for the single-objective case of the problem.
Many evolutionary and constructive heuristic approaches have been introduced in order to solve the Traveling Thief Problem (TTP). However, the accuracy of such approaches is unknown due to their inability to find global optima. In this paper, we propose three exact algorithms and a hybrid approach to the TTP. We compare these with state-of-the-art approaches to gather a comprehensive overview on the accuracy of heuristic methods for solving small TTP instances.
Wind energy plays an increasing role in the supply of energy world-wide. The energy output of a wind farm is highly dependent on the weather condition present at the wind farm. If the output can be predicted more accurately, energy suppliers can coordinate the collaborative production of different energy sources more efficiently to avoid costly overproductions. With this paper, we take a computer science perspective on energy prediction based on weather data and analyze the important parameters as well as their correlation on the energy output. To deal with the interaction of the different parameters we use symbolic regression based on the genetic programming tool DataModeler. Our studies are carried out on publicly available weather and energy data for a wind farm in Australia. We reveal the correlation of the different variables for the energy output. The model obtained for energy prediction gives a very reliable prediction of the energy output for newly given weather data.
With this paper, we contribute to the understanding of ant colony optimization (ACO) algorithms by formally analyzing their runtime behavior. We study simple MAX-MIN ant systems on the class of linear pseudo-Boolean functions defined on binary strings of length 'n'. Our investigations point out how the progress according to function values is stored in pheromone. We provide a general upper bound of O((n^3 \log n)/ \rho) for two ACO variants on all linear functions, where (\rho) determines the pheromone update strength. Furthermore, we show improved bounds for two well-known linear pseudo-Boolean functions called OneMax and BinVal and give additional insights using an experimental study.
Diversity plays a crucial role in evolutionary computation. While diversity has been mainly used to prevent the population of an evolutionary algorithm from premature convergence, the use of evolutionary algorithms to obtain a diverse set of solutions has gained increasing attention in recent years. Diversity optimization in terms of features on the underlying problem allows to obtain a better understanding of possible solutions to the problem at hand and can be used for algorithm selection when dealing with combinatorial optimization problems such as the Traveling Salesperson Problem. We explore the use of the star-discrepancy measure to guide the diversity optimization process of an evolutionary algorithm. In our experimental investigations, we consider our discrepancy-based diversity optimization approaches for evolving diverse sets of images as well as instances of the Traveling Salesperson problem where a local search is not able to find near optimal solutions. Our experimental investigations comparing three diversity optimization approaches show that a discrepancy-based diversity optimization approach using a tie-breaking rule based on weighted differences to surrounding feature points provides the best results in terms of the star discrepancy measure.
Over the past 30 years many researchers in the field of evolutionary computation have put a lot of effort to introduce various approaches for solving hard problems. Most of these problems have been inspired by major industries so that solving them, by providing either optimal or near optimal solution, was of major significance. Indeed, this was a very promising trajectory as advances in these problem-solving approaches could result in adding values to major industries. In this paper we revisit this trajectory to find out whether the attempts that started three decades ago are still aligned with the same goal, as complexities of real-world problems increased significantly. We present some examples of modern real-world problems, discuss why they might be difficult to solve, and whether there is any mismatch between these examples and the problems that are investigated in the evolutionary computation area.
The installed amount of renewable energy has expanded massively in recent years. Wave energy, with its high capacity factors has great potential to complement established sources of solar and wind energy. This study explores the problem of optimising the layout of advanced, three-tether wave energy converters in a size-constrained farm in a numerically modelled ocean environment. Simulating and computing the complicated hydrodynamic interactions in wave farms can be computationally costly, which limits optimisation methods to have just a few thousand evaluations. For dealing with this expensive optimisation problem, an adaptive neuro-surrogate optimisation (ANSO) method is proposed that consists of a surrogate Recurrent Neural Network (RNN) model trained with a very limited number of observations. This model is coupled with a fast meta-heuristic optimiser for adjusting the model's hyper-parameters. The trained model is applied using a greedy local search with a backtracking optimisation strategy. For evaluating the performance of the proposed approach, some of the more popular and successful Evolutionary Algorithms (EAs) are compared in four real wave scenarios (Sydney, Perth, Adelaide and Tasmania). Experimental results show that the adaptive neuro model is competitive with other optimisation methods in terms of total harnessed power output and faster in terms of total computational costs.
Substitution Boxes (S-boxes) are nonlinear objects often used in the design of cryptographic algorithms. The design of high quality S-boxes is an interesting problem that attracts a lot of attention. Many attempts have been made in recent years to use heuristics to design S-boxes, but the results were often far from the previously known best obtained ones. Unfortunately, most of the effort went into exploring different algorithms and fitness functions while little attention has been given to the understanding why this problem is so difficult for heuristics. In this paper, we conduct a fitness landscape analysis to better understand why this problem can be difficult. Among other, we find that almost each initial starting point has its own local optimum, even though the networks are highly interconnected.