This thesis investigates the use of problem-specific knowledge to enhance a genetic algorithm approach to multiple-choice optimisation problems. It shows that such information can significantly enhance performance, but that the choice of information and the way it is included are important factors for success.

* Diplomarbeit, in German, Universitaet Mannheim, 1996
Click to Read Paper
This thesis investigates the use of problem-specific knowledge to enhance a genetic algorithm approach to multiple-choice optimisation problems.It shows that such information can significantly enhance performance, but that the choice of information and the way it is included are important factors for success.Two multiple-choice problems are considered.The first is constructing a feasible nurse roster that considers as many requests as possible.In the second problem, shops are allocated to locations in a mall subject to constraints and maximising the overall income.Genetic algorithms are chosen for their well-known robustness and ability to solve large and complex discrete optimisation problems.However, a survey of the literature reveals room for further research into generic ways to include constraints into a genetic algorithm framework.Hence, the main theme of this work is to balance feasibility and cost of solutions.In particular, co-operative co-evolution with hierarchical sub-populations, problem structure exploiting repair schemes and indirect genetic algorithms with self-adjusting decoder functions are identified as promising approaches.The research starts by applying standard genetic algorithms to the problems and explaining the failure of such approaches due to epistasis.To overcome this, problem-specific information is added in a variety of ways, some of which are designed to increase the number of feasible solutions found whilst others are intended to improve the quality of such solutions.As well as a theoretical discussion as to the underlying reasons for using each operator,extensive computational experiments are carried out on a variety of data.These show that the indirect approach relies less on problem structure and hence is easier to implement and superior in solution quality.

* PhD thesis, University of Wales (Swansea), 1999
* 258 pages, PhD thesis, University of Wales (Swansea)
Click to Read Paper
In recent years genetic algorithms have emerged as a useful tool for the heuristic solution of complex discrete optimisation problems. In particular there has been considerable interest in their use in tackling problems arising in the areas of scheduling and timetabling. However, the classical genetic algorithm paradigm is not well equipped to handle constraints and successful implementations usually require some sort of modification to enable the search to exploit problem specific knowledge in order to overcome this shortcoming. This paper is concerned with the development of a family of genetic algorithms for the solution of a nurse rostering problem at a major UK hospital. The hospital is made up of wards of up to 30 nurses. Each ward has its own group of nurses whose shifts have to be scheduled on a weekly basis. In addition to fulfilling the minimum demand for staff over three daily shifts, nurses' wishes and qualifications have to be taken into account. The schedules must also be seen to be fair, in that unpopular shifts have to be spread evenly amongst all nurses, and other restrictions, such as team nursing and special conditions for senior staff, have to be satisfied. The basis of the family of genetic algorithms is a classical genetic algorithm consisting of n-point crossover, single-bit mutation and a rank-based selection. The solution space consists of all schedules in which each nurse works the required number of shifts, but the remaining constraints, both hard and soft, are relaxed and penalised in the fitness function. The talk will start with a detailed description of the problem and the initial implementation and will go on to highlight the shortcomings of such an approach, in terms of the key element of balancing feasibility, i.e. covering the demand and work regulations, and quality, as measured by the nurses' preferences. A series of experiments involving parameter adaptation, niching, intelligent weights, delta coding, local hill climbing, migration and special selection rules will then be outlined and it will be shown how a series of these enhancements were able to eradicate these difficulties. Results based on several months' real data will be used to measure the impact of each modification, and to show that the final algorithm is able to compete with a tabu search approach currently employed at the hospital. The talk will conclude with some observations as to the overall quality of this approach to this and similar problems.

* Young Operational Research Conference 12, 1998
* 22 pages, Young Operational Research Conference 12
Click to Read Paper
Over the last few years, more and more heuristic decision making techniques have been inspired by nature, e.g. evolutionary algorithms, ant colony optimisation and simulated annealing. More recently, a novel computational intelligence technique inspired by immunology has emerged, called Artificial Immune Systems (AIS). This immune system inspired technique has already been useful in solving some computational problems. In this keynote, we will very briefly describe the immune system metaphors that are relevant to AIS. We will then give some illustrative real-world problems suitable for AIS use and show a step-by-step algorithm walkthrough. A comparison of AIS to other well-known algorithms and areas for future work will round this keynote off. It should be noted that as AIS is still a young and evolving field, there is not yet a fixed algorithm template and hence actual implementations might differ somewhat from the examples given here.

* Invited Keynote Talk, Annual Operational Research Conference 46, York, UK, 2004
Click to Read Paper
This paper presents a new type of genetic algorithm for the set covering problem. It differs from previous evolutionary approaches first because it is an indirect algorithm, i.e. the actual solutions are found by an external decoder function. The genetic algorithm itself provides this decoder with permutations of the solution variables and other parameters. Second, it will be shown that results can be further improved by adding another indirect optimisation layer. The decoder will not directly seek out low cost solutions but instead aims for good exploitable solutions. These are then post optimised by another hill-climbing algorithm. Although seemingly more complicated, we will show that this three-stage approach has advantages in terms of solution quality, speed and adaptability to new types of problems over more direct approaches. Extensive computational results are presented and compared to the latest evolutionary and other heuristic approaches to the same data instances.

* Journal of the Operational Research Society, 53(10), pp 1118-1126 2002
Click to Read Paper
This paper combines the idea of a hierarchical distributed genetic algorithm with different inter-agent partnering strategies. Cascading clusters of sub-populations are built from bottom up, with higher-level sub-populations optimising larger parts of the problem. Hence higher-level sub-populations search a larger search space with a lower resolution whilst lower-level sub-populations search a smaller search space with a higher resolution. The effects of different partner selection schemes amongst the agents on solution quality are examined for two multiple-choice optimisation problems. It is shown that partnering strategies that exploit problem-specific knowledge are superior and can counter inappropriate (sub-) fitness measurements.

* Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), late-breaking papers volume, pp 1-8, San Francisco, USA
Click to Read Paper
We combine Artificial Immune Systems 'AIS', technology with Collaborative Filtering 'CF' and use it to build a movie recommendation system. We already know that Artificial Immune Systems work well as movie recommenders from previous work by Cayzer and Aickelin 3, 4, 5. Here our aim is to investigate the effect of different affinity measure algorithms for the AIS. Two different affinity measures, Kendalls Tau and Weighted Kappa, are used to calculate the correlation coefficients for the movie recommender. We compare the results with those published previously and show that Weighted Kappa is more suitable than others for movie problems. We also show that AIS are generally robust movie recommenders and that, as long as a suitable affinity measure is chosen, results are good.

* Proceedings of the 5th International Conference on Recent Advances in Soft Computing (RASC 2004), Nottingham, UK
Click to Read Paper
Some argue that biologically inspired algorithms are the future of solving difficult problems in computer science. Others strongly believe that the future lies in the exploration of mathematical foundations of problems at hand. The field of computer security tends to accept the latter view as a more appropriate approach due to its more workable validation and verification possibilities. The lack of rigorous scientific practices prevalent in biologically inspired security research does not aid in presenting bio-inspired security approaches as a viable way of dealing with complex security problems. This chapter introduces a biologically inspired algorithm, called the Self Organising Map (SOM), that was developed by Teuvo Kohonen in 1981. Since the algorithm's inception it has been scrutinised by the scientific community and analysed in more than 4000 research papers, many of which dealt with various computer security issues, from anomaly detection, analysis of executables all the way to wireless network monitoring. In this chapter a review of security related SOM research undertaken in the past is presented and analysed. The algorithm's biological analogies are detailed and the author's view on the future possibilities of this successful bio-inspired approach are given. The SOM algorithm's close relation to a number of vital functions of the human brain and the emergence of multi-core computer architectures are the two main reasons behind our assumption that the future of the SOM algorithm and its variations is promising, notably in the field of computer security.

* pp. 1-30, Computer Security: Intrusion, Detection and Prevention, 2009
Click to Read Paper
Adverse drug reaction (ADR) is widely concerned for public health issue. ADRs are one of most common causes to withdraw some drugs from market. Prescription event monitoring (PEM) is an important approach to detect the adverse drug reactions. The main problem to deal with this method is how to automatically extract the medical events or side effects from high-throughput medical events, which are collected from day to day clinical practice. In this study we propose a novel concept of feature matrix to detect the ADRs. Feature matrix, which is extracted from big medical data from The Health Improvement Network (THIN) database, is created to characterize the medical events for the patients who take drugs. Feature matrix builds the foundation for the irregular and big medical data. Then feature selection methods are performed on feature matrix to detect the significant features. Finally the ADRs can be located based on the significant features. The experiments are carried out on three drugs: Atorvastatin, Alendronate, and Metoclopramide. Major side effects for each drug are detected and better performance is achieved compared to other computerized methods. The detected ADRs are based on computerized methods, further investigation is needed.

* International Journal of Information Technology and Computer Science (IJITCS), in print, 2014
Click to Read Paper
Adverse drug reaction (ADR) is widely concerned for public health issue. In this study we propose an original approach to detect the ADRs using feature matrix and feature selection. The experiments are carried out on the drug Simvastatin. Major side effects for the drug are detected and better performance is achieved compared to other computerized methods. The detected ADRs are based on the computerized method, further investigation is needed.

* Second International Conference on Business Computing and Global Informatization (BCGIN), pp 820-823, 2012
Click to Read Paper
Many machine learning algorithms assume that all input samples are independently and identically distributed from some common distribution on either the input space X, in the case of unsupervised learning, or the input and output space X x Y in the case of supervised and semi-supervised learning. In the last number of years the relaxation of this assumption has been explored and the importance of incorporation of additional information within machine learning algorithms became more apparent. Traditionally such fusion of information was the domain of semi-supervised learning. More recently the inclusion of knowledge from separate hypothetical spaces has been proposed by Vapnik as part of the supervised setting. In this work we are interested in exploring Vapnik's idea of master-class learning and the associated learning using privileged information, however within the unsupervised setting. Adoption of the advanced supervised learning paradigm for the unsupervised setting instigates investigation into the difference between privileged and technical data. By means of our proposed aRi-MAX method stability of the KMeans algorithm is improved and identification of the best clustering solution is achieved on an artificial dataset. Subsequently an information theoretic dot product based algorithm called P-Dot is proposed. This method has the ability to utilize a wide variety of clustering techniques, individually or in combination, while fusing privileged and technical data for improved clustering. Application of the P-Dot method to the task of digit recognition confirms our findings in a real-world scenario.

* Information Sciences 194, 4-23, 2012
Click to Read Paper
Memory can be defined as the ability to retain and recall information in a diverse range of forms. It is a vital component of the way in which we as human beings operate on a day to day basis. Given a particular situation, decisions are made and actions undertaken in response to that situation based on our memory of related prior events and experiences. By utilising our memory we can anticipate the outcome of our chosen actions to avoid unexpected or unwanted events. In addition, as we subtly alter our actions and recognise altered outcomes we learn and create new memories, enabling us to improve the efficiency of our actions over time. However, as this process occurs so naturally in the subconscious its importance is often overlooked.

* University of Nottingham, 2005
Click to Read Paper
Innate immunity now occupies a central role in immunology. However, artificial immune system models have largely been inspired by adaptive not innate immunity. This paper reviews the biological principles and properties of innate immunity and, adopting a conceptual framework, asks how these can be incorporated into artificial models. The aim is to outline a meta-framework for models of innate immunity.

* Proceedings of the 4th International Conference on Artificial Immune Systems (ICARIS 2005), Lecture Notes in Computer Science 3627, Banff, Canada, p 112-125
* 14 pages, 5 figures, 2 tables, 4th International Conference on Artificial Immune Systems (ICARIS 2005)
Click to Read Paper
A new emerging paradigm of Uncertain Risk of Suspicion, Threat and Danger, observed across the field of information security, is described. Based on this paradigm a novel approach to anomaly detection is presented. Our approach is based on a simple yet powerful analogy from the innate part of the human immune system, the Toll-Like Receptors. We argue that such receptors incorporated as part of an anomaly detector enhance the detector's ability to distinguish normal and anomalous behaviour. In addition we propose that Toll-Like Receptors enable the classification of detected anomalies based on the types of attacks that perpetrate the anomalous behaviour. Classification of such type is either missing in existing literature or is not fit for the purpose of reducing the burden of an administrator of an intrusion detection system. For our model to work, we propose the creation of a taxonomy of the digital Acytota, based on which our receptors are created.

* Proceedings of the 2nd International Conference on Emerging Security Information, Systems and Technologies, Cap Esterel, France, p 287-293, 2008
* 7 pages, 4 figures, 1 table, 2nd International Conference on Emerging Security Information, Systems and Technologies,
Click to Read Paper
The Dendritic Cell Algorithm is an immune-inspired algorithm orig- inally based on the function of natural dendritic cells. The original instantiation of the algorithm is a highly stochastic algorithm. While the performance of the algorithm is good when applied to large real-time datasets, it is difficult to anal- yse due to the number of random-based elements. In this paper a deterministic version of the algorithm is proposed, implemented and tested using a port scan dataset to provide a controllable system. This version consists of a controllable amount of parameters, which are experimented with in this paper. In addition the effects are examined of the use of time windows and variation on the number of cells, both which are shown to influence the algorithm. Finally a novel metric for the assessment of the algorithms output is introduced and proves to be a more sensitive metric than the metric used with the original Dendritic Cell Algorithm.

* Proceedings of the 7th International Conference on Artificial Immune Systems (ICARIS 2008), Phuket, Thailand, p 291-303
* 12 pages, 1 algorithm, 1 figure, 2 tables, 7th International Conference on Artificial Immune Systems (ICARIS 2008)
Click to Read Paper
Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the data itself. Although abundant expert knowledge exists in many areas where unlabelled data is examined, such knowledge is rarely incorporated into automatic analysis. Incorporation of expert knowledge is frequently a matter of combining multiple data sources from disparate hypothetical spaces. In cases where such spaces belong to different data types, this task becomes even more challenging. In this paper we present a novel immune-inspired method that enables the fusion of such disparate types of data for a specific set of problems. We show that our method provides a better visual understanding of one hypothetical space with the help of data from another hypothetical space. We believe that our model has implications for the field of exploratory data analysis and knowledge discovery.

* Proceedings of the 10th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 09), Lecture Notes in Computer Science 5788, Burgos, Spain, 2009, p208-218
* 11 pages, 2 figures, 10th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 09)
Click to Read Paper
The premise of automated alert correlation is to accept that false alerts from a low level intrusion detection system are inevitable and use attack models to explain the output in an understandable way. Several algorithms exist for this purpose which use attack graphs to model the ways in which attacks can be combined. These algorithms can be classified in to two broad categories namely scenario-graph approaches, which create an attack model starting from a vulnerability assessment and type-graph approaches which rely on an abstract model of the relations between attack types. Some research in to improving the efficiency of type-graph correlation has been carried out but this research has ignored the hypothesizing of missing alerts. Our work is to present a novel type-graph algorithm which unifies correlation and hypothesizing in to a single operation. Our experimental results indicate that the approach is extremely efficient in the face of intensive alerts and produces compact output graphs comparable to other techniques.

* Proceedings of the 4th International Conference on Information Systems Security (ICISS 2008), Lecture Notes in Computer Science 5352, Hyderabad, India, 2008, p173-187
* 15 pages, 3 tables, (ICISS 2008)
Click to Read Paper
In a previous paper the authors argued the case for incorporating ideas from innate immunity into artificial immune systems (AISs) and presented an outline for a conceptual framework for such systems. A number of key general properties observed in the biological innate and adaptive immune systems were highlighted, and how such properties might be instantiated in artificial systems was discussed in detail. The next logical step is to take these ideas and build a software system with which AISs with these properties can be implemented and experimentally evaluated. This paper reports on the results of that step - the libtissue system.

* Proceedings of the Workshop on Artificial Immune Systems and Immune System Modelling (AISB06), Bristol, UK, p 18-19, 2006
* 8 pages, 5 figures, 4 tables, Workshop on Artificial Immune Systems and Immune System Modelling (AISB06)
Click to Read Paper
This paper reports on continuing research into the modelling of an order picking process within a Crossdocking distribution centre using Simulation Optimisation. The aim of this project is to optimise a discrete event simulation model and to understand factors that affect finding its optimal performance. Our initial investigation revealed that the precision of the selected simulation output performance measure and the number of replications required for the evaluation of the optimisation objective function through simulation influences the ability of the optimisation technique. We experimented with Common Random Numbers, in order to improve the precision of our simulation output performance measure, and intended to use the number of replications utilised for this purpose as the initial number of replications for the optimisation of our Crossdocking distribution centre simulation model. Our results demonstrate that we can improve the precision of our selected simulation output performance measure value using Common Random Numbers at various levels of replications. Furthermore, after optimising our Crossdocking distribution centre simulation model, we are able to achieve optimal performance using fewer simulations runs for the simulation model which uses Common Random Numbers as compared to the simulation model which does not use Common Random Numbers.

* Proceedings of 2008 International Simulation Multi-Conference (SCS), San Diego, USA, 434-439
* 6 pages, 7 tables, 2008 International Simulation Multi-Conference (SCS), San Diego, USA
Click to Read Paper
Artificial immune systems (AISs) to date have generally been inspired by naive biological metaphors. This has limited the effectiveness of these systems. In this position paper two ways in which AISs could be made more biologically realistic are discussed. We propose that AISs should draw their inspiration from organisms which possess only innate immune systems, and that AISs should employ systemic models of the immune system to structure their overall design. An outline of plant and invertebrate immune systems is presented, and a number of contemporary research that more biologically-realistic AISs could have is also discussed.

* Proceedings of the 6th International Conference on Artificial Immune Systems (ICARIS2007), Lecture Notes in Computer Science 4628, Santos, Brazil
* 12 pages, 6th International Conference on Artificial Immune Systems (ICARIS2007), Lecture Notes in Computer Science 4628, Santos, Brazil
Click to Read Paper