PhD Project Descriptions:
- Early detection of anomalies in multi-variate longitudinal data
- Dr H Mitchell, Professor S McLoone, Dr K J Cairns,
Longitudinal data sources exist across many research fields, and involve repeated measurements of covariates on a set of subjects over time. Detecting anomalies in these patterns, such as short-term outliers or more subtle trends and interactions over longer time scale, can be a key objective and is important in applications such as credit card fraud detection, medical diagnostics information anomaly detection, industrial equipment incipient fault detection and structural defect detection, network intrusion detection, and text mining. Outliers may be seen as indicators that a process is leaving its stable state, and thus their detection should ideally be recognised as early as possible.
This research project aims to focus on the computational/methodological challenges that exist in anomaly detection. The PhD will commence with a review of anomaly detection methods, assessing their ability to handle missingness, particularly when data is of high dimensionality and has a low signal to noise ratio. In certain cases, the implications of combining multiple imputation with methods which require complete information will be evaluated. Combining results from many weak learners into one high-quality ensemble predictor may also be considered.
The primary application area to be considered by the PhD research studentship is semiconductor manufacturing, using data obtained from industrial partners Intel/Seagate. Semiconductor manufacturing is an expensive multi-step process used to create the integrated circuits present in everyday electronic devices. Numerous processing steps can be applied to wafers and there is increased demand for metrology between the various steps to verify that wafers have not been damaged by previous processing steps. Virtual metrology can eliminate or reduce costly metrology steps by using statistical methods to predict measurements based on previous metrology outputs and information from current and previous steps of fabrication.
Krylov subspace methods have already demonstrated their capability and efficiency in an alternative computationally-intensive application in theoretical physics, and may also be useful in the highly dimensional virtual metrology setting. For example, the PhD aims to evaluate the use of Krylov subspace methods within algorithms such the hidden Markov model.
- A robust approach to multivariate joint modelling
- Dr L McCrink, Professor A H Marshall
In recent years it has become common practice to gather longitudinal data over time where information on multiple longitudinal responses are gathered concurrently. These longitudinal responses frequently impact time-to-event processes with the related survival data often being collected alongside the repeated measurements. Consequently, joint modelling techniques which simultaneously analyse a longitudinal and survival process have recently been extended to handle multivariate longitudinal responses.
However, such developments commonly assume normality for the longitudinal random terms, an assumption which can be greatly affected by the presence of longitudinal outliers introducing bias into the analysis. Previous research has verified such negative impacts in the estimation of parameters and thus interpretations of the estimated joint model in the analysis of a single longitudinal process within a joint model setting.
This research proposes the development of a more robust joint modelling approach which down weighs the impact of longitudinal outliers within multiple longitudinal processes simultaneously. This sub-model is then linked to the survival process through a joint likelihood approach. This research may be applied to a wide range of applications, including medical research and astrostatistics.
- Robust joint modelling incorporating a stochastic component
- Dr L McCrink, Professor A H Marshall
With the advancements in technology, it is becoming common practice for the repeated measurements of patients to be collected over time, regularly alongside their survival data. This, together with the improvements in computing, has driven the recent developments and advancements of techniques to analyse longitudinal data, a field that is receiving growing attention over the past two decades. One such technique that is under constant recent development is a joint modelling approach that simultaneously accounts for the common relationship found between medical longitudinal data and patient’s survival.
These joint models not only simultaneously estimate the parameters from the longitudinal and survival components but also account for both outlying individuals and outlying observations within individuals. The focus of this research would be the further development of such robust techniques.
To do so, this research would incorporate a stochastic component within the linear mixed effects models that represents the longitudinal process. This would accurately represent fluctuations in an individual’s own average longitudinal response over time, alongside accounting for longitudinal outliers, novel research within a robust joint model setting. In doing so, this would better represent the true longitudinal process of individuals and thus their survival, providing more precise estimates and thus interpretations.
- Area-wide road traffic control using macroscopic traffic flow models
- Dr S Moutari, Professor A H Marshall
Area-wide traffic control refers to sets actions, which aims to coordinate traffic flow information in road networks in order to address frequent occurrences of congestion. When properly implemented, area-wide traffic control schemes can have significant impacts on sustainable transportation in terms of safety, economic efficiency, air quality, etc.
Recent research works, based on microscopic traffic simulation, have shown that significant improvement over an optimized fixed time control. Although microscopic models can be regarded as an appropriate response in some specific situations, these models are impractical for crowded large road networks. Macroscopic traffic models, which provide a global view of the traffic in the area, could be an appropriate alternative to overcome the limitations of microscopic type models.
Assessments of large urban road networks are recently feasible due to the availability of comprehensive sets of area-wide traffic monitoring data. The aim of this proposal is to use such data to derive robust area-wide traffic control schemes within the macroscopic framework.
- Modelling Disease Misclassification using hidden Markov models
- Dr H Mitchell, Professor A H Marshall
Understanding the dynamic nature of a disease process can be vital in early detection, diagnosis and progression. Early disease detection can improve medical treatment with there being a larger possibility of curing the patient, delaying the disease progression and/or enhancing the patients quality of life. Often disease states and transitions between them can be used to model disease progression however in some cases there can be some misclassification of disease state due to true events not being directly observable or from measurement procedure.
Transitional models are often used to model disease progression where interest lies within the disease states and transition rates. To examine potential misclassification, the use of hidden Markov model can be used.
Hidden Markov models are a family of versatile statistical models that have been used in a variety of applications from speech recognition through to financial fraud, with their use within healthcare modelling growing. The hidden Markov model is a probabilistic model consisting of a hidden process and an observed stochastic process. The models are a form of mixture models but are more general where the hidden states are assumed to have a Markovian structure.
Typically within disease misclassification the hidden states within the hidden Markov model represent absence or presence of the particular disease. These hidden states cannot be observed directly but can be inferred from the outcome, which tend to be subject to error.
This PhD will aim to develop the use of hidden Markov models for misclassification of diseases, investigating the different types of hidden Markov models and their ability to capture misclassification, particularly if there are multiple disease outcomes measured on a patient and the interactions within these.
- Everything Changes Over Time: Transforming Joint Modelling Methodology
- Dr L McCrink, Dr K J Cairns
Clinicians typically collect data from patients regularly throughout the treatment of an illness. Such longitudinal data provides valuable insights into how things change over time for patients – tracking the progression of the disease, a patient’s reaction to particular treatments, the usefulness of intervention strategies, for example. It is common that such data will be collected alongside key event information such as the time to recovery, time to relapse or time to death of patients. Joint models enable the relationships between this survival and longitudinal data to be mathematically represented, frequently linking a linear mixed effects model to a Cox proportional hazards model.
Despite the significant growth in this field of research in recent years, a wider array of models is needed to truly represent natural biological changes over time. With joint models being first introduced in 1996, this relatively young field of research has many opportunities in which novel approaches can be explored. This project will tackle one such avenue of research – the transformation of joint modelling methodology to allow a better representation of changing effects over time.
Within current literature, joint models assume that the effect of covariates is constant, unchanging over time, a potentially unrealistic assumption which this research would relax. By lending theory from time-varying parameter models, this project would take into account the likely situation that as a disease progresses, the relationship between the response and covariates (e.g. biomarkers or drug effects) tend to strengthen and change over time. Examples of this are given within many medical fields such as the analysis of biomarkers associated with Parkinson’s disease, antiviral treatment effects for HIV patients and the analysis of people trying to quit smoking, to name but a few.
This PhD would feed into a user-friendly software package, complementing other active research projects currently being undertaken by the primary supervisor in conjunction with both national and international collaborators.
- A framework for an effective traffic congestion management in road networks
- Dr S Moutari, Professor A H Marshall
Road traffic congestion will continue to be a major social and economic problem due to the rapid growth of traffic density. The problem has been extensively investigated over the last few decades and wide range of theories and techniques to address the issue have been suggested in the literature. However, due to some physical space and financial resources’ constraints as well as environmental concerns, the traditional approach that consists of expanding highways’ infrastructures is no longer viable. Currently, the most prominent alternatives to ease traffic conditions on highways rely on an optimal exploitation of the existing highways’ infrastructures via Intelligent Transport Systems (ITSs). The effectiveness of an ITS, in improving traffic conditions, in road networks depends highly upon the efficiency of their operational models. One of the corner stone of these models, which plays an essential role in traffic congestion management, includes methods of traffic control. Traffic lights at intersections have been essentially the major tool used to control traffic flow in urban road networks and a wide range of models to optimise a such control have been suggested in the literature e.g.  . Recently, congestion pricing strategies   have been used as an alternative to tackle traffic congestion in some major cities around the world. Congestion pricing is a mechanism used to shift purely discretionary rush hour highway travel to other transportation modes or to off-peak periods by surcharging users of a road network in periods of peak traffic, via some toll-like road pricing fees, in order to reduce traffic congestion.
The purpose of this project is first to investigate the effectiveness of congestion pricing strategies e.g.   in easing traffic condition compared to classical traffic control models e.g.  . Then, the insight gained from this investigation will be used to develop an integrated framework, which combined both classical traffic control with congestion pricing, for an effective traffic congestion management and control so as to ease traffic conditions in road networks. The developed framework is expected to be computationally efficient as well as suitable for real-time applications.
The current PhD-project requires high-level skills in mathematics, probability and stochastic processes theory as well as computer programming. An experience in programming and a liking of numerical simulation would be an asset.
 Chiou S-W (2007). A hybrid optimization algorithm for area traffic control problem. Journal of the Operational Research Society 58: 816–823.
 Cipriani E and Fusco G (2004). Combined signal setting design and traffic assignment problem. European Journal of Operational Research 155: 569-583.
 Dial R (200). Minimal Revenue Congestion Pricing Part II: an efficient algorithm for the general case. Transportation Research B 34: 645-665.
 Zhang X and Yang H (2004). The optimal cordon-based network congestion pricing problem. Transportation Research Part B 38: 517–537.
- Coxian phase-type classification of cancer behaviours
- Professor A H Marshall, Dr L Knight