Browse by author
Lookup NU author(s): Dr Hugo Hiden, Dr Simon Woodman, Professor Paul WatsonORCiD
The use of cloud resources for processing and analysing medical data has the potential to revolutionise the treatment of a number of chronic conditions. For example, it has been shown that it is possible to manage conditions such as diabetes, obesity and cardiovascular disease by increasing the right forms of physical activity for the patient. Typically, movement data is collected for a patient over a period of several weeks using a wrist worn accelerometer. This data, however, is large and its analysis can require significant computational resources. Cloud computing offers a convenient solution as it can be paid for as needed and is capable of scaling to store and process large numbers of data sets simultaneously. However, because the charging model for the cloud represents, to some extent, an unknown cost and therefore risk to project managers, it is important to have an estimate of the likely data processing and storage costs that will be required to analyse a set of data. This could take the form of data collected from a patient in clinic or of entire cohorts of data collected from large studies. If, however, an accurate model was available that could predict the compute and storage requirements associated with a piece of analysis code, decisions could be made as to the scale of resources required in order to obtain results within a known timescale. This paper makes use of provenance and performance data collected as part of routine e-Science Central workflow executions to examine the feasibility of automatically generating predictive models for workflow execution times based solely on observed characteristics such as data volumes processed, algorithm settings and execution durations. The utility of this approach will be demonstrated via a set of benchmarking examples before being used to model workflow executions performed as part of two large medical movement analysis studies.
Author(s): Hiden H, Woodman S, Watson P
Publication type: Report
Publication status: Published
Series Title: School of Computing Science Technical Report Series
Year: 2016
Pages: 10
Print publication date: 14/09/2016
Acceptance date: 14/09/2016
Report Number: 1499
Institution: School of Computing Science, University of Newcastle upon Tyne
Place Published: Newcastle upon Tyne