Description
Aims:
The module aims to teach students the background theoretical software engineering as it applies to commissioning cloud and parallel systems. Students will be taught applied, technical details of deploying and maintaining data science applications. Students will learn to develop and write their own large scale, state-of-the-art Machine Learning analyses
Intended learning outcomes:
On successful completion of the module, a student will be able to:
- Evaluate and interpret the utility of different containerisation solutions.
- Evaluate different cloud and distributed computing platforms and use this information to commission and administer such a resource.
- Explain how to and be capable of deploying a high throughput data analysis application to a distributed computing resource.
- Critique strategies and choices for deploying high throughput data analysis applications.
- Evaluate and implement different strategies for recognising and diagnosing issues in data analysis pipelines.
- Remediate faults in data analysis pipelines structural and statistical testing.)
Indicative content:
The following are indicative of the topics the module will typically cover:
- Dev-ops for data: continuous deployment, infrastructure as code, Continuous Integration/ Continuous Deployment for data infrastructures, container orchestration.
- Cloud & Parallel infrastructure.
- Data pipeline lifecycles.
- Reliability engineering for data pipelines (On-line data issue detection, including structural and statistical testing.)
Requisites:
To be eligible to select this module as optional or elective, a student must: (1) be registered on a programme and year of study on which the module is formally available; and (2) either:
- Have taken Software Development Practice (COMP0104) or Research Software Engineering with Python (COMP0233) or an equivalent module at FHEQ level 6 (or higher), or
- Demonstrate basic software engineering experience in either a commercial or academic setting and provide evidence of their experience, such as a hyperlink to an appropriate code repository (i.e., GitHub repo), or
- Attend the Software Carpentry Software Development short course in Term 1, offered by Advanced Research Computing (ARC.)
Module deliveries for 2024/25 academic year
Last updated
This module description was last updated on 19th August 2024.
Ìý