ºÚÁϳԹÏÍø
CSMBD: Big Data and Cloud Computing
Module code: CSMBD
Module provider: Computer Science; School of Mathematical, Physical and Computational Sciences
Credits: 20
Level: 7
When you’ll be taught: Semester 2
Module convenor: Dr Xiang Li , email: x.li7@reading.ac.uk
Pre-requisite module(s):
Co-requisite module(s):
Pre-requisite or Co-requisite module(s):
Module(s) excluded:
Placement information: NA
Academic year: 2025/6
Available to visiting students: No
Talis reading list: No
Last updated: 3 April 2025
Overview
Module aims and purpose
The massively increased uptake of computing, with devices at all scales of operation, has driven the development of large-scale distributed systems capable of meeting the demands for handling scalable parallel data analysis and processing and supporting the execution of analytical algorithms on computer clusters such as Hadoop. This module aims to introduce the concepts and design principles for big data management and advanced network-centric computing platforms.Â
This module also encourages students to develop a set of professional skills, such as software development documentation and project management. Students will also be able to demonstrate their abilities in professional and effective writing to communicate data science concepts, solutions and outputs in technical reports and utilising knowledge and skills to continue learning and adapting to new data science technologies.Â
Module learning outcomes
By the end of the module, it is expected that students will be able to:
- Describe concepts and models of distributed system and cloud computing as well as cloud computing design principles;
- Identify and describe the challenges of big data management and appraise relevant tools and techniques to tackle such challenges;
- Acquire an integrated perspective on big data processing in cloud computing platforms, this includes handling and processing large-scale data using big data framework and distributed computing technologies, designing, implementing, and validating cloud-based solutions for solving big data problems; and
- Address socio-legal, security, privacy and trust issues involved in operating and using cloud services.
Module content
The module covers the following topics:Â Â
- Introduction to cloud computing (e.g., IaaS, PaaS, SaaS, and AI-as-a-S) and big data
- Principles of item-based and user-based recommendation systems, building parallel recommendation systems, and evaluating their performance.
- Introduction to cloud computing, including IaaS, PaaS, SaaS, and AI-as-a-Service (AIaaS).
- Principles of item-based and user-based recommendation systems, building parallel recommendation systems, and evaluating their performance.
- Principles of data distribution, workload balancing, and applications of data stream mining in real-time systems.
- Cloud computing middleware, including Hadoop, MapReduce, and their key design features such as consistent hashing and data partitioning.
- Big data platforms, including Apache Spark, for handling large-scale semi-structured and unstructured data.
- Strategies for efficient cloud-based big data access and processing, including governance, security, and compliance in platform design.
- Cloud computing design features, such as consistent hashing and partition for computational processing
- Big data platform, e.g., Spark, for handling large-scale data in a semi-structured and unstructured-data mode
- Cloud-based big data access and performance efficiency Design cloud platform with big data governance embedded
Structure
Teaching and learning methods
Material will be delivered via lectures and practical classes on a weekly basis. The lectures will introduce students the theories, concepts and underpinning principles specified in the indicative content. Students will be supervised in the practical sessions to apply the concepts and principles to a