ºÚÁϳԹÏÍø

Internal

CS3DS19NU - Data Science Algorithms and Tools

ºÚÁϳԹÏÍø

CS3DS19NU-Data Science Algorithms and Tools

Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Level:6
Semesters in which taught: Semester 2 module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2023/4

Module Convenor: Dr Carmen Lam
Email: carmen.lam@reading.ac.uk

NUIST Module Lead: Xiaohe Zhang
Email: xiaohe.zhang@nuist.edu.cn

Type of module:

Summary module description:

Automated data collection and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. In this context, automated data analysis and data modelling tools and algorithms (Data Mining) are becoming essential components to any information system. Application areas of these techniques include scientific computing, intelligent business, direct marketing, customer relationship management, market segmentation, store shelf management, data warehouse management, fraud detection in e-commerce and in credit card transactions, etc.


Aims:

The study of fundamental techniques and tools for data manipulation and transformation, and for data mining algorithms classification, regression, clustering, association rule mining. In particular, one of the leading platform for Data Science and Machine Learning, KNIME, will be introduced and adopted for practical activities.Ìý



This module also encourages students to develop a set of professional skills, such as problem solving, critical analysis of published literature, creativity, technical report writing for technical and non-technical audience, professional communication (email; letters; minutes etc.), self-reflection and effective use of commercial software.


Assessable learning outcomes:

Students are expected to understand the general Data Mining principles and techniques, and to be able to apply them in different contexts. In a practical project a data workflow is designed and developed using advanced tools for data science to combine data mining algorithms and analyse real-world datasets.


Additional outcomes:

Students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.


Outline content:


  • Introduction to Data Mining;

  • Introduction to Data Science and Machine Learning platforms

  • KNIME

  • Data preprocessing;

  • Proximity measures;

  • Regression, Classification and model evaluation;

  • Clustering and cluster validity;

  • Decision Tree Induction;

  • Association Rule Mining;


Brief description of teaching and learning methods:

The module comprisesÌýlectures and practical activities. The lectures introduce the basic concepts, the tools, and the algorithms used to build Data Science applications. The assessment is based on multiple choice questionnaires and a data science project that allows the students to apply theoretical concepts to a practical case.


Contact hours:
Ìý Semester 1 Semester 2
Lectures 10
Practicals classes and workshops 10
Guided independent study: Ìý Ìý
Ìý Ìý Other 80
Ìý Ìý Ìý