Computational Statistics and Visualisation
An intensive course in data analysis primarily for non-mathematics/statistics graduates. Covers fundamentals of descriptive statistics, probability and applications.
Visualisation is a central theme and is incorporated in all three content sections.
- Data Visualisation [20%] - Methods of sampling. Data representation - pie and bar charts; scatterplots; histograms; cumulative (relative) frequency curves; dot plots; box-whisker plots, stem-and-leaf displays. Measures of central tendency and variability for sample and grouped data. Psychological aspects.
- Probability [30%] - Definitions and fundamental laws; counting techniques; conditional probability; Bayes theorem; the concept of a discrete probability distribution; expectations and variance; some standard discrete distributions; Geometric, Binomial, Poisson. The concept of a continuous distribution; the Normal distribution and properties; use of Normal tables. Continuous probability distributions and their properties; Expectation and variance. Some standard continuous distributions; normal and related distributions.
- Statistical Applications [50%] - The concept of a sampling distribution; point and interval estimation; hypothesis testing; Type I and Type II errors; p values; determination of sample size; confidence intervals and significance tests for means and for proportions; single, paired and unpaired samples; Normal and t tests. F-test. Normal probability plot. Introduction to one-way Analysis of Variance. Hartley's test, Bartlett's test. Confidence intervals for treatment means and differences between treatment means. Introduction to simple linear regression. ANOVA table. Confidence intervals and prediction intervals. Correlation and rank correlation. Chi-square as a test of association and as a test of model fit. Non-parametric tests (Wilcoxon's Signed rank test, Mann-Whitney-Wilcoxon test, Kruskal-Wallis test and Friedmann test).
Business Intelligence (with SAS)
Business Intelligence is seen as the tools/systems that play a key role in the strategic planning process of an organisation. These tools/systems utilize statistical methods to allow the gathering, storing, accessing and analysing of data to aid in the organisations decision making process. The aim of the unit is to provide the ability using commercial statistical software, to analyse and interpret real-life business data, thus equipping students with a structured approach to using data to identify meaningful and useful information for business analysis purposes.
- Fundamentals of business intelligence [10%] - What is Business intelligence, what is useful data/information, structured/unstructured data/information, handling large quantities of data collected from business operations, giving/using real-life business data as examples.
- Statistical techniques that have a real-life business intelligence application [90%]
- Variable selection: Used to help to understand relationships between dependant and independent variables, to select the best data/transformations for objectives.
- Multivariate Analysis Of Variance, Discriminant Analysis, Principal Component Analysis, Variable Clustering, Factor analysis.
- Forecasting: Used to predict and simulate demand.
- Times series analysis, Box-Jenkins methodology, Regression models with autocorrelated errors, intervention analysis and outlier analysis. ARCH/GARCH Modelling.
Data Analytics Project
Each individual project will investigate a challenging but constrained Data Analytics problem.
The project will involve performing an end-to-end data analytics task pipeline including, data collection, formulation of one or more questions to be asked about the data, typical preprocessing steps (e.g. cleaning, transforming and exploring), analysis, application of applicable machine learning methods, modelling, visualization, interpretation and assessment of whether models are meaningful and relevant to the field. Students will be required to demonstrate understanding of experimental design including validation and evaluation of models using appropriate statistical methods.
The project will involve practical experimentation work on live data. The project may also involve practical implementation. The project will provide him or her with the opportunity to develop independent practical and analytical skills using proven methods and techniques.
Students will be able to produce well-substantiated and validated results within the limits imposed by the time constraint. They will be able to demonstrate their investigative ability but will not necessarily be able to produce a complete piece of research or make a significant contribution to knowledge. They will, however, be expected to critically examine their work and be able to place it in context.
Each student will be allocated a Project Supervisor from the academic staff. The main function of a Project Supervisor is to offer general advice and guidance to the student. Students will submit a proposal to their Project Supervisor which will be scrutinised by at least one other academic member of staff.
Supporting seminars (5%), commencing before the start of the project, will be used to reinforce the students knowledge of research methods and to discuss personal organization and time management. Students need support to develop the communications and other generic skills they require to become effective researchers, to enhance their employability and assist their career progress after completing their degree. These skills may be present on commencement or developed during the project. The need for dissertations to address, as appropriate, legal, ethical, professional and social issues will be emphasised.
Students on the MSc will also attend a seminar which will be dedicated to examining current professional, legal, ethical, social and cultural issues in data analytics.
As the project is the most distinctive part of postgraduate study, there will be a strong element of personal development planning, both during the support seminars and also during the supervision sessions with individual project supervisors, as students are invited to reflect on their progress during the projects execution and write-up.
The student, at the end of the project will be required to submit a project dissertation and undertake a Viva examination to present the project work, too the Projects Supervisor and a designated Second Reader allocated by the Project Tutor.
Where it is appropriate,the project may be undertaken with an industry partner (e.g. existing employer or internship) with system creation or experimentation being work-based.
Likely Optional Units
Data Management and Machine Learning
The aim of this unit is to develop the student’s knowledge in the areas of data management including online analytical processing; data architectures such as data warehousing and the process and application of machine learning algorithms to data.
- Data Management Overview [15%] - Example content includes database modelling/querying (relational/noSQL), graph data modelling, applications.
- Online Analytical Processing (OLAP) [15%] - Including the representation of multi-dimensional views of data; Technologies and Architectures; Categories of OLAP tools, Business Intelligence Tools.
- Data Warehousing [10%] - Methodologies, architectures, modelling techniques; Data Warehousing Project Management; The Extraction, Transformational and Loading Process;
- Machine Learning Overview [10%] - The machine learning process, Applications of machine Learning.
- Machine Learning Algorithms [50%] - For example, artificial neural networks, naïve bayes, decision trees, clustering, association rules, text mining, fuzzy systems, application, analysis and validation.
Data Modelling and Analysis
The unit will equip students with skills and knowledge relating to the handling and analysis of data typically generated by organisations. Students will be introduced to the concept of self-service business intelligence and what impact this will have on their future career path.
The overarching theme of the unit will be to consider the core principles of business analytics: How can organisations make sense of all their data? How can data be harnessed to effectively support the decision making process? In what ways can actionable information be created and communicated?
Ethics, Security and Sustainability
The aim of this unit is to introduce students to the area of Strategic Information Systems. The core aim is to understand the nature of problems and issues faced by organisations at the operational and strategic level, and examine how Information Systems and Information Technology help to overcome such issues.
Statistics is a key element of business analytics, ranging from the description of and summarising of data to advanced modelling of both cross-sectional and time series information. Students will receive a firm grounding in the most widely utilised statistical techniques in modern organisations.
The overarching theme of the unit will be to explore how various statistical techniques can aid business decision-making. Consideration will also be given to the factors driving adoption and on-going usage of management information derived from statistical analysis.
Programming fundamentals: Control constructs, operators, procedural abstraction, simple I/O and use of libraries; Data types: primitive types, constants, variables and arrays.
Exploratory Programming environments: Integrated Development Environments; notebooks; workbenches; read-eval-print loops, interactive shells
The aim of this unit is to provide the students with an understanding of the skills and language around financial analysis, to demonstrate examples of good and bad practice, and for the students to be able to perform financial analysis and clearly present their results.