Module Content - Honours
Introduction to R (13074-723)
Objectives and content: This
module is an introduction to programming and data analysis within the R
open source environment. It is presented as a block course in the first
two weeks of the first semester and commences the week preceding
general commencement of classes. The viewpoint of this module as well as
of all modules where R plays a role is in agreement with the aim of the
R computer language: "R has a simple goal: To turn ideas into software, quickly and faithfully".
Biostatistics (10408-712)
Objectives and content: Biostatistics
may be regarded as the study of the application of statistics to
medicine. It covers medical terminology, the design of clinical trials,
the collection and numerical analysis of data, the interpretation of
the analyses and the drawing of conclusions. Particular emphasis is
given to skills relevant to medical literature (the writing, as well as
the understanding of writing by others) and statistical techniques and
software that are widely used when doing medical research. It is not a
mathematically strenuous course. It deals primarily with the philosophy
and terminology of medical research, as well as the statistical
techniques problems encountered in the medical field in particular. Topics that will be covered are: SAS, Clinical trials, Power and sample size analysis, Longitudinal data analysis, Handling missing data and Statistical genetics.
Multivariate Methods in Statistics A & B (10600-721 & 10601-751)
Objectives and content: The objective of the course is to teach students the practical application of multivariate analysis. Various multivariate methods are dealt with. Students learn when and where to apply these techniques. The consequences of the assumptions made on some of these techniques are also studied. The following topics are studied: Matrix algebra, Characterising and displaying multivariate data, The multivariate normal distribution, Inferences on one or two mean vectors, Multivariate analysis of variance, Inferences on the covariance matrix, Discriminant analysis, Classification analysis, Multivariate regression, Canonical correlation, Principal component analysis, Factor analysis and Cluster analysis. The R and SAS software are used in all the applications to datasets. The MMS A module is a prerequisite for the MMS B module.
Experimental Design (10440-713)
Objectives and content: This
module does not require advanced mathematics and is an option for both
statistics and mathematical statistics students. Focus is mainly on the
practical implementation of techniques together with computer packages
from consultancy perspective. Attention is given to modeling, design
matrices, least squares and diagnostics.
Data mining (58777-741)
Objectives and content: Statistical learning is a relatively new area in statistics. It is concerned with modeling and understanding patterns in complex datasets. With to the explosion of "Big Data", there is currently a high demand for individuals with expertise in statistical learning. The methods studied in this module include regularised regression by means of ridge regression and the lasso; classification using linear discriminant analysis, logistic regression, quadratic discriminant analysis and k-nearest neighbors; resampling methods such as k-fold cross-validation, leave-one-out cross-validation and the bootstrap; linear model selection and dimension reduction methods; handling non-linearity via regression splines, smoothing splines, local regression, generalised additive models, bagging, random forests and boosting; and non-linear classification and regression by means of support vector machines. The objectives of the module are to equip students with the following knowledge and skills:
- the theory underlying the above statistical learning techniques;
- application of statistical learning methods in a programming environment;
- assessment and comparison of various models;
- interpretation and effective (written and verbal) communication of results.
We extensively make use of the R programming language, therefore note that the R course is a prerequisite.
Sampling Techniques (10705-742)
Objectives and content: The
design of a sample is one of the most important aspects of any survey:
no amount of statistical analysis can compensate for a badly-designed
sample. Therefore, the emphasis of this course is the scientific design
of samples, determination of sample sizes and is related to methods for
analysing the data from a survey. Contents include: Questionnaire
design, sampling techniques (simple random, stratified, systematic,
cluster, complex), proportional vs disproportional allocation for
stratified sampling, ratio and regression estimation, estimation of
means, totals proportions and their variances, weighting of survey data,
dealing with non-response.