Statistics and Actuarial Science
Welkom by Universiteit Stellenbosch

Nagraadse Wiskundige Statistiek

Aansoeke vir 2020 nagraadse studies is nou gesluit. Sluitingsdatum vir nagraadse studies in 2021 is 31 Oktober 2020. Besoek www.sun.ac.za vir verdere besonderhede omtrent die aansoek-prosedures.


Hier vind u inligting oor die volgende nagraadse programme in Wiskundige Statistiek:

'n Kort omskrywing en opsomming van die verskillende nagraadse modules word gevind deur op die module te klik of om na onder te gaan op hierdie bladsy. Vir verdere navrae oor hierdie programme kontak Prof Lubbe by slubbe@sun.ac.za.

Honneursprogramme in Wiskundige Statistiek

56928 – 779 (164) HonsBCom in Ekonomie en Wiskundige Statistiek

Sien die Jaarboek hier vir 'n volledige omskrywing van die program.

22853 – 778 (120) HonsBCom in Wiskundige Statistiek (Fokus Datawetenskap)

Sien die Jaarboek hier vir 'n volledige omskrywing van die program.

Hierdie program word gesamentlik deur die Departement Statistiek en Aktuariële Wetenskap en die Departement Rekenaarwetenskap aangebied. Gevolglik moet studente deur beide departemente tot honneursstudie toegelaat word. ’n Baccalaureusgraad met ’n gemiddelde punt van minstens 65% in Wiskundige Statistiek 3 word vereis, asook ’n bevredigende punt in Rekenaarwetenskap op ten minste tweedejaarvlak.

Verpligte modules wat aangebied word in hierdie program:

ModuleKodeSemester (2020)KredieteVerpligte Modules
Inleiding tot R Programmering​13074-723​1​6****​
Data-ontginning​58777-741​1​12****​
​Inleiding tot Statistiese Leerteorie​13360-771​2​12​****
​Navorsingswerkstuk: Wiskundige Statistiek​11228-791​1 & 2​30​****

Keusemodules (word gekies met inagneming van modules by Rekenaarwetenskap)

ModuleKodeSemester (2020)KredieteVerpligte Modules
Bayes-statistiek10394-711NVT12
​Meerveranderlike Statistiese Analise A10602-715​1​12
​Meerveranderlike Statistiese Analise B *
​10603-745​2​12
Stogastiese Simulasie​65250-718​1​12
Tydreeksanalise​10751-747​1​12

* Die B module volg direk na die A module voltooi is. Die MSA A module is 'n voorvereiste vir die MSA B module.

22853 – 778 (120) HonsBCom in Wiskundige Statistiek  

Sien die Jaarboek hier vir 'n volledige omskrywing van die program. Modules wat aangebied word in hierdie program:

ModuleKodeSemester (2020)KredieteVerpligte Modules
Biostatistiek10408-712112
Capita Selecta in Wiskundige Statistiek A​11922-724​NVT​12
Capita Selecta in Wiskundige Statistiek B​11923-754​NVT​12
​Data-ontginning​55777-7411​12
Eksperimentele Ontwerp​10440-713​NVT​12
Inleiding tot R Programmering​13074-723​1​6****​
​Meerveranderlike Statistiese Analise A10602-715​1​12​****
Meerveranderlike Statistiese Analise B *
​10603-745​2​12​****
Oorlewingsanalise​10636-7462​12
Steekproefnemingstegnieke​10705-742​NVT​12
​Stogastiese Simulasie​65250-718​1​12****​
Tydreeksanalise​10751-747​1​12****​
Inleiding tot Statistiese Leerteorie **
13360-771​​2​12
​Navorsingswerkstuk: Wiskundige Statistiek​11228-791​1 & 2​30​****

NVT - Hierdie module word nie aangebied in 2020 nie.

* Die B module volg direk na die A module voltooi is. Die MSA A module is 'n voorvereiste vir die MSA B module.

** Hierdie module se voorvereiste is Data-ontginning 741. Data-ontginning moet eers voltooi word voorday jy mag registreer vir die Statistiese Leerteorie module.



Magisterprogramme in Wiskundige Statistiek 

Magisterprogramme in Wiskundige Statistiek

22853 – 879 (180) MCom (Wiskundige Statistiek) – tesis opsie  

22853 – 889 (180) MCom (Wiskundige Statistiek) – werkstuk opsie

Sien die Jaarboek hier vir 'n volledige omskrywing van die programme. Modules wat aangebied word in hierdie programme:

ModuleKodeSemester (2020)KredieteVerpligte Modules
Ekstreemwaardeteorie A (§)
10441-8131
15
Ekstreemwaardeteorie B (§)
10442-8432
15
Gevorderde Steekproefnemingstegnieke​10523-818NVT
​15
Gevorderde Wiskundige Statistiek A​10524-819​NVT​15
​Gevorderde Wiskundige Statistiek B​11173-849​NVT​15
Meerdimensionele Skalering A​10597-822​1​15
Meerdimensionele Skalering B​11910-852​2​15
Skoenlus en ander Steekproefhergebruiktegnieke A​10694-811​1​15
Skoenlus en ander Steekproefhergebruiktegnieke B​10695-841​2​15
Statistiese Leerteorie A (§)
​10703-812​1​15
Statistiese Leerteorie B (§)
​10704-842​2​15
​Tesis: Wiskundige Statistiek​11246-891 ​1 & 2​90 saam met 879
​Navorsingswerkstuk: Wiskundige Statistiek​11228-895​1 & 2​60​saam met 889

NVT - Hierdie module word nie aangebied in 2020 nie

§ Hierdie modules is jaarmodules. Dit beteken dat die A module moet eers voltooi word voordat die B module geneem kan word. Daar is slegs een assesseringsgeleentheid vir hierdie jaarmodule aan die einde van die jaar. Die A & B modules is nie onafhanklike en aparte semestermodules nie.


Doktoraleprogram in Wiskundige Statistiek

22853 – 978 (240) PhD (Wiskundige Statistiek)

Sien die Jaarboek hier vir 'n volledige omskrywing van die programme.


 

Module Inhoud - Honneurs

Bayesian Statistics (10394-711)

Objectives and content: The aim of the module is to introduce the students to the basic principles of Bayesian Statistics and its applications. Students will be able to identify the application areas of Bayesian Statistics. The numerical methods often used in Bayesian Analysis will also be demonstrated. Topics: Decision theory in general; risk and Bayesian risk in Bayesian decisions; use of non-negative loss functions; construction of Bayesian decision function; determining posteriors; sufficient statistics; class of natural conjugate priors; marginal posteriors; class of non-informative priors; estimation under squared and absolute error loss; Bayesian inference of parameters; Bayesian hypothesis testing; various simulation algorithms for posteriors on open source software; numerical techniques like Gibbs sampling and the Metropolis-Hastings algorithm, as well as MCMC methods  to simulate posteriors.

Biostatistics (10408-712)

Objectives and content: Biostatistics may be regarded as the study of the application of statistics to medicine.  It covers medical terminology, the design of clinical trials, the collection and numerical analysis of data, the interpretation of the analyses and the drawing of conclusions.  Particular emphasis is given to skills relevant to medical literature (the writing, as well as the understanding of writing by others) and statistical techniques and software that are widely used when doing medical research. It is not a mathematically strenuous course. It deals primarily with the philosophy and terminology of medical research, as well as the statistical techniques problems encountered in the medical field in particular. Topics that will be covered are: SAS, Clinical trials, Power and sample size analysis, Longitudinal data analysis, Handling missing data and Statistical genetics.

Survival Analysis (10636-746)

Objectives and content: A problem frequently faced by applied statisticians is the analysis of time-to-event data. Examples of this data arise in diverse fields, such as medicine, biology, public health, epidemiology, engineering, economics, and demography. Our focus in this course however will be on applications of the techniques in Biology and Medicine. Interest is on analysing data on the time to death from a certain cause, duration of response to treatment, time to recurrence of a disease, time to development of a disease, or simply time to death. Various non-parametric, semi-parametric and parametric techniques are introduced in the course that will also address residual analysis and goodness-of-fit in survival data. The emphasis of this course is on the practical analysis of survival data, with the necessary underlying theoretical background. SAS and R are used extensively to analyse the data.

Experimental Design (10440-713)

Objectives and content: This module does not require advanced mathematics and is an option for both statistics and mathematical statistics students. Focus is mainly on the practical implementation of techniques together with computer packages from consultancy perspective. Attention is given to modeling, design matrices, least squares and diagnostics.

Introduction to R programmering (13074-723)

Objectives and content: This module is an introduction to programming and data analysis within the R open source environment. It is presented as a block course in the first two weeks of the first semester and commences the week preceding general commencement of classes. The viewpoint of this module as well as of all modules where R plays a role is in agreement with the aim of the R computer language:  "R has a simple goal: To turn ideas into software, quickly and faithfully".

Multivariate Statistical Analysis A & B (10602-715 & 10603-745)

Objectives and content: Data collected in practice rarely consist of one isolated variable.  Mostly, data consist of many variables influencing one another.  If only one variable upon a time is singled out for analysis, the data analyst is in danger of arriving at completely wrong conclusions.  Multivariate statistical analysis entails the study of techniques for analysing data sets consisting of various variables influencing one another.  This model aims to provide students with the expertise to confidently come to the right conclusions when analysing multivariate data. MSA A is 'n voorvereistes vir MSA B.

Data mining (58777-741)

Objectives and content: Statistical learning is a relatively new area in statistics. It is concerned with modeling and understanding patterns in complex datasets. With to the explosion of "Big Data", there is currently a high demand for individuals with expertise in statistical learning. The methods studied in this module include regularised regression by means of ridge regression and the lasso; classification using linear discriminant analysis, logistic regression, quadratic discriminant analysis and k-nearest neighbors; resampling methods such as k-fold cross-validation, leave-one-out cross-validation and the bootstrap; linear model selection and dimension reduction methods; handling non-linearity via regression splines, smoothing splines, local regression, generalised additive models, bagging, random forests and boosting; and non-linear classification and regression by means of support vector machines. The objectives of the module are to equip students with the following knowledge and skills:

  • the theory underlying the above statistical learning techniques;
  • application of statistical learning methods in a programming environment;
  • assessment and comparison of various models;
  • interpretation and effective (written and verbal) communication of results.

We extensively make use of the R programming language, therefore note that the R course is a prerequisite


 

Time series analysis (10751-747) & Applied Time series Analysis (10748-722)

Objectives and content: This module is a continuation of undergraduate time series analyses and concentrates on more advanced forecasting techniques. Topics that are covered include: 

  • The Box & Jenkins methodology of tentative model identification, conditional and unconditional parameter estimation and diagnostic methods for checking the fit of the series.
  • ARIMA and Seasonal ARIMA-processes.
  • Introduction to Fourier Analysis, spectrum of a periodic time series, estimation of the spectrum, periodogram analysis, smoothing of the spectrum.
  • Case studies using STATISTICA, R and SAS.
  • Forecasting with ARMA models and prediction intervals for forecasts.
  • Transfer function models and intervention analysis.
  • Multiple regression with ARMA errors, cointegration of non-stationary time series.
  • Conditional heteroscedastic time series models, ARCH and GARCH.


Stochastic Simulation (65250-718)

Objectives and content: The module probability models and stochastic simulation is devoted to a study of the theory and applications of important probability models and stochastic processes.  Applications are studied analytically, by means of the techniques of mathematical statistics, and are also illustrated by means of stochastic computer simulation. The broad aim of the module is to make students aware of the following important concepts:

  • the way in which probability models and stochastic processes can be used to model phenomena containing a random or stochastic component;
  • the important role played by assumptions in identification of an appropriate model for a given practical situation;
  • the standard techniques of mathematical statistics that can be used in the analysis of probability models;
  • the wide applicability of stochastic simulation in the analysis of probability models.

The specific outcomes of the module are related to the specific topics that receive attention. These topics include the following: Methods for generating random variables from distributions; Monte Carlo integration; Markov chains (including applications to Metropolis-Hastings and Gibbs sampler methods); Homogeneous and non-homogeneous Poisson processes; Markov processes; variance reduction techniques in stochastic simulation.

Sampling Techniques (10705-742)

Objectives and content: The design of a sample is one of the most important aspects of any survey: no amount of statistical analysis can compensate for a badly-designed sample.  Therefore, the emphasis of this course is the scientific design of samples, determination of sample sizes and is related to methods for analysing the data from a survey. Contents include: Questionnaire design, sampling techniques (simple random, stratified, systematic, cluster, complex), proportional vs disproportional allocation for stratified sampling, ratio and regression estimation, estimation of means, totals proportions and their variances, weighting of survey data, dealing with non-response.

Introduction to Statistical Learning Theory (13360-771)

Objectives and content: Statistical Learning Theory is a module presented to honours students in Data Science and in Mathematical Statistics.  The module extends over 1 semester and entails 13 contact sessions of approximately 2 hours each.  It follows on the Honours Data Mining module as prerequisite and is in turn followed by the Statistical Learning Theory module presented at Masters level.

The following outcomes are envisaged in this module.

  • The students should develop a holistic view of the subject of statistics, gaining an appreciation of the general principles underlying many (seemingly unrelated) statistical methods.
  • The students should develop an awareness of the size, complexity and diversity of data sets which one encounters in practice.
  • The students should develop an appreciation for the challenges and problems posed to a statistician wishing to analyse a data set, especially the problem referred to as the "curse of dimensionality" in high-dimensional problems.
  • The students must gain knowledge of the various approaches to the "curse of dimensionality" problem and the manner in which these approaches compromise between underlying assumptions and sample size requirements.
  • The students must understand the important role played by more traditional, established statistical procedures such as multiple regression analysis, logistic regression analysis and linear discriminant analysis in modern data mining.
  • The students should become sensitive to and appreciative of the valuable contributions made to the development of data mining procedures in areas such as computer science and machine learning.

Regarding the content of the module, the following topics are discussed: linear algebra background, including inner products and orthogonal projections; ridge regression and the lasso; linear basis function expansions, regression splines and smoothing splines; reproducing kernel Hilbert spaces and kernel methods; the perceptron and the support vector classifier; projection pursuit regression and neural networks; feature extraction methods; probabilistic graphical models.

Capita Selecta in Mathematical Statistics A & B (11922-724 & 11923-754)

Objectives and content: Selected and specialised topics to be followed in Mathematical Statistics. Content varies from year to year when offered.


 


 

Module Inhoud - Magister

Bootstrap and other Resampling techniques A & B (10694-811 & 10695-841)

Objectives and content: Traditional procedures of statistical inference in many cases are true only asymptotically or under strict assumptions for small samples.  For many problems it is impossible to find solutions analytically.  Re-sampling techniques are computer intensive methods using repeated re-sampling from the original sample in order to obtain solutions for inferential statistical problems.  The aim of this module is to introduce the student to the bootstrap and related computer intensive methods enabling him/her to use correctly these methods with confidence in practice. The A module is a prerequisite for the B module.

Statistical Learning Theory A & B (10703-812 & 10704-842)

Objectives and content: The outcomes of these modules can be summarized as follows:

  • The students should develop a holistic view of the subject of statistics, gaining an appreciation of the general principles underlying many (seemingly unrelated) statistical methods.
  • The students should develop an awareness of the size, complexity and diversity of data sets which one encounters in practice.
  • The students should develop an appreciation of the challenges and problems posed to a statistician wishing to analyse a data set, especially the problem referred to as the "curse of dimensionality" in high-dimensional problems.
  • The students must gain knowledge of the various approaches to the "curse of dimensionality" problem and the manner in which these approaches compromise between underlying assumptions and sample size requirements.
  • The students must understand the important role played by more traditional, established statistical procedures such as multiple regression analysis, logistic regression analysis and linear discriminant analysis in modern data mining.
  • The students should become sensitive to and appreciative of the valuable contributions made to the development of data mining procedures in areas such as computer science and machine learning.
  • The students should be granted the opportunity to enhance their programming skills through writing appropriate programs to solve various data analysis problems.

Regarding the content of the modules, statistical learning theory is a collective noun for a variety of techniques that can be used to identify, describe and model important patterns and trends in data sets.  The topics which are studied in these modules include techniques that are well established in traditional statistics, namely regression analysis, discriminant analysis, spline models and smoothing splines, as well as more recently developed approaches such as regression and classification trees, additive models, bagging, boosting (also from a functional gradient descent point of view), neural networks, random forests and support vector machines. These modules form a year module. The A module is a prerequiste to the B module and there is a single assessment at the end of the year for this year module.

Advanced Sampling Techniques (10523-818)

Objectives and content: In practice, complex sampling techniques are usually applied to design sample surveys.  Furthermore, nonresponse and skewness generally manifest in sampling surveys that need to be addressed scientifically.  This course covers both theoretical and practical aspects regarding sampling and include the following: two-stage cluster sampling; design and estimation of complex sample surveys; design effects; dealing with nonresponse and missing data; weighting of surveys; inferential statistics for complex survey data.

Multi-dimensional Scaling A & B (10597-822 & 11910-852)

Objectives and content: Multi-dimensional scaling (MDS) consists of various techniques from the field of multivariate statistical analysis. MDS focuses on dimension reduction and graphical displays of multi-dimensional data. This module introduces the theory and practical implementation of classical metrical scaling, non-metrical scaling, various forms of Procrustes analysis, unfolding techniques, individual differences models, as well as m-mode n-way models. Biplot methodology is emphasised. MDS techniques for both quantitative and qualitative data are considered. Correspondence analysis, multiple correspondence analysis, homogeneity analysis, analysis of distance as well as non-linear principal component analysis and canonical variate analysis are also discussed. The A module is a prerequisite for the B module.

Extreme Value Theory A & B (10441-813 & 10442-843)

Objectives and content: Extreme value Theory (EVT) entails the study of extreme events, i.e. unusual events rather than usual events as in more traditional statistics. In order to do this, theory has been developed that describes behavior in the tails of distributions. These results are analogous to the results of central limit theory and in a similar way transforms problems of unknown underlying distributions to parametric problems where only parameters are unknown. Techniques have been developed to carry out inference on these parameters and to apply them to data sets where understanding behavior in the tails of distributions, is important. In these modules the mathematical and practical aspects of the theory and inference techniques will be studied. These modules form a year module. The A module is a prerequiste to the B module and there is a single assessment at the end of the year for this year module.

Advanced Mathematical Statistics A & B (10524-819 & 11173-849)

Objectives and content: Selected and specialised topics to be followed in Mathematical Statistics. Content varies from year to year when offered.