Universiteit Stellenbosch
Welkom by Universiteit Stellenbosch
Workshop in multivariate data analysis hosted by Vibrational Spectroscopy Unit
Outeur: Dr J Colling & E Els
Gepubliseer: 17/02/2020

​A three-day workshop in the field of multivariate data analysis was organized by the CAF Vibrational Spectroscopy Unit and the Department of Food Science. An international researcher, Professor Federico Marini, an associate professor at the Department of Chemistry at the Sapienza University of Rome, was invited to present this workshop. He is an expert in chemometrics and focused on developing chemometric tools to evaluate the quality of food products during his PhD.

 

Photo: Participants of the multivariate data analysis workshop. Photo by Prof Marena Manley

Multivariate data analysis can be described as the 'simultaneous analysis of multiple measurements collected from individuals or objects'. Interpretation of the results and understanding the applications of these analysis is an important skill.

The aim of the training initiative was to provide the 30 participants who had different levels of experience in data analysis, with an overview of the techniques involved. The course was packed with information from the history of chemometrics and fundamental principles to the more challenging aspects of pretreating data and predictive modeling. Morning sessions focused on understanding the theory of various analytical methods. Afternoon sessions were dedicated to more practical examples to demonstrate real-world applications. The workshop also provided the opportunity for participants to get assistance with their data analysis and to network and meet colleagues who have similar goals. This workshop was made possible by the National Research Foundation (NRF) who provided funding through the Knowledge Interchange and Collaboration Grant.

More about multivariate data analysis

People interact and benefit from the information from multivariate data analysis or data science more often than realized. The weather application on cellphones analyzes the measurements of the wind, temperature and air pressure collected by weather stations to predict todays' weather. By swiping a loyalty card, companies can collect information from consumers to study purchasing behavior, which in turn assists with personalized advertising. The application of multivariate data analysis to study the chemically relevant information produced by chemical experiments, is termed chemometrics (Wold, 1995).

The data generated with the near infrared (NIR) hyperspectral imaging cameras at the Vibrational Spectroscopy unit (CAF) is often of a multivariate nature. In this instance, the absorption of 'light' at various wavelengths in the NIR range are measured with a spectrometer.  As the absorption of 'light' is influenced by the compounds in the sample, the data can be used to evaluate certain chemical properties of the samples.  By using a camera, an image is collected of the object and an absorption plot or spectra is available for every pixel in the image. 

This three-dimensional dataset consisting out of the pixel position (x and y) in the image and the absorption spectra of every pixel, is called a hypercube. The data obtained can be used for exploratory analysis to look for trends or patterns. Alternatively, prior knowledge of the samples, which can include quantification of compounds such as protein content or information about the class of a sample (diseased or healthy) can be used for predictive modeling.  This facilitates determining specific chemical measurable outputs (quantitative prediction) or classification of samples according to their chemical nature (qualitative prediction).

A short workshop on NIR hyperspectral imaging is planned for April 2020. It will be hosted by Dr José Amigo Rubio. Contact Dr Janine Colling (jcolling@sun.ac.za) to attend this workshop. Another hyperspectral imaging training opportunity will follow in June.

Reference:

Wold S (1995). Chemometrics; what do we mean with it, and what do we want from it?  Chemometrics and Intelligent Laboratory Systems 30: 109 – 115