Exploring the Steps of Infrared (IR) Spectral Analysis: Pre-Processing, (Classical) Data Modelling, and Deep Learning
Infrared (IR) spectroscopy has greatly improved the ability to study biomedical samples because IR spectroscopy measures how molecules interact with infrared light, providing a measurement of the vibrational states of the molecules. Therefore, the resulting IR spectrum provides a unique vibrational fingerprint of the sample. This characteristic makes IR spectroscopy an invaluable and versatile technology for detecting a wide variety of chemicals and is widely used in biological, chemical, and medical scenarios. These include, but are not limited to, micro-organism identification, clinical diagnosis, and explosive detection. However, IR spectroscopy is susceptible to various interfering factors such as scattering, reflection, and interference, which manifest themselves as baseline, band distortion, and intensity changes in the measured IR spectra. Combined with the absorption information of the molecules of interest, these interferences prevent direct data interpretation based on the Beer–Lambert law. Instead, more advanced data analysis approaches, particularly artificial intelligence (AI)-based algorithms, are required to remove the interfering contributions and, more importantly, to translate the spectral signals into high-level biological/chemical information. This leads to the tasks of spectral pre-processing and data modeling, the main topics of this review. In particular, we will discuss recent developments in both tasks from the perspectives of classical machine learning and deep learning.