Application on sensory prediction of Chinese Moutai-flavour liquor based on ATR-FTIR

. ATR-FTIR combined with chemometrics was applied to establish SVM classification models aiming to evaluate sensory quality of Chinese Moutai-flavour liquor. Transformation of ATR-FTIR data, selection of effective wavenumbers as well as determination of c and gamma were performed in succession, while the verification of models was deployed applying unknown samples. Finally, taste-prediction models of raw grain and cleanliness have an accuracy reaching 90%. Model of after-taste has an accuracy of 80% and others are lower than 70%. As for some flavours, ATR-FTIR and chemometrics technology provided an effective method for quality analysis of Chinese Moutai-flavour liquor.


Introduction
Moutai flavour spirit, is a typical Chinese liquor with a soy sauce aroma, made from grains and daqu. During fermentation, the metabolism of the functional microorganisms gives rise to highly stable heat-and acidresistant enzymes such as amylases, proteases, glucoamylases, cellulases, glucosidases, xylanases as well as various dehydrogenases and phosphoenolpyruvate carboxykinases involved in oxidation-reduction reactions. All kinds of extreme micro-organisms (such as Bacillus licheniformis [1] and Bacillus subtilis [2][3][4], which produce the soy sauce aroma) and enzymes, encouraged by the long production process, help to produce rich flavour compounds (including esters, aldehydes, sulphur and nitrogen) of the liquor. These flavor compounds enter into the basic distillates by means of different processes and further assist in the formation of the unique taste and quality of Moutai flavour liquor which sensory characteristics are as follows: a) Smell: featuring soy Moutai-flavour aroma, fine and strong tasting, with enduring aroma lingering in empty glasses. b) Taste: tasting pleasantly strong and full, with lingering aftertaste. c) Style: featuring soy Moutaiflavour aroma, refined and delicate, pleasantly strong and full, with enduring aftertaste and lingering aroma in empty glasses [5].
The assessment of Chinese spirits is traditionally performed by professional tasters, which outcome mainly depends on the acute olfaction of tasters and this approach is time-consuming, labour intensive and affected by subjective factors or even biased. Therefore, researchers have been sought for sophisticated instruments combined with multiple analysis methods for decades. So far technologies such as GC [6] , GC-MS [7], GC-O [8] and FTIR [9], which are usually combined with chemometrics including PLS [10], SVM [11], ANN [12], CA, have been applied on quantitative or qualitative analysis of Chinese spirits to determine component, concentration [13], or to identify source, geographic origin [14], brand, authenticity [15] or storage time [16]. Qin Ouyang studied on prediction of the overall sensory score of Chinese rice wine applied NIR spectroscopy and Back propagation artificial neural network (BPANN) combined with adaptive boosting (AdaBoost) algorithm [17]. Daniel Cozzolino [18] investigated the relationship between sensory analysis and visible and near infrared spectroscopy in two Australian white wine varieties using PLS regression. The correlation coefficients were greater than 0.70 for estery, lemon and honey, and less than 0.50 for passionfruit, overall flavor and sweetness in both calibration and cross validation.
This paper aimed at predicting sensory quality of Moutai-flavour base liquor applying ATR-FTIR and SVM.

Sample
Moutai-flavor base liquor (different batches from 2015-2018) with its sensory analysis provided by Guotai Liquor Co. Ltd.. Sensory quality was evaluated by means of a score from 0 to 10 for each flavour. Prediction models were established using data collected in 2015-2017 and data of 2018 were tested these models.

Methods
IR spectra were recorded with an accumulation of 32 scans in 4000-650 cm -1 range with a resolution of 4 cm -1 .
The score of samples offered by panel of sensory analysis was set as one category to establish a Support Vector Machine (SVM) model. The establishment was performed on The Unscrambler (version 10.3, CAMO Software AS, Norway) and the infrared absorbance values were chosen as independent variables.
Appropriate preprocess on infrared data should be selected according to the accuracy of test-models which are built respectively with different sets of parameters. Effective wave (ew) range is suggested by Principal Component Analysis (PCA) which x-loadings plot highlights regions of the most importance and performs as an indicator that helps to determine beneficial wavenumbers which contain the most significant information of the spectra. Finally, with the best set of parameters (preprocess and ew), a grid search is required for the best gamma and c value which are two significant parameters of Radial Basis Function (RBF) kernel SVM. C controls overfitting of model, and gamma controls the degree of nonlinearity of the model. Gamma is inversely related to sigma, which is a degree for spread around a mean in statistics: the higher the value of gamma, the lower the value of sigma, thus the less spread or the more nonlinear the behavior of the kernel. Finally, unknown samples would be predicted by the SVM model to confirm the accuracy of models.

Detrend
Remove the effects of baseline shift and curvi-linearity.

Gap derivatives
Baseline drifts are reduced and slight spectral differences are enhanced by means of the computation of 1 st and 2 nd derivatives. To avoid enhancing the noise, which is a consequence of derivation, spectra are first smoothed.

Spectroscopic
Convert units from absorbance by taking the inverse logarithm of it to give reflectance or transmittance respectively. Chinese spirit contains water, ethanol and flavouring components. But it is the flavouring components which have a low proportion of 1%~2% that reflect the aroma, taste and quality of liquor product. So ethanol was deducted as background when spectra collected, which caused the negative peaks (Shown in Fig. 2). FTIR spectrum reflects holistic characteristics of sample, which contains overall information of components that spectral data were regarded as multivariable and put into software to analyse the relationship between FTIR information and sensory quality which is represented by scores.

Optimization of model
Taste-prediction models were established respectively after a series of selections of preprocess, ew and SVM parameters.
As for raw grain, preprocess of baseline or Spectroscopic on FTIR could lead to better validation accuracy. Then a PCA based on the transformed FTIR was performed to select ew. Scores plot and loading plot were shown as Fig. 3. The closer the samples are in the scores plot, the more similar they are with respect to the two components concerned. The plot can be used to interpret differences and similarities among samples and help to determine which variables are responsible for differences between samples. The sum of the explained variances for the 2 components is 98% and the plot shows a larger portion of the information in data, so the relationships can be interpreted with a high degree of certainty. PCA is a good way to detect important variables.
Samples were divided into two parts by the ordinate axis. The vast majority of samples which scores were 5 and 6 are located in second and third quadrants and those with 7 and 8 are most in first and fourth quadrants. This distribution indicates that PC1 is a major factor distinguishing higher scores and lower scores but an apparent trend between 5 and 6 or 7 and 8 wasn't be found. Then loadings plot on PC1 was checked to find ew and 1122-920cm -1 was a significant range than others for classification owing to its high values on Y-axis. Then a grid search, which was an orthogonal test from 0.01 to 100 on 5 levels, was conducted to seek optimal gamma and c value. When c was 100 and gamma was 0.01, model has highest validation accuracy, so this combination was selected for SVM model of raw grain flavour.    Other flavour models were established according to the above process, which performance was shown in Table 6. Models of Cleanliness, Grain, After-taste have better predictive capability which accuracy reach 80%. Some flavours cannot be evaluated precisely and their models need to be updated and corrected with the large amount of experimental data continue to a ccumulate.

External validation
Additional samples were predicted as unknown samples by the completed models to confirm accuracy (the consistency between predicted and sensory scorings). The results were listed in Table 7. It is indicated that grain and cleanliness have better accuracy reaching 90%. After-taste has an accuracy of 80% and others are lower than 70%.