Study of Structure Activity Correlation of 7-O-Amide Hesperetine Derivative based on Descriptor Calculation by Using AM1 as Anti-Inflammatory Candidate

.


Introduction
Inflammation is the response of the immune system to infection and tissue damage [1][2][3]. Inflammation is also involved in various pathogenesis, such as arthritis, cancer, stroke, neurodegenerative, and cardiovascular diseases [4,5]. The main signs of inflammation are rubor (redness), calor (heat), tumor (swelling), and dolor (pain) [6,7]. In Indonesia, diseases accompanied by inflammatory reactions were quite high, such as joint disease, diabetes mellitus, respiratory infection, and asthma. According to RISKESDAS (2018) that joint disease especially in patients aged >75 years (18.9%), diabetes at the age of 55-64 years (6.3%) mainly occured in DKI Jakarta, most ISPA in Papua (10.2%), and asthma in Yogyakrta (4.5%) [8]. Therefore, preventive measures are needed to minimize the occurrence of these diseases. One of the preventive measures is to develop new anti-inflammatory drugs.
The process of discovering and developing new drugs requires a lot of time and money. The process also requires various scientific disciplines to minimize errors. The experimental method needs to be supported by a theoretical or modeling approach to reduce costs and time. The relationship between electronic and geometric structures and molecules that have certain activities can be sought through a quantum chemical approach which is an alternative to solving problems in the search for new compounds by identifying the activity of a compound before synthesizing [9][10][11]. This approach is known as Quantitative Structure-Activity Relationship (QSAR) [12,13].
QSAR is a method of building computational or mathematical models to find statistically significant correlations between structure and function with chemometric techniques [14]. One of them is the relationship between descriptors and the bioactivity of a molecule based on quantum chemistry. Based on the equations obtained from QSAR, the active site of a molecule can be identified and becomes the basis for new molecular designs [15]. The descriptors (parameters) that affect the activity of drug molecules greatly determine the quality of the QSAR equation. Descriptors are obtained from quantum mechanical calculations. The calculation that is usually used is a semi-empirical method, namely, Austin Model 1 (AM1) [16]. To produce the QSAR equation of the electronic descriptors and molecular descriptors that affect the biological activity of drugs, statistical methods are used.  [19]. Dawood et al. (2015) succeeded in synthesizing new coumarinderived compounds and calculating the percentage of anti-inflammatory activity of these new coumarin-derived compounds. This study examined the relationship between structure and anti-inflammatory activity of the 7-O-amide hespetine derivative using electronic descriptors and molecular descriptors calculated using semiempirical methods, namely Austin Model 1 (AM1). To the best of our current knowledge, these derived compounds have not been studied or predicted for their anti-inflammatory properties.

Research Methods
This research was conducted at the computational chemistry laboratory, department of chemistry, faculty of mathematics and natural sciences, Universitas Negeri Gorontalo. The devices used consist of hardware and software. The hardware was a Personal Computer (PC) with an 8th Generation Inter® CoreTM i5 processor and 4GB DDR4-2400 SDRAM (1 × 4GB) memory and an internal storage capacity of 256 GB PCle® NVMeTM M.2 SSD. Meanwhile, the software was the Microsoft ® windows 10 Pro 32-Bit Operating System (OS), chemdraw professional 15.0 2015, SPSS version 21.0, and hyperchem 18.0.
The sample of this study was a 7-O-amide hesperetin derivative and its anti-inflammatory biological activity (IC50) value obtained of the publication of Yilong (2019) [20]. The parent structure of the 7-O-amide hesperetine compound was shown in Figure 1. A total of 23 series of derivative compounds of 7-O-amide hespetine (table 1) taken from literature 4 were drawn in 2D and then transformed into 3) using Chemdraw Professional 15.0 of 2015. The structures (3D) were carried out geometry optimization using hyperchem, and then calculatingdescriptors using the Austin Model 1 (AM1) method. The descriptors used were parameters according to the Hansch QSAR model, namely HOMO energy, LUMO energy, log P, hydration energy, polarizability, and net atomic change The data analysis technique used to determine the QSAR equations were multi linear regression (MLR) statistics calculated using SPSS 21.0. These equations were then selected based on statistical parameters that describe significance, namely the correlation coefficient (R), the square of the correlation coefficient (R 2 ), the Fcount and Ftable ratio, and the standard error (SE) 12 . Then the resulting data were tested against the test compound by calculating the PRESS (Predicted Residual Sum of Squares) value. The best equation model was the one with the smallest PRESS value [21]. The best anti-inflammatory compounds were those with the lowest IC50 [22,23].

Calculation of Descriptors
The molecular descriptors referred to in this study were the physicochemical properties of the 7-O-amide hesperetine derivative compounds as parameters to study the quantitative structureactivity relationship and determine pharmacological characteristics. This descriptor data were used as an independent variable in statistical analysis to find the QSAR equation for 7-O-amide hesperetine derivatives. The descriptor calculation was carried out using Austin Model-1 (AM1). The AM1 semi-empirical method was very suitable for use in this study because most of the organic compounds were compatible with this method. This method also had good prediction accuracy, required a relatively short time in calculations and did not require large memory in data storage [24]. A descriptor was a parameter that represents the relationship between the structure of a molecule and the biological activity of molecule 21 . In the design of drug molecules, atomic charge was also used to describe the polarity of a compound. Atomic charge descriptors were also commonly used in calculating or measuring the chemical reactivity index as a measure of weak intermolecular interactions [25]. In this study, the electronic descriptor used were the net atomic charge (q) in the main framework of the 7-O-amide hesperetine compound with a total of twenty-seven net atomic charges, namely qC1, qC2, qC3, qC4, qC5, qC6, qC8, qC10, qC11, qO14, qO16, qO17, qC19, qC20, qC21, qC22, qC24, qC26, qO28, qO30, qC31, qO35, qC36, qC39, qO40, qN41, HOMO (highest occupied molecular orbital) energy, LUMO (lowest unoccupied molecular orbital) energy, while molecular descriptor data in the form of partition coefficient (log P), polarisibility, and hydration energy obtained from the calculations on the QSAR properties on the compute menu using Hyperchem. The results of the calculation of the descriptors were shown in table 2.

Determination of Training Set and Test Set
In analyzing the search for the best form of the QSAR equation and validating the equation model, it was necessary to separate the 23 7-O-amide hesperetine derivatives into a training set and a test set. The results were shown in table 3. The training set compounds were analyzed to generate the QSAR model, while the test set compounds were used to validate the QSAR model generated from the training set. Before dividing into two groups, the IC50 values of all compounds were converted to logarithms (log). This was done so that the range of IC50 values between one compound and another did not differ too much and the distribution of the IC50 values was better. The test set used consisted of 6 compounds determined according to the smallest log IC50 value of 2 compounds, the medium log IC50 value of 2 compounds, and the largest log IC50 value of 2 compounds. The remaining 17 compounds were used as a training set.

Statistic Analysis
Multiple inier regression (MLR) was a statistical analysis used to analyze the quantitative relationship between the structure of the 7-O-amide hesperetin compound and its anti-inflammatory activity. This analysis was carried out with the help of SPSS version 21.0, where the dependent variable (y) was the biological activity value (IC50) of the 7-O-amide hesperetin derivative, and the independent variable (x) was a descriptor. This analysis was initially only carried out on training set compound data. The aim was to determine the descriptors that had a significant effect on the IC50 value as shown from the best QSAR equation resulting from this stage. The analytical method used in SPSS for this stage was the Backward method. The output of the multilinear regression analysis was the statistical parameters of various descriptor combination models that were associated with the IC50 value, namely the correlation coefficient (R), the coefficient of determination (R 2 ), the standard error of estimate (SEE), and the Fischer value (Fcount).
The best QSAR equation was the one that fits the following criteria: the R 2 value was greater than 0.6, the SEE was less than 0.3, and the Fcount/Ftable ratio was greater than or equal to 1. When the statistical analysis results meet the above criteria, then it could be ascertained that there was a relationship or correlation between the structure and activity of a compound being analyzed.  Based on the data in table 4, there were 8 QSAR models. This indicated that the effect of the independent variables (descriptors) used on the antiinflammatory activity was quite large (more than 60%). The SEE value also meets the criteria (less than 0.3). This value indicated the accuracy of the resulting model for predicting new anti-inflammatory compounds was very good, the closer to 0 the more accurate. Fhit/Ftab ratio that indicated the levels of significance of the effect of the descriptor on activity were met the criteria (more than 1). This showed that the level of significance of the descriptor's influence on activity. Based on the specified statistical parameters, all models were acceptable and further validated using a test set.

Validation of QSAR Equation Models
Validation of the QSAR equation model aims to ascertain whether an equation was able to predict the value of the biological activity of a series of compounds with a low probability of error. If the QSAR equation was valid then the equation can represent mathematically the quantitative relationship between the descriptors and the biological activity of the compounds studied. Validation of the QSAR equation model was carried out on the test set compounds by calculating the predicted residual sum of squares (PRESS) values from the received equations. The PRESS value was the sum of the squares of the difference in the value of the biological activity of the experimental results with the predicted biological activity based on the selected equation models. A good equation was characterized by a small PRESS value because it showed a small error rate in calculating the value of biological activity. Models 6, 7, and 8 were validated because its had the fewest descriptors compared to the other models. PRESS values and log IC50 experimental results and prediction results for the test set for the selected model werepresented in table 5.  The plot of the IC50 log values from the experimental results and predictions for all compounds can be seen in Figure 5. The prediction R 2 value was 0.678. This value indicated a good correlation between the anti-inflammatory activity of the experimental and predicted results. The model that was generated from all data of 23 compounds involving the descriptors qC2, qC3, qC4, qC8, qC11 obtained the following equation: In the above equation, the negative coefficient indicated that increasing the descriptor would result in lower IC50 log values as an indication of a more active compound. The statistical parameters resulting from the above equation met the predetermined criteria, so that the equation could be used to predict the values of the anti-inflammatory activity of 7-O-Amide hesperetine derivatives.

Proposed Anti-Inflammatory Compounds
The proposed anti-inflammatory compounds in this study were designed by replacing the substituents R, R1, R2, R3, R4, R5 and X in the four series of 7-Oamide hesperetine parent compounds. The designed compound was expected to show a smaller log IC50 value than the guiding compound series which indicated better anti-inflammatory activity. Replacement of substituents was based on isosteric properties between the previous group and the replacement group. In this study, 21 proposed antiinflammatory compounds were designed to be derived from 7-O-amide hespetine as presented in table 5. The proposed anti-inflammatory compound was chosen based on the lowest IC50 value. Because of IC50 was the effectiveness of a compound in inhibiting a reaction, in this case inflammation. When the IC50 value of a compound was low or small, the compound was more effective at inhibiting inflammation. Based on the results of calculating the activity values of several compounds that had been designed, the proposed compound that had the best anti-inflammatory activity from the previous 7-Oamide hesperetine derivatives was compound 38 which involved an N atom, and three H atoms where each was bound to X, R3, R4, and R5. According to Hidayati (2008) that flavonoids function by inhibiting cyclooxygenase and lipooxygenase enzymes which can provide hope for the treatment of symptoms of inflammation and allergies. Flavonoid was the phenolic group which can act as a poison inhibitor and a slow-acting nervous system [27]. The 7-O-amide hesperetine compound was a flavonoid compound that could play a role in overcoming inflammation, therefore the descriptors that influenced the anti-inflammatory activity of the derivatives of this compound were qC2, qC3, qC4, qC8, and qC11, all of which were in the chroman core which characterizes flavonoid compounds. The net atomic charges of the five descriptors were different. This showed that there was a difference in the electron density of each of these atoms. The net atomic charge of an atom was negative, indicating that the electron density was smaller than that of an atom with a net positive atomic charge. The anti-inflammatory compound proposed with the most effective IC50 in preventing inflammation was compound number 38 which was a derivative of compound 6a where there was a substitution of the N atom in substituent X which changed the end of the compound from the piperidine group to piperazine. The group changes were shown in figures 3(a) and (b). Substitution of the CH group with an N atom has made the anti-inflammatory activity of this compound more effective because the piperazine group was richer in electrons than the piperidine group. Hatnapure (2012) showed that electron-rich piperazine could increase the activity of chromone units in flavonoids in order to inhibit inflammation [28]. This was what causes the IC50 value of compound number 38 to be lower than other 7-O-amide hesperetine derivatives. The choice of the N atom as a new group to be substituted for the CH group was due to the isosteric N atom with the CH group, the number of electrons of N atom is the same as the number of electrons of the CH group, 7 electrons. Systematically the compound proposed number 38 is named as (S)-5-hydroxy-2-(3hidroxy-4methoxyphenyl)-7-(2-oxo-2-(piperazine-1yl)ethoxy)chroman-4-one.