Developing a DNA Marker Approach for the Sustainable Production of D-Tagatose

. D-tagatose is known as a type of sugar that has low-calorie and numerous benefits. The sugar is also known to have potential for the food industry. D-tagatose can be produced biologically using the L-arabinose isomerase (L-AI) enzyme. However, sustainable production of D-Tagatose still faces an issue due to the specificity of the enzyme and the requirement of a high temperature for large-scale production. This study aims to develop an approach to discovering new bacteria that have the L-AI enzyme by implementing the DNA marker technique. We collected protein sequences from a public biological database and performed a multiple-sequence alignment. Then, the degenerate primers were designed based on the aligned sequence. The primer characterization was carried out using Oligo Calc. In-silico PCR amplification was also performed to test the primers’ specificity. Overall, the primers’ properties have met the criteria for optimally working primers. In addition, gel electrophoresis confirmed the successful amplification of the L-AI enzyme from several bacteria. Our study could be used to discover the L-AI enzyme that has the desired characteristics, which allows the sustainable production of D-tagatose.


Introduction
Sugar is considered as one of the most important elements of our diet.Sugar is present in many foods and drinks we normally consume because the presence of sugar is essential to increase the taste of both foods and drinks [1].However, consuming sugar that has a relatively high calories could increase the risk of having several health problems, such as diabetes, obesity, cardiovascular disease, and dental cavities [2].Thus, replacing the high-calorie sugar with another alternative, such as low-calorie sugar or sweetener, could mitigate the problems.D-tagatose is a type of sugar that has a similar structure to galactose, a monosaccharide sugar.Dtagatose can be normally found in fruits and dairy products, albeit only a small amount.Compared to sucrose (the common sugar), D-tagatose has approximately 92% of its sweetness and around 40% of its calories, indicating a low-calorie sugar [3].Several studies have shown that D-tagatose has many health benefits, including anti-aging, anti-oxidant, improving digestive system, improving fertility, and healthy fetal development [4].Furthermore, it is also reported that Dtagatose did not significantly increase the glucose blood level and could promote weight loss, which is ideal for treating diabetic and obesity patients [5].In addition, the US Food and Drug Administration (FDA) has given the generally recognized as safe product (GRAS) status to D-tagatose, which can be consumed by anyone [6].
Given these benefits, D-tagatose possesses great potential to be used in the food industry.
The production of D-tagatose can be performed via two different routes, the chemical and biological routes.The chemical route uses galactose as the raw material, which consists of two steps: the first step is the isomerization step to convert D-galactose into Dtagatose complex, and the second step is the filtration to separate D-tagatose from other compounds [7].Meanwhile, the biological route of D-tagatose production has been done by utilizing two enzymes, galactitol dehydrogenase, and L-arabinose isomerase [7].The galactitol dehydrogenase enzyme, which is present in several microorganisms, catalyzes the conversion of galactitol to D-tagatose [8].This approach could produce a high yield of D-tagatose, but the galactitol (as the substrate) is quite expensive, hindering large-scale production [7].Meanwhile, the L-arabinose isomerase (L-AI) enzyme catalyzes the conversion of Dgalactose into D-tagatose [8].This enzyme has the biological route, offering several advantages compared to the chemical route, including simple purification steps and being environmentally friendly.
Producing D-tagatose by utilizing L-AI enzyme is believed as the most realistic method for large-scale production.This is mainly because the method is environmentally friendly, and more economical compared to the other biological routes.However, the main issue with utilizing L-AI enzyme for large-scale production is the specificity of this the L-AI enzyme, as the enzyme could also convert L-arabinose into Lribulose.Furthermore, for the large-scale production, it is necessary to apply certain physical settings, such as high-temperature conditions [9].This can speed up the enzymatic process and prevent bacterial or fungal contamination [10].Hence, using L-AI enzyme that has a high specificity and withstands a high temperature could tackle the issues.
One favorable approach for the large-scale production of D-tagatose is by using the L-AI enzyme obtained from extreme conditions, such as thermophilic bacteria.This includes optimizing the enzyme activity in a bioreactor [11][12].Until now, researchers have characterized L-AI enzymes from various bacteria, including mesophilic and thermophilic bacteria [12][13][14].Nevertheless, many thermophilic bacteria, which may possess L-AI enzymes with better characteristics, have not yet been discovered.The exploration of the potential thermophilic bacteria could be undertaken using a DNA marker.Our work developed an approach to find out new L-AI enzymes from bacteria.We implemented a DNA marker technique, which includes mining and analyzing L-AI sequences, and then designing and testing degenerate primers.Our research findings could act as a reference to discover the L-AI enzyme that may have the best characteristics, which is expressed by thermophilic bacteria.

Materials and Methods
In general, the DNA marker technique that was used in this study consists of several steps (Fig. 1): mining DNA sequences, sequence analysis, DNA marker (primer) design, marker characterization, and testing.

Data acquisition
The sequences of L-AI were obtained from National Center for Biotechnology Information (NCBI), accessed through (https://www.ncbi.nlm.nih.gov).The L-AI from various bacteria, including mesophilic and thermophilic were retrieved and used in this study.

Sequence analysis
The Basic Local Alignment Search Tool (BLAST) program was used to retrieve L-AI protein sequences from different bacteria (https://blast.ncbi.nlm.nih.gov/Blast.cgi).PSI-BLAST (Position-Specific Iterated BLAST) algorithm was selected to search for similarities between protein sequences, which filters all related sequences present in the database [15].Several taxa were chosen to guide the PSI-BLAST to obtain the most relevant bacteria, including Escherichia coli, Lactobacillus, Alicyclobacillus, Geobacillus, Thermotoga, and Thermus sp.Then, EMBL-EBI online software was used to analyze the sequences.The MUSCLE Algorithm was utilized to accomplish multiple sequence alignment since it is fast and accurate.After performing the alignment, the consensus sequence was created and utilized for designing degenerate primers

Degenerate primer design
The degenerate primers for L-AI enzyme were designed using SnapGene Viewer software version 5.2.5, based on the consensus sequence generated from a multiple sequence alignment.Several degenerate bases were used, based on the one-letter codes from the International Union of Biochemistry (IUB).Next, the designed primers were characterized using Oligo Calc software (http://biotools.nubic.northwestern.edu/OligoCalc.html), which calculates the properties of the primer, such as melting temperature (Tm), GC content, length, molecular weight, hairpin, and self-complementarity [16].

Primer Testing with in-silico Polymerase Chain Reaction (PCR) amplification
To test if the designed primers could amplify the gene encoding the L-AI enzyme, in-silico PCR amplification was undertaken, using SnapGene® software version 6.1 (from Insightful Science; available at www.snapgene.com).The result of the amplification was visualized using Simulate Agarose Gel Tools, with 1 % agarose and 1 kb DNA ladder from New England Biolabs as the marker.

Sequence analysis of L-AI enzyme
We first chose an L-AI protein sequence as a reference sequence.Using this reference sequence, data mining was performed using PSI-BLAST to retrieve sequences that are relevant to the reference sequence.From PSI-BLAST Algorithm, L-AI protein sequence from various bacterial taxa was retrieved, which includes Escherichia coli, Klebsiella, Lactobacillus, Alicyclobacillus, Geobacillus, Thermotoga, and Thermus sp.The retrieved sequences a relatively high similarity to the reference sequence.After that, the selected sequences were subjected to multiple sequence alignment with the MUSCLE Algorithm.In general, multiple sequence alignment examines changes in the nucleotide sequence, such as deletion, insertion, and substitution [17].
Multiple sequence alignment analysis revealed the aligned L-AI sequence from different bacteria (Fig. 2), including E. coli, K. pneumonia, T. neapolitana, P. thermoglucosidasius, G. thermodenitrificans, Thermus sp., and A. acidocaldarius.All sequences were successfully aligned, although some mismatches were still present.The presence of the mismatches may be due to the different lengths of the L-AI sequence from different bacteria.From the sequence analysis, we then searched for the relatively conserved regions.We identified two regions (in blue box), which contain similar sequences for the design of degenerate primer.Overall, the multiple sequence analysis obtained two regions that are potential for DNA markers of the L-AI enzyme.

Characterization of L-AI primers
Next, the multiple sequence alignment result was then utilized to design the degenerate primers.One set of primers is necessary for the DNA marker of the L-AI enzyme.Table 1 shows the properties of the degenerate primers that were calculated by Oligo Calc.In general, both L-AI_1 and L-AI_2 primers had the same length (22 base pairs).The melting temperatures (Tm) of L-AI_1 and L-AI_2 primers were 60.25°C and 59.45°C, respectively.In addition, the GC content of both primers was also similar, with 45.5% for L-AI_1 and 40% for L-AI_2.Furthermore, Oligo Calc also predicted that both primers do not have any potential self-complementary and hairpin formations, suggesting that both primers are specifically designated for the L-AI enzyme.
To design optimally working primers, numerous criteria must be addressed: sequence length, melting temperature (Tm), GC concentration, and any potential complementary and hairpin formations [18].For the sequence length, the primers should ideally be between 18 and 30 nucleotides long [18].Meanwhile, the Tm of primers could anneal efficiently with the DNA region [18].Next, the GC concentration is also essential, as this influences the binding between the primers and DNA region.A GC concentration of 40-60% is required for the optimum binding to the DNA region [19].In addition, any self-complementary and hairpin formations should be avoided because this could impede effective binding to the DNA region; therefore, the value closest to 0 is preferable [19].Primer characterization results using Oligo Calc showed that the designed primers met the requirements for the optimally working primers.We then performed primer testing to find out whether the primers could amplify the L-AI enzyme from various bacteria and could be used as a DNA marker.L-AI_1 and L-AI_2 primers were used to perform in-silico PCR amplification.Nine bacterial species, including mesophilic and thermophilic, were chosen as the sample.

Testing L-AI primers
To observe the result of in-silico PCR, gel electrophoresis was subsequently conducted.The gel electrophoresis result revealed the expected single fragment of the L-AI enzyme from all bacterial samples (Fig. 3), suggesting the PCR amplification was successfully performed.Overall, the result validated the specificity of the designed primers.The results have shown that the designed primers could serve as the DNA marker to amplify the L-AI enzyme from various bacteria.The primers can be implemented to discover the L-AI enzyme that possesses the best characteristics so that the desired L-AI enzyme could be used for large-scale production of D-tagatose.For future work, the protein engineering method could also be applied to further improve the characteristics of the L-AI enzyme.Therefore, this will lead to the sustainable large-scale production of D-tagatose, which has numerous benefits for humans.

Conclusion
Our work implemented a DNA marker technique for identifying the L-AI enzyme from various bacteria.Using protein sequences obtained from the public biological database, we then analyzed the sequences and identified and designed the degenerate primers.Primer characterization and testing were performed to confirm the properties of the primers.Overall, the designed primers have successfully amplified the L-AI enzyme from different bacteria.Hence, the designed primers could be used to discover the L-AI enzyme that has the best characteristics, allowing the large-scale and sustainable production of D-tagatose.

Fig. 1 .
Fig. 1.The workflow used in this study

Fig. 2 .
Fig. 2. Multiple sequence alignment analysis of L-AI protein sequence from various bacteria.The alignment was partially shown to simplify the figure.The blue box shows the region that has a relatively conserved sequence.

Fig. 3 .
Fig. 3. Gel electrophoresis of in-silico PCR amplification of various bacteria using L-AI Primers.

Table 1 .
The properties of primers for L-AI