The compost metagenome as a source of T4 bacteriophage pyrimidine dimer glycosylase homologues

Compost is a promising source of thermotolerant enzymes for their application in biotechnology. Homologues of bacteriophage T4 DNA glycosylase can find their application in pharmaceuticals and perfumery. Five homologues of glycosylase of pyrimidine dimers of bacteriophage T4, a product of the denV gene, were found by comparing using the DELTA-BLAST algorithm with the compost metagenome proteins. Phylogenetic analysis of the found sequences of enzyme homologues was carried out using the Maximum Likelihood algorithm in the MegaX software package. Thus, an interesting spectrum of promising proteins, homologues of the repair enzyme, DNA glycosylase of pyrimidine dimers of bacteriophage T4, was found. After structural modeling, they can be tested for their thermal stability and tested as a basis for therapeutic and prophylactic drugs.


Introduction
Many enterprises and farms, whose activities involve the presence of biowaste, use various methods of composting to process certain types of organic matter. This is justified both by the ecological purity and safety of the method, and by its economic value. Composting implies a process of aerobic decomposition of waste as a result of the vital activity of microorganisms to water, carbon dioxide, heat and the final product -compost [1]. The latter, as a rule, is subsequently used as an organic fertilizer for the soil. In addition, it can act as a source of microorganisms with enzymes, which can be further used in various industries. The most promising is the selection of biochemical catalysts for the production of biofuels, the processing of cellulose, plastic and other hard-to-decompose substances, as well as the search for biomolecules that can form the basis of pharmaceuticals.
One of the methods that will help to more clearly determine the prospects for research in these areas is the metagenomic analysis of microbial consortia from preselected and sequenced samples of various compost ecosystems. For example, a group of researchers analyzed a library of pyrotags 16S rRNA metagenomic sequence of microbial consortia from compost enriched with rice straw [2]. They showed that the predominant group among bacteria were representatives of Actinobacteria, Proteobacteria, Firmicutes, Chloroflexi and Bacteroidetes. Among these communities, many previously unknown species of the taxon Actinobacteria were also found, which made it possible to conclude that there is a greater ecological diversity of thermophilic actinobacteria than previously assumed [2]. Thus, this approach to metagenomes will help determine the presence or identify the presence of sequences of necessary enzymes and their hosts in the selected community for a wide variety of purposes.
One of the most popular trends in cosmetology and pharmacology is the search and creation of drugs to protect the skin from the effects of UV radiation. They can be used to combat diseases, such as xeroderma pigmentosa, or used to create daily care products. DNA glycosylases of pyrimidine dimers have significant potential in this area and are already beginning to recommend themselves as pharmaceutical and cosmetic protectors and reparations. Quite a lot of studies have been devoted to the product of the denV gene of Escherichia virus T4. It is a multifunctional enzyme capable of performing excisional repair of pyrimidine dimers due to N-glycosylase and AP (apurinic/apyrimidinic) lyase activities [3]. A number of articles are devoted to the study of its activity and effects in eukaryotic and prokaryotic cells [4][5]; there are also several patents associated with it [6][7]. For example, Korean inventors have created a cosmetic composition containing a component that can prevent signs of aging, and promote skin recovery from sun stress [6]. Inventors from the University of Oregon of Health received a Canadian patent for polypeptides of glycosylase specific to pyrimidine dimers and methods of their use for repairing damaged DNA [7]. It was the product of the denV gene and its specially engineered mutants. Moreover, in addition to the forms of the bacteriophage T4 enzyme, its homologue, pyrimidine dimer specific glycosylase (CV-PDG) of Paramecium bursaria chlorella virus-1 (PBCV-1) was also used [8].
It follows that the search for DNA glycosylases, in particular homologs of the already proven DenV enzyme of bacteriophage T4, can be of great value for the preparatory stages of the selection and development of UV protectors for the needs of medicine, veterinary medicine and cosmetology. As mentioned earlier, composts are a fairly specific ecosystem in which a large number of microorganisms exist. In theory, they can possess more thermally stable and valuable homologs of DNA glycosylase. Thus, the aim of this work was the search in the compost metagenomes for new amino acid sequences of DNA glycosylases of pyrimidine dimers and their taxonomic analysis

Materials and methods
To search for homologous, we took the amino acid sequence of the denV gene product -DNA glycosylase DenV of Escherichia virus T4: >NP_049733.1 DenV endonuclease V, N-glycosylase UV repair enzyme [Escherichia virus T4]: MTRINLTLVSELADQHLMAEYRELPRVFGAVRKHVANGKRVRDFKISPTFILGAGH VTFFYDKLEFLRKRQIELIAECLKRGFNIKDTTVQDISDIPQEFRGDYIPHEASIAISQ ARLDEKIAQRPTWYKYYGKAIYA

Search for homologues among the compost metagenomes
To find homologues among metagenomes, the DELTA-BLAST algorithm was used with the following parameters: matrix: BLOSUM62, gap costs: existence: 9 extension: 1. The second iteration was carried out using the PSI-BLAST algorithm, parameters: incl. Threshold: 0.005. The search was performed in the database env_nr, taxid: 702656.

Reverse search for compost DNA glycosylases
For the reliability of bioinformatics analysis, it was decided to introduce controlshomologues of glycosylases found in metagenomes. For each found glycosylase, a search was carried out using the PSI-BLAST algorithm, parameters: incl. Threshold: 0.001 in several iterations. The parameters for each are presented in table 1:

Search for homologues among the taxa Tequatrovirus and Bacteria
For bioinformatics analysis, homologous sequences of DNA glycosylases of other members of the Tequatrovirus genus, to which Escherichia virus T4 belongs, were required, as well as homologous proteins among bacteria. The search was carried out using the PSI-BLAST algorithm, in several iterations, incl. Threshold: 0.005. Sequences with E-value <3e -29 were selected. For bacteria, an additional criterion was their belonging to ecological niches: soil, human and animal microbiome.

Phylogenetic analysis
The resulting file with amino acid sequences of DenV T4 homologs in FASTA format were combined into one file and used for processing in the MEGA X software package [9]. Alignment was performed using the MUSCLE program [10]. The phylogenetic tree was constructed using the Maximum Likelihood algorithm [11], bootstrap of 1000 repetitions [12] was chosen as a statistical method.

Result
As a result of searching for homologues among compost metagenomes, we were able to find five glycosylase sequences, their GenBank numbers: MMZ46843.1, MNW40567. 1, MNS97894.1, MNQ25265.1, MNL43486.1 [13-14]. Metagenomes containing these sequences were extracted from bacterial cultures obtained from the compost soil of the Experimental Botanical Garden Goettingen, Germany, by the research team Egelkamp, R., Zimmermann, T., Hertel,R. and Daniel,. These cultures were additionally enriched with nitriles and their corresponding carboxylic acids; they were subsequently sequenced and composed in metagenome. Thus, the found glycosylases are quite unique, since they were found in an artificially created unique habitat, which makes them more valuable. On the resulting phylogenetic tree (Fig. 1), we obtained an interesting distribution among glycosylases from the compost metagenome, DenV phage glycosylases homologs from the Tequatrovirus genus, and sequences found in soil bacteria and bacteria that can be opportunistic and pathogenic for humans and animals. Also, as a result of the reverse search, a homolog was found among the Archaea domain.
Each glycosylase from the compost formed its own branch, which included its bacterial and archaeal homologues found by reverse search. Exceptions were glucosylases numbered MNL43486.1 and MNQ25265.1 In addition to their own homologues, most of their branches were proteins from bacteria from the human oral cavity and pathogens that cause respiratory infections.
In the reverse search, it was found that glycosylases MMZ46843.1 and MNW40567.1 have common homologues from Paenibacillus sp. -important producers of antibiotics The sequences from bacteriophages formed a separate branch of their own, taking with them a pair of homologs from bacilli and enterobacteria.

Findings
Thus, the found glycosylases from the compost metagenomes are more similar to bacterial sequences than to phage ones. The proteins MNL43486.1 and MNQ25265.1 may be the closest to phage proteins. It should be noted that these sequences had the greatest similarity to the DenV of bacteriophage T4 at the time of the search for homologues. Their close similarity with homologues from bacterial pathogens is also alarming, since the studied enzyme affects the resistance to UV-radiation.
The fact that these proteins are located in a rather peculiar ecosystem can influence the functions and characteristics, therefore, their further study may turn out to be very promising for both industrial and fundamental studies. The latter may raise the question of the pathways of origin of this protein both among bacteria and among phages. The authors have already carried out some studies on oceanic homologues of this glycosylase [15], some of the results turned out to be similar -enzymes from Tequatrovirus phages and proteins from enterobacteria and bacilli were closely related.
After structural modeling, the found glycosylases can be tested for their thermal stability and tested as a basis for therapeutic and prophylactic drugs.
The study was funded by RFBR and NSFC, project number 20-54-53018.