Identification of DNA G–quadruplex Forming Sequence in Shrimp White Spot Syndrome Virus (WSSV)

. White spot syndrome virus (WSSV) is considered one of the most infectious and lethal viruses that affect shrimp. Bioinformatic studies revealed several G–quadruplex forming sequences at the open reading frame region. Moreover, the sequences are widely conserved through all deposited WSSV sequences. Introductory structural studies on two sequences, namely WSSV131 and WSSV172, are proposed to form a quadruplex. While WSSV172 forms a mixture of quadruplex topologies, WSSV131 is suggested to form a parallel topology, as indicated by the NMR spectra and circular dichroism (CD) ellipticity pattern. CD spectra also suggested that the major parallel species of the WSSV131 sequence are found to be stable above 60 °C. Ultimately, these results may open a new strategy for WSSV treatment by targeting the quadruplex confirmation with a quadruplex binding ligand.


Introduction
Shrimp is one of Indonesia's export commodities, supplying market demands from countries such as the USA, Japan, and China [1].The shrimp farming industry in Indonesia has a production value of approximately IDR 36.22 × 10 12 and is expected to increase to IDR 90.3 × 10 12 by 2024, with the amount of production at approximately 517.397 t in 2019.The disease is a major problem in the industry, particularly viral diseases, as shrimps lack some of the key components of adaptive and innate immune response mechanisms.White spot syndrome virus (WSSV) disease is one of the most important shrimp diseases in Indonesia and affects commercially cultured tiger shrimp (Penaeus monodon Fabricius, 1798) and vannamei shrimp (Penaeus vannamei Boone, 1931).
The disease caused by WSSV dubbed white spot syndrome disease (WSSD) was first reported in 1994 and caused a serious impact on the shrimp farming industry, and has become an endemic disease in Indonesia.In the farming ponds, the onset of WSSD occurred during the second month of culture and was followed by increasing mass mortality ranging from 80 % to 100 % within a period of 3 d to 10 d.The causative virus WSSV itself is a large and enveloped DNA virus with a flagellum-like tail and is the only member of the family Nimaviridae, genus Whispovirus.The viral genome contains at least 181 ORFs, most of which encodes a polypeptide with little to no detectable homology to other known proteins.This virus has a broad host range amongst decapod crustaceans, such as freshwater and marine shrimps, crabs, and lobsters.Susceptibility of infection varies and some may have a high viral load with no clinical signs detected, and a low level of infections may occur at an undetectable level.Amplification of viral loads and onset of disease can also be induced by environmental or physiological stress.Death of commercially cultured shrimps (1 mo to 2 mo old) has become common in Indonesia, and thousands of hectares of ponds have become unproductive due to WSSV.In Asia, WSSD has been estimated to cause a loss of USD (4 to 6) × 10 9 , from 1992 to 2002 [2].
Previous studies have shown several methods that could be used to reduce or prevent the infection of commercially cultured shrimps; the use of probiotics [3], herbal or antiviral plants [3,4], preliminary exposure to inactivated WSSV [4,5], vaccination trials and development of protein or nucleic acid-based vaccines [4], the use of physical barriers [6], and development of chemical binders for WSSV VP28 envelope protein [7].In this study, the WSSV genome is examined and found to contain putative intramolecular G-quadruplex forming sequences.Intramolecular G-quadruplex can be formed in guanine-rich nucleotide, which consisted of four GN repeats connected by loop , 00040 (2023) https://doi.org/10.1051/e3sconf/202337400040E3S Web of Conferences 374 3 r d NRLS regions of various lengths.Guanine is one repeat that can stack onto each other and formed a column then four planar columns are then joined by Hoogsteen-type hydrogen bond.Quadruplex formation and stability are governed by various factors, such as the polarity and torsion angle of the guanines and the length and sequence of the loop regions [8][9][10].The G-quadruplex structure is commonly involved in molecular processes by negatively regulating the replication process and gene expression [11].As a regulatory structure, the G-quadruplex has become an interesting topic for the development of viral control [12].

Bioinformatic studies
The complete genome sequence of the WSSV strain CN04 (Accession number: KY828783.1)was retrieved from the NCBI database.The G-quadruplex structures are predicted using webserver analytical tool QGRS Mapper, with the putative G-quadruplexes identified using the motif as shown in Figure 2.Where x represents the number of guanine tetrads in the G-quadruplex and y represent the length of the loops connecting the guanine tetrads [13].Promising putative Gquadruplexes forming sequences obtained from the web server were further aligned using BLAST with the other 30 strains to determine the sequences conservation.

Oligonucleotide preparation
WSSV131 (5'-TCTGGGAGGGAAGGGGAGGGTTA -3') and WSSV172 (5'-TAGGGCCTTAGGGAAGGGATGGGA) were purchased from TIBMOLBIOL (Berlin, Germany) and further purified with ethanol precipitation.DNA concentration was quantified by measuring its absorbance at 260 nm at 80 °C in water.CD optical spectroscopy employed 5 µM of oligonucleotides dissolved in 20 mM potassium phosphate +100 mM KCl buffer, pH 7.0.For NMR spectroscopy, samples were dissolved in 90 % and 10 mM potassium phosphate buffer, pH 7.0 + 10 % D2O to give a final concentration of around 0.3 mM.All samples were heated to 90 o C for 5 min and cooled slowly to room temperature before measurements.

CD spectroscopy
CD experiments were done with Jasco J-810 spectropolarimeter equipped with a Peltier thermostat (Jasco, Tokyo, Japan).Spectra were acquired in 1 cm quartz cuvette from lambda 210 to 350 nm with five accumulations, a bandwidth of 1 nm, scanning speed of 50 nm min -1 , and 4 s response time.The experiments were done at 20 °C, 40 °C, 60 °C, 80 °C, and 90 °C.Samples were stirred for 10 min for each temperature step to ensure equilibrium has been reached.All spectra were blank corrected.

NMR spectroscopy
NMR experiments were done on a Bruker Avance 600 MHz spectrometer equipped with inverse 1H/13C/15N/19F quadrupole resonance cryoprobehead and z-field gradients.Topspin 4.0.7 and CcpNmr V2 were used for data processing and assignment.All spectra were recorded at 25 °C.1D proton spectra were acquired with 32 K data points, 128 transient scans, a sweep width of 12 000 Hz, and 2 s of relaxation delay.

Analysis of DNA G-quadruplex forming sequence in WSSV genome
G-quadruplex structures are formed in DNA that is rich in guanine and usually consists of a sequence with four repeats of GGG connected by loop regions with variable lengths.These structures are commonly involved in molecular processes such as the regulation of gene expression and protein synthesis.With that in mind, we analyzed the complete genome of the White Spot Syndrome Virus obtained from the NCBI database and identified the putative G-quadruplex forming sequence using the QGRS Mapper tool.The WSSV genome has a total length of approximately 281 kbp, and bioinformatics analysis in QGRS Mapper showed several promising putative models of G-quadruplex forming sequences (Table 1).The first sequence, dubbed WSSV131 is located exactly at the 3' end of the WSV131 open reading frame while sequence WSSV172 is located within the open reading frame [14,15].Alignment results revealed that the sequence is highly conserved (Table 1).Only the G10 and G13 of WSSV131 are deleted in one strain (30 or 31) (Accession number: MG702567) (Figure 3B).While WSSV131 is a hypothetically coding region, WSSV172 is known to express ribonucleotide reductase large subunit.Adding a G-quadruplex stabilizing ligand towards these sequences can be hypothetically used to reduce the viability of the virus through disruption of the DNA replication or transcription mechanism as has been shown in the Ebola Virus L gene [11,16].

Evidence of G-quadruplex formation in the highly conserved sequences
Introductory structural studies were done on WSSV131 and WSSV172.The addition of 5' and 3' overhang can reduce the risk of aggregation via tetrad end-to-end stacking which tends to complicate the interpretation of G-quadruplex topologies [17].Further spectroscopy techniques, such as CD and NMR were employed to affirm the formation of G-quadruplex in the WSSV131 and WSSV172.The formation of the quadruplex for WSSV131 and WSSV172 is supported by NMR spectra in the region of the imino proton of the guanine participating in the Hoogsten hydrogen bond.Imino proton spectra of the WSSV131 revealed 12 strong resonances, indicating the presence of tri-layered G-tetrad, as can be deduced from the sequence (Figure 4).However, minor species of quadruplex are still present in the sample.Such minor species can arise from the four-run of guanine in the 3 rd G-tract [18].CD spectra of WSSV131 displayed a characteristic ellipticity pattern for parallel quadruplex, which are positive at ~265 nm and a negative peak at ~240 nm.At about physiological intracellular potassium ionic strength, the WSSV131 quadruplex is still present at 90 ˚C.The melting temperature is predicted to be between 60 to 80 ˚C (Figure 4).This may indicate the formation of such quadruplex topologies in the virus genome at room temperature, or even in the acclimation and ambient temperature of shrimp.
Meanwhile, imino proton spectra of WSSV172 also indicate the formation of multiple species quadruplexes (Figure 3).CD spectra of WSSV172 indicate a typical characteristic of (3+1) hybrid topologies, with a positive shoulder at ~295 nm.However, further analysis with elevated temperature showed that the positive peaks at ~260 nm do not follow the reduction of the ellipticity at 295 nm (Figure 4).This result may suggest the presence of multiple species in the WSSV172 with different topologies and melting temperatures, further reinforcing the NMR spectra.

Conclusion
Exploration of the WSSV genome provides two promising putative G-quadruplex forming sequences.Both sequences are found to be conserved in nearly all deposited sequences of WSSV.While WSSV172 forms a mixture of topology, WSSV131 is suggested to form a parallel topology.Further study with a higher resolution spectroscopy technique can be used to elucidate the arrangement of the guanine for the major species of WSSV131.Additionally, both sequences are found to be stable above the ambient temperature of the shrimp environment.Therefore, these putative G4 forming sequences can be potentially exploited to be targeted with G4 binding ligands and may prevent the widespread of WSSV.
This research was an initiative as one of the outcomes from the World Class Professor Funding

Fig. 1 .
Fig. 1.Schematic representation of a parallel G-quadruplex comprised of all anti-guanine (grey boxes) connected by three propeller loops.

,
Fig. 3.(A) Graphical representation of WSSV strain CN04 double-stranded circular genome and the location of WSSV131 and WSSV172.(B)Schematic representation of the alignment of WSSV131 and WSSV 172 sequences conservation with a total of 31 deposited WSSV complete genomes.A smaller height in font indicates a less conserved sequence.

Table 1 .
List of putative G-quadruplex forming sequence in WSSV genome.