CRISPR/Cas9 high-throughput screening in cancer research

. In recent years, CRISPR/Cas9 technology has developed rapidly. With its accurate, fast, and simple editing functions that can achieve gene activation, interference, knockout, and knock-in, it has become a powerful genetic screening tool that is widely used in various models, including cell lines of mice and zebrafish. The use of CRISPR system to construct a genomic library for high-throughput screening is the main strategy for research of disease, especially tumor target gene research. This article reviews the basic principles and latest developments of CRISPR/Cas9 library screening technology strategies to improve its off-target effect, the basic workflow of library screening, and its application in tumor research.


Introduction
The development of DNA technology leads human to a new era [1]. Biologists now have the ability to edit the genetic information in our body [2]. CRISPR/Cas9 system is composed of clustered regularly interspaced short palindrome repeat sequences and CRISPR-associated genes, which exist specifically in the genomes of most bacteria (60%) and archaea (90%), and are able to resist the invasion of phages [3]. This system of bacteria is considered as the adaptive immune system [4]. The CRISPR Cas9 technology was first used in mammalian cells in 2013 and has since been named the "top 10 technological breakthroughs of 2015" by Science [5,6].
In the CRISPR Cas9 system, a targeted sequence of sgRNA (single -guide RNA) can guide Cas9 nucleases to bind to specific gene sequences and guide Cas9 nucleases to cut the corresponding binding sites [7]. Similar to zincfinger nucleases (ZFNs) and transcription activatorlike effectors nucleases (TALENs) gene editing techniques, the CRISPR/Cas 9 system causes doublestrand breaks in the targeted site, which can be repaired through homologous recombination or non-homologous terminal connections [8]. Compared with the previous technology (ZFNs and TALENs), CRISPR/Cas9 technology is simpler to operate, faster to implement, which allows it to be widely used in fixed-point genome editing of various organisms, such as rats, mice, fruit flies, zebrafish, yeast, bacteria and plants [9]. However, as early as 2013, Fu et al. [10] found that sgRNAs that are designed to target different sites could lead to different degrees of off-target effect, and the average off-targeted possibility at the potential off-target sites was around 40% [11]. Therefore, the safety and the targeting of CRISPR/Cas9 system recently attract scientists' increasing attention [12].

MECHANISM OF CRISPR/CAS9 SYSTEM
CRISPR/Cas9 system is an effective, anti-foreign phage epidemic system existing in bacteria and archaea, mainly composed of leading sequences, highly conserved repeated fragments, spaced fragments, and CRISPR related protein genes [13]. According to its genetic sequence and it core elements, CRISPR system can be divided into three types: type I, type II and type III. Type I also can be divided into six subtypes according to different species; type II can be divided into two subtypes according to whether it containing CSN2 or CSN4 gene; type III can be divided into two subtypes according to whether it contains CSM2 or CMR5 gene [14]. The marker genes for type I is Cas3, for type II is Cas9, and for type III is Cas10 [15]. Type Ⅰ and Ⅲ are similar; the need of Cas protein endonuclease shear the CRISPR long RNA transcription of precursor (pre-CRISPR RNA), and then processed it into series of short, conservative, repetitive sequence and interval sequence of mature crRNA (CRISPR-derived RNA) [16].CrRNA and Cas protein were polymerized into multi-protein complexes to identify and cut foreign nucleic acid sequences complementary to crRNA [17]. Type Ⅱ CRISPR/Cas9 system structure is relatively simple, pre-crRNA is solely processed by Cas9 protein. It is considered to be the smallest CRISPR/Cas system, which include CRISPR repetition interval motif and four (but usually three) Cas gene (Cas9 cas1 cas2, cas4 or csn2) [18]. In the above three types of CRISPR/Cas system, the functions of different kinds of Cas proteins determine their mechanisms of the degradation of exogenous DNA [19]. Based on its characteristics of simple-structured, the present study mainly used CRISPR/Cas9 system to target specific genes in cells to delete, add, activation, inhibition, etc. [20]. When pre-crRNA is transcribed, tracrRNA (Transactivating crRNA) is also transcribed [22]. As shown in the figure 1, Cas9 protein contains two active enzymatic cleavage sites, including the RuvC at the amino terminal and the HNH site that is responsible for cutting the complementary chain [23]. TracrRNA and Cas9 protein can produce mature crRNA. A complex formed by Cas protein, crRNA and tracrRNA can recognize and bind to the exogenous DNA that is complementary to the crRNA [24]. Then, bound double-stranded DNA is unlocked and forms R ring structure, making crRNA hybridize the complementary strand; another chain must be kept single [25]. Finally, the two loci cut two strands respectively and lead to double-stranded DNA break. TracrRNA and crRNA can be fused into a single guide RNA (sgRNA) [26].
Cas9 protein has nuclear localization signal (NLS), which can bind to sgRNA and assemble into sgRNA-Cas9 complex, so as to sequence 17-20 bp upstream of PAM sequence (proto spacer adjacent motif), causing DNA double-strand break6. PAM region is composed of NGG sequence (N can be any base in A, G, C or T) [27]. When DNA is broken, cells undergo homology directed repair (HDR) or non-homologous end joining (NHEL) to repair themselves [28]. HDR uses another homologous chromosome as a template to repair gene; normally, it does not cause mutations [29]. However, NHEL has a false-prone tendency and causes unexpected base insertion of deletion at the fracture site, resulting in transcoding mutation and functional loss of the target base.9 Now, CRISPR/Cas9 technology mainly uses this feature to knock out specific DNA fragments [30]. Because of the abundance of NGG sequences in the human genome, the system can target almost any gene in human genome [31].

CONSTRUCTION OF A CRISPR HIGH-THROUGHPUT SCREENING PLATFORM
Generally, high-throughput screening platform based on three different CRISPR systems are constructed through the construction of screening libraries, the infection of target cells, the collection of cells with specific screening phenotypes, the analysis of whole-genome information combined with second-generation sequencing methods, and the acquisition of candidate genes corresponding to enrichment of sgRNA [32]. . CRISPR/Cas9 high-throughput system allows researchers to screen liver tumorigenesis in mice. By removing known non-mice orthologs genes and knowns oncogenes, a set of potential carcinogenic genes are chosen. SgRNA that targeted those genes are made and cloned into an expression vector. Those AAVs then are injected into mice. After a specific time, those mice are checked by MRI, histology, and MRIs capture sequencing for analysis of potential mutations and the related genes [33].
Firstly, adeno-associated vectors (AAV) are constructed and different sgRNA sequences are inserted in to establish the screening library [34]. As shown in the figure 2, SgRNA library was packaged with AAV and transferred to cas9 expression cell line with low virus infection complex number to ensure that each cell entered only one virus, to achieve functional screening for different sgRNA corresponding genes [35]. According to whether cas9 and sgRNA are constructed on the same AAV vector, they can be roughly divided into two types of vectors: one is the single plasmid system, in which Cas9 and sgRNA sequences are constructed on the same AAV to ensure that Cas9 and sgRNA can enter the target cells at the same time [36]. The other one is dual plasmid system, in which Cas9 and sgRNA are respectively on two different AAV to ensure that there are enough Cas9 proteins to perform double chain cutting [37]. Some E3S Web of Conferences 185, 03032 (2020) ICEEB 2020 http://doi.org/10.1051/e3sconf/202018503032 researchers delivered cas9 blast and sgRNA using a single virus vector with different antibiotic selection markers, which increased the functional virus titer of lent guide Puro by about 100 times compared with the original lenticrisprv1 [38]. However, some researchers think that it is beneficial for library selection to produce cas9 expressing cell lines in advance before the construction of the sgRNA library [39]. Although the double vector system requires longer preparation time, they found that the expression of cas9 nuclease in the single vector system may affect the mutation efficiency mediated by specific sgRNA sequences on the chromosome [40].
AAV packaging was carried out on the constructed library to further infect the targeted cells, thus achieving large-scale gene screening [41]. The main goal is to control multiplicity of infection (MOI) to ensure that only one virus enters each cell, so as to achieve functional screening of corresponding genes of different sgRNAs [42]. According to different research purposes, there are two screening modes in high-throughput CRISPR screening: positive selection screen and negative selection screen [43]. The positive selection screen refers to the screening system, in which few cells can survive in the presence of a certain screening pressure without external factors, such as the screening of anti-tumor drug resistance [44]. In the study of drug resistance, genes in cells are interfered to make them survive. Subsequently, sgRNA that enriched in the surviving cells was analyzed to obtain the genes related to drug resistance [45]. Negative selection screen, as it is called, ultimately identifies genes that cause certain cellular dysfunction, such as finding genes associated with tumor cell growth. It interferes with the abnormal expression of some genes in tumor cells to cause cell growth arrest. Different from positive selection screen, negative selection screen is to obtain the information of the missing sgRNA by comparing the difference of sgRNA abundance at different screening time points (for example, from the time point where the screen starts and the time point where screen ends) and analyze the genes that are corresponding to the missing sgRNA [9]. These genes are candidates that may influence tumor cell growth. In addition to CRISPR screening at the extracellular level, the establishment of the CRISPR screening platform in animals will be more effective in searching for genes related to tumor cell growth and invasion [7].
Additionally, how to use bioinformatics to find important candidate genes and exclude false-positive and false-negative screening results is crucial for the later analysis of CRISPR high-throughput screening results. Different algorithms such as RIGER [36] and DEseq2 [46] have been able to analyze and process CRISPR highthroughput screening results. However, these algorithms are not analytical approaches to the developing CRISPR system. In recent years, Li et al. [47] developed MAGeCK algorithm for CRISPR high-throughput screening system. Compared with existing algorithms based on RNAi screening (RSA [48] or RIGER [49]), MEGeCK is able to control error detection rate and has a higher sensitivity. The analysis of the CRISPR/Cas9 screening results using this algorithm found the biological significance of some genes and pathways that had not been discovered in the original analysis results, such as the important role of EGFR in the study of vemurafenib resistance in BRAFmutated A375 cells [50].

CRISPR/CAS9 LIBRARY SCREENING PRINCIPLE
There are two types of high-throughput screening libraries: array library and hybrid library. In the former, single or several sgRNAs are listed in the chip or porous plate for processing, and the gene-editing sequence in each hole is known; in the latter, the sgRNA library to be screened is designed by computer, and the positive clones are enriched, then transferred to the host cell to introduce various gene mutations, and finally, the results are obtained by high flux sequencing and other methods [7]. Hybrid library has the advantages of low cost, simple operation and can be used for in vivo research, and can cover the whole genome more comprehensively. The infected cells can be detected after long-term culture [46]. Therefore, the CRISPR library mostly uses the hybrid library for high-throughput screening. The operation flow mainly includes the design and synthesis of sgRNA library, construction of cas9 cell line, phenotype selection, high-throughput sequencing, bioinformatics analysis, verification of candidate genes and miss target effect, etc [51].

THE DESIGN AND SYNTHESIS OF SGRNA LIBRARY FOR HIGH-THROUGHPUT SCREENING
The size of the library are the key factors effecting the screening results. The whole-genome library usually contains more than 100000 sgRNA. On the one hand, it can recognize more interesting genes, but on the other hand, it will cost a lot of time and money. It is easier to achieve high coverage of each sgRNA and improve the quality of data in a specific genomic library composed of kinases or membrane proteins, so it will be a better choice for high-throughput screening [52]. At present, there are various kinds of sgRNA online design software on the market, such as CRISPR-focus, hop chip, CRISPR library Designer (CLD), etc., which can select appropriate software to design sgRNA library according to the needs, but it needs to ensure its low miss efficiency.
In addition, the use of bioinformatics to search for candidate genes and exclude false-positive o false negative screening results is crucial for the later analysis of CRISPR high-throughput screening results. Genomic DNA should be extracted from the selected cell subpopulation, and the DNA template should be sufficient to ensure the abundance of the library. After extracting genomic DNA, the target regions of sgRNA were amplified by PCR, and then these regions were sequenced to quantify their relative abundance [53]. According to the relative abundance of sgRNA before and after screening, we can determine whether sgRNA is enriched or consumed, to determine the relationship between genotype and phenotype. Bioinformatics tools can be used to determine whether a gene is significantly enriched in the same background by evaluating the concentration level of multiple sgRNAs in the same gene [54]. After selecting several candidate genes by CRISPR / cas9 technology, a series of methods are needed to verify them, to identify the functional genes regulating specific phenotypes [55]. First of all, we need to analyze the offtarget rate, if not the target site in exon may lead to falsepositive results. It is necessary to further verify the candidate genes. CRISPR / cas9 technology is used for a single-gene knockout, and a candidate gene knockout cell line is constructed by screening monoclonal cells. After gene typing and immunostaining confirm that the gene has been knocked out, the impact of candidate gene knockout on virus replication is detected. At the same time, it can be determined by genetic complementary test whether its effect is caused by gene knockout [56].

APPLICATION OF CRISPR/CAS9 HIGH THROUGHPUT LIBRARY SCREENING TECHNOLOGY IN TUMOR-RELATED RESEARCH
CRISPR/Cas9 technology provides scientists with a powerful tool for genome transformation. By building a high-throughput library screening platform, it can study genes related to tumor development, screen and develop cancer treatment drugs, and also be used for tumor immunotherapy. As shown in Figure 3, using CRISPR/Cas9 highthroughput library to screen genes related to tumorigenesis and development, CRISPR / cas9 technology can screen genes related to the growth and development of tumor cells by constructing the sgRNA library, to carry out tumor mechanism research and treatment for these target genes. Korkmaz and others constructed screening libraries in human BJ cells and breast cancer cells respectively by CRISPR / cas9 knockout technology and found that the binding enhancer of p53 effect gene CDKN1A is the aging-induced by oncogene in immortalized human cells, and the binding enhancer of Er α effect gene CCND1 is an essential element for the growth and development of breast cancer cells55. To find the specific target of TP53 wild-type Ewing sarcoma, some researchers used CRISPR / cas9 GeCKO library screening in Ewing sarcoma cell line and screened out MDM2, MDM4, USP7, and PPM1A as regulatory factors. Genetic and pharmacological methods are used to verify the dependence of these targets on Ewing sarcoma through the mechanism of p53 action [57]. Chen et al. revealed the dependence of EZH2 in MYCN cancer gene amplified neuroblastoma by whole genome library screening [58].
Zhu et al. screened the long noncoding RNA in human hepatoma cells by constructing a paired guide RNA (pgRNA) CRISPR library. They established a library targeting 671 lncrnas and containing 12472 pairs of gRNAs. Through screening, they identified 51 long noncoding RNAs that can regulate the growth of cancer cells in a positive or negative direction and verified the biological functions of 9 of them. Through the establishment of at enrichment domain 2 (arid2) gene knockout model in sk-hep1 hepatoma cell line, the laboratory of Chongqing Medical University found that arid2 inhibited the expression of CyclinD1 and cycline1 by targeting p-rb-e2f signal pathway and then inhibited the proliferation of hepatoma cells, which confirmed that arid2 participated in the occurrence and progression of hepatoma as an anti-tumor gene. Other researchers screened NF1, TSC2, NF2, Bim and plnb1 genes are closely related to the growth of liver cancer [59]. CRISPR/Cas9 library technology also provides a powerful tool for the research of cancer pharmacology. It can use the CRISPR system to establish an accurate tumor cell model, reveal cell mechanism and determine new drug targets, to develop more efficient and safer tumor drugs ( figure 4). For example, Qi Lei's research team screened thousands of gene mutation combinations with CRISPR / cas9 gene knockout technology, looking for mutations that can selectively kill cancer cells, to find the weakness of cancer. The researchers developed this new system for 73 genes in three laboratory cell lines (human cervical cancer cells, lung cancer cells, and embryonic kidney cells), a total of 150000 gene combinations [60]. The results showed that this method can reveal more than 120 new synthetic lethal interactions, providing a new direction for the future research of cancer drugs.

THE APPLICATION OF CRISPR / CAS9 LIBRARY TECHNOLOGY IN THE RESEARCH AND DEVELOPMENT OF TUMOR DRUGS
Ryan et al. developed an adeno-associated virus (AAV) -mediated CRISPR self-screening in glioblastoma (GBM). They first constructed a mouse tumor suppressor with 288 sgRNAs. The AAV CRISPR vector encoding Cre recombinase was constructed by using gtsg library and GFAP promoter. Then the sgRNA packaged by the vector was injected into Rosa26 LSL cas9 GFP mice to observe the expression of cas9 and GFP in astrocytes. The results showed that the mutations of zc3h13 or PTEN changed the gene expression profile of RB1 mutants and made them more resistant to temozolomide, which provided a good prospect for the study of glioma inhibitors in vivo [39].
Matthew et al. constructed a sgRNA Library of 9100 genes targeting 1350 genes in RB1 -/ -small cell lung cancer (SCLC) cell line for screening, and found that RB1 -/ -SCLC cell line was over-dependent on a variety of proteins related to chromosome separation, such as Aurora B kinase. On this basis, they found that in SCLC and other RB1 -/ -cancers, PRB deletion is a predictor of increasing the sensitivity of Aurora b-kinase inhibitors, providing a reference for reducing drug resistance in a variety of cancers. The Dutch team used CRISPR / cas9 gecko library to prove that ERK2 dependent phenotypic transformation relay is responsible for cancer drug addiction in BRAFV600E melanoma cell line [61].
In cancer chemotherapy, Bax may be the main driver of some cytotoxic drugs. The researchers used 87897 sgRNA to establish the library. The results showed that Vdac2 interacted with Bax to promote Bax mediated apoptosis. The mutation or silencing of Vdac2 may be a potential driver of drug resistance in chronic lymphoblastic leukemia. Also, to find a new target for AML treatment, the researchers screened the whole genome CRISPR/Cas9 Library in vitro and in vivo, and identified DcpS as the target gene for AML treatment. The metabolic pathway of pre mRNA was elucidated. Rg3039 is a DcpS inhibitor originally used for the treatment of spinal muscular atrophy. It shows anti-leukemic activity by inducing pre mRNA mismatch, so we can expand its application in AML treatment in the next step [6].
Application of CRISPR/Cas9 library technology in tumor immunotherapy in recent years, tumor immunotherapy has attracted much attention because of its significant efficacy and innovation, and CRISPR/Cas9 technology also shows a wide application prospect in tumor immunotherapy research. Programmed cell death protein 1 (PD-1) can. The change of expression will affect the activation of T cells and help tumor cells escape from the immune system, which leads to the unsatisfactory therapeutic effect of T cells [62]. To solve the problem that PD-1 checkpoint block immunotherapy is only effective for a few tumor patients, Manguso et al. screened 2368 genes expressed by melanoma cells in immunotherapy mice by CRISPR / cas9 gene-editing technology, to identify genes that cooperate with checkpoint block or cause drug resistance. Finally, it was found that defects in IFN -γ signaling pathway led to resistance to immunotherapy. Marian et al. found that cmtm6 protein is the key regulator of PD-L1 in a wide range of cancer cells by using the whole genome CRISPR / cas9 screening, which provides new insights for the biology of PD-L1 regulation, determines the main regulator that has not been recognized before in this immune checkpoint, and highlights the new potential therapeutic target to overcome the immune evasion of tumor cells [63]. Due to the genetic heterogeneity of liver tumors, how to identify tumor suppressor or tumor suppressor gene has been a problem. MIT researchers infected the p53 deficient mouse embryonic liver progenitor cells expressing myc with the mGeCKOa library containing 67000 sgRNA, transplanted the transduced cells subcutaneously into nude mice, screened out plxnb1, flrt2, b9d1, and other liver tumor suppressors, and verified the RAS signal through MAPK. As shown in the figure 5, the mGeCKOa screen has been widely used in screening out the enriched sgRNA to further study on cancer [64].
Somatic gene mutations can change the vulnerability of cancer cells to T-cell-based immunotherapy [65]. US researchers used a CRISPR/Cas9 library containing 123000 sgRNAs to screen and analyze genes that impair CD8 + T-cell effector function (EFT) in tumor cells [66]. They linked EFT genes from 11000 patients' tumors in the cancer genome map to cytolytic activity and then identified multiple loss of function mutations in aplnr in patients whose immunotherapy effect was not significant [67]. The results showed that aplnr interacted with Jak1 to regulate IFNγ response in tumor and its loss of function reduced the effect of adoptive cell metastasis and checkpoint blocking immunotherapy in mouse model [68]. In general, they linked the loss of essential EFT genes to cancer resistance to immunotherapy or nonresponsiveness. Besides, a research team has used genome-wide CRISPR/Cas9 screening to identify the mechanism of tumor cells against cytotoxic T cell killing they have proved that pbaf complex can reduce the chromatin accessibility of interferon-gamma induced genes in tumor cells, thus increasing the resistance to T cell-mediated cytotoxicity, and provide a mechanistic understanding for clinical immunotherapy [69].
The CRISPR technology plays an important role in finding the potential treatment of tumor systematically and comprehensively. In the screening of genes related to tumor cell growth, CRISPR/Cas9 technology was first used to design small sgRNA libraries targeting some gene exons in KBM7 cells of chronic granulocyte leukaemia [70]. Since the survival of KBM7 cells depends on the fusion protein produced by translocation of bcr-abl1, sgRNA that is only related to exons of BCR and ABL1 was found to inhibit the growth of this cell. Base on this, we can use the sgRNA libraries that included 73151 samples to screen the genes that play an important role in the growth of HL60 and KBM7 cells. In the analysis of the abundance of sgRNA, it was found that most of the genes encoding ribosomal proteins had significant effects on the survival of these two types of cells. Additionally, the genes RPS4Y2, RPS4Y1 and some of the genes that encode the ribosome-like protein are tissue-specific and are generally low expressed in KBM7 cells [71]. Further clustering analysis of the screened genes showed that these genes could be attributed to DNA replication, transcription, protein degradation and other important biological processes. Shalem et al. constructed a sgRNA screening library including 64,751 sgRNAs for the 5'terminal exons of 18,080 genes in the human genome. Similarly, analysis of genes associated with the growth of melanoma A375 cells found that genes associated with ribosomal composition play an important role in the growth of the tumor cells [72].

SUMMARY AND PROSPECT
CRISPR/Cas9 technology, as a powerful gene-editing tool, is widely used in cell gene editing and gene expression regulation, gene knockout animal model construction, human disease animal model treatment research and other fields with its advantages of simple production, low cost and high efficiency. With the establishment of high-throughput library screening technology, CRISPR/Cas9 technology provides a better solution for the research of tumor development mechanism, research and development of anticancer drugs and tumor treatment. Although CRISPR/Cas9 technology has many advantages, its application prospect is still limited by its high miss rate, restriction of PAM sequence and low efficiency of transduction. Therefore, for the application of CRISPR/Cas9 technology, we should establish corresponding humanized mouse model to simulate the occurrence and development of human diseases and explore and develop effective gene therapy methods on this basis, so as to provide important treatment programs and ideas for the follow-up drug development and treatment of related diseases. At present, it is necessary to have a lower CRISPR miss rate and more advanced detection technology support before applying gene-editing tools to the human body and to assess the potential risks with a very rigorous attitude, which should not be a quick success and instant benefit.
The high-throughput library selection strategy based on CRISPR/Cas9 technology involves many processes, including sgRNA design and synthesis, cas9 cell line construction, phenotype selection, high-throughput sequencing, bioinformatics analysis, candidate gene, and miss target effect verification, etc. There are at least three aspects to improve the construction process and screening effect of CRISPR Library: the design of sgRNA; the optimization of the number of sgRNA; the selection of more appropriate monoclonal cas9 protein expression cells. The application of CRISPR/Cas9 high-throughput screening libraries in mammalian cells will undoubtedly provide great help for the research of tumors and other related disease mechanisms. We can foresee that with the deepening of related basic research, the CRISPR/Cas9 system will be more widely developed and applied in the personalized treatment of tumors.