Leading Edge
Review
Development and Applications of
CRISPR-Cas9for Genome Engineering
Patrick D.Hsu,1,2,3Eric S.Lander,1and Feng Zhang1,2,*
1Broad Institute of MIT and Harvard,7Cambridge Center,Cambridge,MA02141,USA
2McGovern Institute for Brain Research,Department of Brain and Cognitive Sciences,Department of Biological Engineering, Massachusetts Institute of Technology,Cambridge,MA02139,USA
3Department of Molecular and Cellular Biology,Harvard University,Cambridge,MA02138,USA
*Correspondence:
/10.ll.2014.05.010
Recent advances in genome engineering technologies based on the CRISPR-associated RNA-guided endonuclease Cas9are enabling the systematic interrogation of mammalian genome function.Analogous to the search function in modern word processors,Cas9can be guided to specific locations within complex genomes by a short RNA search string.Using this system, DNA sequences within the endogenous genome and their functional outputs are now easily edited or modulated in virtually any organism of choice.Cas9-mediated genetic perturbation is simple and scalable,empowering researchers to elucidate the functional organization of the genome at the systems level and establish causal linkages between genetic variations and biological phenotypes. In this Review,we describe the development and applications of Cas9for a variety of research or translational applications while highlighting challenges as well as future directions.Derived from a remarkable microbial defense system,Cas9is driving innovative applications from basic biology to biotechnology and medicine.
Introduction
The development of recombinant DNA technology in the1970s marked the beginning of a new era for biology.For thefirst time,molecular biologists gained the ability to manipulate DNA molecules,making it possible to study genes and harness them to develop novel medicine and biotechnology.Recent ad
vances in genome engineering technologies are sparking a new revolution in biological research.Rather than studying DNA taken out of the context of the genome,researchers can now directly edit or modulate the function of DNA sequences in their endogenous context in virtually any organism of choice, enabling them to elucidate the functional organization of the genome at the systems level,as well as identify causal genetic variations.
Broadly speaking,genome engineering refers to the process of making targeted modifications to the genome,its contexts (e.g.,epigenetic marks),or its ,transcripts).The ability to do so easily and efficiently in eukaryotic and especially mammalian cells holds immense promise to transform basic sci-ence,biotechnology,and medicine(Figure1).
For life sciences research,technologies that can delete,insert, and modify the DNA sequences of cells or organisms enable dis-secting the function of specific genes and regulatory elements. Multiplexed editing could further allow the interrogation of gene or protein networks at a larger scale.Similarly,manipu-lating transcriptional regulation or chromatin states at particular loci can reveal how genetic material is organized and utilized within a cell,illuminating relationships between the architecture of the genome and its functions.In biotechnology,precise manipulation of genetic building blocks and regulatory machin-ery also facilitates the reverse engineering or reconstruction of
useful biological systems,for example,by enhancing biofuel production pathways in industrially relevant organisms or by creating infection-resistant crops.Additionally,genome engi-neering is stimulating a new generation of drug development processes and medical therapeutics.Perturbation of multiple genes simultaneously could model the additive effects that un-derlie complex polygenic disorders,leading to new drug targets, while genome editing could directly correct harmful mutations in the context of human gene therapy(Tebas et al.,2014). Eukaryotic genomes contain billions of DNA bases and are difficult to manipulate.One of the breakthroughs in genome manipulation has been the development of gene targeting by homologous recombination(HR),which integrates exogenous repair templates that contain sequence homology to the donor site(Figure2A)(Capecchi,1989).HR-mediated targeting has facilitated the generation of knockin and knockout animal models via manipulation of germline competent stem cells, dramatically advancing many areas of biological research.How-ever,although HR-mediated gene targeting produces highly pre-cise alterations,the desired recombination events occur extremely infrequently(1in106–109cells)(Capecchi,1989),pre-senting enormous challenges for large-scale applications of gene-targeting experiments.
To overcome these challenges,a series of programmable nuclease-based genome editing technologies have
been
modulate1262Cell157,June5,2014ª2014Elsevier Inc.
developed in recent years,enabling targeted and efficient modi-fication of a variety of eukaryotic and particularly mammalian species.Of the current generation of genome editing technolo-gies,the most rapidly developing is the class of RNA-guided endonucleases known as Cas9from the microbial adaptive im-mune system CRISPR (clustered regularly interspaced short palindromic repeats),which can be easily targeted to virtually any genomic location of choice by a short RNA guide.Here,we review the development and applications of the CRISPR-associated endonuclease Cas9as a platform technology for achieving targeted perturbation of endogenous genomic ele-ments and also discuss challenges and future avenues for inno-vation.
Programmable Nucleases as Tools for Efficient and Precise Genome Editing
A series of studies by Haber and Jasin (Rudin et al.,1989;Plessis et al.,1992;Rouet et al.,1994;Chouli
ka et al.,1995;Bibikova et al.,2001;Bibikova et al.,2003)led to the realization that tar-
geted DNA double-strand breaks (DSBs)could greatly stimulate genome editing through HR-mediated recombination events.Subsequently,Carroll and Chandrasegaran demonstrated the potential of designer nucleases based on zinc finger proteins for efficient,locus-specific HR (Bibikova et al.,2001,2003).Moreover,it was shown in the absence of an exogenous homol-ogy repair template that localized DSBs can induce insertions or deletion mutations (indels)via the error-prone nonhomologous end-joining (NHEJ)repair pathway (Figure 2A)(Bibikova et al.,2002).These early genome editing studies established DSB-induced HR and NHEJ as powerful pathways for the versatile
and precise modification of eukaryotic genomes.
To achieve effective genome editing via introduction of site-specific DNA DSBs,four major classes of customizable DNA-binding proteins have been engineered so far:meganucleases derived from microbial mobile genetic elements (Smith et al.,2006),zinc finger (ZF)nucleases based on eukaryotic transcrip-tion factors (Urnov et al.,2005;Miller et al.,2007),transcription activator-like effectors (TALEs)from Xanthomonas bacteria (Christian et al.,2010;Miller et al.,2011;Boch et al.,2009;Mos-cou and Bogdanove,2009),and most recently the RNA-guided DNA endonuclease Cas9from the type II bacterial adaptive im-mune system CRISPR (Cong et al.,2013;Mali et al.,2013a ).Meganuclease,ZF,and TALE proteins all recognize specific DNA sequences through protein-DNA interactions.Although meganucleases integrate its nuclease and DNA-binding domains,ZF and TALE proteins consist of individual modules targeting 3or 1nucleotides (nt)of DNA,respectively (Figure 2B).ZFs and TALEs can be assembled in desired combi-nations and attached to the nuclease domain of FokI to direct nucleolytic activity toward specific genomic loci.Each of these platforms,however,has unique limitations.
Meganucleases have not been widely adopted as a genome engineering platform due to lack of clear correspondence between meganuclease protein residues and their target DNA sequence speci
ficity.ZF domains,on the other hand,exhibit context-dependent binding preference due to crosstalk between adjacent modules when assembled into a larger array (Maeder et al.,2008).Although multiple strategies have been developed to account for these limitations (Gonzaelz et al.,2010;Sander et al.,2011),assembly of functional ZFPs with the desired DNA binding specificity remains a major challenge that requires an extensive screening process.Similarly,although TALE DNA-binding monomers are for the most part modular,they can still suffer from context-dependent specificity (Juillerat et al.,2014),and their repetitive sequences render construction of novel TALE arrays labor intensive and costly.
Given the challenges associated with engineering of modular DNA-binding proteins,new modes of recognition would signifi-cantly simplify the development of custom nucleases.The CRISPR nuclease Cas9is targeted by a short guide RNA that recognizes the target DNA via Watson-Crick base pairing (Figure 2C).The guide sequence within these CRISPR RNAs typically corresponds to phage sequences,constituting the nat-ural mechanism for CRISPR antiviral defense,but can be easily replaced by a sequence of interest to retarget the Cas9nuclease.Multiplexed targeting by Cas9can now be achieved at unprecedented scale by introducing a battery of short guide
Figure 1.Applications of Genome Engineering
Genetic and epigenetic control of cells with genome engineering technologies is enabling a broad range of applications from basic biology to biotechnology and medicine.(Clockwise from top)Causal genetic mutations or epigenetic variants associated with altered biological function or disease phenotypes can now be rapidly and efficiently recapitulated in animal or cellular models (Animal models,Genetic variation).Manipulating biological circuits could also facilitate the generation of useful synthetic materials,such as algae-derived,silica-based diatoms for oral drug delivery (Materials).Additionally,precise genetic engineering of important agricultural crops could confer resistance to envi-ronmental deprivation or pathogenic infection,improving food security while avoiding the introduction of foreign DNA (Food).Sustainable and cost-effec-tive biofuels are attractive sources for renewable energy,which could be achieved by creating efficient metabolic pathways for ethanol production in algae or corn (Fuel).Direct in vivo correction of genetic or epigenetic defects in somatic tissue would be permanent genetic solutions that address the root cause of genetically encoded disorders (Gene surgery).Finally,engineering cells to optimize high yield generation of drug precursors in bacterial factories could significantly reduce the cost and accessibility of useful therapeutics (Drug development).
Cell 157,June 5,2014ª2014Elsevier Inc.1263
RNAs rather than a library of large,bulky proteins.The ease of Cas9targeting,its high efficiency as a site-specific nuclease,and the possibility for highly multiplexed modifications have opened up a broad range of biological applications across basic research to biotechnology and medicine.
The utility of customizable DNA-binding domains extends far beyond genome editing with site-specific endonucleases.Fusing them to modular,sequence-agnostic functional effector domains allows flexible recruitment of desired perturbations,such as transcriptional activation,to a locus of interest (Xu and Bestor,1997;Beerli et al.,2000a;Konermann et al.,2013;Maeder et al.,2013a;Mendenhall et al.,2013).In fact,any modular enzymatic component can,in principle,be substituted,allowing facile additions to the genome engineering toolbox.Integration of genome-and epigenome-modifying enzymes with inducible protein regulation further allows precise temporal control of dynamic processes (Beerli et al.,2000b;Konermann et al.,2013).
CRISPR-Cas9:From Yogurt to Genome Editing
The recent development of the Cas9endonuclease for genome editing draws upon more than a decade of basic research into understanding the biological function of the mysterious repetitive elements now known as CRISPR (Figure 3),which are found throughout the bacterial and archaeal di
versity.CRISPR loci typically consist of a clustered set of CRISPR-associated (Cas)genes and the signature CRISPR array—a series of repeat sequences (direct repeats)interspaced by variable sequences (spacers)corresponding to sequences within foreign genetic elements (protospacers)(Figure 4).Whereas Cas genes are translated into proteins,most CRISPR arrays are first tran-scribed as a single RNA before subsequent processing into shorter CRISPR RNAs (crRNAs),which direct the nucleolytic activity of certain Cas enzymes to degrade target nucleic acids.The CRISPR story began in 1987.While studying the iap enzyme involved in isozyme conversion of alkaline phosphatase li ,Nakata and colleagues reported a curious set of 29nt repeats downstream of the iap gene (Ishino et al.,1987).Unlike most repetitive elements,which typically take the form of tandem repeats like TALE repeat monomers,these 29nt repeats were interspaced by five intervening 32nt nonrepetitive sequences.Over the next 10years,as more microbial genomes were sequenced,additional repeat elements were reported from genomes of different bacterial and archaeal strains.Mojica and colleagues eventually classified interspaced repeat sequences as a unique family of clustered repeat elements present in >40%of sequenced bacteria and 90%of archaea (Mojica et al.,2000).
These early findings began to stimulate interest in such micro-bial repeat elements.By 2002,Jansen
and Mojica coined the acronym CRISPR to unify the description of microbial genomic loci consisting of an interspaced repeat array (Jansen et al.,2002;Barrangou and van der Oost,2013).At the same time,several clusters of signature CRISPR-associated (cas )genes were identified to be well conserved and typically adjacent to the repeat elements (Jansen et al.,2002),serving as a basis for the eventual classification of three different types of CRISPR systems (types I–III)(Haft et al.,2005;Makarova et al.,2011b ).Types I and III CRISPR loci contain multiple Cas proteins,now known to form complexes with crRNA (CASCADE complex for type I;Cmr or Csm RAMP complexes for type III)to facilitate the recognition and destruction of target nucleic acids (
Brouns
Figure 2.Genome Editing Technologies Exploit Endogenous DNA Repair Machinery
(A)DNA double-strand breaks (DSBs)are typically repaired by nonhomologous end-joining (NHEJ)or homology-directed repair (HDR).In the error-prone NHEJ pathway,Ku heterodimers bind to DSB ends and serve as a molecular scaffold for associated repair proteins.Indels are introduced when the complementary strands undergo end resection and misaligned repair due to micro-homology,eventually leading to frameshift muta-tions and gene knockout.Alternatively,Rad51proteins may bind DSB ends during the initial phase of HDR,recruiting accessory factors that direct genomic recombination with homology arms on an exogenous repair template.Bypassing the matching sister chromatid facilitates the introduction of precise gene modifications.
(B)Zinc finger (ZF)proteins and transcription activator-like effectors (TALEs)are naturally occurring DNA-binding domains that can be modularly assembled to target specific se-quences.ZF and TALE domains each recognize 3and 1bp of DNA,respectively.Such DNA-binding proteins can be fused to the FokI endonuclease to generate programmable site-specific nucleases.(C)The Cas9nuclease from the microbial CRISPR adaptive immune system is localized to specific DNA sequences via the
guide sequence on its guide RNA (red),directly base-pairing with the DNA target.Binding of a protospacer-adjacent motif (PAM,blue)downstream of the target locus helps to direct Cas9-mediated DSBs.
1264Cell 157,June 5,2014ª2014Elsevier Inc.
et al.,2008;Hale et al.,2009)(Figure 4).In contrast,the type II system has a significantly reduced number of Cas proteins.However,despite increasingly detailed mapping and annotation of CRISPR loci across many microbial species,their biological significance remained elusive.
A key turning point came in 2005,when systematic analysis of the spacer sequences separating the individual direct repeats suggested their extrachromosomal and phage-associated ori-gins (Mojica et al.,2005;Pourcel et al.,2005;Bolotin et al.,2005).This insight was tremendously exciting,especially given previous studies showing that CRISPR loci are transcribed (Tang et al.,2002)and that viruses are unable to infect archaeal cells carrying spacers corresponding to their own genomes (Mojica et al.,2005).Together,these findings led to the specula-tion that CRISPR arrays serve as an immune memory and defense mechanism,and individual spacers facilitate defense against bacteriophage infection by exploiting Watson-Crick base-pairing between nucleic acids (Mojica et al.,2005;Pourcel e
t al.,2005).Despite these compelling realizations that CRISPR loci might be involved in microbial immunity,the specific mech-anism of how the spacers act to mediate viral defense remained a challenging puzzle.Several hypotheses were raised,including thoughts that CRISPR spacers act as small RNA guides to degrade viral transcripts in a RNAi-like mechanism (Makarova et al.,2006)or that CRISPR spacers direct Cas enzymes to cleave viral DNA at spacer-matching regions (Bolotin et al.,2005).
Working with the dairy production bacterial strain Strepto-coccus thermophilus at the food ingredient company Danisco,Horvath and colleagues uncovered the first experimental evidence for the natural role of a type II CRISPR system as an adaptive immunity system,demonstrating a nucleic-acid-based immune system in which CRISPR spacers dictate target speci-ficity while Cas enzymes control spacer acquisition and phage defense (Barrangou et al.,2007).A rapid series of studies illumi-nating the mechanisms of CRISPR defense followed shortly and helped to establish the mechanism as well as function of all three types of CRISPR loci in adaptive immunity.By studying the type I CRISPR locus of Escherichia coli ,van der Oost and colleagues showed that CRISPR arrays are transcribed and converted into small crRNAs containing individual spacers to guide Cas nuclease activity (Brouns et al.,2008).In the same year,CRISPR-mediated defense by a type III-A CRISPR syst
em from Staphylococcus epidermidis was demonstrated to block plasmid conjugation,establishing the target of Cas enzyme activity as DNA rather than RNA (Marraffini and
Sontheimer,
Figure 3.Key Studies Characterizing and Engineering CRISPR Systems
Cas9has also been referred to as Cas5,Csx12,and Csn1in literature prior to 2012.For clarity,we exclusively adopt the Cas9nomenclature throughout this Review.CRISPR,clustered regularly interspaced short palindromic repeats;Cas,CRISPR-associated;crRNA,CRISPR RNA;DSB,double-strand break;tracrRNA,trans -activating CRISPR RNA.
Cell 157,June 5,2014ª2014Elsevier Inc.1265
2008),although later investigation of a different type III-B system from Pyrococcus furiosus also revealed crRNA-directed RNA cleavage activity(Hale et al.,2009,2012).
As the pace of CRISPR research accelerated,researchers quickly unraveled many details of each type of CRISPR system (Figure4).Building on an earlier speculation that protospacer-adjacent motifs(PAMs)may direct the type II Cas9nuclease to cleave DNA(Bolotin et al.,2005),Moineau and colleagues high-lighted the importance of PAM sequences by demonstrating that PAM mutations in phage genomes circumvented CRISPR inter-ference(Deveau et al.,2008).Additionally,for types I and
II,the lack of PAM within the direct repeat sequence within the CRISPR array prevents self-targeting by the CRISPR system.In type III systems,however,mismatches between the50end of the crRNA and the DNA target are required for plasmid interference(Marraf-fini and Sontheimer,2010).
By2010,just3years after thefirst experimental evidence for CRISPR in bacterial immunity,the basic function and mecha-nisms of CRISPR systems were becoming clear.A variety of groups had begun to harness the natural CRISPR system for various biotechnological applications,including the generation of phage-resistant dairy cultures(Quiberoni et al.,2010)and phylogenetic classification of bacterial strains(Horvath et al., 2008,2009).However,genome editing applications had not yet been explored.
Around this time,two studies characterizing the functional mechanisms of the native type II CRISPR system elucidated the basic components that proved vital for engineering a simple RNA-programmable DNA endonuclease for genome editing. First,Moineau and colleagues used genetic studies in Strepto-coccus thermophilus to reveal that Cas9(formerly called Cas5,Csn1,or Csx12)is the only enzyme within the cas gene cluster that mediates target DNA cleavage(Garneau et al.,2010).Next,Charpentier and colleagues revealed a key component in the biogenesis and processing of crRNA in type II CRISPR systems—a noncoding trans-activating crRNA(tracrRNA)that
hybridizes with crRNA to facilitate RNA-guided targeting of Cas9(Deltcheva et al.,2011).This dual RNA hybrid,together with Cas9and endogenous RNase III,is required for processing the CRISPR array transcript into mature crRNAs(Deltcheva et al.,2011).These two studies suggested that there are at least three components(Cas9, the mature crRNA,and tracrRNA)that are essential for recon-stituting the type II CRISPR nuclease system.Given the increasing importance of programmable site-specific nucleases based on ZFs and TALEs for enhancing eukaryotic genome editing,it was tantalizing to think that perhaps Cas9could be developed into an RNA-guided genome editing system. From this point,the race to harness Cas9for genome editing was
on.Figure4.Natural Mechanisms of Microbial CRISPR Systems in Adaptive Immunity Following invasion of the cell by foreign genetic elements from bacteriophages or plasmids(step 1:phage infection),certain CRISPR-associated (Cas)enzymes acquire spacers from the exoge-nous protospacer sequences and install them into the CRISPR locus within the prokaryotic genome (step2:spacer acquisition).These spacers are segregated between direct repeats that allow the CRISPR system to mediate self and nonself recognition.The CRISPR array is a noncoding RNA transcript that is enzymatically maturated through distinct pathways that are unique to each type of CRISPR system(step3:crRNA biogenesis and processing).
In types I and III CRISPR,the pre-crRNA transcript is cleaved within the repeats by CRISPR-asso-ciated ribonucleases,releasing multiple small crRNAs.Type III crRNA intermediates are further processed at the30end by yet-to-be-identified RNases to produce the fully mature transcript.In type II CRISPR,an associated trans-activating CRISPR RNA(tracrRNA)hybridizes with the direct repeats,forming an RNA duplex that is cleaved and processed by endogenous RNase III and other unknown nucleases.Maturated crRNAs from type I and III CRISPR systems are then loaded onto effector protein complexes for target recognition and degradation.In type II systems, crRNA-tracrRNA hybrids complex with Cas9to mediate interference.
Both type I and III CRISPR systems use multi-protein interference modules to facilitate target recognition.In type I CRISPR,the Cascade com-plex is loaded with a crRNA molecule,constituting a catalytically inert surveillance complex that rec-ognizes target DNA.The Cas3nuclease is then recruited to the Cascade-bound R loop,mediating
target degradation.In type III CRISPR,crRNAs associate either with Csm or Cmr complexes that bind and cleave DNA and RNA substrates,respectively.In contrast,the type II system requires only the Cas9nuclease to degrade DNA matching its dual guide RNA consisting of a crRNA-tracrRNA hybrid.
1266Cell157,June5,2014ª2014Elsevier Inc.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论