WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
maximize
Machine translation
1. (WO2012040628) RAPID ASSEMBLY OF MULTIPLE ARBITRARY LENGTH DNA FRAGMENTS
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

RAPID ASSEMBLY OF MULTIPLE ARBITRARY LENGTH DNA FRAGMENTS

FIELD OF THE INVENTION

Aspects herein relate to compositions, and related methods, for isolating and assembling DNA molecules without intermediate cloning steps.

BACKGROUND OF THE INVENTION

A major advance in the field of synthetic biology is the concept of the standardized part, a unit of DNA that is flanked by standard prefix and suffix sequences that enable modular assembly of composite parts from basic parts. Parts may be stored and reused or shared between research labs. The wide variety of different composite parts that may be combinatorially constructed from basic parts and previously assembled composite parts enables a large and growing set of synthetic biology applications.

While a major advancement in the field, at the same time the assembly of DNA parts into larger DNA constructs is currently a limiting technology in the engineering of biological system (DNA assembly for synthetic biology: from parts to pathways and beyond, T. Ellis, et al., Integr. Biol., 2011, 3, 109-118). Laborious recombinant cloning processes that take several days are required to combine just two DNA parts into a new construct. New biological systems composed of many genes may only be realized after weeks or months of cloning.

SUMMARY OF THE INVENTION

Aspects provided herein related to a DNA assembly method and related double probing technique for sequence-specific purification. The embodiments described herein address the above limitations by automatically converting a set of standard parts into self-assembling parts, which are an ordered set of molecules with uniquely encoded overhangs that combine in a specific order through annealing to self-assemble a larger DNA molecule. In certain embodiments, this conversion process may use standard restriction enzymes and ligase in combination with reusable paramagnetic beads and biotinylated oligonucleotides (oligos). In certain aspects, self-assembly into new DNA constructs is accomplished by mixing, heating the samples, and allowing to cool (annealing). Elimination of cloning, in some aspects, reduces assembly time from days to hours, minimizes the possibility of

mutations, and supports intermediate constructs that may be toxic to the cell or otherwise render the cell difficult to grow. In some aspects, the methods described herein may be automated using liquid handling robots or microfluidic devices.

According to one aspect of the invention, methods of assembling a plurality of DNA molecules are provided. The methods include preparing a double- stranded (ds)DNA molecule containing (i) a first 5' overhang DNA sequence of at least 10 nucleotides on a first strand of the dsDNA molecule and (ii) a second 5' overhang DNA sequence of at least 10 nucleotides on a second strand of the dsDNA molecule, wherein the first overhang and the second overhang do not anneal together, and wherein each overhang anneals to one overhang of a second and a third dsDNA molecule, respectively; preparing a second dsDNA molecule containing (i) a first 5' overhang DNA sequence of at least 10 nucleotides on a first strand of the dsDNA molecule and (ii) a second 5' overhang DNA sequence of at least 10 nucleotides on a second strand of the dsDNA molecule, wherein the first overhang and the second overhang do not anneal together, and wherein each overhang anneals to one overhang of a third and a fourth dsDNA molecule, respectively; and combining the first dsDNA molecule and the second dsDNA molecule under conditions that permit DNA annealing of the 5' overhang sequences, thereby assembling the plurality of dsDNA molecules.

In some embodiments, the first and second dsDNA molecules are combined with at least one additional dsDNA molecule. In some embodiments, the first and second 5' overhangs are at least 20 nucleotides in length, at least 30 nucleotides in length, at least 40 nucleotides in length, or at least 50 nucleotides in length.

In some embodiments, the methods further include preparing a double stranded linear nucleic acid vector containing (i) a first 5' overhang DNA sequence of at least 10 nucleotides on a first strand of the vector and (ii) a second 5' overhang DNA sequence of at least 10 nucleotides on a second strand of the vector, wherein the first overhang and the second overhang do not anneal together, and wherein each overhang anneals to one overhang of two different dsDNA molecules; and combining the assembled dsDNA molecules of any one of claims 1-6 with the vector under conditions that permit DNA annealing, wherein the assembled dsDNA molecules comprise overhangs that anneal to the overhangs of the vector, thereby assembling the vector and the assembled dsDNA molecules.

In some embodiments, the methods further include probing one or more of the overhang sequences of assembled dsDNA molecule to select correctly assembled dsDNA

molecules. In other embodiments, the methods further include probing the overhang sequences of assembled dsDNA molecule to confirm the structure of the assembled dsDNA molecule.

According to another aspect of the invention, methods of preparing a dsDNA molecule are provided. The methods include providing a circular DNA vector comprising a vector backbone VB and a polynucleotide P-X-S, wherein P is a prefix nucleic acid sequence and S is a suffix nucleic acid sequence, each comprising one or more enzyme recognition sites, and X is a DNA part of interest; linearizing the circular DNA vector with one or more enzyme that cleaves one or more enzyme recognition sites in the suffix to produce a linear double- stranded (ds) DNA having a 5' overhang at each end of the dsDNA; ligating to the 5' overhang of the suffix a first oligonucleotide having a length of at least 20 nucleotides;

removing the vector backbone from the linear dsDNA with one or more enzyme that cleaves one or more enzyme recognition sites in the prefix, producing a 5' overhang; and ligating to the 5' overhang of the prefix a second oligonucleotide having a length of at least 20 nucleotides, thereby preparing a dsDNA molecule.

In some embodiments, the DNA element X is selected from the group consisting of: a terminator sequence, a ribosome binding site sequence, a protein coding sequence, a reporter sequence, a signaling sequence, a primer sequence, and a regulatory sequence.

In some embodiments, the enzymes are selected from the group consisting of EcoRI, XbaI, PstI, and SpeI. In other embodiments, the enzymes are selected from I-AniI, I-CeuI, I-ChuI, I-CpaI, I-CpaII, I-CreI, I-DmoI, H-DreI, I-HmuI, I-HmuII, I-LlaI, I-MsoI, PI-PƒuI, PI-PkoII, l-PorI I-PpoI, PI-PspI, I-ScaI, I-SceI, Pl-SceI, I-SceII, I-SecIII, I-SceIV, I-SceV, I-SceVI, I-SceVII, I-Ssp6803I, I-TevI, I-TevII, I-TevIII, PI-TliI, PI-TliII, I-Tsp061I, and I-Vdi141I.

In some embodiments, the methods further include probing the prefix and/or the suffix of the newly assembled DNA molecule to select correctly assembled dsDNA molecules. In other embodiments, the methods further include probing the prefix, then probing the suffix of the newly assembled DNA molecule to confirm the structure of the assembled dsDNA molecule.

In some embodiments of all of the foregoing methods, the method excludes polymerase chain reaction (PCR) purification.

In some embodiments of all of the foregoing methods, the method excludes antibiotic selection.

In some embodiments of all of the foregoing methods, the method is automated.

According to another aspect of the invention, nucleic acid molecules, such as DNA molecules, are provided. The nucleic acid molecules include a double-stranded DNA; a first 5' overhang of at least 10 nucleotides in length located at one end of the DNA; and a second 5' overhang of at least 10 nucleotides in length located at the other end of the DNA. In some embodiments the double-stranded DNA is a standard DNA part.

According to another aspect of the invention, nucleic acid molecules, such as DNA molecules, are provided. The nucleic acid molecules include a first biotinylated

oligonucleotide of at least 10 nucleotides in length, wherein the first oligonucleotide is not phosphorylated and is attached to a bead. In some embodiments, the bead is a streptavidin-coated bead. In some embodiments, the nucleic acid molecules further include a second oligonucleotide of at least 10 nucleotides in length annealed to the first oligonucleotide such that there is a 5' overhang opposite the bead.

According to another aspect of the invention, double- stranded DNA molecules are provided that include a nucleic acid of interest flanked by a prefix sequence and suffix sequence, each of the prefix sequence and the suffix sequence comprising one or more homing enzyme restriction sites. In some embodiments, the DNA molecules further include an origin of replication. In some embodiments, the DNA molecules further include an antibiotic resistance selection marker. In some embodiments, the DNA molecules are attached to a bead or a DNA molecule comprising a bead.

According to another aspect of the invention, collections of two or more of the DNA molecules are provided. The collections include two or more of any of the DNA molecules described herein.

According to another aspect of the invention, kits including one or more containers containing two or more of any of the DNA molecules described herein.

DETAILED DESCRIPTION OF THE INVENTION

The methods and compositions provided herein, in certain aspects, overcome previous limitations by using magnetically-controlled DNA probes that anneal directly to standard DNA prefix and suffix regions to both isolate and assemble new DNA parts without

intermediate cloning steps. In some aspects, the methods provided herein reduce construction time, minimize mutations, and support intermediate constructs that may be toxic to a cell or otherwise difficult to grow.

DNA Construction

Aspects provided herein enable rapid and automated assembly of multiple arbitrary length DNA fragments or parts. A DNA part may be any DNA sequences of defined structure and function. Examples include, but are not limited to, terminator sequences, ribosome binding site (RBS) sequences, protein coding sequences, reporter sequences, signaling sequence, primer sequence, and regulatory sequences or transcriptional control sequences.

A "terminator sequence" refers to a sequence typically located at the end of a gene or operon to stop transcription.

An "RBS sequence" refers to a sequence that, when transcribed, is a segment of the 5' (upstream) part of an mRNA molecule that binds to the ribosome to position the message correctly for the initiation of translation.

A "protein coding sequence" refers to sequence information needed to create functional protein (polypeptide).

A "reporter sequence" refers to a sequence encoding a product that is measurable, for example, a fluorescent protein or colorimetric enzyme. Examples include β-galactosidase, chloramphenicol acetyltransferase, β-glucuronidase, and green or red fluorescent protein (GFP or RFP).

A "signaling sequence" refers to a sequence encoding a signal peptide (e.g., a ~3-60 amino acids long peptide chain that directs the transport of a protein).

A "primer sequence" refers to a strand of nucleic acid that serves as a starting point for DNA synthesis.

A "regulatory sequence" or "transcriptional control sequence" refers to a binding region for RNA polymerase, for example, a promoter or enhancer sequence. In some embodiments, a regulatory sequence may be constitutive, while in other embodiments, it may be inducible.

A DNA part may be provided as a linear structure or it may be part of a DNA plasmid vector. A "plasmid vector" is a circular, double-stranded DNA molecule that may contain a few thousand base pairs that replicate within the cell independently of chromosomal DNA. In some embodiments, a plasmid is a bacterial plasmid (derived or isolated from bacteria). A bacterial plasmid may be derived from an organism selected from Ace tobacter spp.,

Acidithiobacillus spp., Acinetobacter spp., Aeromonas spp., Arcanobacterium spp.,

Agrobacterium spp., Alcaligenes spp., Aquiƒex spp., Arthrobacter spp., Azotobacter spp., Bacillus spp., Biƒidobacterium spp., Borrelia spp., Chromobacterium spp., Citrobacter spp., Clostridium spp., Comamonas spp., Corynebacterium spp., Deinococcus spp., Enterobacter spp., Erwinia spp., Erysipelothrix spp., Escherichia spp., Flavobacterium spp., Francisella spp., Fusobacterium spp., Geobacter spp., Geobacillus spp., Gluconobacter spp.,

Halobacterium spp., Helicobacter spp., Lactobacillus spp., Lactococcus spp., Marinococcus spp., Methanobacterium spp., Methanococcus spp., Microcystis spp., Microlunatus spp., Mycobacterium spp., Pseudomonas spp., Pantoea spp., Ralstonia spp., Rhizobium spp., Rhodococcus spp., Ruminococcus spp., Serratia spp., Saccharopolyspora spp.,

Stenotrophomonas spp., Sinorhizobium spp., Saccharopolyspora spp., Staphylococcus spp., Streptococcus spp., Streptomyces spp., Sulƒolobus spp., Synechococcus spp., Synechocystis spp., Thermus spp., Thiobacillus spp., Vibrio spp., Xanthomonas spp., Xyella spp., Yersinia spp., Zygosaccharomyces spp., or Zymonas spp.

Examples of bacterial plasmids include SUN Plasmid pRAY, pTi-SAKURA, pAP1, ece1, pAC5, pX02, pRBH1, pPL10, pTA1040, pTA1015, pTA1060, p1414, pSTK1, pBMB9741, pTX14-3, pTX14-1, pBMB2062, pCIBb1, RK2,RP1,RP4,R18,R68, lp17, lp25, lp28-1, lp28-2, lp28-3, lp28-4, lp 36, lp 38, lp 54, cp26, cp9, cp32-1, cp32-3, cp32-4, cp32-6, cp32-7, cp32-8, cp32-9, lp21, lp56, lp5, pLeu-Sg, pLeu-Dn, pOM1, pCpA1, pCL1, pSOL1, pSOL12, MCF-1 indigenous plasmid, pPT1, R-plasmid pAG1, pAM330, pSR1, pXZ10145.1, MP1, CP1, R751, pPIGDM1, pSW200, pAP1, CloDF13, ColA, ColE1, ColIb-P9, pECO29, pKL1, NTP16, pO157, pOSAK1, R100, mini Rts1, pFL1, pOM1, pPA52, pHGN1, pNRC100, pHPM180, pHPM186, pHPO100, pHP489, pWS58, pLH1, pLJ1, pAH33, pCIS3, pMRC01, pIL105, pLTK2, pGT232, pPL1, pME2200M, pME2001, pFV1, pFZ1, pURB500, pMA2, small pMA1, pTA144 Dw, pTA144 Up, pJD4, pAYL, pAYS, pUCD5000, pPZG500, pAB2, pIG1, pMD136, pPF1, pPBS1, Pp6427, Pp6859, pRAM4, pGT5, pMBCP, pJTPS1, NGR234, pRM21, pCFC1, pCFC2, pBAW301, pBERT, pSC101, pSR1, pNL1, pSK6, pSK3, pKH6, J3358, pKH7, pC221, pUB110, pC194, pLUG10, pGB354, pSSU1, pER35, pER36, pSCL, pSA1.1, pSNA1, pSN22, pJV1, pRN1, pRN2, pMA4, pTF4.1, pSA19, pXF868,

pYVe227, pPCP1, pCD1, pMT1[100984 bp, pCD1, pMT-1, pYC, pZMO1, pSB2, and pSM1 FLP. In certain embodiments, the plasmid is derived from a eukaryotic organism such as Saccharomyces cerevisiae.

In certain aspects, arbitrary DNA for assembly is performed by "direct conversion," followed by "double probing" (each defined below). In some embodiments, the methods described herein facilitate high fidelity assembly with the prepared DNA parts. In certain embodiments, arbitrary DNA assembly may be performed by direct conversion without using double probing. Similarly, in some embodiments, double probing may be used without direct conversion. In particular embodiments, however, the serial combination of the two allow for robust DNA assembly.

Conversion oƒ standard parts to selƒ-assembling parts

Described herein are methods of transforming a standard DNA part (e.g., from a library of standard DNA or genetic circuits) that typically has a universal prefix and suffix into a single self-assembling part via the addition of a designer prefix and suffix segment. A "prefix" as used herein refers to a DNA sequence flanking a nucleic acid (e.g., protein coding sequence) of interest at the 5' end of the nucleic acid. A "suffix" refers to a DNA sequence flanking the nucleic acid at its 3' end. In embodiments, a prefix and/or suffix may have one more enzyme recognition sites (also referred to as restriction sites), which are cleaved (cut) by restriction enzymes.

A restriction enzyme may be a Type I, Type II, Type III, or Type IV enzyme. Type I enzymes (EC 3.1.21.3) cleave at sites remote from recognition site, typically require both ATP and S-adenosyl-L-methionine to function, and are multifunctional proteins with both restriction and methylase (EC 2.1.1.72) activities. Type II enzymes (EC 3.1.21.4) cleave within or at short specific distances from recognition site, most require magnesium, and are single function (restriction) enzymes independent of methylase. Type III enzymes (EC 3.1.21.5) cleave at sites a short distance from recognition site, typically require ATP, S-adenosyl-L-methionine can stimulate the reaction but is not required, and they exist as part of a complex with a modification methylase (EC 2.1.1.72). Type IV enzymes target methylated DNA. In particular embodiments, the restriction enzymes described herein are Type II enzymes. Examples of Types II restriction enzymes include, but are not limited to EcoRI,

EcoRII, BamHI, HindIII, TaqI, NotI, HinƒI, Sau3, APovII, SmaI, HaeIII, HgaI, AluI, EcoRV, EcoP15I, KpnI, PstI, SacI, SaiI, SeaI, SpeI, SphI, StuI, and Xbal.

A restriction enzyme may also be a homing enzyme. A "homing enzyme" (or homing endonuclease) refers a restriction enzymes encoded by introns or inteins. Examples include I-AniI, I-CeuI, I-ChuI, I-CpaI, I-CpaII, I-CreI, I-DmoI, H-DreI, I-HmuI, I-HmuII, I-LlaI, I-MsoI, PI-PƒuI, PI-PkoII, I-PorI, I-PpoI, PI-PspI, I-ScaI, I-SceI, Pl-SceI, I-SceII, I-SecIII, I-SceIV, I-SceV, I-SceVI, I-SceVII, I-Ssp6803I, I-TevI, I-TevII, I-TevIII, PI-TliI, PI-TliII, I-Tsp061I, I-Vdil41I, PI-PspI, and PI-IsceI. Homing endonuclease recognition sequences are typically long enough to occur randomly only with a very low probability (e.g.,

approximately once every 7x1010 nucleotides) and, in some cases, are found in only one instance per genome.

DNA oligonucleotides - preƒix, suƒƒix, and probe design

A single-stranded prefix, suffix, or probe oligonucleotide (e.g., single-stranded stretch of DNA), in certain embodiments, is of sufficient length to have a melting temperature (Tm) similar to its annealing temperature. In certain embodiments, an oligonucleotide's Tm is at least higher than the temperature at which the DNA assembly reaction is performed, e.g., higher than room temperature. In some embodiments, the Tm of an oligo may be about 45°C, 50°C, 55°C, 60°C, 65°C, 70°C, 75°C, or higher. In some instances, depending on the DNA assembly reaction temperature, the Tm of the oligos may be lower than 45°C.

In certain embodiments, the length of an oligonucleotide (oligo) may be about 10 to about 50, about 10 to about 100, or about 10 to about 500, or more nucleotides in length. For example, the oligo may be at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 200, at least 300, at least 400, or at least 500 nucleotides in length. In certain embodiments, an oligo described herein is about 20, about 30, about 40, or about 50, or about 60 nucleotides long. In specific embodiments, the oligos described herein are about 40 nucleotides long.

In some embodiments, the oligos described herein are biotinylated (i.e., one or more biotin molecule(s) is covalently attached to the oligo). In some embodiments, the oligos described herein may be attached to beads, also referred to as magnetic beads. In certain embodiments, the beads are coated (covered) in streptavidin or avidin.

The direct conversion methods described herein can achieve direct automated assembly of DNA, in many instances, providing "building blocks" for the described assembly methods. Below are methods of converting existing standard DNA parts into self-assembling parts.

Direct Conversion ƒrom BioBrick Standard Parts

The methods described herein may be used to convert, for example, the standard

BioBricks Foundation RFC 10 part (Knight, T. (2003), Idempotent Vector Design ƒor Standard Assembly oƒ Biobricks, MIT Synthetic Biology Working Group) into a self-assembling part through the ligation of long (e.g., at least 10, at least 20, at least 30, or at least 40 nucleotides) single-stranded overhangs to restriction sites in standard prefix and suffix sequences. Figures 17 and 18 depict one embodiment of a method for preparing a self-assembling DNA part (molecule). Figure 17 shows the steps adding (e.g., ligating) an overhang to a Spel restriction site in the suffix, while Figure 18 shows the steps adding an overhang to the Xbal restriction site in the prefix.

An optional, preferred component of the methods described herein is the use of double probing. Double probing is a sequence- specific selection or enrichment method that removes molecules that do not anneal to both probes, and keeps molecules that do anneal.

Production oƒ Selƒ-Assembling DNA Units Using Homing Enzymes

Described herein are standard DNA parts, and related methods, produced using one or more homing enzymes ("meganucleases") with long and therefore rare recognition sites in the prefix and suffix. In certain embodiments, use of homing enzymes may provide more flexibility in the design of the DNA part itself and/or its subsequent utilization in downstream processes. In one embodiment, a standard DNA part described herein, may comprise a restriction site for I-AniI, I-CeuI, I-ChuI, I-CpaI, I-CpaII, I-CreI, I-DmoI, H-DreI, I-HmuI, I-HmuII, I-LlaI, I-MsoI, PI-PƒuI, PI-PkoII, I-PorI, I-PpoI, PI-PspI, I-ScaI, I-SceI, PI-SceI, I-SceII, I-SecIII, 1-SceIV, I-SceV, I-SceVI, I-SceVII, I-Ssp6803I, I-TevI, I-TevII, I-TevIII, PI-TliI, PI-TliII, I-Tsp061I, I-Vdi141I, PI-PspI, or PI-IsceI. In certain embodiments, a meganuclease-based standard DNA part ("MegaBrick") includes a restriction site for I-SceI in the prefix and I-CeuI in the suffix (or I-SceI in the suffix and I-CeuI in the prefix) surrounding the DNA part as follows:

.


Some variability in homing enzyme recognition sites is possible. One set of recognition sites and resulting overhangs from the New England BioLabs (NEB) website is:


Unlike Typell enzymes, for example, homing enzymes have non-palindromic overhangs, so, in some embodiments, only two enzymes may be used to construct a self-assembling DNA molecule. In certain embodiments, the 4-nucleotide non-palindromic overhang provides flexibility because (1) the long (e.g., 18-base pair) recognition site reduces the possibility of collision with a sequence in the original DNA, and (2) the non-palindromic overhang permits combined restriction enzyme and overhang ligation steps in a single pot (e.g., tube or well) reaction, optionally followed by the double probe method to enrich for the correct product (self-assembling molecules).

In some embodiments, the final self-assembled DNA conform to a "MegaBrick" standard. In some instances, in order for the final assembly to be idempotent, the

meganuclease (homing enzyme) restriction sites are reintroduced on separate DNA parts annealed from two homologous oligonucleotides with long (e.g., at least 10 nucleotides) overhangs created. For example:

.


Simultaneous Joining and Conversion ƒrom Multiple Standard Parts

Also provided herein are methods of assembling (e.g., ligating and/or annealing) multiple (standard) DNA parts into a single DNA molecule (e.g., genes, gene circuits). In certain embodiments, the methods described herein can be used to assemble multiple DNA parts simultaneously to produce longer or compound self-assembling parts. In some instances, the methods provide intermediate assembled parts allowing for faster/less complex future assemblies (e.g., for gene circuits). The number of parts that can be assembled at once into a new self-assembling molecule is a function of the assembly standard. In some embodiments, 2 to 5, 2 to 10, 2 to 20, 2 to 30, 2 to 40, or 2 to 50 parts can be assembled. In some embodiments 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more parts can be assembled. Examples of multiple DNA part assembly are, as follows:

Example (1) - For assembly of multiple BioBrick parts using BioBrick RFC 10, at least two parts can be ligated and converted into a self-assembling molecule and its assembly verified using the double probe method, described herein:

.


Example (2 ) - In some embodiments, multiple parts can be assembled with TypeIIS enzymes and unique 4-nucletotide intersections. In certain embodiments, the methods provided herein exclude or omit standard molecular cloning methods such as antibiotic resistance or screening selection methods (e.g., see Weber, E. et al. PLoS ONE, 6, el6765, 2011). In some instances, the double probing methods described herein provide for the ligation and selection of a multiple parts into a self-assembling molecule in the following form:


.

Example (3) - In some embodiments, two or more self-assembling parts can be annealed and the new construction double probed in order to remove unwanted/unannealed products. In some embodiments, the assembled molecule will be a larger self-assembling part, ready for self-assembly into even larger molecules, and in some instances, at high efficiencies.

Double Probing

Double probing may be defined herein as a method in which two identifiable or recoverable probes are attached (e.g., ligated or annealed) to both ends of a target sequence in order to verify, optimize, and/or enrich a given DNA target sequence. In some embodiments, double probing provides a higher level of sequence and assembly validation as compared to current strand-end hybridization methods. In certain embodiments, double probing is used to construct large gene circuits. In some embodiments, double probing increases yield of each intermediate and end-product DNA molecule.

Probing with two probes

In certain aspects, the double probing method comprises first probing one overhang an assembled DNA molecule (e.g., the prefix), then probing a different overhang of the

assembled DNA molecule (e.g., the suffix) to verify that the assembled molecule is correct (e.g., to confirm the structure of the molecule, including the order of the assembled parts). In some aspects, the double probing method is used to select or capture correctly assembled molecules. For example, when a DNA molecule is assembled using two long (e.g., at least 10 nucleotide) overhangs, double probing selects for correct molecules and rejects incorrect molecules. An another example, when converting two or more standard parts into self-assembling parts, double probing can verify that the long overhangs are present on selected assembled molecules. Double probing can also verify the accuracy and structure of intermediate parts because any parts required to assemble a larger part/molecule will remain attached to the probe when probed from either end of the molecule.

Probing with temporary overhangs

In some aspects, provided herein are methods of preparing self-assembling DNA parts having temporary single-stranded DNA overhangs. In some embodiments, the temporary overhangs may be double probed.

In some embodiments, temporary single-stranded DNA overhangs can be created with an exonuclease which removes or "chews" DNA bases from either its 3' or the 5'end. This process may be referred to, and is known in the art, as "chewback." When no dNTPs are present T4 DNA Polymerase (NEB M0203), for example, acts as an exonuclease and chews back from the 3' end of the DNA. As another example, T5 Exonuclease chews back from the 5' end of the DNA. In one embodiment, a temporary overhang is prepared by permitting chewback by a T4 or T5 for a defined period of time, then inactivating the chewing enzyme (e.g., with heat inactivation). Similar to the fixed overhangs described herein, in some embodiments, temporary overhangs can be used in any one of the double probing methods provided herein to verify that two DNA parts are correctly joined. In some embodiments, the temporary overhangs described herein may also be used for sequence-specific purification.

In certain embodiments, an advantage of temporary overhangs is that after probing, removed DNA can be replaced and the integrity of the original double-stranded DNA restored or repaired. In some embodiments, the introduction of extraneous scar sequences into an assembled DNA molecule can be avoided by using temporary overhangs. To eliminate the overhang, it may be necessary to add complementary DNA to an assembly reaction. In some embodiments, T4 DNA polymerase (with dNTPS), Taq polymerase, or an

error correcting polymerase such as Pƒx (Invitrogen®) may be used to eliminate a temporary overhang. In other embodiments, a temporary overhang may be eliminated by annealing to the overhang a single-stranded oligonucleotide cap complementary to the chewed back region.

Double probing with temporary overhangs, in certain embodiments, enables multistep ligation-based assembly without intermediate in vivo cloning / screening steps (see Results, e.g., Figure 8-12).

As an example, for BioBrick RFC 10 (Draft standard for BioBrick™ biological parts), temporary overhangs can be used to probe both prefix and suffix of the DNA part (Figure 13), as well as either end of the backbone (Figure 16). This method, in some instances, may be used to purify parts from backbones without the use of gel extraction or overhang ligation.

Applications

Simpliƒying Existing Workflows

The field of synthetic biology covers a broad range of application areas, from food and energy, to the environment, to health and medicine, and even to manufacturing of new materials. In certain embodiments, the methods and compositions provided herein may facilitate biological engineering and may be used by a diverse group of people. In particular embodiments, the methods and compositions described herein may be used to explore the molecular design space in the fields of biotechnology and microbiology, for example, in medicine and agriculture.

Large Scale Assembly oƒ Genes / Gene Circuits

As described herein, double probing enables the robust assembly of large scale DNA molecules. In certain instances, the methods and compositions described herein may be used for the construction of large genes, and even small genomes.

Automated Assembly

One application of the methods and compositions described is the automation of DNA assembly workflows. In some embodiments, the methods described herein are performed in

liquid phase, or on soluble magnetic beads between common operating temperatures (e.g., 10°C to 80 °C. Therefore, in some instances, the DNA assembly steps are executable on current commercially available liquid handling robots.

Additional Embodiments

In some aspects, provided herein are beads with standard probes and complementary overhangs for creating customized parts from proprietary or other DNA sequences available to the end user.

In other aspects, provided herein are libraries of self-assembling parts with pre-designed long (e.g., at least 10 nucleotides) overhangs. In some embodiment, these parts can be provided as aliquots of DNA, either in tubes, microplates of size 96, 384, 1536, 3456, 9600, or other formats as available, in microfluidic chips, or hybridized to DNA microarrays and photo-released.

In certain other aspects, protocols on a standard liquid handling robot or a specialized device or machine for processing magnetic beads are provided. In some aspects, software is provided for designing new unique overhangs. In yet other aspects, complete machines / devices for designing and automating DNA part assembly are provided. For example, in some embodiments, bead handling may be performed using a Tecan Evo 150 liquid handler.

EXAMPLES

Methods & Materials

Molecular biology

BioBrick Assembly kit (E0546) including EcoRI, XbaI, SpeI, PstI, T4 DNA ligase were purchased from New England BioLabs (NEB). These enzymes may also be purchased separately and are generally available from other vendors. NEB also supplies homing endonucleases I-SceI (R0694) and I-CeuI (R0699). AccuPrime Pfx Supermix from

Invitrogen (Life Technologies, Carlsbad CA) was used for PCR verification of long overhang additions. Miniprep of DNA was done using Qiagen Qiaprep Spin Miniprep Kit. Some minipreps were automated on Qiagen Qiacube using the same miniprep kit.

Architecture oƒ Anchors

The next section shows sequences for anchors if prefix and suffix are BioBrick RFC 10 (compatible with XbaI/SpeI 4bp sites). Constraints on anchors.

1. Should not contain forbidden cutsites

2. Ligation of 4bp overhang should not introduce either XbaI or SpeI cutsite

(special case for backbone)

3. Should be long enough to easily anneal (e.g. 40bp)

4. Should be unique versus other anchors and complementary to one other prefix/suffix

Details oƒ anchor sequences

In both conversion and probing methods, oligonucleotide anchor sequences capable of annealing to complementary overhangs of around 40bp were used. Overhangs and anchors may be shorter so long as the melting temperature (Tm) is in a similar range to that used for PCR primers (for example, 50-70°C). Longer overhangs are more effective but are also more costly. Similar to primers, secondary structure and hairpin formation should be minimized. While the first and last bases in the overhang must match the cutsites used for ligation to the standard parts, the interior region can be composed of a large range of sequences so long as the following conditions are met:

1. an anchor and the overhang it anneals to should be complementary;

2. overhangs designed to self-assemble should be complementary;

3. all other anchor and overhang combinations should be non-homologous;

4. anchor and overhang sequences should not contain the forbidden restriction enzyme cutsites defined by the standard being converted (for example, for RFC 10 this would be EcoRI, XbaI, SpeI, PstI).

In one embodiment, we choose sequences as follows:

Nine unique sequences that can be used for overhangs based on combinations of 12mer words in Shortreed, et al., A thermodynamic approach to designing structure-free

combinatorial DNA word sets. Nucl. Acids Res. 33(15): 4965-4977 doi: 10.1093/nar/gki812.


DNA sequence examples for first two unique overhangs and BioBrick RFC 10 conversion:

Phosphorylated anchor complements:


Note: non-phosphorylated overhangs can only be ligated in vitro with the addition of T4 Polynucleotide kinase (for example NEB M0201S), while either version will be ligated in vivo, after transformation. One or the other version may be preferred depending of oligo and enzyme costs.

In some embodiments, as special type of anchor can be used to preserve XbaI, SpeI cutsites for backbone overhang ligations (a single base is different). Alternatively, re- introduction of cutsites is done on separate self-assemblable parts with long overhangs. The goal in these embodiments is to preserve idempotency, i.e., so the resulting part will also be a BioBrick having proper flanking restriction enzyme cutsites.

Exemplary design oƒ probes and overhangs ƒor I-SceI and I-CeuI ƒlanked standard parts.

Two unique overhangs:


Pairs or anchoring probes and complementary overhangs are J85330/J85331;

J85332/J85333; J85334/J85335; J85336/J85337; J85360/J85331; J85332/J85363; J85364/J85335; J85336/J85367; J85370/J85371; J85372/J85373; J85374/J85375; and J85376/J85377.

Overhangs that are complementary are: J85330/J85333; J8334/J85337;

J85360/J85363; J85364/J84367; J85370/J85373; and J8374/J85377.

The term "/5Phos/" denotes a phosphorylated oligo and the term "/3BioTEG/" denotes a 3' biotinylated oligo as ordered from IDTDNA.

When long overhangs are designed to ligate to short 5' overhangs, the long overhang must include matching short overhang sequences on either end. When ligating to a recessed 5' end, only the ligating must be complementary to the short overhang.

It is important not to phosphorylate the 5' end of the biotinylated anchor probe so that during ligation no permanent connection is made to DNA that can be easily be removed from the bead.

Example assemblable part that can be created are J85370:part:J85377 or

J85374:part:J85373. Note that between the overhang and the part is a dsDNA scar roughly equivalent to half the endonuclease cutsite.

Additional anchor sequences for use with temporary overhangs created by 3' and 5' chewback of BioBrick RFC 10 prefix and suffix, part and backbone:

J85324 is used to probe the end of the backbone that connects to the prefix of PSB1A2 if 3' chewed, for example with T4 DNA Polymerase. J85325 is used to probe the end of the backbone that connects to the suffix when 5' chewed, for example with T5 Exonuclease.


An efficient procedure for separating a BioBrick part from its backbone is to cut with EcoRI (PstI), chew with T4 DNA polymerase (T5 Exonuclease), probe with J85324

(J85325), and then cut with SpeI (XbaI) to release part from probe. Before using the chewed end of the part, a repair step is needed (for example, fill-in with a polymerase).

Conversion oƒ standard part to selƒ-assembling part

The following steps, detailed below, were used to convert BioBrick RFC 10 parts to assemblable parts.

1. Start with DNA, using 100ng to several ug of DNA from miniprep elution, and

measure concentration.

2. Simultaneously digest part with XbaI and ligate to bead containing prefix overhang 1 or 2. Beaded-overhang is created by annealing anchor probe to complementary overhang and incubating with beads.

3. Wash two times in low salt buffer, last time in T4 DNA ligase buffer.

4. Cut with SpeI and PstI and wash in low salt buffer to remove backbone.

5. Elute at 80°C in T4 buffer.

6. Simultaneously digest with SpeI and ligate to bead containing suffix overhang 2 or 1.

7. Wash.

8. Elute.

9. ExoI digest or probe with anchor complementary to prefix 1 or 2 (J8331 or J85335).

Figures 17, 18, 19 further illustrate these steps graphically, with boxes representing

DNA.

Annealing and selƒ-assembly oƒ assemblable parts.

Annealing is performed by combining parts in equimolar ratios and heating to 70°C or higher and allowing to cool to 23°C slowly on the bench (typically 30-60 minutes).

Annealing anchor probe and complementary overhang.

Oligos were ordered from IDTDNA and resuspended at 100uM concentration in sterile de-ionized water. Each anchor / overhang complement was combined in equal molar ratios and heated to 95°C for 2 minutes and then allowed to cool to room temperature slowly by turning off the heat block.

Incubation oƒ anchor with streptavidin coated beads.

40ul of streptavidin Magnetic Beads (NEB S1420S) were washed three times in high salt binding buffer and then incubated with 20ul of 1uM anchor/overhang DNA for 10 minutes at room temperature. After incubation, an additional three wash steps were performed. Beads were washed by mixing, applying magnet, and removing supernatant, being careful not to disturb the beads. Other solid phase separation techniques can be used; for example, non-magnetic beads or chemical attachment to glass surfaces.

Low salt binding buffer used: 0.1 M NaCl, 20 mM Tris HCl (pH 7.5), 1 mM EDTA

High salt binding buffer used: 0.5 M NaCl, 20 mM Tris HCl (pH 7.5), 1 mM EDTA

Simultaneous ligation with T4 DNA ligase and digestion with one or more BioBrick enzymes.

BioBrick enzymes (EcoRI, XbaI, SpeI, PstI) are all active in T4 DNA ligase buffer although at somewhat reduced rates. Simultaneous ligation / digestion reactions were performed either at room temperature in T4 DNA ligase buffer supplemented with 5% PEG or in cycling reactions between low temperature ligation (16°C) and 37°C digestion. A final digestion step at 37 °C can be included to favor an outcome in which no unwanted ligations have occurred. Ligation times ranging from 20 minutes to several hours to overnight were used, depending on the amount of DNA to ligate and the timing constraints of the

experimental procedures. Some enzymes are not suitable for extended digestion (SpeI) and will not remain active as long as the ligase. In these cases, additional amounts of enzyme were added at 37 °C for 30 minutes as a final step.

Wash steps.

Wash steps were performed at room or slightly warmer temperatures, 37°C being most convenient. Warm washes should not approach the melting temperature (Tm) of any annealed regions during probing. Washes may be performed in low or high salt buffer. Final washes were typically performed in T4 DNA ligase buffer before subsequent ligation or digestions reactions.

Elution.

Elution temperature must exceed the melting temperature of the annealed overhang to be eluted. For the sequences J85330-J8370, an elution temperature of 80°C worked well. Elution temperature for other sets of sequences may differ, and can readily be calculated based on the base composition of the sequences, as will be understood by persons of skill in the art. Elution at high temperatures in water can break the biotin-streptavidin bond as well as denature DNA, so generally at least a low salt buffer was used. 1X T4 DNA ligase buffer was determined to be a generally useful elution buffer.

Probing.

A double probing is performed by first probing one overhang and then the other. When adding long overhangs, the overhang addition step can often serve as a probe step, reducing the total number of steps needed. A single probe is performed by capturing the DNA with the overhang on a complementary anchor probe already attached to a bead. The anchor probe should be single stranded. A probe is an annealing reaction that works well starting at a high temperature. To avoid disturbing the streptavidin-biotin bond, probe starting temperatures of 80°C and in 5% PEG buffers worked well.

Removing backbone.

When a part is attached to a bead, the backbone can be removed by cutting with XbaI or SpeI for BioBrick RFC10 parts. In some cases, EcoRI or PstI must also be used since the backbone can also ligate to the bead anchors due to the palindromic nature of the BioBrick enzymes. Later probing steps are used to eliminate any construction error this could cause.

Exo I Step.

An Exo I (NEB M0293) digestion in ExoI buffer was used to remove any single stranded DNA (ssDNA) that results from elution of unligated overhangs. This step can either save a probe step that accomplishes the same purpose, or can be used in combination with probe steps to further eliminate unwanted ssDNA. Lambda exonuclease can also be used for the same purpose if overhangs are ligated to 3' instead of 5'.

T4 chewback.

We used a simple T4 DNA polymerase chewback reaction of 180ng / Unit-minute of DNA for 20 minutes at 23°C in T4 DNA ligase buffer. After 20 minutes, the enzyme was heat inactivated at 80°C.

T5 chewback.

We used a simple T5 exonuclease chewback reaction of 20 minutes at 37°C with a dilution of between 1:20 and 1: 100 depending on DNA concentrations (empirically determined). After 20 minutes, the enzyme is heat inactivated at 80°C.

Fill in.

Fill-in reactions were performed with T4 DNA polymerase and appropriate dNTPs. NEB PreCR repair mix (M0309) also can be used.

Robotic Assembly.

Programs for digestion, ligation, bead-based DNA purification, sequence-specific purification, and annealing reactions assembly reactions along with standard screening protocols for transformation, colony picking, and restriction mapping were developed for a Tecan Evo 150 liquid handler configured with an 8-tip LiHa (liquid handling) arm and a RoMa (robotic manipulator) arm. Devices on the robot deck included a Tecan 37 °C incubator, a Torrey Pines ICXT20 (-20°C to 100°C) Peltier heater/chiller, an Avision AVA6 passport scanner for colony picking, an EmbiTec RunOne electrophoresis station with an EmbiTec PI-1000 PrepOne Sapphire blue illuminator, and various microplate positions, hotels and tip holders. In addition, a Tecan Safire2 plate reader was used as a shaking incubator and can measure cell optical density (OD) and DNA concentration as needed by a protocol.

Computer code written in Python (the MIT BioCAD package) was developed and used to compile scripts for robotic assembly and screening of assembly reactions. Input to robotic assembly includes an ordered-list of the names of promoters, genes, and other DNA elements to be assembled. The BioCAD package includes a database to map these names to location on deck or can be in automated storage associated with the robotic system (for example using 2D-barcoded tubes). Input can be directly from a user or generated from other software systems and bioinformatics tools such as Geneious or a frontend biocompiler.

Results

Results are shown in the figures and described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows success conversion of two RFC 10 BioBricks J85204 and J85206 to assemblable molecules with long overhangs. In each case, unique overhangs have been ligated to the prefix and suffix of the part. Lane 6 shows the result for J85206 and Lane 11 for J85204. The DNA is the correct length after elution, however it is otherwise not clear to see the increase in length by the small overhang.

Hence Figure 2 describes a method based on PCR to determine if any overhangs have been successfully added. Primers 1 and 2 are added to the molecule, which now serves as the template (1). Although the primers will not initially bind to the overhang since they are the same as the overhang, after a single round of extension using polymerase (2), subsequent amplification will occur (3). If amplification occurs, then at least some molecules have overhangs ligated to both ends (4).

Figure 3 shows the application of this PCR-based method to demonstrate that J85204 has been correctly converted to overhang2_J85204_overhang1. RT-PCR could additionally

be used to determine the percentage of correct molecules. Note that overhang2 as prefix is part J85334 and overhang 1 as a suffix is part J85333.

Figure 4 shows that self-assembly is a multiway anneal requiring annealing of long overhangs. Molecules will self assemble if overhangs are complementary.

Figure 5 shows a test case for overhang annealing. Parts J85204 and J85206 can be annealed to form part J85208.

Figure 6 shows the result of annealing parts J85204 and J85206 to form part J85208, with a correct annealed product in lane 2, demonstrating successful self-assembly of converted parts.

Figure 7 shows results demonstrating that Exo I can be used to cleanup single stranded DNA without affecting double stranded DNA. 113522 is a standard BioBrick.

Figure 8 shows plasmids being cut open with EcoRI or PstI in preparation for creation of temporary overhangs.

Figure 9 shows one end of a double-stranded DNA molecule after chewback of single stranded DNA.

Figure 10 shows probing an overhang (which may be temporary).

Figure 11 shows the resulting sequence and complementary regions when the prefix of an RFC 10 part is annealed to an anchor probe. Part (1) shows the prefix part with temporary overhang annealed to an anchor probe.

AATTCGCGGCCGCTTCTAGANNNNNNNNNNNN is SEQ ID NO: 1;

CAAAGAAUCTTAAGCGCCGGCGAAGATCT is SEQ ID NO:2. Part (2) shows the part ligated to the anchor. AATTCGCGGCCGCTTCTAGANNNNNNNNNNNN is SEQ ID NO:3; CAAAGAAUCTTAAGCGCCGGCGAAGATCTNNNNNNNNNNNN is SEQ ID NO:4. Part (3) shows the intersection after a fill-in reaction and with the backbone removed via an SpeI cut. AATTCGCGGCCGCTTCTAGANNNNNNNN is SEQ ID NO:5;

CAAAGAAUCTTAAGCGCCGGCGAAGATCTNNNNNNNNGATC is SEQ ID NO:6.

Figure 12 illustrates backbone removal by enzyme digestion.

Figure 13 shows the results generated when applying the techniques illustrated in Figure 8-12. A BioBrick backbone was successfully separated from its part without gel extraction. A BioBrick prefix anchor was used to separate a backbone from the part (J10000 in this result) without gel extraction.

Any standard part can be captured and purified in this manner with universal prefix / suffix overhangs. The steps for this exemplary procedure were as follows:

1. Wash fresh beads

2. Incubate anchor on bead

3. Cut part with EcoRI

4. Anneal part to bead, wash

5. Cut off backbone with XbaI, wash

6. Elute purified part at 80°C

7. After elution, other parts can be joined without an intermediate cloning step.

Figure 14 further illustrates how the series of steps in Figure 13 can be generalized to complete multistep assembly reactions without intermediate cloning steps, which can also be executed in a binary fashion (Figure 15) in log N steps. Note that conversion of even a single part to an assemblable part as described herein requires at least a two-step reaction for RFC 10 BioBricks (one step to ligate each overhang).

Figure 16 shows the results of using an alternative procedure to Figure 13, in which the temporary overhang is created on the backbone rather than the part with J85325. In this alternative embodiment, the part is removed from the backbone in fewer steps since the part is not directly attached to the bead and hence can be cut and taken out with the supernatant without requiring a high temperature elution step.

Figure 17 schematically shows adding a suffix sequence to a part.

Figure 18 schematically shows adding a prefix sequence to a part and purifying away suffix sequence after removing by digestion.

Figure 19 schematically shows purifying away prefix sequence using a suffix anchor sequence attached to a bead.

Figure 20 shows an alternative embodiment for the method of conversion of standard parts to self-assembling parts.

This invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including,"

"comprising," or "having," "containing," "involving," and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

Each of the foregoing patents, patent applications and references is hereby

incorporated by reference, particularly for the teaching referenced herein.

Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the invention. Accordingly, the foregoing description and drawings are by way of example only.

What is claimed is: