Traitement en cours

Veuillez attendre...

PATENTSCOPE sera indisponible durant quelques heures pour des raisons de maintenance le mardi 27.07.2021 à 12:00 PM CEST
Paramétrages

Paramétrages

Aller à Demande

1. WO2008008256 - PROCÉDÉS PERMETTANT D'AMÉLIORER LA PRODUCTION DE COMPOSÉS ISOPRÉNOÏDES PAR DES CELLULES HÔTES

Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

[ EN ]

METHODS FOR ENHANCING PRODUCTION OF ISOPRENOID COMPOUNDS BY HOST CELLS
CROSS-REFERENCE
[0001] This application claims the benefit of U.S. Provisional Patent Application No. 60/819,706, filed

July 7, 2006, which application is incorporated herein by reference in its entirety.

BACKGROUND
[0002] Engineering microorganisms for the production of industrial products has become increasingly attractive in the past decades due to multiple advantages over traditional synthetic methods. Creating new biosynthetic capabilities in microorganisms allows previously limited products, such as therapeutic proteins and complex natural chemicals to be produced and purified at high levels while reducing the use of petroleum-based organic precursors and environmentally destructive chemical processes. In this effort, research has shifted focus from engineering the production of a single recombinant protein to the production of small molecule (e.g., non-protein) products, both natural and synthetic.
[0003] Isoprenoids are a highly diverse class of natural products from which numerous commercial flavors, fragrances, chemicals, and medicines are derived. Isoprenoids constitute an extremely large and diverse group of natural products that have a common biosynthetic origin, i.e., a single metabolic precursor, isopentenyl diphosphate (IPP). At least 20,000 isoprenoids have been described. The number of C-atoms present in the isoprenoids is typically divisible by five (C5, ClO, C15, C20, C25, C30 and C40), although irregular isoprenoids and polyterpenes have been reported. Isoprenoid compounds are also referred to as "terpenes" or "terpenoids." Important members of the isoprenoids include the
carotenoids, monoterpenoids, sesquiterpenoids, diterpenoids, and hemiterpenes. Carotenoids include, e.g., lycopene, β-carotene, and the like, many of which function as antioxidants. Monoterpenoids include, e.g., menthol and camphor, which are flavor and fragrance agents. Sesquiterpenoids include, e.g., artemisinin, a compound having anti-malarial activity. Diterpenoids include, e.g., taxol, a cancer chemotherapeutic agent.
[0004] These valuable compounds are commonly isolated from plants, microbes, and marine
organisms where they are naturally produced in small quantities. As such, purification from native sources suffers from low yields, impurities, and excessive consumption of natural resources.
Furthermore, most of these compounds are chemically complex, resulting in chemical synthesis routes that are difficult, expensive, and suffer from low yields. For these reasons, the engineering of metabolic pathways to produce large quantities of complex isoprenoids in a tractable biological host presents an attractive alternative to extractions from environmental sources or chemical syntheses. Production consistency, scalability, and efficiency of substrate-to-product conversion of microbial fermentation are of particular importance to producing isoprenoid products on the scale and cost of commodity
chemicals.

[0005] There is a need in the art for methods of making various products of medical and commercial interest, where the products, or precursors of same, are synthesized in genetically modified host cells.

Literature
[0006] U.S. Patent No. 7,172,886; U.S. Patent No. 7,192,751; Martin et al. (2003) Nat. Biotech.
21 (7):796-802; U.S. Patent No. 7, 183,089; Smolke et al. (2000) Appl Environ Microbiol 66:5399-5405;

Smolke and Keasling (2002) Biotechnol Bioeng 80:762-776 (2002).

SUMMARY OF THE INVENTION
[0007] The present invention provides methods of producing an isoprenoid or an isoprenoid precursor in a host cell that comprises a biosynthetic pathway that converts a substrate to isopentenyl
pyrophosphate, where the biosynthetic pathway is modified to include a synthetic intergenic region
(IGR) between at least two coding regions encoding enzymes in the biosynthetic pathway. The present invention further provides recombinant nucleic acid constructs comprising a synthetic IGR, and
genetically modified host cells comprising a synthetic IGR.

BRIEF DESCRIPTION OF THE DRAWINGS
[0008] Figures 1 A-E depict Tunable InterGenic Regions (TIGR) assembly and reporter operon.
[0009] Figures 2A-C depict expression from TIGR RG library.
[0010] Figures 3A-I depict TIGR effects on expression.
[0011] Figures 4 A-F depict mevalonate pathway optimization using the TIGR method.
[0012] Figures 5A-C, 6A-C, 7A-C, and 8A-C depict TIGRs from various constructs.
[0013] Figure 9 is a schematic representation of the mevalonate (MEV) pathway for the production of

EPP.
[0014] Figure 10 is a schematic representation of the DXP pathway for the production of IPP and dimethylallyl pyrophosphate (DMAPP).
(0015] Figure 11 is a schematic representation of isoprenoid metabolic pathways that result in the production of the isoprenoid biosynthetic pathway intermediates polyprenyl diphosphates geranyl diphosphate (GPP), farnesyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP), from
isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP).

DEFINITIONS
[0016] The terms "isoprenoid," "isoprenoid compound," "terpene," "terpene compound," "terpenoid," and "terpenoid compound" are used interchangeably herein, and refer to any compound that is capable of being derived from IPP. The number of C-atoms present in the isoprenoids is typically evenly
divisible by five (e.g., C5, ClO, Cl 5, C20, C25, C30 and C40). Irregular isoprenoids and polyterpenes have been reported, and are also included in the definition of "isoprenoid." Isoprenoid compounds include, but are not limited to, monoterpenes, diterpenes, triterpenes, sesquiterpenes, and polyterpenes.

[0017] As used herein, the term "prenyl diphosphate" is used interchangeably with "prenyl
pyrophosphate," and includes monoprenyl diphosphates having a single prenyl group (e.g., IPP and
DMAPP), as well as polyprenyl diphosphates that include 2 or more prenyl groups. Monoprenyl
diphosphates include isopentenyl pyrophosphate (IPP) and its isomer dimethylallyl pyrophosphate
(DMAPP).
[0018] As used herein, the term "terpene synthase" (also referred to a "terpene cyclase") refers to any enzyme that enzymatically modifies IPP, DMAPP, or a polyprenyl pyrophosphate, such that a terpenoid precursor compound is produced. The term "terpene synthase" includes enzymes that catalyze the conversion of a prenyl diphosphate into an isoprenoid or isoprenoid precursor.
[0019] The word "pyrophosphate" is used interchangeably herein with "diphosphate." Thus, e.g., the terms "prenyl diphosphate" and "prenyl pyrophosphate" are interchangeable; the terms "isopentenyl pyrophosphate" and "isopentenyl diphosphate" are interchangeable; the terms famesyl diphosphate" and farnesyl pyrophosphate" are interchangeable; etc.
[0020] The term "mevalonate pathway" or "MEV pathway" is used herein to refer to the biosynthetic pathway that converts acetyl-CoA to IPP. The mevalonate pathway comprises enzymes that catalyze the following steps: (a) condensing two molecules of acetyl-CoAto acetoacetyl-CoA(e.g., by action of acetoacetyl-CoA thiolase); (b) condensing acetoacetyl-CoAwith acetyl-CoAto form
hydroxymethylglutaryl-CoenzymeA (HMG-CoA) (e.g., by action of HMG-CoA synthase (HMGS)); (c) converting HMG-CoA to mevalonate (e.g., by action of HMG-Co A reductase (HMGR)); (d)
phosphorylating mevalonate to mevalonate 5-phosphate (e.g., by action of mevalonate kinase (MK)); (e) converting mevalonate 5-phosphate to mevalonate 5 -pyrophosphate (e.g., by action of
phosphomevalonate kinase (PMK)); and (f) converting mevalonate 5-pyrophosphate to isopentenyl pyrophosphate (e.g., by action of mevalonate pyrophosphate decarboxylase (MPD)). The mevalonate pathway is illustrated schematically in Figure 9. The "top half of the mevalonate pathway refers to the enzymes responsible for the conversion of acetyl-CoA to mevalonate.
[0021] The term "1 -deoxy-D-xylulose 5-diphosphate pathway" or "DXP pathway" is used herein to refer to the pathway that converts glyceraldehyde-3 -phosphate and pyruvate to IPP and DMAPP
through a DXP pathway intermediate, where DXP pathway comprises enzymes that catalyze the
reactions depicted schematically in Figure 10.
[0022] As used herein, the term "prenyl transferase" is used interchangeably with the terms "isoprenyl diphosphate synthase" and "polyprenyl synthase" (e.g., "GPP synthase," "FPP synthase," "GGPP
synthase," etc.) to refer to an enzyme that catalyzes the consecutive 1 '-4 condensation of isopentenyl diphosphate with allylic primer substrates, resulting in the formation of prenyl diphosphates of various chain lengths.

[0023] The terms "polynucleotide" and "nucleic acid," used interchangeably herein, refer to a
polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
[0024] The terms "peptide," "polypeptide," and "protein" are used interchangeably herein, and refer to a polymeric form of amino acids of any length, which can include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides having modified peptide backbones.
[0025] The term "naturally-occurring" as used herein as applied to a nucleic acid, a cell, or an
organism, refers to a nucleic acid, cell, or organism that is found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by a human in the laboratory is naturally occurring.
[0026] As used herein the term "isolated" is meant to describe a polynucleotide, a polypeptide, or a cell that is in an environment different from that in which the polynucleotide, the polypeptide, or the cell naturally occurs. An isolated genetically modified host cell may be present in a mixed population of genetically modified host cells.
[0027] As used herein, the term "exogenous nucleic acid" refers to a nucleic acid that is not normally or naturally found in and/or produced by a given bacterium, organism, or cell in nature. As used herein, the term "endogenous nucleic acid" refers to a nucleic acid that is normally found in and/or produced by a given bacterium, organism, or cell in nature. An "endogenous nucleic acid" is also referred to as a "native nucleic acid" or a nucleic acid that is "native" to a given bacterium, organism, or cell. For example, the nucleic acids encoding HMGS, mevalonate kinase, and phosphomevalonate kinase in represent exogenous nucleic acids to E. coli. These mevalonate pathway nucleic acids were cloned from Sacchromyces cerevisiae. In SI cerevisiae, the gene sequences encoding HMGS, MK, and PMK on the chromosome would be "endogenous" nucleic acids.
[0028] The term "heterologous nucleic acid," as used herein, refers to a nucleic acid wherein at least one of the following is true: (a) the nucleic acid is foreign ("exogenous") to (i.e., not naturally found in) a given host microorganism or host cell; (b) the nucleic acid comprises a nucleotide sequence that is naturally found in (e.g., is "endogenous to") a given host microorganism or host cell (e.g., the nucleic acid comprises a nucleotide sequence that is endogenous to the host microorganism or host cell) but is either produced in an unnatural (e.g., greater than expected or greater than naturally found) amount in the cell, or differs in sequence from the endogenous nucleotide sequence such that the same encoded protein (having the same or substantially the same amino acid sequence) as found endogenously is produced in an unnatural (e.g., greater than expected or greater than naturally found) amount in the cell;

(c) the nucleic acid comprises two or more nucleotide sequences or segments that are not found in the same relationship to each other in nature, e.g., the nucleic acid is recombinant.
[0029] The term "heterologous polypeptide," as used herein, refers to a polypeptide that is not
naturally associated with a given polypeptide. For example, an isoprenoid precursor-modifying enzyme that comprises a "heterologous transmembrane domain" refers to an isoprenoid precursor-modifying enzyme that comprises a transmembrane domain that is not normally associated with (e.g., not normally contiguous with; not normally found in the same polypeptide chain with) the isoprenoid precursor- modifying enzyme in nature.
[0030] "Recombinant," as used herein, means that a particular nucleic acid (DNA or RNA) is the
product of various combinations of cloning, restriction, and/or ligation steps resulting in a construct having a structural coding or non-coding sequence distinguishable from endogenous nucleic acids found in natural systems. Generally, DNA sequences encoding the structural coding sequence can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of synthetic
oligonucleotides, to provide a synthetic nucleic acid which is capable of being expressed from a
recombinant transcriptional unit contained in a cell or in a cell-free transcription and translation system. Such sequences can be provided in the form of an open reading frame uninterrupted by internal non- translated sequences, or introns, which are typically present in eukaryotic genes. Genomic DNA
comprising the relevant sequences can also be used in the formation of a recombinant gene or
transcriptional unit. Sequences of non-translated DNA may be present 5 ' or 3' from the open reading frame, where such sequences do not interfere with manipulation or expression of the coding regions, and may indeed act to modulate production of a desired product by various mechanisms (see "DNA regulatory sequences", below).
[0031] Thus, e.g., the term "recombinant" polynucleotide or "recombinant" nucleic acid refers to one which is not naturally occurring, e.g., is made by the artificial combination of two otherwise separated segments of sequence through human intervention. This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. Such is usually done to replace a codon with a redundant codon encoding the same or a conservative amino acid, while typically introducing or removing a
sequence recognition site. Alternatively, it is performed to join together nucleic acid segments of
desired functions to generate a desired combination of functions. This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.
[0032] Similarly, the term "recombinant" polypeptide refers to a polypeptide which is not naturally occurring, e.g., is made by the artificial combination of two otherwise separated segments of amino sequence through human intervention. Thus, e.g., a polypeptide that comprises a heterologous amino acid sequence is recombinant.

[0033] By "construct" or "vector" is meant a recombinant nucleic acid, generally recombinant DNA, which has been generated for the purpose of the expression and/or propagation of a specific nucleotide sequence(s), or is to be used in the construction of other recombinant nucleotide sequences.
[0034] As used herein, the terms "operon" and "single transcription unit" are used interchangeably to refer to two or more contiguous coding regions (nucleotide sequences that encode a gene product such as an RNA or a protein) that are coordinately regulated by one or more controlling elements (e.g., a promoter). As used herein, the term "gene product" refers to RNA encoded by DNA (or vice versa) or protein that is encoded by an RNA or DNA, where a gene will typically comprise one or more
nucleotide sequences that encode a protein, and may also include introns and other non-coding
nucleotide sequences.
[0035] The terms "DNA regulatory sequences," "control elements," and "regulatory elements," used interchangeably herein, refer to transcriptional and translational control sequences, such as promoters, enhancers, polyadenylation signals, terminators, protein degradation signals, and the like, that provide for and/or regulate expression of a coding sequence and/or production of an encoded polypeptide in a host cell.
[0036] The term "transformation" is used interchangeably herein with "genetic modification" and refers to a permanent or transient genetic change induced in a cell following introduction of new
nucleic acid (i.e., DNA exogenous to the cell). Genetic change ("modification") can be accomplished either by incorporation of the new DNA into the genome of the host cell, or by transient or stable
maintenance of the new DNA as an episomal element. Where the cell is a eukaryotic cell, a permanent genetic change is generally achieved by introduction of the DNA into the genome of the cell. In
prokaryotic cells, permanent changes can be introduced into the chromosome or via extrachromosomal elements such as plasmids and expression vectors, which may contain one or more selectable markers to aid in their maintenance in the recombinant host cell. Suitable methods of genetic modification
include viral infection, transfection, conjugation, protoplast fusion, electroporation, particle gun
technology, calcium phosphate precipitation, direct microinjection, and the like. The choice of method is generally dependent on the type of cell being transformed and the circumstances under which the transformation is taking place (i.e. in vitro, ex vivo, or in vivo). A general discussion of these methods can be found in Ausubel, et al, Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons, 1995.
[0037] "Operably linked" refers to a juxtaposition wherein the components so described are in a
relationship permitting them to function in their intended manner. For instance, a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression. As used herein, the terms "heterologous promoter" and "heterologous control regions" refer to promoters and other control regions that are not normally associated with a particular nucleic acid in nature. For example, a
"transcriptional control region heterologous to a coding region" is a transcriptional control region that is not normally associated with the coding region in nature.

[0038] A "host cell," as used herein, denotes an in vivo or in vitro eukaryotic cell, a prokaryotic cell, or a cell from a multicellular organism (e.g., a cell line) cultured as a unicellular entity, which eukaryotic or prokaryotic cells can be, or have been, used as recipients for a nucleic acid (e.g., an expression vector that comprises a nucleotide sequence encoding one or more biosynthetic pathway gene products such as mevalonate pathway gene products), and include the progeny of the original cell which has been
genetically modified by the nucleic acid. It is understood that the progeny of a single cell may not necessarily be completely identical in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation. A "recombinant host cell" (also
referred to as a "genetically modified host cell") is a host cell into which has been introduced a
heterologous nucleic acid, e.g., an expression vector. For example, a subject genetically modified
prokaryotic host cell (e.g., a bacterium) is a prokaryotic host cell that, by virtue of introduction into a suitable prokaryotic host cell a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to (not normally found in nature in) the prokaryotic host cell, or a recombinant nucleic acid that is not normally found in the prokaryotic host cell; and a subject genetically modified eukaryotic host cell is a eukaryotic host cell that, by virtue of introduction into a suitable eukaryotic host cell a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to the eukaryotic host cell, or a recombinant nucleic acid that is not normally found in the eukaryotic host cell.
[00391 The term "conservative amino acid substitution" refers to the interchangeability in proteins of amino acid residues having similar side chains. For example, a group of amino acids having aliphatic side chains consists of glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains consists of serine and threonine; a group of amino acids having amide- containing side chains consists of asparagine and glutamine; a group of amino acids having aromatic side chains consists of phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains consists of lysine, arginine, and histidine; and a group of amino acids having sulfur- containing side chains consists of cysteine and methionine. Exemplary conservative amino acids
substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine- valine, and asparagine-glutamine.
[0040] "Synthetic nucleic acids" can be assembled from oligonucleotide building blocks that are
chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form gene segments which are then enzymatically assembled to construct the entire gene. "Chemically synthesized," as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well-established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. The nucleotide sequence of the nucleic acids can be modified for optimal expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.
[0041] A polynucleotide or polypeptide has a certain percent "sequence identity" to another
polynucleotide or polypeptide, meaning that, when aligned, that percentage of bases or amino acids are the same, and in the same relative position, when comparing the two sequences. Sequence similarity can be determined in a number of different manners. To determine sequence identity, sequences can be aligned using the methods and computer programs, including BLAST, available over the world wide web at ncbi.nlm.nih.gov/BLAST. See, e.g., Altschul et al. (1990), /. MoI. Biol. 215:403-10. Another alignment algorithm is FASTA, available in the Genetics Computing Group (GCG) package, from
Madison, Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc. Other
techniques for alignment are described in Methods in Enzymology, vol. 266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, Academic Press, Inc., a division of Harcourt Brace & Co., San Diego, California, USA. Of particular interest are alignment programs that permit gaps in the sequence. The Smith-Waterman is one type of algorithm that permits gaps in sequence alignments. SeeMeth. MoI. Biol. 70: 173-187 (1997). Also, the GAP program using the Needleman and Wunsch alignment method can be utilized to align sequences. See J. MoI. Biol. 48: 443-453
(1970).
[0042] A nucleic acid is "hybridizable" to another nucleic acid, such as a cDNA, genomic DNA, or
RNA, when a single stranded form of the nucleic acid can anneal to the other nucleic acid under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing
conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring
Harbor (1989), particularly Chapter 11 and Table 11.1 therein; and Sambrook, J. and Russell, W.,
Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (2001). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Hybridization conditions and post-hybridization washes are useful to obtain the desired determine stringency conditions of the
hybridization. One set of illustrative post-hybridization washes is a series of washes starting with 6 x SSC (where SSC is 0.15 M NaCl and 15 mM citrate buffer), 0.5% SDS at room temperature for 15 minutes, then repeated with 2 x SSC, 0.5% SDS at 45°C for 30 minutes, and then repeated twice with 0.2 x SSC, 0.5% SDS at 500C for 30 minutes. Other stringent conditions are obtained by using higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 minute washes in 0.2 x SSC, 0.5% SDS, which is increased to 600C. Another set of highly stringent conditions uses two final washes in 0.1 x SSC, 0.1% SDS at 65°C. Another example of stringent hybridization conditions is hybridization at 500C or higher and 0.1 xSSC (15 mM sodium chloride/1.5 inM sodium citrate). Another example of stringent hybridization conditions is overnight incubation at 42°C in a solution: 50% formamide, 5 x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 * Denhardt's solution, 10% dextran sulfate, and 20 μg/ml
denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 * SSC at about 65°C.
Stringent hybridization conditions and post-hybridization wash conditions are hybridization conditions and post-hybridization wash conditions that are at least as stringent as the above representative
conditions.
[0043] Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The
appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of the melting temperature (Tm) for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). Typically, the length for a hybridizable nucleic acid is at least about 10 nucleotides. Illustrative minimum lengths for a hybridizable nucleic acid are: at least about 15 nucleotides; at least about 20 nucleotides; and at least about 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be
adjusted as necessary according to factors such as length of the probe.

[0044] Before the present invention is further described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the
appended claims.
[0045] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
[0046] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the
practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the
methods and/or materials in connection with which the publications are cited.
[0047] It must be noted that as used herein and in the appended claims, the singular forms "a," "and," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a nucleic acid" includes a plurality of such nucleic acids and reference to "the IPP
biosynthetic pathway enzyme" includes reference to one or more IPP biosynthetic pathway enzymes and equivalents thereof known to those skilled in the art, and so forth. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as
antecedent basis for use of such exclusive terminology as "solely," "only" and the like in connection with the recitation of claim elements, or use of a "negative" limitation.
[0048] The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently
confirmed.

DETAILED DESCRIPTION
[0049] The present invention provides methods of producing an isoprenoid or an isoprenoid precursor in a host cell that comprises a biosynthetic pathway that converts a substrate to isopentenyl
pyrophosphate (EPP), where the biosynthetic pathway is modified to include a synthetic intergenic region (IGR) between at least two coding regions encoding enzymes in the biosynthetic pathway. The present invention further provides recombinant nucleic acid constructs comprising a synthetic IGR, and genetically modified host cells comprising a synthetic IGR.
[0050] One method of making an isoprenoid or an isoprenoid precursor is to culture a host cell, where the host cell is capable of making the isoprenoid or isoprenoid precursor. Because the biosynthetic pathway for making an isoprenoid or an precursor involves multiple enzymes, the flux through the pathway may not be optimum or properly balanced. One method of correcting such imbalance is to modulate the number or stability of either the RNA transcript or the resulting enzyme. This can be achieved by the presence of a synthetic IGR, as described herein.
[0051] Isoprenoid compounds are synthesized from a universal five carbon precursor, isopentenyl pyrophosphate (D?P). There are two major pathways for converting a substrate to EPP: 1) the
"mevalonate" pathway," which converts acetyl-CoA to EPP; and the "1-deoxy-D-xylulose 5- diphosphate pathway" (also referred to as the "DXP pathway"), which converts D-glyceraldehyde-3- phosphate and pyruvate to IPP and DMAPP.

[0052] Mevalonate pathway enzymes are depicted in Figure 9. The mevalonate pathway comprises the following enzymatic reactions: (a) condensing two molecules of acetyl-CoA to acetoacetyl-CoA; (b) condensing acetoacetyl-CoA with acetyl-CoA to form HMG-CoA; (c) converting HMG-CoA to
mevalonate; (d) phosphorylating mevalonate to mevalonate 5-phosphate; (e) converting mevalonate 5- phosphate to mevalonate 5-pyrophosphate; and (f) converting mevalonate 5-pyrophosphate to
isopentenyl pyrophosphate. Enzymes that carry out these reactions include acetoacetyl-CoA thiolase, hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), mevalonate kinase (MK), phosphomevalonate kinase (PMK), and mevalonate pyrophosphate
decarboxylase (MPD).
[0053J Figure 10 depicts schematically the DXP pathway, in which pyruvate and D-glyceraldehyde-3- phosphate are converted via a series of reactions to IPP and DMAPP. The pathway involves action of the following enzymes: l-deoxy-D-xylulose-5-phosphate synthase (Dxs), 1 -deoxy-D-xylulose-5- phosphate reductoisomerase (IspC), 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase (IspD), 4- diphosphocytidyl-2-C-methyl-D-erythritol kinase (IspE), 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF), l-hydroxy-2-methyl-2-(£)-butenyl 4-diphosphate synthase (IspG), and isopentenyl diphosphate isomerase (IspH).
[0054] Eukaryotic cells other than plant cells use the mevalonate pathway exclusively to convert acetyl- CoA to IPP, which is subsequently isomerized to DMAPP. Plants use both the mevalonae and the DXP pathways for isoprenoid synthesis. Prokaryotes, with some exceptions, use the DXP pathway to produce IPP and DMAPP separately through a branch point.
[0055] The IPP produced by the mevalonate pathway can be isomerized to produce DMAPP. The IPP and/or the DMAPP can be acted on by prenyltransferases to produce polyprenyl pyrophosphates. For example, as shown in Figure 11, IPP or DMAPP can be modified by prenyl transferases to generate the polyprenyl diphosphates geranyl diphosphate (GPP), farnesyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP). GPP and FPP are further modified by terpene synthases to generate monoterpenes and sesquiterpenes, respectively; and GGPP is further modified by terpene synthases to generate
diterpenes and carotenoids. IPP and DMAPP are generated by one of two pathways: the mevalonate
(MEV) pathway and the l-deoxy-D-xylulose-5-phosphate (DXP) pathway.
METHODS OF ENHANCING PRODUCTION ISOPRENOIDS AND ISOPRENOID PRECURSORS
[0056] The present invention provides methods of producing an isoprenoid or an isoprenoid precursor in a host cell that comprises a biosynthetic pathway that converts a substrate to IPP (an "IPP
biosynthetic pathway"), where the IPP biosynthetic pathway is modified to include a synthetic IGR between at least two coding regions encoding enzymes in the biosynthetic pathway. An IPP
biosynthetic pathway that is modified to include a synthetic IGR between at least two coding regions encoding enzymes in the IPP biosynthetic pathway is also referred to herein as an "IGR-modified EPP biosynthetic pathway." A subject method generally involves culturing a host cell in vitro in a suitable medium, wherein the host cell comprises a biosynthetic pathway that converts a substrate to IPP. The IPP biosynthetic pathway is modified to include at least one synthetic IGR, where at least one synthetic IGR is disposed between a set of two coding regions encoding two enzymes in the biosynthetic
pathway. Thus, the host cell is "genetically modified," as it includes a biosynthetic pathway modified to include a synthetic IGR.
[0057] Illustrative examples of an IPP biosynthetic pathway, include: 1) an IPP biosynthetic pathway that is endogenous to a cell (e.g., an endogenous mevalonate pathway present in a eukaryotic cell that normally produces IPP via a mevalonate pathway, where such cells include, e.g., a yeast cell, a fungal cell, etc.; and an endogenous DXP pathway present in a prokaryotic cell that normally produces IPP via a DXP pathway); 2) an IPP biosynthetic pathway that is exogenous to a cell (e.g., where the cell has been genetically modified with one or more exogenous nucleic acids comprising nucleotide sequences encoding one or more IPP biosynthetic pathway enzymes heterologous to the cell (e.g., an exogenous mevalonate pathway in a prokaryotic cell that does not normally produce IPP via a mevalonate
pathway; e.g., an exogenous DXP pathway in a eukaryotic cell that does not normally produce IPP via a DXP pathway); and 3) a modified endogenous IPP biosynthetic pathway, e.g., where an endogenous
DXP.or mevalonate pathway is genetically modified.
[0058] In some embodiments, the biosynthetic pathway that converts a substrate to IPP includes a single synthetic IGR disposed between a first coding region and a second coding region, where the first coding region comprises a nucleotide sequence encoding a first enzyme in the biosynthetic pathway and the second coding region comprises a nucleotide sequence encoding a second enzyme in the
biosynthetic pathway. The second enzyme is one that acts on a product of the first enzyme.
[0059] In other embodiments, the biosynthetic pathway that converts a substrate to IPP includes two or more synthetic IGR, each disposed between two coding regions encoding enzymes in the pathway, where the coding regions encode at least three different enzymes in the pathway. For example, in some embodiments, the biosynthetic pathway that converts a substrate to IPP includes a first synthetic IGR disposed between a first coding region and a second coding region; and a second synthetic IGR
disposed between the second coding region and a third coding region, where each of the first, second, and third coding regions comprises nucleotide sequences encoding different enzymes in the
biosynthetic pathway. The second enzyme is one that acts on a product of the first enzyme; and the third enzyme acts on a product of the second enzyme.
[0060] In other embodiments, the biosynthetic pathway that converts a substrate to IPP includes two or more synthetic IGR, each disposed between two coding regions encoding enzymes in the pathway, where the coding regions encode at least four different enzymes in the pathway. For example, in some embodiments, the biosynthetic pathway that converts a substrate to EPP includes a first synthetic IGR disposed between a first coding region and a second coding region; and a second synthetic IGR
disposed between a third coding region and a fourth coding region, where each of the first, second, third, and fourth coding regions comprises nucleotide sequences encoding different enzymes in the biosynthetic pathway.

[0061] As noted above, a synthetic IGR is disposed between a first coding region encoding a gene product A (where gene product A is an mRNA encoding an enzyme in the IPP biosynthetic pathway) and a second coding region encoding a gene product B (where gene product B is an mRNA encoding another enzyme in the IPP biosynthetic pathway). In some embodiments, the presence of a synthetic
IGR between a first coding region and a second coding region reduces the level of gene product A
relative to gene product B, and thus provides for a reduced activity level of enzyme A relative to
enzyme B. For example, the presence of a synthetic IGR reduces the activity level of enzyme A by from about 5% to about 10%, from about 10% to about 15%, from about 15% to about 20%, from about 20% to about 25%, from about 25% to about 30%, from about 30% to about 40%, from about 40% to about 50%, or more than 50%, compared to the activity level of enzyme B.
[0062] In other embodiments, the presence of a synthetic IGR between a first coding region and a second coding region reduces the level of gene product B relative to gene product A, and thus provides for a reduced activity level of enzyme B relative to enzyme A. For example, the presence of a synthetic IGR reduces the activity level of enzyme B by from about 5% to about 10%, from about 10% to about 15%, from about 15% to about 20%, from about 20% to about 25%, from about 25% to about 30%, from about 30% to about 40%, from about 40% to about 50%, or more than 50%, compared to the activity level of enzyme A.
[0063] In other embodiments, the presence of a synthetic IGR between a first coding region and a second coding region reduces the stability of gene product B relative to gene product A, or reduces the stability of gene product A relative to gene product B, such that in either case, the activity levels of enzyme A and enzyme B are substantially the same, e.g., the activity levels of the first and second
enzymes differ from one another by less than about 20%, less than about 15%, less than about 10%, less than about 5%, less than about 2%, or less than about 1%.
[0064] In other embodiments, the presence of a synthetic IGR between a first coding region and a second coding region reduces the stability of gene product A relative to gene product B, and thus " provides for a reduced activity level of enzyme A relative to enzyme B. For example, the presence of a synthetic IGR reduces the activity level of enzyme A by from about 5% to about 10%, from about 10% to about 15%, from about 15% to about 20%, from about 20% to about 25%, from about 25% to about 30%, from about 30% to about 40%, from about 40% to about 50%, or more than 50%, compared to the activity level of enzyme B.
[0065] In other embodiments, the presence of a synthetic IGR between a first coding region and a second coding region reduces the stability of gene product B relative to gene product A, and thus
provides for a reduced activity level of enzyme B relative to enzyme A. For example, the presence of a synthetic IGR reduces the activity level of enzyme B by from about 5% to about 10%, from about 10% to about 15%, from about 15% to about 20%, from about 20% to about 25%, from about 25% to about 30%, from about 30% to about 40%, from about 40% to about 50%, or more than 50%, compared to the activity level of enzyme A.

[0066] In other embodiments, the presence of a synthetic IGR between a first coding region and a second coding region modulates reduces stability of gene product B relative to gene product A, or
reduces the stability of gene product A relative to gene product B, such that in either case, the activity levels of enzyme A and enzyme B are substantially the same, e.g., the activity levels of the first and second enzymes differ from one another by less than about 20%, less than about 15%, less than about
10%, less than about 5%, less than about 2%, or less than about 1%
[0067] In some embodiments, the presence of the at least one synthetic IGR provides for production of one or more intermediates in the biosynthetic pathway at a level that is non-toxic to the host cell, e.g., at a level that does not substantially reduce growth of the cell. For example, the presence of the at least one synthetic IGR provides for production of one or more intermediates in the biosynthetic pathway at a level that inhibits growth of the cell by less than about 30%, less than about 25%, less than about
20%, less than about 15%, less than about 10%, less than about 5% or less than about 2%, compared to inhibition of cell growth in the absence of the synthetic IGR(s).
[0068] A synthetic IGR can have a length in a range of from about 15 nucleotides (nt) to about 500 nt, e.g., from about 15 nt to about 20 nt, from about 20 nt to about 25 nt, from about 25 nt to about 50 nt, from about 50 nt to about 75 nt, from about 75 nt to about 100 nt, from about 100 nt to about 125 nt, from about 125 nt to about 150 nt, from about 150 nt to about 175 nt, from about 175 nt to about 200 nt, from about 200 nt to about 225 nt, from about 225 nt to about 250 nt, from about 250 nt to about 275 nt, from about 275 nt to about 300 nt, from about 300 nt to about 350 nt, from about 350 nt to about 400 nt, from about 400 nt to about 450 nt, or from about 450 nt to about 500 nt. In some embodiments, a
synthetic IGR can have a length in a range of from about 75 nt to about 275 nt. In other embodiments, a synthetic IGR can have a length in a range of from about 25 nt to about 135 nt.
[0069] In some embodiments, a synthetic IGR is a nucleotide sequence having a length of from about

15 nt to about 500 nt and comprising a nucleotide sequence that forms a hairpin. In some embodiments, a synthetic IGR is a nucleotide sequence having a length of from about 15 nt to about 500 nt and
comprising a nucleotide sequence that is a riboendonuclease recognition site. In some embodiments, a synthetic IGR is a nucleotide sequence having a length of from about 15 nt to about 500 nt, wherein the synthetic IGR comprises both a nucleotide sequence that forms a hairpin and a nucleotide sequence that is a riboendonuclease recognition site.
[0070] As used herein, the term "hairpin" refers to a three-dimensional structure formed by a first complementary nucleotide sequence and a second complementary nucleotide sequence in the same nucleic acid sequence, where the nucleic acid comprising the hairpin-forming nucleotide sequence folds back on itself, such that the first and second complementary nucleotide sequences form hydrogen bonds with one another. The first and second complementary nucleotide sequences can be immediately
adjacent one another, or can be separated by from one to 200 nucleotides (e.g., from two to about 100, from six to about 50, from about 10 to about 20, etc.). Thus, e.g., the hairpin structure may contain a loop portion positioned between the two sequences that form the duplex. The loop can vary in length. In some embodiments the loop is from about 5 nt to about 10 nt, from about 10 nt to about 100 nt, from about 15 nt to about 50 nt, from about 20 nt to about 30 nt nucleotides in length, and the like. The first and second complementary sequences may be 100% complementary, or may include one or more
mismatches. The first and second complementary sequences form at least one base pair, and can form from one to 20 or more base pairs, where the base pairs can be contiguous or separated by one or more nucleotides.
[0071] A hairpin can be a short hairpin or a long hairpin. A hairpin can be included in a secondary structure such as a stem-loop structure, an internal loop, a bulge loop, a branched structure, or a
pseudoknot, multiple stem loop structures, cloverleaf type structures or any three dimensional structure that includes a hairpin. A synthetic IGR can include a single hairpin, or more than one hairpin, e.g., two, three, four, or more hairpins.
[0072] A riboendonuclease recognition site includes an RNAse III recognition site, an RNAseE
recognition site, and the like. RNAse recognition sequences are known in the art. In some
embodiments, the riboendonuclease recognition sequence is a recognition sequence for RNAse HI. In other embodiments, the riboendonuclease recognition sequence is a recognition sequence for RNAseE.

[0073] In some embodiments, the synthetic IGR comprises, in order from 5'-3', a first hairpin-forming nucleotide sequence; an RNAse recognition sequence; and a second hairpin-forming nucleotide
sequence.
[0074] Suitable synthetic IGR nucleotide sequences include, but are not limited to, a nucleotide
sequence having at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% nucleotide sequence identity to a
nucleotide sequence set forth in one of SEQ ID NOs:7, 8, 11, 12, 15, 16, 19, 20, and 62-77. In some embodiments, a synthetic IGR comprises a nucleotide sequence having at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% nucleotide sequence identity to a nucleotide sequence set forth in any one of SEQ ID
NOs:5, 6, 9, 10, 13, 14, 17, or 18. In some embodiments, a synthetic IGR comprises a nucleotide
sequence having at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% nucleotide sequence identity to a
nucleotide sequence set forth in any one of SEQ ID NOs:7, 8, 11 , 12, 15, 16, 19, or 20. In some
embodiments, a synthetic IGR comprises a nucleotide sequence having at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% nucleotide sequence identity to a nucleotide sequence set forth in any one of SEQ ID
NOs:62-77.
[0075] ' In some embodiments, the host cell produces EPP and/or mevalonate via the mevalonate
pathway. In some embodiments, the host cell comprises a mevalonate pathway that comprises at least one synthetic IGR, where the at least one synthetic IGR is disposed between two coding regions
encoding two enzymes in the mevalonate pathway.

[0076] In some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a mevalonate pathway, where the one or more nucleic acids comprises a single synthetic IGR, where the single synthetic IGR is between a coding region comprising a nucleotide sequence encoding acetoacetyl-CoA thiolase and a coding region comprising a nucleotide sequence encoding HMGS. In other embodiments, the host cell comprises a mevalonate pathway that comprises a single synthetic IGR, where the single synthetic IGR is between a coding region comprising a
nucleotide sequence encoding HMGS, and a coding region comprising a nucleotide sequence encoding HMGR. In other embodiments, the single synthetic IGR is between a coding region comprising a
nucleotide sequence encoding HMGR and a coding region comprising a nucleotide sequence encoding MK; between a coding region comprising a nucleotide sequence encoding MK and a coding region comprising a nucleotide sequence encoding PMK; or between a coding region comprising a nucleotide sequence encoding PMK and a coding region comprising a nucleotide sequence encoding MPD.
[0077] In other embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a mevalonate pathway, where the one or more nucleic acids comprises two or more synthetic IGRs, where the two or more synthetic IGRs are each disposed between two coding regions encoding enzymes in the pathway. For example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a mevalonate pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region comprising a nucleotide sequence encoding acetoacetyl CoA thiolase and a coding region comprising a nucleotide sequence encoding HMGS; and a second synthetic IGR between a coding region comprising a nucleotide
sequence encoding HMGS and HMGR, where the first and second IGR can be the same or different
(e.g., have the same nucleotide sequence or two different nucleotide sequences). As another example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences r encoding a mevalonate pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region comprising a nucleotide sequence encoding HMGS and a coding region
comprising a nucleotide sequence encoding HMGR; and a second synthetic IGR between a coding region comprising a nucleotide sequence encoding HMGR and a coding region comprising a nucleotide sequence encoding MK. As another example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a mevalonate pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region comprising a nucleotide sequence encoding acetoacetyl-CoA thiolase and a coding region comprising a nucleotide sequence encoding HMGS; and a second synthetic IGR between a coding region comprising a nucleotide
sequence encoding HMGR and a coding region comprising a nucleotide sequence encoding MK.
[0078] In some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a mevalonate pathway, where the one or more nucleic acids comprises three, four, or five synthetic IGRs, where the three, four, or five synthetic IGRs are each disposed between two coding regions encoding enzymes in the pathway.

[0079] In some embodiments, a host cell that comprises one or more nucleic acids comprising
nucleotide sequences encoding a mevalonate pathway is a host cell that normally produces IPP via a mevalonate pathway. In other embodiments, a host cell that comprises one or more nucleic acids
comprising nucleotide sequences encoding a mevalonate pathway is a host cell that does not normally produces IPP via a mevalonate pathway.
[0080] In some embodiments, the host cell produces IPP via a DXP pathway. In some embodiments, the host cell comprises a mevalonate pathway that comprises at least one synthetic IGR, where the at least one synthetic IGR is disposed between two coding regions encoding two enzymes in the DXP pathway.
[0081] In some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises a single synthetic IGR, where the single synthetic IGR is between a coding region comprising a nucleotide sequence encoding Dxs and a coding region comprising a nucleotide sequence encoding Dxr. In other
embodiments, the host cell comprises a DXP pathway that comprises a single synthetic IGR, where the single synthetic IGR is between a coding region comprising a nucleotide sequence encoding Dxr, and a coding region comprising a nucleotide sequence encoding IspD. In other embodiments, the single synthetic IGR is between a coding region comprising a nucleotide sequence encoding IspD and a
coding region comprising a nucleotide sequence encoding IspF; between a coding region comprising a nucleotide sequence encoding IspF and a coding region comprising a nucleotide sequence encoding
IspG; or between a coding region comprising a nucleotide sequence encoding IspG and a coding region comprising a nucleotide sequence encoding IspH.
[0082] In other embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises two or more synthetic IGRs, where the two or more synthetic IGRs are each disposed between two coding regions encoding enzymes in the pathway. For example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region comprising a nucleotide
sequence encoding Dxs and a coding region comprising a nucleotide sequence encoding Dxr; and a second synthetic IGR between a coding region comprising a nucleotide sequence encoding Dxr and
IspD, where the first and second IGR can be the same or different (e.g., have the same nucleotide
sequence or two different nucleotide sequences). As another example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region
comprising a nucleotide sequence encoding Dxr and a coding region comprising a nucleotide sequence encoding IspD; and a second synthetic IGR between a coding region comprising a nucleotide sequence encoding IspD and a coding region comprising a nucleotide sequence encoding IspE. As another
example, in some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises a first synthetic IGR between a coding region comprising a nucleotide sequence encoding Dxs and a coding region comprising a nucleotide sequence encoding Dxr; and a second synthetic IGR between a coding region comprising a nucleotide sequence encoding IspD and a coding region comprising a nucleotide sequence encoding IspE.
[0083] In some embodiments, the host cell comprises one or more nucleic acids comprising nucleotide sequences encoding a DXP pathway, where the one or more nucleic acids comprises three, four, or five synthetic IGRs, where the three, four, or five synthetic IGRs are each disposed between two coding regions encoding enzymes in the pathway.
[0084] In some embodiments, a host cell that comprises one or more nucleic acids comprising
nucleotide sequences encoding a DXP pathway is a host cell that normally produces DPP via a DXP pathway. In other embodiments, a host cell that comprises one or more nucleic acids comprising
nucleotide sequences encoding a DXP pathway is a host cell that does not normally produces IPP via a DXP pathway.
[0085] As noted above, the host cell that produces an isoprenoid or isoprenoid precursor is genetically modified such that it produces the isoprenoid or isoprenoid precursor via an IPP biosynthetic pathway that has been modified to include one or more synthetic IGRs. The genetically modified host cell is a genetically modified version of a parent host cell.
[0086] The EPP biosynthetic pathway is in some embodiments substantially the same as an endogenous pathway but for the inclusion of the one or more synthetic IGRs, e.g., the IPP biosynthetic pathway comprises nucleotide sequence encoding enzymes that are endogenous to the host cell. For example, in some embodiments, the host cell is a prokaryotic cell that normally produces DPP via an endogenous
DXP pathway, and the IGR-modified IPP biosynthetic pathway comprises the endogenous DXP
pathway, modified to include one or more synthetic IGRs. As another example, the host cell is a
eukaryotic cell (e.g., a yeast cell) that normally produces D?P via an endogenous mevalonate pathway, and the IGR-modified IPP biosynthetic pathway comprises the endogenous mevalonate pathway,
modified to include one or more synthetic IGRs.
[0087] In other embodiments, the IGR-modified IPP biosynthetic pathway comprises both synthetic
IGR(s) and nucleotide sequences encoding enzymes that are heterologous to the cell. For example, in some embodiments, the host cell is a prokaryotic cell that does not normally synthesize IPP via a
mevalonate pathway; and the host cell is genetically modified with an exogenous mevalonate pathway that includes one or more synthetic IGRs and nucleotide sequences encoding mevalonate pathway
enzymes that are heterologous to the cell. As another example, the host cell is a eukaryotic cell (e.g., a yeast cell) that does not normally synthesize DPP via a DXP pathway; and the host cell is genetically modified with an exogenous DXP pathway that includes one or more synthetic IGRs and nucleotide sequences encoding DXP pathway enzymes that are heterologous to the cell.

[0088] Production of an isoprenoid or an isoprenoid precursor is increased in the genetically modified host cell, compared to a control, parent cell. Thus, e.g., production of an isoprenoid or isoprenoid precursor is increased by at least about 10%, at least about 20%, at least about 50%, at least about 2- fold, at least about 2.5-fold, at least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, at least about 50-fold, at least about 75-fold, at least about 100- fold, at least about 200-fold, at least about 300-fold, at least about 400-fold, or at least about 500-fold, or more, in the genetically modified host cell, compared to the control host cell.
[0089] For example, in some embodiments, the level of mevalonate produced in a subject genetically modified host cell is at least about 10%, at least about 20%, at least about 50%, at least about 2-fold, at least about 2.5-fold, at least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 30- fold, at least about 40-fold, at least about 50-fold, at least about 75-fold, at least about 100-fold, at least about 200-fold, at least about 300-fold, at least about 400-fold, or at least about 500-fold, or more, greater than the level of mevalonate produced in a control cell. For example, in some embodiments, the level of mevalonate produced in a subject genetically modified host cell is greater than about 275 mM, e.g., from about 280 mM to about 290 mM, from about 290 mM to about 300 mM, from about 300 mM to about 350 mM, from about 350 mM to about 400 mM, from about 400 mM to about 450 mM, from about 450 mM to about 500 mM, from about 500 mM to about 550 mM, or from about 550 mM to about 600 mM, or greater, at 24 hours in culture, where the concentrations are normalized to OD (e.g., OD600).
[0090] As another example, in some embodiments, the level of IPP produced in a subject genetically modified host cell is at least about 10%, at least about 20%, at least about 50%, at least about 2-fold, at least about 2.5-fold, at least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 30- fold, at least about 40-fold, at least about 50-fold, at least about 75-fold, at least about 100-fold, at least about 200-fold, at least about 300-fold, at least about 400-fold, or at least about 500-fold, or more, greater than the level of IPP produced in a control cell.
[0091] As another example, in some embodiments, the level of an isoprenoid compound produced in a subject genetically modified host cell is at least about 10%, at least about 20%, at least about 50%, at least about 2-fold, at least about 2.5-fold, at least about 5-fold, at least about 10-fold, at least about 20- fold, at least about 30-fold, at least about 40-fold, at least about 50-fold, at least about 75-fold, at least about 100-fold, at least about 200-fold, at least about 300-fold, at least about 400-fold, or at least about 500-fold, or more, greater than the level of the isoprenoid compound produced in a control cell.
[0092] In some embodiments, the growth rate of a subject genetically modified host cell is greater than the growth rate of a control cell. For example, in some embodiments, a subject genetically modified host cell grows at a rate that is at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100% or 2- fold, at least about 5-fold, at least about 10-fold, or more, higher than the growth rate of a control cell.

Cell growth o is readily determined using well-known methods, e.g., optical density (OD) measurement at about 600 nm (OD600) of liquid cultures of bacteria; colony size; growth rate; and the like.
[0093] As noted above, in some embodiments, the presence of a synthetic IGR between a first coding region and a second coding region provides for altered levels of a gene product encoded by the first coding region relative to a gene product encoded by the second coding region. For example, in some embodiments, the level of HMGS mRNA in a genetically modified host cell is less than the level of a mevalonate pathway mRNA other than HMGS mRNA in the genetically modified host cell. For
example, in some embodiments, the level of HMGS mRNA is less than the level of acetoacetyl-CoA thiolase mRNA in the genetically modified host cell. In some embodiments, the level of HMGS protein in a genetically modified host cell is less than the level of a mevalonate pathway protein other than
HMGS protein in the genetically modified host cell. For example, in some embodiments, the level of HMGS protein in a genetically modified host cell is less than the level of acetoacetyl-CoA thiolase protein in the genetically modified host cell.
[0094] In some embodiments, the level of HMGS mRNA in a subject genetically modified host cell is from about 10% less to about 15% less, from about 15% less to about 20% less, from about 20% less to about 30% less, from about 30% less to about 40% less, from about 40% less to about 50% less, or from about 50% less to about 60% less, than the level of acetoacetyl-CoA thiolase mRNA in the
genetically modified host cell. In some embodiments, the level of HMGS protein in a subject
genetically modified host cell is from about 10% less to about 15% less, from about 15% less to about 20% less, from about 20% less to about 30% less, from about 30% less to about 40% less, from about 40% less to about 50% less, from about 50% less to about 60% less, or from about 60% less to about
70% less, than the level of acetoacetyl-CoA thiolase protein in the genetically modified host cell.
[0095] In some embodiments, the level of both HMGS mRNA and HMGR mRNA in a genetically modified host cell is less than the level of a mevalonate pathway mRNA other than HMGS mRNA and HMGR mRNA in the genetically modified host cell. For example, in some embodiments, the level of both HMGS mRNA and HMGR mRNA is less than the level of acetoacetyl-CoA thiolase mRNA in the genetically modified host cell. In some embodiments, the level of both HMGS protein and HMGR protein in a genetically modified host cell is less than the level of a mevalonate pathway protein other than HMGS and HMGR proteins in the genetically modified host cell. For example, in some
embodiments, the level of both HMGS protein and HMGR protein in a genetically modified host cell is less than the level of acetoacetyl-CoA thiolase protein in the genetically modified host cell.
[0096] In some embodiments, the level of both HMGS mRNA and HMGR mRNA in a subject
genetically modified host cell is from about 10% less to about 15% less, from about 15% less to about 20% less, from about 20% less to about 30% less, from about 30% less to about 40% less, from about 40% less to about 50% less, or from about 50% less to about 60% less, than the level of acetoacetyl- CoA thiolase mRNA in the genetically modified host cell. In some embodiments, the level of both
HMGS protein and HMGR protein in a subject genetically modified host cell is from about 10% less to about 15% less, from about 15% less to about 20% less, from about 20% less to about 30% less, from about 30% less to about 40% less, from about 40% less to about 50% less, from about 50% less to
about 60% less, or from about 60% less to about 70% less, than the level of acetoacetyl-CoA thiolase protein in the genetically modified host cell.
[0097] In some embodiments, a subject method of increasing production of an isoprenoid compound, or an isoprenoid precursor compound, in a host cell comprises decreasing the level of HMGS activity in the cell and/or decreasing the level of HMGR activity in the cell, compared to the levels in a control host cell. Decreasing the level of HMGS activity in a cell includes decreasing the total amount of
HMGS polypeptide within the cell; and decreasing the specific activity of HMGS polypeptide within the cell. Thus, in some embodiments, the level of HMGS activity in a cell is decreased by decreasing the total amount of HMGS in the cell. In other embodiments, the level of HMGS activity in a cell is decreased by decreasing the specific activity of HMGS in the cell. Similarly, decreasing the level of
HMGR activity in a cell includes decreasing the total amount of HMGR polypeptide within the cell;
and decreasing the specific activity of HMGR polypeptide within the cell. Thus, in some embodiments, the level of HMGR activity in a cell is decreased by decreasing the total amount of HMGR in the cell. In other embodiments, the level of HMGR activity in a cell is decreased by decreasing the specific activity of HMGR in the cell.
[0098] Isoprenoids that can be produced using the method of the invention include, but are not limited to, monoterpenes, including but not limited to, limonene, citranellol, geraniol, menthol, perillyl alcohol, linalool, thujone; sesquiterpenes, including but not limited to, periplanone B, gingkolide B,
amorphadiene, artemisinin, artemisinic acid, valencene, nootkatone, epi-cedrol, epi-aristolochene, famesol, gossypol, sanonin, periplanone, and forskolin; diterpenes, including but not limited to,
casbene, eleutherobin, paclitaxel, prostratin, and pseudopterosin; triterpenes, including but not limited to, arbrusideE, bruceantin, testosterone, progesterone, cortisone, digitoxin. Isoprenoids also include, but are not limited to, carotenoids such as lycopene, α- and β-carotene, α- and β-cryptoxanthin, bixin, zeaxanthin, astaxanthin, and lutein. Isoprenoids also include, but are not limited to, triterpenes, steroid compounds, and compounds that are composed of isoprenoids modified by other chemical groups, such as mixed terpene-alkaloids, menaqυinones (e.g., vitamin K-2), and coenzyme Q-10.
HETEROLOGOUS NUCLEIC ACIDS
[0099] The present invention provides nucleic acids that are useful in generating a genetically
modified host cell, for use in producing an isoprenoid or isoprenoid precursor, e.g., in a subject method. A subject nucleic acid comprises nucleotide sequences encoding one or more synthetic IGRs. In some embodiments, a subject nucleic acid comprises nucleotide sequences encoding one or more synthetic
IGRs; and nucleotide sequences encoding one or more enzymes in an IPP biosynthetic pathway (e.g., a mevalonate pathway; or a DXP pathway), where a synthetic IGR is located 5' to at least one nucleotide sequence encoding an enzyme in the IPP biosynthetic pathway. In some embodiments, a subject nucleic acid is a synthetic nucleic acid. In some embodiments, a subject nucleic acid is a recombinant nucleic acid. In some embodiments, a subject nucleic acid is an expression construct (an "expression vector").

[00100] In some embodiments, a subject nucleic acid comprises nucleotide sequences encoding one or more synthetic IGRs, where synthetic IGRs are as described above; and nucleotide sequences encoding one or more enzymes in an EPP biosynthetic pathway (e.g., a mevalonate pathway; or a DXP pathway), where a synthetic IGR is located 5' to at least one nucleotide sequence encoding an enzyme in the IPP biosynthetic pathway.
[00101] In some embodiments, a subject nucleic acid comprises nucleotide sequences encoding two or more enzymes in an IPP biosynthetic pathway, and a single synthetic IGR disposed between two coding regions comprising nucleotide sequences encoding the two of the two or more enzymes in the IPP
biosynthetic pathway, or a single synthetic IGR at the 5' end of the pathway. For example, in some embodiments, a subject nucleic acid comprises a nucleotide sequence encoding a single synthetic IGR, and nucleotide sequences comprising a first coding region and a second coding region, where the single IGR is disposed between a first coding region and a second coding region, where the first coding region comprises a nucleotide sequence encoding a first enzyme in the biosynthetic pathway and the second coding region comprises a nucleotide sequence encoding a second enzyme in the biosynthetic pathway. The second enzyme is one that acts on a product of the first enzyme.
[00102] In other embodiments, a subject nucleic acid comprises nucleotide sequences encoding three or more enzymes in an IPP biosynthetic pathway; and nucleotide sequences encoding two or more
synthetic IGRs, where each of the two or more synthetic IGRs is disposed between two coding regions encoding two of the three or more enzymes in the pathway. For example, in some embodiments, a subject nucleic acid comprises a nucleotide sequence encoding an IGR-modified IPP biosynthetic
pathway, and includes a first synthetic IGR disposed between a first coding region and a second coding region; and a second synthetic IGR disposed between the second coding region and a third coding region, where each of the first, second, and third coding regions comprises nucleotide sequences
encoding different enzymes in the IPP biosynthetic pathway. The second enzyme is one that acts on a product of the first enzyme; and the third enzyme acts on a product of the second enzyme.
[00103] In other embodiments, a subject nucleic acid comprises nucleotide sequences encoding an IGR- modified IPP biosynthetic pathway that includes two or more synthetic IGR, each disposed between two coding regions encoding enzymes in the pathway, where the coding regions encode at least four different enzymes in the pathway. For example, in some embodiments, a subject nucleic acid comprises nucleotide sequences encoding an IGR-modified IPP biosynthetic pathway that includes a first synthetic IGR disposed between a first coding region and a second coding region; and a second synthetic IGR disposed between a third coding region and a fourth coding region, where each of the first, second, third, and fourth coding regions comprises nucleotide sequences encoding different enzymes in the biosynthetic pathway.

Mevalonate pathway enzymes
[00104] The mevalonate pathway comprises: (a) condensing two molecules of acetyl-CoA to
acetoacetyl-CoA; (b) condensing acetoacetyl-CoA with acetyl-CoA to form HMG-CoA; (c) converting HMG-CoAto mevalonate; (d) phosphorylating mevalonate to mevalonate 5 -phosphate; (e) converting mevalonate 5-phosphate to mevalonate 5-pyrophosphate; and (f) converting mevalonate 5- pyrophosphate to isopentenyl pyrophosphate. The mevalonate pathway enzymes required for
production of IPP vary, depending on the culture conditions.
[00105] In some embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, and
HMGR, where the one or more synthetic IGRs are disposed between two coding regions encoding the enzymes.
[00106] In other embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR,
MK, PMK, and MPD (and optionally also IPP isomerase), where each of the one or more synthetic
IGRs are disposed between two coding regions encoding the enzymes. In other embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide
sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, IPP isomerase, and a prenyl transferase, where each of the one or more synthetic IGRs are disposed between two coding regions encoding the enzymes. In other embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, IPP isomerase, a prenyl transferase, and a terpene
synthase, where each of the one or more synthetic IGRs are disposed between two coding regions
encoding the enzymes.
Nucleotide sequences encoding mevalonate pathway enzymes
[00107] Nucleotide sequences encoding mevalonate (MEV) pathway gene products are known in the art, and any known MEV pathway gene product-encoding nucleotide sequence can used to generate a subject genetically modified host cell. For example, nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, and IDI are known in the art. The following are non- limiting examples of known nucleotide sequences encoding MEV pathway gene products, with
GenBank Accession numbers and organism following each MEV pathway enzyme, in parentheses:
acetoacetyl-CoA thiolase: (NC_000913 REGION: 2324131..232531S; ^. coli), (D49362; Paracoccus denitrificans), and (L20428; Saccharomyces cerevisiae); HMGS: (NC_001145. complement
19061..20536; Saccharomyces cerevisiae), (X96617; Saccharomyces cerevisiae), (X83882;
Arabidopsis thaliana), (AB037907; Kitasatospora griseola), and (BT007302; Homo sapiens); HMGR: (NM 206548; Drosophila melanogaster), (NM_204485; Gallus gallus), (ABOl 5627; Streptomyces sp. KO-3988), (AF542543; Nicotiana attenuata), (AB037907; Kitasatospora griseola), (AX128213,
providing the sequence encoding a truncated HMGR; Saccharomyces cerevisiae), and (NC OOl 145: complement (115734..1 18898; Saccharomyces cerevisiae)); MK: (L77688; Arabidopsis thalianά), and (X55875; Saccharomyces cerevisiae); PMK: (AF429385; Hevea brasiliensis), (NM 006556; Homo sapiens), (NCJ)Ol 145. complement 712315..713670; Saccharomyces cerevisiae); MPD: (X97557;
Saccharomyces cerevisiae), (AF290095; Enter ococcus faecium), and (U49260; Homo sapiens); and
EDI: (NC_000913, 3031087..3031635; E. coli), and (AF082326; Haematococcus pluvialis).
[00108] A non-limiting example of nucleotide sequences encoding aceoacetyl-CoA thiolase, HMGS, and HMGR is set forth in Figures 13A-C (SEQ ID NO:1) of U.S. Patent No. 7,183,089. A non-limiting example of nucleotide sequences encoding MK, PMK, MPD, and isopentenyl diphosphate isomerase (IDI) is set forth in Figures 16A-D of U.S. Patent No. 7,183,089.
[00109] In some embodiments, the HMGR coding region is set forth in SEQ ID NO: 13 of U. S . Patent

No. 7,183,089 (see also Figures 20A-C of U.S. Patent No. 7,183,089), which encodes a truncated form of HMGR ("tHMGR") that lacks the transmembrane domain of wild-type HMGR. The transmembrane domain of HMGR contains the regulatory portions of the enzyme and has no catalytic activity.
[00110] The coding sequence of any known MEV pathway enzyme may be altered in various ways known in the art to generate targeted changes in the amino acid sequence of the encoded enzyme. The amino acid of a variant MEV pathway enzyme will usually be substantially similar to the amino acid sequence of any known MEV pathway enzyme, i.e. will differ by at least one amino acid, and may differ by at least two, at least 5, at least 10, or at least 20 amino acids, but typically not more than about fifty amino acids. The sequence changes may be substitutions, insertions or deletions. For example, as described below, the nucleotide sequence can be altered for the codon bias of a particular host cell. In addition, one or more nucleotide sequence differences can be introduced that result in conservative amino acid changes in the encoded protein.
DXP pathway enzymes
[00111] The DXP pathway comprises: l-deoxy-D-xylulose-5 -phosphate synthase (Dxs), 1-deoxy-D- xylulose-5 -phosphate reductoisomerase (IspC), 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase (IspD), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (IspE), 2C-methyl-D-erythritol 2,4- cyclodiphosphate synthase (IspF), and l-hydroxy-2-methyl-2-(£)-butenyl 4-diphosphate synthase
(IspG).
[00112] In some embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide sequences encoding two, three, four, five, six, or seven of the DXP pathway enzymes, where each of the one or more synthetic IGRs are disposed between two coding regions encoding the enzymes. In other embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic IGRs; and nucleotide sequences encoding two, three, four, five, six, or seven of the DXP pathway enzymes, and a prenyl transferase, where each of the one or more synthetic IGRs are disposed between two coding regions encoding the enzymes. In other
embodiments, a subject nucleic acid comprises nucleotide sequence encoding one or more synthetic
IGRs; and nucleotide sequences encoding two, three, four, five, six, or seven of the DXP pathway enzymes, a prenyl transferase, and a terpene synthase, where each of the one or more synthetic IGRs are disposed between two coding regions encoding the enzymes.
Nucleotide sequences encoding DXP pathway enzymes
[00113] Nucleotide sequences encoding DXP pathway enzymes are known in the art, and can be used in a subject method. Variants of any known nucleotide sequence encoding a DXP pathway enzyme can be used, where the encoded enzyme retains enzymatic activity. Variants of any known nucleotide
sequence encoding a DXP pathway enzyme selected from l-deoxy-D-xylulose-5-phosphate synthase
(dxs); l-deoxy-D-xylulose-5-phosphate reductoisomerase (IspC; dxr), 4-diphosphocytidyl-2-C-methyl- D-erythritol synthase (IspD; YbgP), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (IspE; YchB), 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF; YbgB), l-hydroxy-2-methyl-2-(£)- butenyl 4-diphosphate synthase (IspG), and isopentenyl diphosphate isomerase can be used, where a variant differs in nucleotide sequence by one or more nucleotides from a reference sequence (e.g., a known sequence); and where a variant nucleotide sequence includes one or more nucleotide
substitutions, insertions, truncations, or deletions, compared to a reference sequence, e.g., compared to a known sequence.
[00114] The coding sequence of any known DXP pathway enzyme may be altered in various ways known in the art to generate targeted changes in the amino acid sequence of the encoded enzyme. The amino acid of a variant DXP pathway enzyme will in some embodiments be substantially similar to the amino acid sequence of any known DXP pathway enzyme, i.e. will differ by at least one amino acid, and may differ by at least two, at least 5, at least 10, or at least 20 amino acids, but typically not more than about fifty amino acids. The sequence changes may be substitutions, insertions or deletions. For example, as described below, the nucleotide sequence can be altered for the codon bias of a particular host cell. In addition, one or more nucleotide sequence differences can be introduced that result in conservative amino acid changes in the encoded protein.
[00115] Nucleotide sequences encoding l-deoxy-D-xylulose-5-phosphate synthase (dxs) are known in the art. See, e.g., GenBank Accession No. DQ768815 {Yersinia pestis dxs); GenBank Accession No. AF143812 {Lycopersicon esculentum dxs); GenBank Accession No. Y18874 {Synechococcus PCC6301 dxs); GenBank Accession No. AF035440; E. coli dxs); GenBank Accession No. AF282878
{Pseudomonas aeruginosa dxs); GenBank Accession No. NM_121176 (Arabidopsis thaliana dxs); and GenBank Accession No. AB026631 (βtreptomyces sp. CLl 90 dxs). Swissprot accession No. 078328
(Capsicum annum). See also Figure 5 of U.S. Patent Publication No. 2003/0219798 for nucleotide sequences encoding dxs.
[00116] Nucleotide sequences encoding l-deoxy-D-xylulose-5-phosphate reductoisomerase (IspC; dxr) are known in the art. See, e.g., GenBank Accession No. AF282879 {Pseudomonas aeruginosa dxr);
GenBank Accession No. AY081453 {Arabidopsis thaliana dxr); and GenBank Accession No.
AJ297566 {Zea mays dxr). See also Figure 31 of U.S. Patent Publication No. 2003/0219798 for
nucleotide sequences encoding dxr.

[00117] Nucleotide sequences encoding 4-diphosphocytidyl-2-C-methyl-D-erythritol synthase (IspD;

YbgP) are known in the art. See, e.g., GenBank Accession No. AF230737 (Arabidopsis thaliana);
GenBank Accession No. CP000034.1 (nucleotides 2725605-2724895; Shigella dysenteriae); and
GenBank Accession No. CP000036.1 (nucleotides 2780789 to 2781448; Shigella boydii). See also
SEQ ID NO:5 of U.S. Patent No. 6,660,507 (Methylomonas IspD).
[00118] Nucleotide sequences encoding 4-diphosphocytidyl-2-C-methyl-D-erythritol (IspE; YchB) kinase are known in the art. See, e.g., GenBank Accession No. CP000036.1 (nucleotides 1839782- 1840633; Shigella boydii); GenBank Accession No. AF288615 (Arabidopsis thaliana) and GenBank
Accession No. CP000266.1 (nucleotides 1272480-1271629; Shigella flexneri). See also, SEQ ID NO.7 of U.S. Patent No. 6,660,507 (Methylomonas 16a IspE).
[00119] Nucleotide sequences encoding 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF;

YbgB) are known in the art. See, e.g., GenBank Accession No. AEOl 7220.1 (nucleotides 3025667- 3025216; Salmonella enterica IspF); GenBank Accession No. NM_105070 (Arabidopsis thaliana);
GenBank Accession No. AE014073.1 (nucleotides 2838621-283841; Shigella flexneri).
[00120] Nucleotide sequences encoding l-hydroxy-2-methyl-2-(£)-butenyl 4-diphosphate synthase
(IspG; GcpE) are known in the art. See, e.g., GenBank Accession No. CP000034.1 (nucleotides
2505082 to 2503964; Shigella dysenteriae IspG); GenBank Accession No. NM l 80902 (Arabidopsis thaliana); GenBank Accession No. AE008814.1 (nucleotides 15609-14491; Salmonella typhimurium
IsgG); GenBank Accession No. AE014613.1 (nucleotides 383225-384343; Salmonella enterica GcpE); GenBank Accession No. AEOl 7220.1 (nucleotides 2678054-2676936; Salmonella enterica GcpE; and GenBank Accession No. BX95085.1 (nucleotides 3604460-3603539; Erwinia carotova GcpE).
[00121] IspH genes are known in the art. See, e.g., GenBank Accession No. AYl 68881 (Arabidopsis thaliana).
[00122] Nucleotide sequences encoding IPP isomerase are known in the art. See, e.g., (J05090;
Saccharomyces cerevisiae); Wang and Ohnuma (2000) Biochim. Biophys. Acta 1529:33-48; GenBank Accession No. NM_121649 (Arabidopsis thaliana); U.S. Patent No. 6,645,747; SEQ ID NO:1 of WO
02/095011; and SEQ ID NO:50 of WO 02/083720.
[00123] Nucleotide sequences having at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or higher, nucleotide sequence identity to a known nucleotide sequence encoding a DXP pathway enzyme are also suitable for use, where the nucleotide sequence encodes a functional DXP pathway enzyme.
Prenyl transferases
[00124] In some embodiments, a subject genetically modified host cell is genetically modified to
include one or more nucleic acids comprising a nucleotide sequence(s) encoding one or more
mevalonate pathway enzymes, as described above; and a nucleic acid comprising a nucleotide sequence that encodes a prenyl transferase.

[0012S) Prenyltransferases constitute a broad group of enzymes catalyzing the consecutive
condensation of IPP resulting in the formation of prenyl diphosphates of various chain lengths.
Suitable prenyltransferases include enzymes that catalyze the condensation of BPP with allylic primer substrates to form isoprenoid compounds with from about 2 isoprene units to about 6000 isoprene units or more, e.g., 2 isoprene units (Geranyl Pyrophosphate synthase), 3 isoprene units (Farnesyl
pyrophosphate synthase), 4 isoprene units (geranylgeranyl pyrophosphate synthase), 5 isoprene units, 6 isoprene units (hexadecylpyrophosphate synthase), 7 isoprene units, 8 isoprene units (phytoene
synthase, octaprenyl pyrophosphate synthase), 9 isoprene units (nonaprenyl pyrophosphate synthase, 10 isoprene units (decaprenyl pyrophosphate synthase), from about 10 isoprene units to about 15 isoprene units, from about 15 isoprene units to about 20 isoprene units, from about 20 isoprene units to about 25 isoprene units, from about 25 isoprene units to about 30 isoprene units, from about 30 isoprene units to about 40 isoprene units, from about 40 isoprene units to about 50 isoprene units, from about 50 isoprene units to about 100 isoprene units, from about 100 isoprene units to about 250 isoprene units, from about 250 isoprene units to about 500 isoprene units, from about 500 isoprene units to about 1000 isoprene units, from about 1000 isoprene units to about 2000 isoprene units, from about 2000 isoprene units to about 3000 isoprene units, from about 3000 isoprene units to about 4000 isoprene units, from about
4000 isoprene units to about 5000 isoprene units, or from about 5000 isoprene units to about 6000 isoprene units or more.
[00126] Suitable prenyltransferases include, but are not limited to, an is-isoprenyl diphosphate synthase, including, but not limited to, geranyl diphosphate (GPP) synthase, farnesyl diphosphate (FPP) synthase, geranylgeranyl diphosphate (GGPP) synthase, hexaprenyl diphosphate (HexPP) synthase, heptaprenyl diphosphate (HepPP) synthase, octaprenyl (OPP) diphosphate synthase, solanesyl diphosphate (SPP) synthase, decaprenyl diphosphate (DPP) synthase, chicle synthase, and gutta-percha synthase; and a Z- isoprenyl diphosphate synthase, including, but not limited to, nonaprenyl diphosphate (NPP) synthase, undecaprenyl diphosphate (UPP) synthase, dehydrodolichyl diphosphate synthase, eicosaprenyl
diphosphate synthase, natural rubber synthase, and other Z-isoprenyl diphosphate synthases.
[00127] The nucleotide sequences of a numerous prenyl transferases from a variety of species are
known, and can be used or modified for use in generating a subject genetically modified host cell.
Nucleotide sequences encoding prenyl transferases are known in the art. See, e.g., Human farnesyl pyrophosphate synthetase mRNA (GenBank Accession No. J05262; Homo sapiens); farnesyl
diphosphate synthetase (FPP) gene (GenBank Accession No. J05091 ; Saccharomyces cerevisiae);
isopentenyl diphosphate:dimethylallyl diphosphate isomerase gene (J05090; Saccharomyces
cerevisiae); Wang and Ohnuma (2000) Biochim. Biophys. Acta 1529:33-48; U.S. Patent No. 6,645,747; Arabidopsis thaliana farnesyl pyrophosphate synthetase 2 (FPS2) / FPP synthetase 2 / famesyl
diphosphate synthase 2 (At4gl7190) mRNA (GenBank Accession No. NM_202836); Ginkgo biloba geranylgeranyl diphosphate synthase (ggpps) mRNA (GenBank Accession No. AY371321);
Arabidopsis thaliana geranylgeranyl pyrophosphate synthase (GGPSl) / GGPP synthetase / farnesyltranstransferase (At4g36810) mRNA (GenBank Accession No. NM_119845); Synechococcus elongatus gene for farnesyl, geranylgeranyl, geranylfamesyl, hexaprenyl, heptaprenyl diphosphate synthase (SelF-HepPS) (GenBank Accession No. ABOl 6095); etc.
Terpene synthases
[00128] A nucleic acid comprising a nucleotide sequence encoding any known terpene synthase can be used. Suitable terpene synthases include, but are not limited to, amorpha-4,11-diene synthase (ADS), beta-caryophyllene synthase, germacrene A synthase, 8-epicedrol synthase, valencene synthase, (+)- delta-cadinene synthase, germacrene C synthase, (E)-beta- farnesene synthase, Casbene synthase,
vetispiradiene synthase, 5-epi-aristolochene synthase, Aristolchene synthase, beta-caryophyllene, alpha- humulene, (E,E)-alρha-farnesene synthase, (-)-beta-pinene synthase, Gamma-teφinene synthase,
limonene cyclase, Linalool synthase, 1,8-cineole synthase, (+)-sabinene synthase, E-alpha-bisabolene synthase, (+)-bornyl diphosphate synthase, levopimaradiene synthase, Abietadiene synthase,
isopimaradiene synthase,(E)-gamma-bisabolene synthase, taxadiene synthase, copalyl pyrophosphate synthase, kaurene synthase, longifolene synthase, gamma-humulene synthase, Delta-selinene synthase, beta-phellandrene synthase, limonene synthase, myrcene synthase, terpinolene synthase, (-)-camphene synthase, (+)-3-carene synthase, syn-copalyl diphosphate synthase, alpha-terpineol synthase, syn- pimara-7,15-diene synthase, ent-sandaaracopimaradiene synthase, stemer-13-ene synthase, E-beta- ocimene, S-linalool synthase, geraniol synthase, gamma-teφinene synthase, linalool synthase, E-beta- ocimene synthase, epi-cedrol synthase, alpha-zingiberene synthase, guaiadiene synthase, cascarilladiene synthase, cis-muuroladiene synthase, aphidicolan-16b-ol synthase, elizabethatriene synthase, sandalol synthase, patchoulol synthase, Zinzanol synthase, cedrol synthase, scareol synthase, copalol synthase, manool synthase, and the like.
[00129] Nucleotide sequences encoding teφene synthases are known in the art, and any known teφene synthase-encoding nucleotide sequence can used to genetically modify a host cell. For example, the following teφene synthase-encoding nucleotide sequences, followed by their GenBank accession
numbers and the organisms in which they were identified, are known and can be used: (-)-germacrene
D synthase mRNA (AY438099; Populus balsamifera subsp. trichocarpa x Populiis deltoids); E1E- alpha-famesene synthase mRNA (AY640154; Cucumis sativus); 1,8-cineole synthase mRNA
(AY691947; Arabidopsis thaliana); teφene synthase 5 (TPS5) mRNA (AY518314; Zea mays); teφene synthase 4 (TPS4) mRNA (AY518312; Zea mays); myrcene/ocimene synthase (TPSlO) (At2g24210) mRNA (NM_127982; Arabidopsis thaliana); geraniol synthase (GES) mRNA (AY362553; Ocimum basilicum); pinene synthase mRNA (AY237645; Picea sitchensis); myrcene synthase le20 mRNA
(AY195609; Antirrhinum majus); (E)-β-ocimene synthase (0e23) mRNA (AYl 95607; Antirrhinum majus); E-β-ocimene synthase mRNA (AYl 51086; Antirrhinum majus); teφene synthase mRNA
(AF497492; Arabidopsis thaliana); (-)-camphene synthase (AG6.5) mRNA (U87910; Abies grandis); (- )-4S-limonene synthase gene (e.g., genomic sequence) (AF326518; Abies grandis); delta-selinene
synthase gene (AF326513; Abies grandis); amoφha-4,11-diene synthase mRNA (AJ251751; Artemisia annuά); E-α-bisabolene synthase mKNA (AF006195; Abies grandis); gamma -humulene synthase
mRNA (U92267; Abies grandis); δ-selinene synthase mRNA (U92266; Abies grandis); pinene synthase (AG3.18) mRNA (U87909; Abies grandis); myrcene synthase (AG2.2) mRNA (U87908; Abies
grandis); etc.
[00130] Amino acid sequences of the following terpene synthases are found under the GenBank
Accession numbers shown in parentheses, along with the organism in which each was identified,
following each terpene synthase: (-)-germacrene D synthase (AAR99061; Populus balsamifera subsp. trichocarpa x Populus deltoids); D-cadinene synthase (P93665; Gossypium hirsutum); 5-epi- aristolochene synthase (Q40577; Nicotiana tabacum); E,E-alpha-farnesene synthase (AAU05951;
Cucumis sativus); 1,8-cineole synthase (AAUOl 970; Arabidopsis thaliana); (R)-limonene synthase 1
(Q8L5K3; Citrus limon); syn-copalyl diphosphate synthase (AAS98158; Oryza sativa); a taxadiene synthase (Q9FT37; Taxus chinensis; Q93YA3; Taxus bacca; Q41594; Taxus brevifolia); a D-cadinene synthase (Q43714; Gossypium arboretum); terpene synthase 5 (AAS88575; Zea mays); terpene
synthase 4 (AAS88573; Zea mays); terpenoid synthase (AAS79352; Vitis vinifera); geraniol synthase
(AARl 1765; Ocimum basilicum); myrcene synthase le20 (AAO41727; Antirrhinum majus); 5-epi- aristolochene synthase 37 (AAP05762; Nicotiana attenuata); (+)-3-carene synthase (AAO73863; Picea abies); (-)-camphene synthase (AAB70707; Abies grandis); abietadiene synthase (AAK83563; Abies grandis); amorpha-4,11-diene synthase (CAB94691 ; Artemisia annua); trichodiene synthase
(AAC49957; Myrothedum roridum); gamma-humulene synthase (AAC05728; Abies grandis); δ- selinene synthase (AAC05727; Abies grandis); etc.
Expression constructs
[00131] In some embodiments, a subject nucleic acid comprises an expression construct, e.g., in some embodiments, a subject nucleic acid comprises a nucleotide sequence encoding at least one synthetic
IGR, and nucleotide sequences encoding one or more enzymes in an IPP biosynthetic pathway, where the at least one synthetic IGR is 5 ' of a coding region for one of the enzymes, and can be disposed between two coding regions encoding two or more enzymes in the IPP biosynthetic pathway.
[00132] Suitable expression vectors include, but are not limited to, baculovirus vectors, bacteriophage vectors, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral vectors (e.g.
viral vectors based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, and the like), Pl -based artificial chromosomes, yeast plasmids, yeast artificial
chromosomes, and any other vectors specific for specific hosts of interest (such as E. coli and yeast).
Suitable vectors include chromosomal, nonchromosomal and synthetic DNA sequences.
[00133] Numerous suitable expression vectors are known to those of skill in the art, and many are
commercially available. The following vectors are provided by way of example; for bacterial host cells: pQE vectors (Qiagen), pBluescript plasmids, pNH vectors, lambda-ZAP vectors (Stratagene); pTrc99a, pKK223-3, pDR540, and pRTT2T (Pharmacia); for eukaryotic host cells: pXTl, pSG5 (Stratagene), pSVK3, pBPV, pMSG, and pSVLSV40 (Pharmacia). However, any other plasmid or other vector may be used so long as it is compatible with the host cell.
[00134] Depending on the host/vector system utilized, any of a number of suitable transcription and translation control elements, including constitutive and inducible promoters, transcription enhancer elements, transcription terminators, etc. may be used in the expression vector (see e.g., Bitter et al.
(1987) Methods in Enzymology, 153:516-544).
[00135] Suitable promoters for use in prokaryotic host cells include, but are not limited to, a
bacteriophage T7 RNA polymerase promoter; a trp promoter; a lac operon promoter; a hybrid promoter, e.g., a lac/tac hybrid promoter, a tac/trc hybrid promoter, a trp/lac promoter, a T7/lac promoter; a trc promoter; a tac promoter, and the like; an araBAD promoter; in vivo regulated promoters, such as an ssaG promoter or a related promoter (see, e.g., U.S. Patent Publication No. 20040131637), apagC
promoter (Pulkkinen and Miller, J. Bacteriol., 1991: 173(1): 86-93; Alpuche-Aranda et al., PNAS,
1992; 89(21): 10079-83), a nirB promoter (Harbome et al. (1992) MoI. Micro. 6:2805-2813), and the like {see, e.g., Dunstan et al. (1999) Infect. Immun. 67:5133-5141 ; McKelvie et al. (2004) Vaccine
22:3243-3255; and Chatfield et al. (1992) Biotechnol. 10:888-892); a sigma70 promoter, e.g., a
consensus sigma70 promoter (see, e.g., GenBank Accession Nos. AX798980, AX798961, and
AX798183); a stationary phase promoter, e.g., a dps promoter, an spv promoter, and the like; a
promoter derived from the pathogenicity island SPI-2 (see, e.g., WO96/17951); an actA promoter (see, e.g., Shetron-Rama et al. (2002) Infect. Immun. 70:1087-1096); an rpsM promoter (see, e.g., Valdivia and Falkow (1996). MoI. Microbiol. 22:367-378); a tet promoter (see, e.g., Hillen,W. and
Wissmann,A. (1989) In Saenger.W. and Heinemann,U. (eds), Topics in Molecular and Structural
Biology, Protein-Nucleic Acid Interaction. Macmillan, London, UK, Vol. 10, pp. 143-162); an SP6 promoter (see, e.g., Melton et al. (1984) Nucl. Acids Res. 12:7035-7056); and the like.
[00136] Non-limiting examples of suitable eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Suitable promoters for expression in yeast include, but are not limited to, CYCl, HIS3, GALl, GALlO, ADHl, PGK, PHO5, GAPDH, ADCl, TRPl, URA3, LEU2, ENO, and TPl ; and, e.g., AOXl (e.g., for use in
Pichia). Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art. The expression vector may also contain a ribosome binding site for translation initiation and a
transcription terminator. The expression vector may also include appropriate sequences for amplifying expression.
[00137] In addition, the expression vectors include one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampicillin resistance in prokaryotic host cells such as E. coli.
[00138] Generally, an expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli, the S.

cerevisiae TRPl gene, etc.; and a promoter derived from a highly-expressed gene to direct transcription of the coding sequence. Such promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), α-factor, acid phosphatase, or heat shock proteins, among others.

[00139] In some embodiments, a nucleotide sequence encoding an IPP biosynthetic pathway enzyme is operably linked to an inducible promoter. Inducible promoters are well known in the art. Suitable inducible promoters include, but are not limited to, the pL of bacteriophage λ; Plac; Ptrp; Ptac (Ptrp-lac hybrid promoter); an isopropyl-beta-D-thiogalactopyranoside (IPTG)-inducible promoter, e.g., a lacZ promoter; a tetracycline-inducible promoter; an arabinose inducible promoter, e.g., PBAD (see, e.g.,
Guzman et al. (1995) 7. Bacteriol. 177:4121-4130); a xylose-inducible promoter, e.g., Pxyl (see, e.g., Kim et al. (1996) Gene 181:71-76); a GALl promoter; a tryptophan promoter; a lac promoter; an
alcohol-inducible promoter, e.g., a methanol-inducible promoter, an ethanol-inducible promoter; a raffinose-inducible promoter; a heat-inducible promoter, e.g., heat inducible lambda PL promoter, a promoter controlled by a heat-sensitive repressor (e.g., CI857-repressed lambda-based expression
vectors; see, e.g., Hoffmann et al. (1999) FEMS Microbiol Lett. 177(2):327-34); and the like.
[00140] In some embodiments, a nucleotide sequence encoding an IPP biosynthetic pathway enzyme is operably linked to a constitutive promoter. Suitable constitutive promoters for use in prokaryotic cells are known in the art and include, but are not limited to, a sigma70 promoter, e.g., a consensus sigma70 promoter.
[00141] In yeast, a number of vectors containing constitutive or inducible promoters may be used. For a review see, Current Protocols in Molecular Biology, Vol. 2, 1988, Ed. Ausubel, et al., Greene Publish. Assoc. & Wiley Interscience, Ch. 13; Grant, et al., 1987, Expression and Secretion Vectors for Yeast, in Methods in Enzymology, Eds. Wu & Grossman, 31987, Acad. Press, N.Y., Vol. 153, pp.516-544;
Glover, 1986, DNA Cloning, Vol. II, ERL Press, Wash., D.C., Ch. 3; and Bitter, 1987, Heterologous
Gene Expression in Yeast, Methods in Enzymology, Eds. Berger & Kimmel, Acad. Press, N. Y., Vol. 152, pp. 673-684; and The Molecular Biology of the Yeast Saccharomyces, 1982, Eds. Strathern et al., Cold Spring Harbor Press, VoIs. I and II. A constitutive yeast promoter such as ADH or LEU2 or an inducible promoter such as GAL may be used (Cloning in Yeast, Ch. 3, R. Rothstein In: DNA Cloning Vol. 11, A Practical Approach, Ed. DM Glover, 1986, ERL Press, Wash., D.C.). Alternatively, vectors may be used which promote integration of foreign DNA sequences into the yeast chromosome.
[00142] In some embodiments, a subject nucleic acid comprises nucleotide sequences encoding two or more EPP biosynthetic pathway enzymes, where the nucleotide sequences encoding the two or more enzymes will in some embodiments each be contained on separate expression vectors. In other
embodiments, nucleotide sequences encoding one or more EPP biosynthetic pathway enzymes are
contained in a single expression vector. Where nucleotide sequences encoding one or more EPP
biosynthetic pathway enzymes are contained in a single expression vector, in some embodiments, the nucleotide sequences will be operably linked to a common control element (e.g., a promoter), e.g., the common control element controls expression of all of the BPP biosynthetic pathway enzyme-encoding nucleotide sequences on the single expression vector.
[00143] Where nucleotide sequences encoding the IPP biosynthetic pathway enzyme(s) are contained in a single expression vector, in some embodiments, the nucleotide sequences will be operably linked to different control elements (e.g., a promoters), e.g., the different control elements control expression of each of the IPP biosynthetic pathway enzyme-encoding nucleotide sequences separately on a single expression vector.
GENETICALLY MODIFIED HOST CELLS
[00144] The present invention provides genetically modified host cells; and compositions comprising the genetically modified host cells. The genetically modified host cells are useful for producing an isoprenoid compound or an isoprenoid precursor compound, as discussed above. As discussed above, a subject method for producing an isoprenoid or isoprenoid precursor generally involves culturing a genetically modified host cell in a suitable medium. In some embodiments, the genetically modified host cell is one that has been genetically modified with one or more heterologous nucleic acids
comprising nucleotide sequence(s) encoding one or more synthetic IGRs, where the one or more
synthetic IGRs are each disposed between two coding regions comprising nucleotide sequences
encoding IPP biosynthetic enzymes. In other embodiments, the genetically modified host cell is one that has been genetically modified with one or more heterologous nucleic acids comprising nucleotide sequencers) encoding one or more synthetic IGRs, and two or more coding regions comprising
nucleotide sequences encoding enzymes in an B?P biosynthetic pathway, where the one or more
synthetic IGRs are each disposed between two of the two or more coding regions. In some
embodiments, a subject genetically modified host cell comprises a subject nucleic acid, e.g., is
genetically modified with a subject nucleic acid.
[00145] As noted above, a host cell that produces an isoprenoid or isoprenoid precursor is genetically modified such that it produces the isoprenoid or isoprenoid precursor via an IGR-modified BPP
biosynthetic pathway. The genetically modified host cell is a genetically modified version of a parent host cell.
[00146] The IGR-modified IPP biosynthetic pathway is in some embodiments substantially the same as an endogenous pathway but for the inclusion of the one or more synthetic IGRs, e.g., the IGR-modified DPP biosynthetic pathway comprises nucleotide sequence encoding enzymes that are endogenous to the host cell. For example, in some embodiments, the host cell is a prokaryotic cell that normally produces LPP via an endogenous DXP pathway, and the IGR-modified BPP biosynthetic pathway comprises the endogenous DXP pathway, modified to include one or more synthetic IGRs. As another example, the host cell is a eukaryotic cell (e.g., a yeast cell) that normally produces IPP via an endogenous
mevalonate pathway, and the IGR-modified IPP biosynthetic pathway comprises the endogenous
mevalonate pathway, modified to include one or more synthetic IGRs.

[00147] In other embodiments, the IGR-modified BPP biosynthetic pathway comprises both synthetic
IGR(s) and nucleotide sequences encoding enzymes that are heterologous to the cell. For example, in some embodiments, the host cell is a prokaryotic cell that does not normally synthesize IPP via a
mevalonate pathway; and the host cell is genetically modified with an IGR-modified mevalonate
pathway that includes one or more synthetic IGRs and nucleotide sequences encoding mevalonate pathway enzymes heterologous to the host cell. As another example, the host cell is a eukaryotic cell
(e.g., a yeast cell) that does not normally synthesize IPP via a DXP pathway; and the host cell is
genetically modified with an IGR-modified DXP pathway that includes one or more synthetic IGRs and nucleotide sequences encoding DXP pathway enzymes heterologous to the host cell.
[00148] To generate a subject genetically modified host cell, a subject nucleic acid is introduced stably or transiently into a parent host cell, using established techniques, including, but not limited to,
electroporation, calcium phosphate precipitation, DEAE-dextran mediated transfection, liposome- mediated transfection, and the like. For stable transformation, a nucleic acid will generally further include a selectable marker, e.g., any of several well-known selectable markers such as neomycin
resistance, ampicillin resistance, tetracycline resistance, chloramphenicol resistance, kanamycin
resistance, and the like.
[00149] Host cells (including parent host cells and genetically modified host cells) can be unicellular organisms, or are grown in culture as single cells. In some embodiments, the host cell is a eukaryotic cell. Suitable eukaryotic host cells include, but are not limited to, yeast cells, insect cells, plant cells, fungal cells, and algal cells. Suitable eukaryotic host cells include, but are not limited to, Pichia
pastoris, Pichia βnlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis,
Pichia methanolica, Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger,
Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium
gramineum, Fusarium venenatum, Neurospora crassa, Chlamydomonas reinhardtii, and the like. In some embodiments, the host cell is a eukaryotic cell other than a plant cell.
[00150] In other embodiments, the host cell is a prokaryotic cell. Suitable prokaryotic cells include, but are not limited to, any of a variety of laboratory strains of Escherichia coli, Lactobacillus sp.,
Salmonella sp., Shigella sp., and the like. See, e.g., Carrier et al. (1992) 7. Immunol. 148:1176-1181;
U.S. Patent No. 6,447,784; and Sizemore et al. (1995) Science 270:299-302. Examples of Salmonella strains which can be employed in the present invention include, but are not limited to, Salmonella typhi and S. typhimurium. Suitable Shigella strains include, but are not limited to, Shigella flexneri, Shigella sonnet, and Shigella disenteriae. Typically, the laboratory strain is one that is non-pathogenic. Non- limiting examples of other suitable bacteria include, but are not limited to, Bacillus subtilis,
Pseudomonas pudita, Pseudomonas aeruginosa, Pseudomonas mevalonii, Rhodobacter sphaeroides, Rhodobacter capsulatus, Rhodospirillum rubrum, Rhodococcus sp., and the like. In some
embodiments, the host cell is Escherichia coli.
PRODUCING AN ISOPRENOID OR ISOPRENOID PRECURSOR COMPOUND
[00151] The present invention provides a method of producing an isoprenoid compound or an
isoprenoid precursor compound. In some embodiments, the methods generally involve culturing a genetically modified host cell in a suitable medium. The genetically modified host cell comprises an
IGR-modified IPP biosynthetic pathway comprising at least one synthetic intergenic region (IGR) that comprises a nucleotide sequence that forms a hairpin structure; and the at least one synthetic IGR is disposed between a set of two coding regions encoding two enzymes in the biosynthetic pathway. In some embodiments, the methods involve culturing a subject genetically modified host cell in a suitable medium. The genetically modified host cell produces the isoprenoid or isoprenoid precursor in a
recoverable amount. In some embodiments, the methods further involve recovering the isoprenoid or isoprenoid from the genetically modified host cell, from the culture medium, or both the genetically modified host cell and the culture medium.
[00152] In some embodiments, a subject genetically modified host cell is cultured in a suitable medium

(e.g., Luria-Bertoni broth, optionally supplemented with one or more additional agents, such as an inducer (e.g., where one or more IPP biosynthetic pathway enzyme-encoding nucleotide sequence is under the control of an inducible promoter), etc.); and the culture medium is overlaid with an organic solvent, e.g. dodecane, forming an organic layer. The isoprenoid compound produced by the
genetically modified host cell partitions into the organic layer, from which it can be purified. In some embodiments, where the isoprenoid-modifying enzyme-encoding nucleotide sequence is operably linked to an inducible promoter, an inducer is added to the culture medium; and, after a suitable time, the isoprenoid compound is isolated from the organic layer overlaid on the culture medium.
[00153] In some embodiments, the isoprenoid compound will be separated from other products which may be present in the organic layer. Separation of the isoprenoid compound from other products that may be present in the organic layer is readily achieved using, e.g., standard chromatographic
techniques.
[00154] In some embodiments, an isoprenoid compound synthesized by a subject method is further chemically modified in a cell-free reaction. For example, in some embodiments, artemisinic acid is isolated from culture medium and/or a cell lysate, and the artemisinic acid is further chemically
modified in a cell-free reaction to generate artemisinin.
[00155] In some embodiments, the isoprenoid compound is pure, e.g., at least about 40% pure, at least about 50% pure, at least about 60% pure, at least about 70% pure, at least about 80% pure, at least about 90% pure, at least about 95% pure, at least about 98%, or more than 98% pure, where "pure" in the context of an isoprenoid compound refers to an isoprenoid compound that is free from other
isoprenoid compounds, macromolecules, contaminants, etc.

EXAMPLES
[00156] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight,
molecular weight is weight average molecular weight, temperature is in degrees Celsius, and pressure is at or near atmospheric. Standard abbreviations may be used, e.g., bp, base pair(s); kb, kilobase(s); pi, picoliter(s); s or sec, second(s); min, minute(s); h or hr, hour(s); aa, amino acid(s); kb, kilobase(s); bp, base ρair(s); nt, nucleotide(s); i.m., intramuscularQy); i.p., intraperitoneal(ly); s.c, subcutaneous(ly);
and the like.
Example 1 : Combinatorial engineering of intergenic regions of operons
MATERIALS AND METHODS
[00157] General. Media and chemicals were purchased from Sigma (St. Louis, MO) and Fisher
Scientific (Pittsburgh, PA). Enzymes were purchased from Roche (Indianapolis, IN), Promega
(Madison, WI), and New England Biolabs (Beverly, MA). Oligonucleotides (see Table 1 of U.S.
Provisional Patent Application No. 60/819,706) were purchased from Operon Biotechnologies
(Alameda, CA). C-medium17 was supplemented with 3.4% glycerol, 1% Casamino acids (Difco) and micronutrients. Carbenecillin and chloramphenicol were used at concentrations of 50 μg/mL and 34 μg/mL, respectively, and cells were grown at 37°C, unless otherwise indicated. Oligonucleotides are depicted in Table 1 of U.S. Provisional Patent Application No. 60/819,706; and features of the
oligonucleotides are depicted in Table 1 , below. The strains and plasmids used in this study are listed in Table 2, below.
Table 1

Name Design Feature #of
seq Sequence
GCCTAGCAAGATCTCCTGATCCCGGTGCACCCGGACA
Al Hairpin 410
TCTGCATAGTCTG (SEQ ID N0:21)
Shorter version GCCTAGCAAGATCTCCTGATCCACCCGGACATCTGCA
A2 of Hairpin 4 TAGTCTG (SEQ ID NO: 22)
Longer version of GCCTAGCAAGATCTCCTGATCCCGGTGCGCGACCACC α A3 Hairpin 4 CGGACATCTGCATAGTCTG (SEQ ID NO: 23)
έ Y-shaped GCCTAGCAAGATCTCCTGATCCACATAAGGCCGGACA

U A4
α> Structure TCTGCATAGTCTG (SEQ ID NO: 24)
GCCTAGCAAGATCTCCTGATCCACCTTTGATGGCTAG
Longer Hairpin
A5 AAAAATTAAGCTGCGGACATCTGCATAGTCTO
with bulges (SEQ ID NO: 25)
pin 4 with 1,024 GCCTAGCAAGATCTCCTGATCCNNNNNCACCCGGACA
A6 Hair
degeneracies TCTGCATAGTCTG (SEQ ID NO:26) Name Design Feature # of
seq Sequence
GCCTAGCAAGATCTCCTGATCCACCTTGAGAGTACTT
A7 RNase III site 1 AATGTAAGCCCTCTCTCAGACATCTGCATAGTCTG
(SEQ ID NO:27)
RNase III site GCCTAGCAAGATCTCCTGATCAGAGGGACAANNNNAA

A8 containing 4 / 5 N 256 GGTCATTCAGACATCTGCATAGTCTG
Bulge11 (SEQ ID NO:28)
RNase III site GCCTAGCAAGATCTCCTGATCAGAGGGACAANNNNNA

A9 containing 5/4 N 1,024 AGGTCATTCAGACATCTGCATAGTCTG
Bulge11 (SEQ ID NO:29)
Extended RNase
GCCTAGCAAGATCTCCTGATCAGAGGGACAANNNNAA

III site
AlO 256 GGTCATTGCAGCTCAGACATCTGCATAGTCTG
containing 4/5 N
(SEQ ID NO:30)
Bulge11
Extended RNase GCCTAGCAAGATCTCCTGATCAGAGGGACAANNNNNA
III site
All 1,024 AGGTCATTGCAGCTCAGACATCTGCATAGTCTG
containing 5/4 N
(SEQ ID NO: 31)
Bulge11

5 ' constant Region GCCTAGCAAGATCTCCTGATC (SEQ ID NO: 78) A-B Overlap Region ACATCTGCATAGTCTG (SEQ ID NO: 79)

AAATACTGTAAATTCAAGGCAAGTGTACCTGATCCCG GTGCACCCAGACTATGCAGATGT
Bl Hairpin 41
(SEQ ID NO: 32)
AAATACTGTAAATTCAAGGCAAGTGTAAGTGTACCTG
Shorter version of ATCCACCCAGACTATGCAGATGT
B2 Hairpin 4 (SEQ ID NO: 33)
AAATACTGTAAATTCAAGGCACCTGATCCCGGTGCGC
Longer version of
B3 Hairpin 4 1 GACCACCCAGACTATGCAGATGT
(SEQ ID NO: 34)
AAATACTGTAAATTCAAGGCCCTGATCCGCCAGTCCT



B7 containing 4/5 N 1 , 024 GGCCACTCAGACTATGCAGATGT
Bulge11 (SEQ ID NO: 38)
RNase III site AAATACTGTAAATTCAAGGCAGAAGGTCAANNNNAAG

B8 containing 5/4 N 256 GCCACTCAGACTATGCAGATGT
Bulge11 (SEQ ID NO:39)
Extended RNase III AAATACTGTAAATTCAAGGCAGAAGGTCAANNNNNAA B9 site containing 4/5 1,024 GGCCACTGCAGCTCAGACTATGCAGATGT
N Bulge " (SEQ ID NO: 40)
Extended RNase III AAATACTGTAAATTCAAGGCAGAAGGTCAANNNNAAG BlO site containing 5/4 256 GCCACTGCAGCTCAGACTATGCAGATGT
N Bulge" (SEQ ID NO: 41) Name Design Feature # of
seq Sequence
B-C overlap GCCTTGAATTTACAGTATTT (SEQ ID NO: 80) w w Pap RNase E site3 TTTGTATTGATC (SEQ ID NO: 81)
Jg .y Consensus sequence12 RMNGDWYNMM (SEQ ID NO: 82)
S3 W (GA) (CA)NG(GUA) - (AU) (CU)N(CA) (CA)

RNaseE site from GCCTTGAATTTACAGTATTTATTTGTATTGATCTCCT

Cl Pap Operon TATCCGCTCAAGA (SEQ ID NO: 42)
GCCTTGAATTTACAGTATTTGATCTCAATGCTCTATC

RNaseE site from
C2 1 AATGTAGGAAGATACCTTATCCGCTCAAGA
Pap Operon
(SEQ ID NO:43)
RNaseE site from GCCTTGAATTTACAGTATTTCAGTTACCGCTCTATCC

C3 1 TTATCCTTATCCGCTCAAGA
Pap Operon
(SEQ ID NO: 44)
RNaseE site from GCCTTGAATTTACAGTATTTAATTTACCTTTGATTTC

C4 1 CGGATCCTTATCCGCTCAAGA
Pap Operon
(SEQ ID NO: 45)
GCCTTGAATTTACAGTATTTACAAGTTTTGATCGAGG

RNaseE site from
C5 1 GACAGTAGTCCTTATCCGCTCAAGA
Pap Operon
(SEQ ID NO: 46)
RNaseE site from
GCCTTGAATTTACAGTATTTAGCGTTCCGAGTGCATG
C6 Pap Operon with HP
■ag CCTTATCCGCTCAAGA (SEQ ID NO: 47)

W directly downstream
RNaseE site from
Pap Operon with GCCTTGAATTTACAGTATTTAGCGTTCCGAGTGCATG

C7
Longer HP directly CTCGCCCTTATCCGCTCAAGA (SEQ ID NO: 48) downstream
RNaseE site from
Pap Operon with GCCTTGAATTTACAGTATTTACCGAGTGCATGCCTTA

C8 Shorter HP directly TCCGCTCAAGA (SEQ ID NO: 49)
downstream
GCCTTGAATTTACAGTATTTAATGAACTAGCGTTCCG
C9 TWO RNaseE SITES 1 AGTGCATGCCTTATCCGCTCAAGA
(SEQ ID NO: 50)
RNase III site GCCTTGAATTTACAGTATTTTAGTGGCCTTNNNNNAT

ClO containing 5/4 N 1,024 ACTATTCGGTCACCTTATCCGCTCAAGA
Bulge11 (SEQ ID NO: 51)

C-D Overlap Region CCTTATCCGCTCAAGA (SEQ ID NO: 83)

GGATACAGTATCTGCGGTACCGATCTCCAATATCCGC
Dl TCTATCTTGAGCGGATAAGG (SEQ ID NO: 52)
GGATACAGTATCTGCGGTACCGATCTCCAAGATCCGA
D2 TTATATTCTTGAGCGGATAAGG
(SEQ ID NO: 53)
Oa GGATACAGTATCTGCGGTACCACCTTTTCTTGATTTC

-a D3 CGAATCCTATCCACTCAATTTCTTGAGCGGATAAGG

W CO (SEQ ID NO: 54)
GGATACAGTATCTGCGGTACCGATATCCTAGCGGATC
D4 CTATTAACTCTCCGCTCTTGAGCGGATAAGG
(SEQ ID NO: 55)
D5 GGATACAGTATCTGCGGTACCGATCGGGCCGGTATCC
GGTTATCTTGAGCGGATAAGG (SEQ ID NO: 56) # of
Name Design Feature Sequence
seq
Finish Hairpin to GGATACAGTATCTGCGGTACCCTAGATGCGTTCCGAG

D6
match C6 TGCATGTCTTGAGCGGATAAGG (SEQ ID NO: 57)
GGATACAGTATCTGCGGTACCCTAGATGCGTTCCGAG
Longer Hairpin to
D7 1 TGCATGCTCGCTCTTGAGCGGATAAGG
match C7
(SEQ ID NO: 58)
Shorter Hairpin to GGATACAGTATCTGCGGTACCCTAGATCCGAGTGCAT

D8
match C8 GTCTTGAGCGGATAAGG (SEQ ID NO: 59)
RNa s e III site GGATACAGTATCTGCGGTACCCTAGATTTAGTGGCCT

D9 containing 5 / 4 N 256 TNNNNATACTACTCGGTCTCTTGAGCGGATAAGG
Bulge11 (SEQ ID NO: 60)

3 ' Constant region GGATACAGTATCTGCGGTACC (SEQ ID NO: 61)

Table 2
Source/
Strain/ Plasmid Genotype/Description
Reference
DHlOB F- , mcrA, Δ(mrr-hsdRMS-mcrBC), F801acZΔM15, Invitrogen
ΔlacX74, recAl, endAl, araD139, Δ (ara, leu)7697,
galU, galK, λ-, rpsL (StrR), nupG
DPlO F- , mcrA, Δ(mrr-hsdRMS-mcrBC), F801acZΔM15, Doug
ΔlacX74, recAl, endAl, araD139, Δ (ara, leu)7697, Pitera
galU, galK, λ-, rpsL (StrR), nupG, ΔaraFGH, PCpg- (unpublish
AraE ed)
DPS DHlOB, ispA : : PLAC-(MK, PMK, MPD, idi, ispA) ; This work
ispA, Δ ispC; E. coli Strain auxotrophic for
mevalonate
p701g pBad24 based reporter plasmid (ampr, pBR origin) (Smolke
carrying lacZ and gfpuy. and
Keasling
2002)
pBadRFPεc pBad24 based reporter plasmid (ampr, pBR origin) This work
carrying E. coli optimized DsRed.
p70rg p701g with rfpEC replacing lacZ. This work
p70rg 1-15 p70rg with various IGR sequences This work
p70gr p70rg with rfpEc and gfpuv in reverse order This work
pGEM-4Z In-vitro transcript cloning vector carrying SP6 and T7 Promega
promoter.
pCSOl pGEM-4Z with gfp (Smolke,
Carrier et
al. 2000)
pBP2 pGEM-4Z with rfpEC> This work
pPAtoB pGEM-4Z with 744 bp fragment from atoB This work
pPHMGS pGEM-4Z with 744 bp fragment from HMGS This work
pPHMGR pGEM-4Z with 744 bp fragment from OfAfGR This work Source/
Strain/ Plasmid Genotype/Description
Reference
pBad33 Cloning vector containing Cm', pACYC origin, araC, (Guzman,
and P promoter Belin et al.
1995)
pBad33MevT pBad33 with atoB, HMGS, tHMGR This work
pBad33MevT- pBad33MevT-L with atoB, rfpEC, tHMGR used for This work
L cloning libraries
pBad33MevT pBad33MevT with various IGR-HMGS-IGR This work
Sample A-D fragments inserted between atoB and tHMGR.
pKLN59 GFP biosensor plasmid containing, Amp' Kan', oriC, (Newman,
gfp> Almeida et
al. 2003)

[00158] Construction of plasmids pMevT, pMBIS, pADS, and pBad33MevT is described in, e.g., U.S.

Patent No. 7,183,089; U.S. Patent No. 7,172,886; U.S. Patent No. 7,192,751; U.S. Patent Publication
Nos. 2006/007946, 2003/0148479, and 2004/0005678; and Martin et al. (2003) Nat. Biotech. 21:796- 802. Nucleotide sequences of the plasmids are provided in U.S. Patent No. 7,183,089: pBAD24MevT (SEQ ID NO: 1); pBAD33MevT (SEQ ID NO:2); pMevT (SEQ ID NO:3); pMBIS (SEQ ID NO:4);
pADS (SEQ ED NO:5); pAtoB (SEQ ID NO:6); pHMGS (SEQ ED NO:7); pHMGR (SEQ ID NO:8;
pBAD18HMGR (SEQ DD NO:9); pHMGSR (SEQ ID NO: 10); pMevT(C159A), also referred to as pBAD33MevT(C159A) (SEQ ID NO:11); pHMGS(C159A) (SEQ ID NO:12); and tHMGR (SEQ ID
NO: 13).
[00159] Assembly of TIGR libraries. Cloning of the initial reporter vectors and mevalonate pathway constructs are described in the supplementary material. TIGRs were synthesized using PCR to
assemble oligonucleotides into chimeric DNA sequences. Four-hundred (400) picomoles of an
equimolar oligonucleotide mixture were added to a mixture containing 2.5 units of AmpliTaq Gold® polymerase (Applied Biosystems, Foster City, CA). The assembly was conducted over 35 rounds of 15 sec at 95°C, 30 sec at 72°C, and 20 + 5 seconds/cycle at 72°C. The resulting assembly products were purified with a nucleotide removal column (Qiagen) and amplified using end specific primers
containing BgHl and Asp! 18 restriction sites. The amplified libraries were subcloned into p70rg, the ligation products were electroporated into E. coli DHlOB, and the resulting transformants were plated on LB agar with carbenecillin.
[00160] Reporter library screening. The p70rg library transformants were collected from agar plates in

5 mL of phosphate buffered saline. The cell suspensions were diluted to 107 cells/mL prior to screening through Fluorescence Activated Cell Sorting (FACS) on a Beckman-Coulter EPICS Elite Sorter. The extreme 30% representing highly red, highly green, and highly green and red cells were separated from the remaining cell population. Ten million events were collected and resorted to remove any undesired cells that were carried over in the initial sort. The sorted populations were plated on LB agar with carbenecillin and individual colonies were subsequently grown overnight in C-medium. Cultures were back-diluted 1 :100 into fresh C-medium with carbenecillin and 0.2% arabinose. After growing for 24 h, the OD6Oo arid GFP and DsRed fluorescence were measured using a Tecan Safire (Maennedorf,
Switzerland) plate reader. GFP and DsRed were measured at excitation / emission wavelengths of
400nm / 510nm and 558 run / 583 nm, respectively. Each fluorescence value was normalized to the number of cells by dividing by the OD600.
[00161] Fifteen selected members of the IGR library were grown overnight in LB medium with
carbenicillin and inoculated into C medium with carbenicillin to an OD6Oo of 0.016. At an OD60O of
0.05, the cultures were induced with 0.2% arabinose. At an OD60O of 0.4, fluorescence and mRNA
levels were determined. These values were normalized to the values generated by the p70RG control operon and presented above (Fig. 2b and c).
[00162] RNA methods. Messenger RNA analysis was performed by dot-blot hybridization, Northern blot hybridization, and real-time PCR. Details for these methods are found in the supplementary
materials. Briefly, total RNA was isolated using a RiboPure™-Bacteria kit (Ambion, Austin, TX) and quantified on a Bioanalyzer Total RNA Nanochip (Agilent Technologies, Palo Alto, CA). Dot and
Northern blots were generated according to standard protocols29. The construction of probe templates is described in the supplementary materials. Probes were synthesized by in vitro transcription from these gel extracted templates with SP6 RNA polymerase (Promega) in the presence of [32P]-labeled α- CTP (PerkinElmer, Wellesley, MA) and unlabeled nucleotide triphosphates (Promega) according to the manufacturer's instructions. All probes are specific for their own genes and did not generate any cross reactivity to the other genes.
[00163] Megaprimer construction of TIGR-HMGS-TIGR libraries. Construction of IGR libraries between three genes was performed using a megaprimer reaction27. Previously amplified libraries were reamplified with one of two primer sets: MevT-A, MevT-B or MevT-C, MevT-D in a standard PCR.
These primers contain either a restriction site for subsequent cloning or a 5' tail complementary to
HMGS. The products of these reactions were purified and used as megaprimers to amplify HMGS in a subsequent PCR. The resulting fragments had the form AΛoI-IGR-HMGS-IGR-Λ'bΛ. These
fragments were cloned into pBad33MevT-L, a vector containing the remaining genes of the MevT
pathway, AtoB and XHMGR. The ligations were transformed into competent DHlOB (Invitrogen) and plated onto LB agar plates containing chloramphenicol and 0.1 % glucose. Transformants were pooled as above and their plasmids isolated. Functional operons were selected by transforming the resulting plasmid pool into an E. coli mevalonate auxotroph28 and plating on the appropriate media. The
plasmids were then isolated from pooled cultures of the surviving colonies and transformed into
DHlOB for further screening.
[00164] Biosensor screening of mevalonate producing libraries. Colonies containing functional
operons were transferred into 96-well plates and grown overnight in C-medium with chloramphenicol and 0.1 % glucose. Cultures were back-diluted 1 :100 into fresh C-medium with chloramphenicol and 0.2 % arabinose. After 24 h, the cells were pelleted and the spent media collected. A culture of the biosensor cells was grown overnight in C-medium with 50 μg/mL kanamycin and 1 mM mevalonate and back-diluted to an ODβoo of 0.02. One-hundred ninety (190) μL of this culture was combined with either 10 μL of the spent media or 10 μL of a 1:10 dilution of the spent media in separate screening plates. Two wells per plate were run in triplicate as internal controls. Mevalonate controls between
100 μM to 2 mM and the highest mevalonate producer were run on each plate. The biosensor plates were grown for 48 h, during which the GFP fluorescence was periodically measured using a Typhoon laser scanner. Samples were clustered into groups based on their relative fluorescence compared to the best mevalonate producer and average standard deviation from the triplicate wells. The mevalonate producing cells corresponding to the highest fluorescent biosensor wells were subjected to further analysis.
[00165] MevT expression analysisΛiiøi mevalonate producing library members were assayed for cell growth, mRNA levels, enzyme activity, intracellular acyl-coenzyme A levels, and mevalonate
concentrations. Cultures were inoculated to an OO600 of 0.016 from glucose-repressed overnight
cultures, grown to an ODβoo of 0.05, and induced with 0.2% arabinose. Dot blots and Northern blots were prepared in triplicate from total RNA isolated at an OD60O of 0.4. Blots were probed with
appropriate radiolabeled probes generated as described in the supplementary material. tHMGR protein levels were assayed enzymatically by monitoring the disappearance of NADPH (tHMGR cofactor) by measuring the absorbance at 340 ran30. Mevalonate levels were determined by GC-MS as described below. Acyl-CoA levels were determined by LC-MS analysis of cell extracts described below.
[00166] Construction of TIGR libraries. The reporter vector, p70rg, used for TIGR screening was constructed by replacing the lacZ gene in p701g! with the previously described rfpEC gene2.
Transcription of this reporter operon was controlled using the arabinose-inducible promoter, PBAD- In addition, the operon incorporated a 126-nucleotide (nt) IGR containing a protective hairpin 3' oϊrfpEc and the endoribonuclease (RNase E) site from the Pap operon3 of E. coli as well as identical RBSs
(ACGAGG) upstream of each coding region. The rfpεc gene was amplified using primers RG-Fwd and RG-Rev in a mixture containing PFU Turbo DNA polymerase (Stratagene, La Jolla, CA). The PCR product and p701g were digested with JVAeI and BgIQ. and subsequently ligated together. Ligation mixtures were transformed by electroporation into E. coli DHlOB (Invitrogen) and resulting
transformants were plated on LB agar with carbenecillin. The reporter construct, p70gr with the genes in the reverse order was similarly constructed from p70gl" and a PCR product derived from primers,
GR_Fwd, GR_Rev.
[00167] RNA probe template synthesis. The gfp probe template was constructed by gel extracting the

883 basepair (bp) fragment of a PvuTl-Pstl digest of pCSOl ' . The rfpEC probe template was constructed by inserting the Ncol-EcoRl fragment of pDsRed-Express into pGEM-4Z (Promega). The resulting plasmid, pBP32, was digested with BgH and the 874 bp fragment was isolated by gel extraction. The probes for atoB, HMGS, and XHMGR were constructed by PCR using the following primer pairs: atoB- Apro_Fwd, Apro_Rev; HMGS- SproJFwd, Spro_Rev; XHMGR- Rpro_Fwd, RproJRev. The amplified gene fragments were cloned into pGEM-4Z with either Asplλ 8 and Pstl or Xmάl and Pstl. The
resulting plasmids, pPAtoB, pPHMGS, and pPHMGR were digested with PvuU and Pstl. The
approximately 750 bp fragments were purified by gel extraction and used for in vitro transcription described in RNA methods.
[00168] Transcript levels. Total RNA was isolated using a RiboPure™-Bacteria kit (Ambion, Austin,

TX) and quantified on a Bioanalyzer Total RNA Nanochip (Agilent Technologies, Palo Alto, CA). 2.5 μg of total RNA was transferred to a Nytran SuPerCharge membrane (Schleicher & Schuell, Keene,
NH) using a S&S Minifold I multi-well filtration manifold (Schleicher & Schuell) according to the manufacturer's instructions. RNA was denatured by heating at 68°C for 15 min in a 50% formamide, 7% formaldehyde, and 1 x SSC solution. RNA was cooled quickly on ice and diluted to 400 μL total volume with 6* SSC. Samples were transferred to the membrane and RNA was immobilized by baking at 800C for 1 h. Specific transcript levels were measured by hybridizing appropriate radiolabeled
probes. The probe intensity was quantified using a Typhoon phosphorimager (Amersham Biosciences, Piscataway, NJ).
[00169] Real-time PCR quantification. Total RNA was isolated as above. 10 μg of total RNA was reverse transcribed with Superscript RT (Invitrogen) according to the manufacturer's instructions using gene-specific primers (Table 1 of U.S. Provisional Patent Application No. 60/819,706). Real-time PCR was performed in an ICycler (BioRad, Hercules, CA) monitoring double-stranded DNA assayed
continuously with SYBR®-Green (Invitrogen). Dilutions of cDNA and linearized plasmid (standard) were amplified in a PCR with AmpliTaq Gold® by gene-specific primer sets (Table 1 of U.S.
Provisional Patent Application No. 60/819,706). The Ct values were determined for each reaction and the quantity of mRNA was determined from a standard curve prepared from linearized plasmid.
Replicate reactions of three dilutions of cDNA were run for each sample.
[00170] Northern blots. Cultures were grown as described in the methods section. For the Sample 1
Northern blot in Figure 3f and g, rifampicin (250 μg/mL) was added at an OD60O of 0.4 to stop
transcription. Five samples were taken between 3.5 and 23.5 minutes after rifampicin addition. For all samples total RNA was isolated and quantified as described above. Northern blots were run according to standard protocols4. Briefly, 2.5 μg of total RNA was denatured at 65°C in the presence of 15.5 μL of denaturing agent (3.5 μL formaldehyde, 2 μL 5X MOPS running buffer, 10 μL formamide) for 30 minutes. The denatured RNA samples were separated on a 1% agarose gel containing 17.5% (vol) formaldehyde. RNA was transferred to a Nytran SuPerCharge membrane using a turboblotter transfer system (Schleicher & Schuell) with 20* SSC as the transfer buffer. The RNA was subsequently
crosslinked to the membrane in a UV crosslinker and transcripts were identified and quantified as described above.
[00171] Enzyme free cloning of pBad33MevT-L. The initial vector used in assembling the library of mevalonate operon constructs was performed using enzyme free cloning5. Primer sets were designed to amplify pBad33MevT and rfpEC. The pBad33MevT PCR products started at the 3' end of atoB and proceeded upstream around the plasmid to the 5 ' end of tHMGR. Two restriction sites, Xhόl and Noil were incorporated into the 5' ends of these PCR products to enable TIGR-HMGS-TIGR fragments to be cloned into the vector. The rfpεc gene was cloned initially as junk DNA to simplify the gel
extraction of cut vector prior to library cloning. The rfpεc gene also served as a negative control for ligations of libraries, similar to traditional blue-white screening.
[00172] GC-MS Analysis of Mevalonate. Mevalonate (mevalonic acid) concentration in cultures of engineered E. coli was determined by GC-MS analysis. 560 μL of E. coli culture was mixed with 140 μL of 500 mM HCl in a glass GC vial to convert mevalonate from mevalonic acid to mevalonic acid lactone. 700 μL of ethyl acetate, spiked with 50 μg/mL (-)-trans-caryophyllene as an internal standard, was added to each vial and then the samples were shaken at maximum speed on a Fisher Vortex Genie 2 mixer (Fischer Scientific) for three minutes. The ethyl acetate extract of acidified culture was diluted 1 :10 with fresh ethyl acetate in a clean GC vial before analysis.Diluted ethyl acetate extracts were analyzed using an Agilent Technologies 6890 gas chromatograph with an Agilent Technologies model 5973 mass selective detector (GC-MS) operating in electron impact mode. The GC column used was an Agilent Technologies DB-5ms (30 m x 250 μm x 0.25 μm). Helium was used as the carrier gas at a constant flow of 1 rnL/min and 1 μL injections split 1:5 were preformed. The injection port was
maintained at 2500C, the MS source temperature was maintained at 2300C, and the MS quad
temperature was held constant at 1500C. The oven cycle for each sample and the ions monitored were modified from the method of B. H. Woollen et al.7. The column temperature profile was 700C for 2 minutes; 15°C/min to 185°C; 30°C/min to 3000C; and held at 3000C for 3 minutes. The selected ions monitored were m/z 71 and 58 for mevalonic acid lactone, and m/z 189 and 204 for (-)-trans- caryophyllene. Retention time, mass spectrum and concentration of extracted mevalonic acid lactone was confirmed using commercial DL-mevalonic acid lactone (Sigma).Total mevalonate production was defined as the concentration of mevalonate in a culture sample. Specific mevalonate production was defined as the total mevalonate concentration divided by the cell density (OD60o)- [00174] LC-MS Analysis of Acyl-coenzyme A intermediates. The concentrations of free coenzyme A, acetyl-CoA, acetoacetyl-CoA, and 3-hydroxy-3-methyl-glutaryl-CoA (HMG-CoA) were determined by LC-MS analysis of Trichloroacetic acid (TCA) culture extracts taken during the exponential phase of growth. To simultaneously and rapidly quench cellular metabolism, isolate E. coli cells from growth media and extract metabolites, cells were centrifuged through a layer of silicone oil into a denser
solution of TCA by method similar to that of M. Shimazu et al8. Using 15 mL Falcon tubes (Fischer
Scientific), 2 mL of silicone oil (AR200 from Fluka) was layered over 0.5 mL of 10% Trichloroacetic Acid (Fluka) in Deuterium Oxide (Sigma). Tubes were stored on ice until time of sampling. To each tube, 10 mLof cell culture was carefully added above the silicone oil layer. Tubes were then
centrifuged at 4°C for 3 min at 10,00Og. The spent medium was carefully removed by aspiration and the TCA extract layer was transferred to a 2 mLcentrifuge tube using a small gauge needle and syringe.

To neutralize the TCA, 1 mL of ice cold 0.5 M Tri-n-octylamine in 1,1,2-Trichloro-l ,2,2- trifluoroethane (both Aldrich) was added, tubes were vortexed and then centrifuged at max speed for 1 min to separate the layers. The aqueous layer was removed for analysis by LC-MS.
[001751 The neutralized TCA extract was analyzed using a Hewlett-Packard 1100 series LC-MS using electrospray ionization. A 50 μL sample was separated on a C-18 reversed phase HPLC column (250 x 2.1 mm Inertsil 3 urn ODS-3 by Varian) using a two solvent gradient system adapted from J.J. Dalluge et al9. Solvent A was 100 mM Ammonium Acetate buffer at pH 6 and Solvent B was 70% Solvent A and 30% Acetonitrile. The HPLC column was equilibrated each run with 8% Solvent B (92% Solvent A) for 10 minutes. Using a 0.25 mL/min flow rate and linear gradients as indicated, the eluent program was the following: 8% Solvent B at 0 min to 50% Solvent B at 5 min, gradient increase to 100%
Solvent B at 13 min, isocratic at 100% Solvent B until 19 min, gradient returning to 8% Solvent B at 24 min. The resolved metabolite samples were analyzed by electrospray ionization mass selective detector (ESI-MS) operated in positive mode. The following ESI-MS parameters were used: drying gas, 12
L/min; nebulizer pressure, 60 psig; drying gas temperature, 3000C; capillary voltage, 2500 V. Selected ions corresponding to the protonated molecular ion of each metabolite were monitored: Coenzyme A - m/z 768, acetyl-CoA - m/z 810, acetoacetyl-CoA - m/z 852, 3-hydroxy-3-methyl-glutaryl-CoA (HMG- CoA) - m/z 912. Retention times, mass spectra and concentrations of extracted metabolites were
confirmed using commercial standards (Sigma).
RESULTS
[00176] Described herein is a method for tuning the expression of multiple genes within operons by generating libraries of Tunable InterGenic Regions (TIGRs) by recombining various post- transcriptional control elements and screening for the desired relative expression levels. TIGRs were employed to vary the relative expression of two reporter genes over a 100-fold range and to balance expression of three genes m an operon encoding a heterologous, mevalonate biosynthetic pathway, resulting in a seven-fold increase in production. This technology is generally useful for tuning
expression of multiple genes in synthetic operons, both in prokaryotes and eukaryotes.
[00177] The synthesis of natural or unnatural products in microorganisms usually involves the
introduction of several genes encoding the enzymes of a metabolic pathway1' 2. In order to produce these molecules at commercially relevant levels, the genes need to be expressed in a coordinated
fashion at appropriately balanced levels to avoid bottlenecks in the biosynthetic pathway that result in suboptimal yields and can lead to the accumulation of potentially toxic intermediates. Similarly, the introduction into a cell or manipulation of multi-subunit proteins (e.g., F1F0-ATPaSe, proteosomes, ion channels, etc.) usually involves coordinated expression of several genes to produce the subunits at the appropriate stoichiometrics3.
[00178] Grouping multiple, related genes into operons, as is done naturally in prokaryotes5, is a
convenient means for regulating several genes simultaneously without the need for multiple promoters. Internal ribosomal entry sequences (IRESs) from eukaryotic viruses and host stress response pathways perform a similar function and have been harnessed to create operons for heterologous expression of genes in eukaryotes6"9. With a single promoter controlling the transcription of several genes, relative expression of each open reading frame in the operon is controlled by altering post-transcriptional processes (transcription termination10' π, mRNA stability12' l3, and translation initiation14'16. Previous work has demonstrated that sequences inserted into the intergenic region (IGR) of bacterial operons can direct the processing and segmental stability of a transcript containing multiple coding regions17' l8.
This type of directed mRNA processing resulted in differential production of the proteins encoded in the operon depending on the nature of the IGR between the coding regions.
[00179] One of the major obstacles to implementing this type of control is the difficulty in designing these control elements because of the many interrelated variables involved in transcription termination, mRNA stability, and translation initiation19. While our previous work17' l8 demonstrated that it is
possible to differentially control the protein levels encoded by two or more genes in an operon using
IGR sequences, it is difficult or impossible with the current state of knowledge to choose a priori the sequences that will balance expression of the genes in an operon. Here, we demonstrate a method for simultaneously tuning the expression of several genes within operons by generating and screening through large libraries of Tunable InterGenic Regions (TIGRs) containing various control elements
(mRNA secondary structures, RNase cleavage sites, RBS sequestering sequences, etc.). An operon reporter system (Fig. Ia) containing the genes encoding the red fluorescent protein DsRed (rfpεc)20' 2I and the green fluorescent protein GFP igfpυv) was designed to facilitate high-throughput measurement of relative gene expression resulting from the libraries of TIGRs.
[00180] A large library of TIGR sequences (>104) was assembled combinatorially from four sets of oligonucleotides (Table 1 of U.S. Provisional Patent Application No. 60/819,706) using polymerase chain reaction (PCR). Each oligonucleotide contained two 15-nt sequences that hybridized to a
corresponding sequence in the neighboring oligonucleotide, such that a series of chimeric DNA
molecules containing oligonucleotides from each of the four sets was created after several rounds of
PCR (Fig. Ib). Between the hybridization sequences at either end of each oligonucleotide was a
variable sequence that provided the diversity of features designed into the library. PCR amplification of this DNA pool with end-specific oligonucleotides enriched the population with full-length TIGRs containing a member from each set of oligonucleotides (Fig. Ib). Specific restriction sites incorporated into the amplification primers were used to clone the TIGR library between the two reporter genes.
[00181] Figures IA-E. TIGR assembly and reporter operon. (a) The reporter plasmid p70rg consists of the reporter genes rfpEC and gfpυv downstream of the PBAD promoter. The black scissors indicate the location of the cloning site used to insert the library of TIGRs. (b) Layout of the TIGR assembly
reaction. Members of each region (A-D) anneal to members of neighboring regions and are extended by PCR. Eventually full-length TIGRs containing members of each region are assembled and then
amplified using end-specific primers containing the restriction sites for cloning, (c) The oligos used for TIGR construction were designed to make three separate regions. A-B form 5' hairpins, B-C form the single-stranded region with RNase E sites, and C-D form a 3' hairpin, (d) When transcribed these sequences form a variety of structures incorporating many elements that affect gene expression, (e) The TIGR sequence is designed to be processed at a cleavage site between two secondary structures.
Cleavage results in two independent secondary transcripts whose stability is determined by the remaining TIGR sequence.
Table 3, below, provides Sequences of TIGRs from library samples. Below are the sequences and predicted secondary structures of the characterized samples from the fluorescent reporter library and the mevalonate library. They are written from the stop codon of the upstream (5') gene through the start codon of the downstream (3') gene.

Table 3
Fluorescent Reporter Library
Plasmid 5' Gene ΪGR 3' Gene Length
UAGAUCUGAUCGUUGCGGGCGGGGCGAGAGU !
CUCGCCCCGCCCGCGACCGCGGUGAAAAUAC

GAGAAUAUUAUUUGUAUUGAUCUCCUAGGCG
p70rg rfp^ gfpw 133
GGGUACCGUAUUUUGGAUGAUAACGAGGCGC

AAAAAAUG MSEQ ID NO: 62)
UAGAUCUCCUGAUCAGAGGGACAAGGAGGAA GGUCAUUGCAGCUCAGACAUCUGCAUAGUCU
* GAGCUGCAGUGGCCUUCCCUUUGACCUUCUG
CCUUGAAUUUACAGUAUUUAUUUGUACUGAU
p70rg-1 rfp∞ CUCCUUAUCCGCUCAAGACAUGCACUCGGAA gfpw 229
CGCAUCUAGGGUACCGAUCUGGAUACAGUAU
: CUGCGGUACCGUAUUUUGGAUGAUAACGAGG
CGCAAAAAAUG
(SEQ ID NO: 63)
UAGAUCUUGCUAGGCAAGGCGGAUACAGUAU

CUGCGGUACCGCAGAUACUGUAUCCACCCGG
p70rg-2 rfp∞ gfpw 268
ACAUCUGCCUAGCAAGAUCUCCUGAUCCUAG

CGCACCCGGACAUCUGCAUAGUCUGAGAGAG

Fluorescent Reporter Library
Plasmid 5' Gene IGR 3' Gene Length
GGCUUACCUUAAGUACUCUCGAGGUGCCUUG

AAUUUACAGUAUUUAGCGUCCCGAGUGCAUG

CUCGCCCUUAUCCGCUCAAGAAUAUAAUCGG

AUCUUGGAGAUCGGUACCGUAUUUUGGAUGA

UAACGAGGCGCAAAAAAUG
(SEQ ID NO: 64)
UAGAUCUCCUGAUCCACCCGGACAUCUGCAU

AGUCCGGGUGCACCGGGAUCAGGUACACUUG

CCGUGAAUUUACAGUAUUUAUUUGUAUUGAU
p70rg-3 rfp^ CUCCUUAUCCGCUCAAGAUAACCGGAUACCG gfpav 171

00 GCCCGAUCGGUACCGUAUUUUGGAUGAUAAC
GAGGCGCAAAAAAUG
(SEQ ID NO: 65)
UAGAUCUCCUGAUCCACCCGGACAUCUGCAU AGUCUGGGUGCACCGGGAUCAGGUACACUUG CCUUGAAUUACAGUAUUUAAUGAACUAGCGU

UCCGAGUGCAUGCCUUAUCCGCUCAAGAAUA
p70rg-4 rfp^ gfpw 258
UAAUCGGAUCUUGGAGAUCGGUACCGCAGAU

ACUGUAUCCACAUAAGGUGCCUAGCAAGAUC

UUGCUAGGCGCCUUAUGUGGAUACAGUACCU

GCGGUACCGUAUUUUGGAUGAUAACGAGGCG

Fluorescent Reporter Library
Plasmid 5' Gene IGR 3' Gene Length
CAAAAAAUG
(SEQ ID NO: 66)
UAGAUCUCCUGAUCCACCCGGACAUGCCUAG

CAAGAUCACCUGAUCCACCCGGACAUGCCUA

GCAAGAUCUCCUGAUCCACAUAAGGCCGGAC

AUCUGCAUAGUCUGGGUGCACCGGGAUCAGG

UACACUUGCCGUGAAUUUACAGUACUUACAA
p70rg-5 rfp^ g£pm 252
GUUUUGAUCGAGGGACAGUAGUCCUUAUCCG

CUCAAGAUAGAGCGGAUAUUGGAGAUCGGUA CCGUAUUUUGGAUGAUAACGAGGCGCAAAAA
& AUG
(SEQ ID NO: 67)
UAGAUCUAGCUAGGCAGUGCUCCUGGAUACA

GUAUCUGCGGUACCGUAUUUUGGAUGAUAAC
p70rg-6 rfp∞ gfpm 78
GAGGCGCAAAAAAUG
(SEQ ID NO: 68)
AGAUCUUGCUAGGCGCUAUGCAGAUGUCCGG

GUGGAUACAGUAUCUGCGGUACCGUAUUUUG
p70rg-7 rfp^ gfρm 86
GAUGAUAACGAGGCGCAAAAAAUG
(SEQ ID NO: 69)
p70rg-8 rfp^ UAGAUCUCCUGAUCCACUUGCCUAGCAAGAU gfPϋv 100

Fluorescent Reporter Library
Plasmid S' Gene IGR 3' Gene Length
CUUGCUAGGCGAGGUGGAUACAGUAUCUGCG GUACCGUAUUUUGGAUGAUAACGAGGCGCAA AAAAUG
(SEQ ID NO:70)
AGAUCUCCUGAUCCGAGCACACCCGGACAUC UGCAUAGUCUGGGUGGUCGCGCACCGGGAUC AGGUGCCGUGAAUUUACAGUAUUUACAAGUU CUGAUCGAGGGACAGUAGUCCUUAUCCGCUC
p70rg-9 rfpεc σfp 'iDV 189
AAGAAUAUAAUCGGAUAUUGGAGAUCGGUAC CGUAUUUUGGAUGAUAACGAGGCGCAAAAAA
Ui
O UG
(SEQ ID NO:71)
UAGAUCUCCUGAUCCACCCGGACAUCUGCCU AGUCUGGGUGGUCGCGCACCGGGAUCAGGUG CCGUGAAUUUACAGUAUUUAAUUUACCUUUG
p70rg- rfp ',EC AUUUCCGGAUCCUUAUCCGCUCAAGAAUAUA gfPσv 179 10
AUUGGAUCUUGGAGAUCGGUACCGUAUUUUG GAUGAUAACGAGGCGCAAAAAAUG
(SEQ ID NO:72)
p70rg- AGAUCUUGCUAGACAUGGGGAUACAGUAUCU
rfpεc gfPav 100 11 GCGGUACCGAUCAGGUGGAUACAGUAUCUGC

Fluorescent Reporter Library
Plasmid 5' Gene IGR 3' Gene Length
GGUACCGUAUUUUGGAUGAUAACGAGGCGCA

AAAAAUG
(SEQ ID NO: 73)
AGAUCUCCUGAUCCACCCGGACAUCUGCAUA GUCUGGGCCAAUCUGAGGACUGGCGUAUCAG GGCCGUGAAUUUACAGUAUUUACAAGUUUUG
p70rg- rfpec AUCGAGGGACAGUAGUCCUUAUCCGCUCAAG gfpw 185 12
AUAAUAAUCGGAUCUUGGAGAUCGGUACCGU

AUUUUGGAUGAUAACGAGGCGCAAAAAAUG
(SEQ ID NO:74)
Ln
M UAGAUCUCCUGAUCCACAUAAGGCCGGACAU

CUGCAUAGUCUGGGUGCACCGGGAUCAGGUA CACUUGCCGUGGAUUUACAGU,AUUUACAAG
p7Org- UUUUGAUCGAGGGACAGUAGUCCUUAUCCGC
rfpεc gfpav 199

13 UCAAGAGCGGAGAGUUAAUAGGAUCCGCUAG
GAUAUCGGUACCGUAUUUUGGAUGAUAACGA

GGCGCAAAAAAUG
(SEQ ID NO: 75)

Fluorescent Reporter Library
Plasmid 5' Gene IGR 3' Gene Length
UAGAUCUCCUGAUCCACCCGGACAUCUGCAU

AGUCUGGGCCAAUCUGAGGACUGGCGUAUCA

GGGCCGUGAAUUUACAGUAUUUACAAGUUUU
p70rg- rfpnc GAUCGAGGGACAGUAGUCCUUAUCCGCUCAA gfpw 187
GAUAAUAAUCGGAUCUUGGAGAUCGGUACCG

UAUUUUGGAUGAUAACGAGGCGCAAAAAAUG
(SEQ ID NO:76)
UAGAUCUCCUGAUCCACACCCGGACAUCUCC

AUAGUCUGGGCCAGUCUGAGGACUGGUGGAU

CAGGGCCGUGAAUUUACAGUAUUUCAGUUAC
N p70rg- CGCUCUAUccuuAuccuuAuccGcucAAGAG
rfpsc gfpov 192

15 CAGAGAGUUAAUAGGAUCCGCUAGGAUAUCG
GUACCGUAUUUUGGAUGAUAACGAGGCGCAA AAAAUG
(SEQ ID N0:77)

[00183] The TIGR pool that resulted from the assembly of the oligonucleotides was designed to contain three regions, two variable hairpin sequences flanking a single-stranded region incorporating various
RNase E sites22' 23 (Fig. Ic and Id). When transcribed, those TIGR sequences that contained a strong endonuclease site would be cleaved generating two secondary transcripts whose stability could be individually modulated by the secondary structures flanking the RNase site17 (Fig. Ie). The TIGR
sequences also incorporated mRNA secondary structures of various lengths, GC contents, asymmetries, and mismatched bulges. Greater than 104 possible TIGRs were generated from the nine to eleven
oligonucleotides in each of the four sets. Inclusion of degeneracies in some of the oligonucleotides and the use of error-prone PCR conditions increased the number of possible sequence combinations but the actual size of the library was likely determined by the number of clones attained by electroporation.
[00184] Cells harboring the reporter libraries produced a wide range of fluorescence phenotypes. To further characterize the library, cells were collected and sorted via FACS to isolate highly fluorescent cells. Cultures of the sorted cells were grown in 96-well plates to determine the fluorescence of each construct after 24 hours. The relative fluorescence ratio, red to green, varied by two orders-of- magnitude, from 45:1 DsRed:GFP to 3:1 GFP:DsRed (Fig. 2a), depending on the IGR. As expected, the distributions of relative mRNA levels and fluorescence ratios were not identical indicating that
TIGRs may also affect translation (Fig. 2b and c). In addition, the TIGR sequences had a stronger influence on the expression of the gene 3' to the TIGR than on the gene 5' to the TIGR.
[00185] Figures 2A-C. Expression from TIGR RG library. Colonies of fluorescent strains were imaged using a laser scanner detecting fluorescence at 526 nm and 580 nm. The image was an overlay of the two signals, (a) DsRed:GFP fluorescence ratios of the library after 24 hours of growth, (b) Fifteen
clones were assayed for fluorescence during exponential growth. Shown are DsRed:GFP fluorescence ratios normalized to the fluorescence of p70rg. (c) The ratios oϊrfpεc and gfp mRNA from
exponentially growing cells were determined by real-time PCR and dot blot hybridization. Shown are the rfpEc'gfp mRNA ratios determined by real-time PCR. Note the similarity in the distributions of the two samples, indicating that the TIGR is altering mRNA levels which in turn alter fluorescence ratios. RG refers to the original operon vector p70RG described in the text.
[00186] The operons from fifteen plasmids were sequenced to characterize mechanisms responsible for the resulting expression ratios. Many of the sequences were predicted to form secondary structures (as predicted by the Mfold web server24) with variably-sized hairpins and single-stranded regions. The secondary structures and the size distribution of transcripts (viewed on Northern blots, Fig. 3) suggest that at least three different mechanisms are responsible for the observed differences in gene expression in the library: differential mRNA processing (Fig. 3a-c), transcription termination (Fig. 3d-g), and RBS sequestration (Fig. 3h-i).
[00187] Intergenic cleavage and differential processing of secondary mRNAs. Incorporation of
RNase sites between the two coding regions of a transcript (Fig. 3a) has been shown to decouple the stability of the two coding regions and thus the production of the corresponding proteins17. Of the
TIGR library samples studied, most had two distinct transcripts with lengths of approximately 1800 and 800 bases, corresponding to the predicted primary (gfp and rfpEc coding regions on one transcript) and secondary (gfp or rfpEc coding region alone) transcripts (Fig. 3a-c), indicating processing of the
primary transcripts by endonucleases at cleavage sites placed in the TIGR between the coding regions. Cleavage in the TIGR would result in two independent secondary transcripts whose stability and
ultimately the amount of protein produced from them would be dictated by the remaining TIGR
sequences at the 3'- and 5'-ends of the separated transcripts. Differences in the intensities of the
primary and secondary transcripts support differential transcript stability (Fig.3b and 3c).
[00188] Transcription termination. Large ratios in the expression of the first gene relative to the second gene could be generated by increasing the frequency of termination prior to transcription of the 3' gene. In the TIGR library, the overall distribution of fluorescence ratios was skewed with nearly
70% demonstrating more red fluorescence (first gene) than green fluorescence (second gene) (Fig. 2a). Premature transcription termination due to large intergenic secondary structures between the coding regions, best demonstrated by sample 1 (Fig.3d), is the most likely explanation for the skew in
expression in favor of the first gene. The predicted structure of the sample 1 TIGR consists of two very large hairpins with over 20 bp in each stem (Fig.3e). Northern Blots revealed a large quantity of stable rfpεc transcript corresponding to the size of a single gene (Fig. 3f) and very little gfp transcript, either full-length or containing gfp alone (Fig. 3g). The strong hairpins in this TIGR can serve two functions, premature transcription termination and protection from exoribonucleases25 resulting in an increased ratio of the first gene product to the second gene product (red:green).
[00189] Ribosome binding site sequestration. RBS sequestration resulted in a number of samples showing a significant difference between the relative mRNA and protein levels (Fig. 2b and c). In many of these samples, the gfp RBS was part of a secondary structure, such that cis basepairing of the RBS may have prevented the ribosome from loading onto the transcript. Secondary structures
incorporating the RBS have been previously shown to reduce the rate of translation initiation14"16'26.
The structure of sample 3, which best illustrated this mechanism (Fig. 3h), consisted of four hairpins, the last of which incorporated the RBS at the base of its stem (Fig. 3i). Despite the significant levels of gfp mRNA present (both full-length and secondary transcripts containing gfp only, Fig. 3c) there was very little green fluorescence (Fig. 2c), suggesting that the ribosome was not able to load onto the RBS 5 ' of the second coding region.
[00190] Figures 3A-I. TIGR effects on expression, (a) Differential RNase protection mechanism of
TIGR samples. Cleavage of the TIGR generates two secondary transcripts, (b) Northern blot of total
RNA from exponentially growing cultures of numbered clones probed for rfpEc- (c) Northern blots of the same RNA in (b) probed for gfp. (d) Premature transcription termination mechanism of sample 1. The large IGR structure prevents transcription of the gfp gene, (e) The predicted secondary structure of sample 1 's TIGR contains two large hairpins, (f) rfpεc probed Northern blot of total RNA harvested at various timepoints (bottom) after transcription was stopped by adding rifampicin. (g) Northern blots of the same RNA in (f) probed foτ gfp. The primary (1°) and secondary (2°) transcripts are labeled
accordingly, (h) RBS sequestration mechanism of sample 3. (i) Predicted secondary structure of the last hairpin of sample 3's TIGR. Note the location of the RBS at the base of the hairpin (dashed box). Despite the gfp mRNA present in (c, sample 3) there was less green fluorescence relative to p70RG than expected (Fig. 2b and 2c).
[00191] Following successful demonstration of the ability to control the expression of reporter genes in an operon, the TIGR approach was applied to optimize flux through a metabolic pathway. As described above, balancing the expression of genes in multi-step metabolic pathways is challenging and time consuming, yet important in order to increase the efficiency of resource utilization, decrease the
metabolic load of producing pathway enzymes, and decrease the accumulation of potentially toxic intermediates. One pathway that would benefit immensely from balanced gene expression is the
mevalonate pathway that was introduced into E. coli to produce amorpha-4,1 1-diene, a precursor to the anti-malarial drug artemisinin2. Expression of these genes created a heterologous mevalonate pathway that transforms acetyl-CoA into the isoprenoid precursors, isopentenyl pyrophosphate and dimethylallyl pyrophosphate (Fig* 4a and 4b). Although an operon consisting of the first three genes in the
mevalonate pathway (atoB, HMGS, tHMGR; referred to as MevT) appears to limit flux through the pathway, overexpression of the operon led to reduced growth and reduced product formation,
potentially due to the toxicity of unbalanced gene expression. Directed approaches to remedy this imbalance would involve the difficult and time-consuming construction and analysis of many clones utilizing various means of expression control regulating gene copy number, transcription initiation, and translation initiation.
[00192] Instead, application of the combinatorial approach described above reduced pathway
optimization time and increased the number and variety of mechanisms explored to balance the MevT pathway. Specifically, the TIGR libraries were used to generate a series of operons that were
subsequently screened for increased mevalonate production. A megaprimer PCR approach27 was used to simultaneously place TIGR libraries between the first and second genes and between the second and third genes of the MevT operon (Fig.4c). Functional operons from the libraries were selected by
transforming the plasmid library into an E. coli strain engineered to be auxotrophic for mevalonate28.
The plasmids isolated from this enriched pool were recovered and subsequently transformed into a production strain (DHlOB) and screened for the highest-producing version using a mevalonate
auxotroph transformed with a plasmid harboring a constitutively-expressed gfp as a biosensor for
mevalonate28. This fluorescence-based screen incorporates a GFP-producing sensor strain whose
growth is dependent on the level of mevalonate secreted into the surrounding medium by the producer strain. Of the more than 600 colonies screened in this manner, 18% were significantly (more than three standard deviations greater, calculated from two sets of triplicate controls on each plate) more
fluorescent than the original operon (control without TIGRs).

[00193) The four best producers were examined to determine the mechanism responsible for the
increased mevalonate production. The most obvious difference between the library samples and the pBad33MevT control was the improved growth of the library strains. Cultures of uninduced DHlOB harboring pBad33 (empty vector) or pBad33MevT (original pathway, no TIGRs) grew similarly.
However, once induced, cultures harboring pBad33MevT grew much slower than those harboring
pBad33. While the final culture densities were similar, the empty vector reached stationary phase
nearly 12 hours earlier (Fig. 4d). Uninduced cultures of DHlOB harboring one of each of the library plasmids grew similarly to the other uninduced strains, but once induced grew faster than pBad33MevT and slower than pBad33. Mevalonate production from the library strains was seven-fold greater
compared to that from pBad33MevT, averaging 6 mM after 24 hours (Fig.4e). The specific production (Table 4) of the library strains was twice that of the pBad33MevT control, indicating that both the improved growth and increased flux through the pathway contributed to the increased production.

100194] Figures 4A-F. Mevalonate pathway optimization using the TIGR method, (a) Biosyntheic pathway of heterologous artemisinin production. Pink box highlights the yeast mevalonate pathway converting Acetyl-CoA (AcCoA) to the isoprenoid precursors isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) through mevalonate (MEV). Dashed box, blown up in (b), represents the first three genes in the mevalonate pathway, atoB (acetoacetyl-CoA thiolase), HMGS,
(HMG-CoA synthase), and tHMGR (truncated HMG-CoA reductase). These three genes have been clustered into one operon to make the vector pBad33MevT (c). (d) Growth of the top four producers and the sampling points for induction (green), RNA (red), acyl-CoAs, and mevalonate assays (orange). The data for each of the sampling points is shown in Table 4. The library samples all grow faster than the original vector (pBad33MevT) but slower than the empty vector (pBad33). (e) Total mevalonate production (mM) of the top clones determined by GC-MS: striped bars (production at 11 hours) and solid bars (production at 23 hours). Specific mevalonate production (normalized to ODgoo) at these time points is reported in Table 4. (f) Structure of the TIGR of library sample A. Note the position of the
RBS upstream of HMGS in a hairpin. This type of basepairing can lead to reduced translation initiation. Abreviations: HMG-CoA - hydroxymethylglutaryl-CoA, FPP - farnesyl pyrophosphate, DMAPP - dimethylallyl pyrophosphate, CIT - citrate, MAL - Malate, OAA - oxaloacetate.
Table 4
pBad3 pBad33Me Sample Sample Sample Sample
3 vT A B C D
mRNA levels (ng)
atoB 1.5 6.0 5.7 6.8 6.1 8.6
hmgs 0.2 4.3 2.3 2.1 2.2 3.5
thmgr 0.3 6.3 3.1 3.9 3.0 4.0
Enzyme Activity (μmol/min/mg protein)
tHMGR < 0 3,000 1,100 N/D N/D N/D
Metabolite Concentrations (nM Normalized to OD) pBad3 pBad33Me Sample Sample Sample Sample
3 vT A B C D
Time (hr)
CoA 292 326 310 310 284 315
Acetyl CoA 826 < 0 126 123 184 129
18.2 17.6
O AA-CoA 24.4 14.6 19.2 19.8
HMG-CoA 7.6 19.9 37.2 23.4 37.8 25.7
Mevalonat
e < 0 214,000 172,000 N/D 151,000 177,000
CoA 176 104 283 354 208 288
Acetyl CoA 627 47 774 678 587 670
AA-CoA 0.8 7.4 2.0 2.6 2.1 2.1
11
HMG-CoA 0.4 20.7 21.9 19.3 12.8 9.7
Mevalonat
e < 0 283,000 279,000 243,000 222,000 240,000
Mevalonat
24 e < 0 276.000 565.000 524.000 596.000 534.000

(00195] The concentrations of intracellular acyl-CoA species were measured to determine if the TIGR changed the relative carbon flux through any of the three steps in the pathway. The biggest difference between the strains was the concentration of acetyl-CoA, a central metabolic intermediate (Table 4).
The strain containing the original mevalonate pathway (pBAD33MevT) had much lower acetyl-CoA levels than the control strain and the strains containing the evolved pathway. The fact that the strains producing more mevalonate had more acetyl-CoA was unexpected but correlates with the improved growth seen in the library samples. The other acyl-CoAs in the pathway (acetoacetyl-CoA, HMG- CoA) were not significantly changed among the strains expressing mevalonate pathway genes. The data suggest that the improved production was due primarily to increased growth correlated with the availability of acetyl-CoA.
[00196] Transcripts from each strain were examined using dot blots to determine whether changes in
TIGR sequence altered the relative expression levels of the three genes. The HMGS snά tHMGR
transcripts were noticeably reduced in the library strains in comparison to the transcript expressed from pBad33MevT (Table 4). Corroborating the mRNA data, the tHMGR enzyme activity was significantly reduced in library sample A compared to pBad33MevT (Table 4). The reduction in expression was not expected, but suggests that reduced expression of HMGS and tHMGR was responsible for the improved growth and increased mevalonate production.
[00197] The sequences of the TIGRs (Table 5; and Figures 5-8, where Samples 1-4 are Samples A- D, respectively) from each of the top four producers were predicted to form a small set of similar structures. The TIGR between HMGS and tHMGR was predicted to form either a large hairpin or a short single-stranded structure, while the TIGR between atoB and HMGS contained two small hairpins separated by a few single-stranded bases (Fig.4e).

Table 5


The second of these hairpins appears to sequester the RBS, thereby down-regulating production of HMGS. Reduced translation rates can in turn affect mRNA stability19, and therefore message levels, by decreasing transcript coverage by ribosomes and leaving the transcript open to RNase degradation. Therefore, reduced HMGS translation rates could lead to the reduction in HMGS and tHMGR message, which would correlate with the improved growth and increased mevalonate production.

[00199J The described TIGR method utilized combinatorial assembly of oligonucleotides possessing various regulatory sequences to generate libraries of operons whose relative expression varied over a
100-fold range. The sequences used in this study incorporated three mechanisms to control protein production: differential mRNA processing, premature transcription termination, and translation
initiation inhibition. These mechanisms are interrelated, and the effects of one can not be easily
decoupled from the others19, making it nearly impossible to design de novo operons using these and other post-transcriptional regulatory mechanisms to coordinately regulate the expression of multiple genes. The ability to access multiple regulatory mechanisms simultaneously without specific design is a major strength of the TIGR approach. By accessing multiple regulatory mechanisms the TIGR
method was able to assemble an improved version of pBad33MevT that when expressed generated seven-fold more mevalonate. More importantly, the mechanisms behind the improved mevalonate production were counterintuitive. Designed constructs may not have focused on reducing the
expression of HMGS and tHMGR. Nonetheless, this solution resulted in significant improvements in mevalonate production. Most importantly, incorporation of this optimized operon into the strain
engineered to produce artemisinin or its precursor (amorphadiene) could ultimately reduce the cost of the anti-malarial drug produced using this pathway. Similarly, by using the same methods described above, it should be possible to control eukaryotic protein synthesis from operons by constructing TIGRs containing IRES elements and RNase sites, the combination of which will be very useful for the design of operons for pathway engineering, multi-subunit protein production, and gene therapy.

References
1. Khosla, C. & Keasling, J.D. Metabolic engineering for drug discovery and development. Nat
Rev Drug Discov 2, 1019- 1025 (2003).
2. Martin, V.J., Pitera, D.J., Withers, S.T., Newman, J.D. & Keasling, J.D. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat Biotechnol 21, 796-802 (2003).
3. White, M.M. Pretty subunits all in a row: using concatenated subunit constructs to force the expression of receptors with defined stoichiometry and spatial arrangement. Molecular Pharmacology 69, 407-410 (2006).
4. Endy, D. Foundations for engineering biology. Nature 438, 449-453 (2005).
5. Baga, M., Goransson, M., Normark, S. & Uhlin, B.E. Processed mRNA with differential stability in the regulation of E. coli pilin gene expression. Cell 52, 197-206 (1988).
6. Komar, A.A. & Hatzoglou, M. Internal ribosome entry sites in cellular mRNAs: mystery of their existence. J. BioL Chem. 280, 23425-23428 (2005).
7. Martin, P., Albagli, O., Poggi, M.C., Boulukos, K.E. & Pognonec, P. Development of a new bicistronic retroviral vector with strong IRES activity. BMC Biotechnology 6, 4 (2006).
8. Chappell, S.A. & Mauro, V.P. The internal ribosome entry site (IRES) contained within the RNA-binding motif protein 3 (Rbm3) mRNA is composed of functionally distinct elements. J. Biol. Chem. 278, 33793-33800 (2003).
9. Fernandez-Miragall, O., Ramos, R., Ramajo, J. & Martinez-Salas, E. Evidence of reciprocal tertiary interactions between conserved motifs involved in organizing RNA structure essential for internal initiation of translation. RNA 12, 223-234 (2006).
10. Nudler, E. & Gottesman, M.E. Transcription termination and anti-termination in E. coli. Genes Cells 7, 755-768 (2002).

11. Mandal, M. & Breaker, R.R. Gene regulation by riboswitches. Nat Rev MoI Cell Biol 5, 451-463 (2004).
12. Arraiano, CM. & Maquat, L.E. Post-transcriptional control of gene expression: effectors of mRNA decay. MoI Microbiol 49, 267-276 (2003).
13. Kushner, S.R. mRNA decay in prokaryotes and eukaryotes: different approaches to a similar problem. IUBMB Life 56, 585-594 (2004).
14. de Smit, M.H. & van Duin, J. Control of translation by mRNA secondary structure in Escherichia coli. A quantitative analysis of literature data. JMoI Biol 244, 144-150 (1994).
15. Isaacs, F.J. et al. Engineered riboregulators enable post-transcriptional control of gene expression. Nat Biotechnol 22, 841-847 (2004).
16. Majdalani, N., Vanderpool, CK. & Gottesman, S. Bacterial small RNA regulators. Crit Rev Biochem MoI Biol 40, 93-113 (2005).
17. Smolke, CD., Carrier, T.A. & Keasling, J.D. Coordinated, differential expression of two genes through directed mRNA cleavage and stabilization by secondary structures. Appl Environ Microbiol 66, 5399-5405 (2000).
18. Smolke, CD. & Keasling, J.D. Effect of gene location, mRNA secondary structures, and RNase sites on expression of two genes in an engineered operon. Biotechnol Bioeng 80, 762-776 (2002).

19. Deana, A. & Belasco, J.G. Lost in translation: the influence of ribosomes on bacterial mRNA decay. Genes Dev 19, 2526-2533 (2005).
20. Mate, M. V. et al. Fluorescent proteins from nonbioluminescent Anthozoa species. Nat Biotechnol 17, 969-973 (1999).
21. Pfleger, B.F., Fawzi, N.J. & Keasling, J.D. Optimization of DsRed production in Escherichia coli: effect of ribosome binding site sequestration on translation efficiency. Biotechnol Bioeng 92, 553- 558 (2005).
22. Naureckiene, S. & Uhlin, B.E. In vitro analysis of mRNA processing by RNase E in the pap operon of Escherichia coli. MoI Microbiol 21, 55-68 (1996).
23. Kaberdin, V.R. Probing the substrate specificity of Escherichia coli RNase E using a novel oligonucleotide-based assay. Nucleic Acids Res 31, 4710-4716 (2003).
24. Zuker, M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31, 3406-3415 (2003).
25. Spickler, C. & Mackie, G.A. Action of RNase II and polynucleotide phosphorylase against RNAs containing stem-loops of defined structure. J Bacteriol 182, 2422-2427 (2000).
26. Gottesman, S. The small RNA regulators of Escherichia coli: roles and mechanisms*. Annu Rev Microbiol 58, 303-328 (2004).
27. Burke, E. & Barik, S. Megaprimer PCR: application in mutagenesis and gene fusion. Methods MoI Biol 226, 525-532 (2003).
28. Pfleger, B.F., Pitera, D.J., Newman, J.D., Martin, V.J. & Keasling, J.D. Microbial Sensors for Small Molecules: Development of a Mevalonate Biosensor. Appl Environ Microbiol (Submitted).

29. Sambrook, J., Fritsch, E.F. & Maniatis, T. Molecular Cloning. A Laboratory Manual. (Cold Spring Harbor Press, Cold Spring Harbor, NY; 1989).
30. Rodwell, V. W. et al. 3-Hydroxy-3-methylglutaryl-CoA reductase. Methods Enzymol 324, 259-280 (2000).

[00200] While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made and
equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of
matter, process, process step or steps, to the objective, spirit and scope of the present invention. All
such modifications are intended to be within the scope of the claims appended hereto.