Certains contenus de cette application ne sont pas disponibles pour le moment.
Si cette situation persiste, veuillez nous contacter àObservations et contact
Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

TITLE OF INVENT Reverse Methods of Use CROSS REFERENCE TO RELATED APPLICATIONS This application claims priority to Application Med June which is hereby by STATEMENT REGARDING FEDERALLY DEVELOPMENT This was made with under and awarded by National institutes of The government has certain rights in the BACKGROUND OF INVENTION It is becoming increasingly important to monitor complete sequences of Jong NA such as viral regulatory noncodiftg mixtures of alternatively spliced messages in healthy and diseased In these it is to sequence an entire to link and of multiple splice site other diversifications that influence accurate Sequencing of long transcript is compromised by a paucity of highly accurate transcriptase enzymes to produce complementary DN A transcripts for As a RN A sequences are typically compiled that are joined to yield a average RN A which confounds the to monitor the linkage between mMtiple stectural and changes that occur within single Continued advances in genomics research depend on the to solve this and a need the of new technologies for improving as is a specific area of interest a major unmet T derived retroviral such as Superscript series from RT A second family of commercial RTs was developed thermophilic group enzymes While enzymes were extensively optimized to achieve longer they hav not been to effectively very long structured templates 4000 nucleotides case has vity or fidelity of these enzymes quantitatively particularly t a need in the art for ail improved reverse The present invention addresses this SUMMARY OF THE INVENTION the present provides a composition a reverse transcriptase comprising a of E ri rect the reverse transcriptase comprises one or more mutations relative to the amino sequence set forth SEQ ID I one the reverse transcriptase an acid sequence having greater than about homology the amino sequence forth in SEQ farther comprising one or more mutations relative to SEQ ID one the reverse transcriptase comprises at least one mutation from the group consisting and R353X relative to SEQ wherein X denotes any one the reverse transcriptase comprises least one mutation selected from the group consisting of and R353A relative to SEQ NO In one the reverse transcriptase amino acid sequence selected from the group consisting of SEQ I SEQ SEQ ID 17 and SEQ ID one the reverse transcriptase comprises at one selected the group consisting mutation of the A binding mutation of the a mutation to produce increased pairs withi rigid sections of tertiar additi on of an domain to enhance mutation of the of th catalytic and a substitution mutation wherein one or residues or domain maturase replaced or more residues or a domain derived from a enzyme of organism other than the of the binding domain comprises at least one selected rom the consisting i X denotes any term denotes deletion residues corresponding to position to pos tion 427 of SEQ In one X is selected from the group consisting Alanine and Serine one the of the is selected the group consisting mutations the portion of the an of the with an from another maturase reverse one the mutation to produce pairs hin rigid sections of the tertiary structure comprises at least one selected from the group consisting and S In one embodiment X is acid i one of the thumb domain comprises a least one selected from the group consisting and denotes any amino one the mutation of the thumb domain least one from the group consisting S and n one mutatio of the catalytic site comprises at least one selected from the group consisting R i and wherein X is any amino In one the substitution mutation wherein a domain mamrase is replaced wi a domain derived from a maturase of an other than r from the group consisting replacement of the finger of maturase with a finger domain of reverse and of the of Ex maturase with a palm of another maturase reverse one the substitution mutation wherein one or more residues of B matiirase is replaced with one or residues derived a enzyme of as than comprises at least selected the group consisting E and where X denotes any amino In one the substitution least one selected the group of the composition further comprises an agent that reduces binding of to surface of the one the agent a EN A In one the agent comprises acid molecule derived from a group In one the agent comprise D4A or variant In one reverse transcriptase has one or more properties selected from the group consisting of enhanced reduced error reduced improved In one the invention provides an isolated nucleic acid molecule the reverse transcriptase one present invention prov a method of reverse comprising contacting an NA molecule with composition comprising a reverse transcriptase comprising acterium or a variant of Ex one the maturase comprises an amino having than homology to the amino acid set forth in one embodiment the maturase variant of maturase is used an optimized reaction wherein the optimized reaction buffer comprises a concentration of about to about i KCl at concentration of about to about at a concentration of about to about DTT at a concentration of about to about and wherein the reaction buffer has a i of about 8 to one the d reaction buffer further comprises one or more protein agents In one the maturase or a variant of maturase is contacted with agent that reduces binding of primers to the maturase or variant of one the agent comprises a one the agent comprises a nucleic acid molecule derived from a group II In the comprises P4A or owe the present invention provides a kit comprising a polypeptide comprising maturase variant of o one the maturase comprises an amino sequence having greater than about homology t the amino acid sequence set forth in SEQ ID Irs one the kit comprises agent that reduces specific of primers to the or variant of E the agent a RNA I one the comprises a nucleic acid molecule derived from a group II In the agent D4A a thereof one the kit further comprise an reaction wherein optimized buffer Tris at a bout to 1 at concentration of to about at a concentration about about DTT concentration of 1 to about and the optimized reaction buffer a of about 8 to RT reactions using RepA Figure comprising Figure Figure depicts from example demonstrating that the is a proeessivity factor grou 0 for The structure of the RT domain and was determined by crystallography and the of thumb as a threading model by J et Nat based on the thumb subdomain of iirA Green arrow the entry site for YADD motif that coordinates the active site is sho n i Sequenc th and rounding regions fern all rnaturase sequences in th M A et Nucleic shown the structural The figure created by web GE et Gel showin the products produced by T mutant of maturase at time is mm conformation in the cr structure of m complex In the in forms a and n an conformation is b its interaction with group D Figure Figure Figure depicts example demonstrating positively affects efficiency on RepA model showin positively charged A surface in the RT domain of The electrostatic surface potential of the RT domain was calculated by ARBS HA et Natl Acad S and PDB2PQR TJ et Nucleic Acids and is presented as a Residues that are mutated in and constructs were shown as Gel showing the RT products produced by and different coiistmcts of The RT reactions used Rep A as and were performed under increase of primer rate in RT reactions catalyzed by different enzymes to WT Primer incorporation efficiency is the ratio of all extension products relative to the total of primer the to products plus unincorporated Figure comprising Figure A Figure depicts results from exam ple determining the error rate of various reverse transcriptases including maturase and sequencing The schematic diagram of primers used for 2nd strand synthesis is shown The principle underlying sequencing is errors are consistent in all sequencing reads share the same product barcode are as RT errors Errors inconsistent among reads share the same product originated from amplification or the sequencing Overall for maturase and Substitutional mutation spectrum for B and There are 65 60 C and 69 the sequence used this Tiie shewn her is highl conservative relative to ohr because it was conducted on Figure comprising Figure 9 A Figure depicts results a for maturase model Sot structure of the was by crystallography and the structure of thumb ain was created as a threading by ASSER J et Mat based o the thumb snbdorrsain of LtrA YADD motif that coordinates th active site ions is shown in The is shown in that includes the i finger subdomain and the first the The is in yellow contains in finger primer grip in palm and a highly region the second in the thumb Green arrow indicates the for RNA RT products by maturase and under salt RepA was used as RT Salt ia to RT buffers the top of each Comparison of thumb in maturase and of RT RT has a more extensive surface that could interact with RNA template compared group 33 maturase Figure 10 depicts results example demonstrating the chemical and conformational homogeneity of maturase The elation profile from Superdex suggests the almost all purified maturase exists as stained by suggests purified matarase has high chemical Figure depicts the from example experiments investigating reaction for The reactions were carried on RepA D3 RNA using The buffer compositions provided Table and the numbering of the buffers corresponds to gel lanes in The prime for the the yields of product are 41 figure Figure through depicts the lts of investigating the ability of D4A to improve The secondary structure of reverse by niaturase in the presence of D4A D3 was used as the templat buffer used to reactions Figure 13 depicts a of for thermophilic The residues the thermophilic are indicated by Figure comprising and Figure depicts the of experiments S to protein production of maturase and maturase The purified protein by Lane cleaved protein by The three Lane 1 protein by Lane cleaved protein by SUMO purified by cleaved protein by SUMO protein precipitate after SUMO protease Lane by cleaved M337T protein by The positions of proteins the are indicated by the results of experiments enzymatic assa s for three At the primer incorporation efficiencies by and M337T are 6 and the yields of product are and At the efficiencies are reduced to 45 the yields of pr oduct are reduced to 1 ϊ and DETAILED DESCRIPTIO play important roles in and the are direct reporters of expression the structure RNAs is limited by the low proc ess i of reverse transcriptases that decode the RNA This limitation can be demonstrated by five Lo RT processi makes it obtain useful sequence information structured or heavily In gene expression low been shown to bias coverage transcript and is more severe experiments et In structural methods such as SHAPE et et RT processi vity results in background A some the background signal can be so strong that it obscures actual Low RT processi limits possibility of sequencing for RNA molecules nanopore sequencing et Genome or sequencing Nat RNA sequencing is tremendously helpful for characterizing heterogeneous RN as splicing variants and RNAs different modification sites or mutation Low RT limits development of direct RN sequencing using in contrast to a for 0 Α sequencing has already popularity To direct RNA sequencing has only been conducted using short reads aL or modification sites et J or using nanopore technology has poor error rate et Detect The present invention provides compositions and for reverse The present invention relates to the discovery as engineered are reverse transcriptases t t display As described and the engineered variants are highly proce reverse transcriptases that be used S in variety of clinical biology whic utilize reverse The present invention to compositions comprising protein or variants compositions comprising nucleic molecules protein or variants methods for making the and methods for using compositions in a reverse transcription one the invention provides composition comprising reverse transcriptase or a encoding reverse In efti the reverse transcriptase is derived f om reverse transcriptase is modified relati to In certai the reverse transcriptases of the present invention are reverse thereby allowing for amplification of templates in a single In certain the reverse transcriptases of the present invention are functional at physiologic thereby allowing for efficient reverse transcription under conditions that reduce the degradation of the RN A In certain the reverse transcriptases of the present invention efficiently copy long RNAs single thereby the presently described reverse transcriptases to be used at lower reverse concentrations and in single molecule sequencing the present a a agent that improves RT activit or variant thereof For some the comprises an agent that reduces binding of primers to the positively charged surface maturase or variants In the agent that reduces binding of primers to positively surface of maturase or variants thereof comprises a nucleic acid small or other compound that or reduce specific I one agent comprises a nucleic acid such single stranded or double stranded A or For in one the agent comprises an RNA snch as a double stranded RNA or single stranded RNA hairpin or In some the comprises a nucleic acid molecule derived a group as the group in the agent comprises D4A a nucleic acid molecule derived group la the agent comprises a variant deri but not limited a of a D4A or acid having substantial homology to In one the present invention provides reaction buffer that enhances activity of or variants In some the reaction buffer one or at of to about at a of about to about at concentration of about to about and at a of to about one the optimized buffer a of about 8 to the optimised buffer further comprises a protein stabilizing Exemplary protein stabilizing agents but are not stabilizers such as sorbitols and polyethylene amino ac ids deri vati ves thereof such ionic stabilizers such as and quaternary proteins such as bovine serum albumin one the present invention rela tes a method of reverse transcription using a reverse transcriptase comprising or a variant thereof certain the method provides reverse at physiologic or at lower temperatures relative to that required when using reverse transcriptases In certain lower temperature of the re verse transcription reaction provides a decreased rate of degradation of molecule during the relative to the rate of degradatio of an molecule a reverse transcription reaction that uses a reverse anothe the Molecule to reverse transcribed long complex RN A In a the reverse transcription reaction efficiently creates DMA In another the reverse transcription reaction requires less protein relative to the amount of reverse transcriptase in a reverse t reaction uses reverse one the method comprises amplification of in a made by the thermocy cling ability of the reverse transcriptases described Definitions Unless defined technical and scientific used have the same as commonly by of ordinary in the art to which this invention It is also that the used herein i for the purpose of describing particiiiar embodiments and is not intended to As used of the following terms the meaning associated with d in this The articles and are nsed herein to refer to one or t more than one to at l east of the object of the article By way of example means one element ot more than one element as used herein when referring to a value as temporal and the is meant to encompass variations of or from specified as such variations are appropriate to disclosed A nsed refers t biological derived the Individual into whom the material later be As used refers to a biological material deri ved from a genetically different individual of the species as the individual into the will be The and of used interchangeably generally refer to a of more than one The population may be a pure population comprising one cell the population may comprise than one cell In present there is no limit the number of cell a cell population may refers to the inherent of specific sequences of nucleotides in a as a a or an to serve as synthesis of oth er in biological processes having either a defined sequence of nucleotides and or sequence of amino ds and the biological properties resulting therefrom gene encodes a proteia transcription and of corresponding to gene produces protein a or other biological Both the coding the nucleotide sequence of which to the sequence and is usually provided in sequence the u sed as the template transcription of a gene or can referred to a s encoding the protein or other product of that or refers to vector comprising recombinant polynucleotide comprising expression control sequences linked to a nucleotide sequence to be An expression vector comprises sufficient for other elements for expression can be supplied by the host or in aft in vitro expression vectors include those known the such as naked or contained viruses and incorporate the refers t the sequence similarity or sequence identity between two polypeptides or two nucleic acid position in both of two sequences is occupied by the or amino acid if a position in each of two A is occupied by then the molecules are homologous at mat The percent of homology between two sequences is a function of the of matching or shared by the two sequences divided by the number of positions compared For if 6 of 10 of positions in two sequences are matched or homologous then the two sequences are By way of the sequences and TATGGC share comparison is made sequences aligned to give altered or removed the natural For a nucleic acid or a peptide a living organism is not but same nucleic or peptide partially completely separated fr m the coexisting of its slate is isolated acid or exist substantially purified or exist in a such for a host context of the present followin abbreviations for the commonly occurring acid are to refers to refers to refers to and refers to Unless otherwise a sequence encoding an acid tides all nucleotide sequences are degenerate versions of each other and that encode same acid The phrase nucleotide sequence that encodes a protein or an may include mtrons to the extent that the nucleotide sequence encoding the protein may in some contain an The as used herein is defined chain nucl nucleic aci ds are of nucl eo ti nucleic acids and polynucleotides as used herein One skilled the art has the general knowledge that nucleic acids are which can be the The nucleotides can b hydrolyzed into As used herein polynucleotides but not limited all nucleic acid sequences which are obtained by available in the without recombinant the cloning of nucleic sequences from recombinant librar or a cell ordinary cloning technolog and the and by synthetic As the terms and are used refer to a compound comprised of acid residues covaientiy linked by peptide A protein or peptide must contain at two amino and no limitation is on the maximum number of amino acids can a or peptide Polypeptides include any peptide or protein comprising two or amino acids joined to each other by peptide As used the term refer to short commonly ar referred to in art oligopeptides and for and t longer which generally are referred to in art as of there are biologically active substantially homologous The used herein defined a DNA recognized by the synthetic of the or required to initiate the specific transcription of a polynucleotide used the a nucleic acid which is required for expression of gene product operably linked to some this sequence may be core promoter sequence and in other this sequence may also include a enhancer sequence and other regulatory elements which are required for of the gene sequence for be one which expresses the gene product A promoter is a nucleotide sequence when operably linked with a polynucleotide whic encodes or specifies a gene causes the gene product to be produced in a under most or all physiological conditions the An promoter is a nucleotide sequence when operably with polynucleotide which encodes or a gene causes the gene product to produced in a substantially only an inducer which to the is present the cell A is a composition of which comprises an isolated nucleic and which can be used to deliver the isolated nuc leic acid to the interior of a Numerous vectors are known in the art but not limited polynucleotides with ionic or amphophilic and the includes an autonomously replicating or a term should also be construed to include and viral compounds which of nucleic acid into such and the Example s of viral vectors are adenoviral virus retroviral and the throughout various aspects of r can pxesesited range It should be that the description in range fomiat is merely for convenience brevity and not be construed an inflexible limitation on scope of the the description of a range should be considered to have specifically disclosed al the possible subranges as as numerical values within that For description of a range as from Ϊ t 6 should be considered to have specifically disclosed as 1 to 1 to 1 to 2 to 2 to 3 to as well individual numbers within for and This applies regardless of the breadth of Description some the present invention relates to reverse transcriptase comprising a variant for use h reverse transcription The reverse transcriptases of the present invention are desc ribed to one o more including but not limited enhanced reduced error reduced and improved ihemio The presently described reverse transcriptases thus have enhanced functionality allow to be utilized in a wide variety of applications but not nest generation nanopoxe library splice site viral single cell structure the In one the present provides a composition comprising a reverse or a nucleic acid molecule encoding a In one the reverse transcriptase is derived from in certain the reverse transcriptase comprises maturase variant is modified relative to In certain the maturase variant comprises one or the finger DMA binding in positively charged protei the provides a of variants of In some the variants have at least one enhanced relative to some the variants are engineered by mutating to be improved relative to unmodified maturase with regard the or In some the variants by the solution conditions relative to solution conditions create a improved composition comprising roatutase or a variant with regard to the error or other In one present invention provides a metho reverse For in one the method comprises contacting an molecule with one or more reverse transcriptase molecules described As described using the presently described transcriptases allows for the revers transcription reaction to occur at lower temperatures and at lower reverse transcriptase concentrations the use of the presently described reverse transcriptases allows for production of longer Further the ability of the presentl described reverse transcriptases amplification using a single In one th invention a composition comprising a In one the reverse transcriptase is derived from certain the transcriptase comprises or a variant one maturase is relative to unmodified For certain the variant comprises or point insettion or deletion relative to wildtype I the comprises a protein or one the composition comprises The acid of wiidtype is provided below and is denoted SEQ IB EME E SEQ 14 comprises UNA binding and DMA binding that influence efficiency of reverse transcription of an A In oae the reverse transcriptase an matorase variant where one or more secondary RNA binding sites on the surface of the protein are mutated to reduce nonspecific binding of the reverse transcription protein to the promoting binding at the polymerase cleft facilitating enzyme one a variant of matnrase comprises at least one point selected from the group and X denotes any ami no In another variant of r rnaturase comprises at least one point selected from the group R one the reverse transcriptase comprise rnaturase variant to and denoted as SEQ comprising the point and to wiidtype the transcriptase comprises an variant to herein as rnaturase as ID comprising ttoe relative wildtype 5 reverse transcriptase comprises an variant to as maturase and denoted SEQ comprising point mutations relative to wildtype In one the reverse transcriptase comprises an maturase variant to herein as denoted as SEQ ID comprising the point mutations and relative n one the reverse transcriptase comprises an maturase variant comprising or more mutations in the binding domain of I 5 In one embodiment variant of raaiiirase comprises at least one point mutation selected group wherein X denotes amino In another such a variant of maturase comprises at least one point mutation selected the group In another such variant of maturase comprises at one point mutation selected from the group and another such the residues deleted relative to wildtype wherein Δ variant the sequence 25 deleted maturase has a the of which is 10 the bold and fra ment highl conserved maturase reverse t the reverse transcriptase of the present invention comprises maturase 30 comprising one or mutations in of one the maturase varian t comprises one or more mutations in region of the In one at least point mutation is created ive to the unmodified sequence ID of I one the mutation is at least one selected the DI wherein X amino one such the at least one point mutation selected from the polar amino acid electrostatic amino acid and another such the is engineered to be more ilexible b substituting positions in the region wit one or mor in another such the is engineered to be more stiff by substituting positions in the with or more In one the mutation a of at least one residue of the one the reverse transcriptase of present invention comprises an variant which residues are substituted with two glycine residues SEQ maturase can perform reverse at lower temperatures relative other reverse the engineering of more thermostabl maturase would enable of templates in single without using amplification Analysis of thermophilic structure and suggests that tend to have larger numbers of hydroge bonds and within rigid sections of tertiary in one e reverse transcriptase of the present comprises maturase engineered to have pairs at positions that are proximal in according to the structure of the enzyme C et at Nature structural molecular one such variant comprises at one point mutation cted the group E can form a salt bridge with L21E can form a salt bridge with and 3E can form a bridge one the reverse transcriptase of the an maturase engineered comprise a domain to fidelity the proofreading domain comprises an exonuclease another such the proofreading domain to the rtfiiintts of the maturase variant another proofreading domain is to of tire variant through tinker molecule o sequence for Maturase reverse are among but some have beneficial in one the reverse transcriptase of the present invention comprises m wherein at least one fragment or domain of is replaced with a fragment or domain a reverse transcriptase from a species other than E m For one the RT domain and of maturase reverse transcriptase is replaced with the RT domain from a thermophilic maturase reverse transcriptase to In the of is a longer from another maturase reverse transcriptase to enhance In one one or amino acids are substituted with hydrophobic amino acids or charged amino acids i order to improve I one the reverse transcriptase of the present invention comprises an maturase wherein one or more residues are substituted with one or more residues derived from a enzyme from a organism other than E For in some the maturase variant can one or more point mutations on residues in In one the variant comprises least one selected from the A29 and where X denotes any amino one the is at least one selected from the 1 and where X denotes an amino one the variant comprises least one from the ΕΪ I I and In one variant comprises a triple point of these ons the thermostability of the In one the reverse of the comprises an comprising one or more mutations in the thumb domain relative to type one the variant comprises at point selec ted the consisting S and wherei X any amino In another the variant comprises at least one point rn selected from the group of arid one such one or more mutations ate incorporated the surface of the thumb optimizing its ability to clasp the In the variant comprises at least one point from the group consisting of E and X denotes any amino ac anothe such the comprises least one mutation selected group of In one composition comprises art isolated polypeptide comprising a reverse In one the reverse transcriptase is derived For in one the or a variant Exemplary amino acid sequences of the derived reverse transcriptases of the present i nvention but are not limited SEQ ID 14 SEQ ID SEQ SEQ ID maturase SEQ ID IS and SEQ ID the present no limited to these Rathe the invention encompasses any reverse transcriptase derived from matorase or a variant one polypeptide a of or vari ant thereof that mi the of to perform the polypeptide comprises a derivative of th maturase or variant In certain polypeptide comprises an acid selected a fragment or derivative of SEQ ID 1 a or of SEQ of SEQ ID or deri vative of SEQ a or derivative of SEQ ID a fragment or derivative of S EQ ID In one of comprises or mo mutations the catalytic to reduce the fidelity of which enhance its for RNA structure mapping since lesions that are used probe RNA structure are flagged by that the error rate of enzyme can RNA in some the polypeptide comprises at least one selected the X denotes amino mutations at A225 as or mutations at as R imitations at as at mutations SO as mutations as f mutations at as 3 A or E143 mutations at as at as or may be in In one the composition of die present invention comprises a polypeptide comprising or variant or fragment thereof such the maturase comprises one or mutations corresponding to one or herein Reverse transcriptases of the present invention may produce more product length at particular temperatures compared to reverse of length product syn thesis are made at different temperatures one temperature being such as between and one temperature being as between C and while keeping all oifeer reaction conditions similar or the The amount of full length product may be determined using techniques well known in the for by conducting a reverse transcription reaction at a first temperature and determining the amount of full length transcript second reverse reaction at a temperature higher than the first temperature and of full length product comparing the amounts produced at the A convenient of comparison to determine the of the of fall a the first temperature that is produced at the second The lot two salt buffer divalent metal nucleoside triphosphate template reverse transcriptase primer of time the reaction is be same for Suitable reaction conditions ma be determined b those skilled the using routine techniques and examples of such conditions are provided herein The reverse transcriptases of the invention y produce at least about at least at least at least at least at leas at least i or at least product or Mi product compared to corresponding control reverse transcriptase under the same reaction conditions and The reverse transcriptases of the invention produce to about about to about from to about from t about or about to about more product or length product compared to a control reverse transcriptase the same conditions incubation The reverse transcriptases of the invention may produce at least 2 least 3 at least at least 5 at least 6 at least at least 9 at least at least 25 at least SO at least at least at least at least 200 at 300 at least 400 at least 500 at least at least at least times more product o full length product compared to a control reverse transcriptase the same reaction conditions Reverse transcriptases of the present invention may have an at temperatures as compared to corresponding reverse They may show increased thermostability the presence or absence RNA some reverse transcriptases of the in vention may show an increased thermostability in both presence and absence of RN A template skilled in the art appreciate that reverse transcriptase enzymes are typically more in presence of an increase in thermostability measured by comparing suitable of the modified or mutated revers transcriptase of the invention to of a corresponding fied or reverse Suitable parameters to compare but are not limited atnoiffit of product length by the reverse the invention elevated temperature compared to the or length synthesized by a control reverse at same the of reverse transcriptase activity at an elevated temperaiiire of reverse transcriptase of the invention at an elevated temperature compared to that of a control reverse reverse transcriptase of the have an increase i thermostability at a particular temperature of least about fold from about fold to about from about fold to about 50 to 25 from about fold to about to the control reverse A reverse transcriptase of the invention may have an increase thermostability at a particular temperature of at least about 10 fold from about fold to about 100 from to about from about fold to about 25 from about fold to about for to the control reverse A reverse transcriptase of the in n may have an thermostability at a particular temperature of at least about 25 fold from about 25 fold about from about 25 to about 75 from about 23 fold to about 50 o from 25 fold to about 35 to the control reverse The polypeptide of present invention using chemical For e polypeptides can synthesized by solid techniques J Y et al Science cleaved from the and purified preparative high perfor ance liquid Automated synthesis may for using the AM 43 i Peptide accordance with the provided by the The polypeptide be made by recombinant or by cleavage a The polypeptide may be confirmed by amino acid analysis or The also construed to include of a polypeptide substantial to a reverse transcriptase disclosed For a polypeptide is about about about about about or about to an amino acid sequence of a some the comprises a reverse transcriptase comprising an amino acid sequence that is about about about about about about about about or about to or maturase described some the composition comprises a reverse transcriptase comprising amino acid sequence is about about about about about about about about about about about or homologous to the acid sequence set forth SEQ ID ID SEQ ID SEQ ID or SEQ some the composition a reverse transcriptase comprising an amino sequence thai is about about about about about about about about about about about about about or homologous to the amino acid sequence set forth SEQ SEQ SEQ ID SEQ or SEQ ID wherein the transcriptase comprises more of mutations described one the present invention provides composition an agent that improves T activity of maturase or variants thereof For some the comprises an agent that reduces binding of primers to the positively charged surface of maturase or variants In some the agent reduces binding of to the positively surface of maturase or variants thereof comprises a peptide or but not The variants of the polypeptides according to the present invention may one in which or more of the amino residues are substituted with a conserved or acid such substituted amino acid residue ma or fee one encoded by e genetic one there are on or modified that are modified by the of one in w nch the poiypeptide is an alternative spli ce variant of the polypeptide of the present fragments of the polypeptides one which is another such as a leader or secretory sequence or a which is employed for purification or for epitope The fragments include polypeptides generated via cleavage of an original Variants ma or chemically Such variants to be within the scope of those skilled the art from the teaching As the the two polypeptides is determined by comparing the amino acid sequence and its conserved amino of one polypeptide to a sequence of a second Variants defined to include poiypeptide sequences different from the original for different from the original sequence in less than of residues per segment of interest different the original sequence less than of residues per segment of interest different b less of residues per segment of or different from the original protein sequence in just a few residues per of interest at the same time sufficiently the original sequence to preserve the functionality of the original sequence the abilit to perform reverse present invention includes sequences that are at or 95 similar or identical to the original amino acid The degree of identity two peptides is ined using computer algorithms and that are widely known to the persons skilled in the The identit between two acid sequences be by using the algorithm e et Biol The polypeptides of the cart be Fo that within the o f the present invention include signal peptide protein folding and proteolytic Some or processing events of For processing such as signal peptide cleavage and are examined by adding canine microsomal membranes or egg extracts to a standard translation The po lypeptides of the include formed by or b introducing unnatural acids A variety of approaches are a vailable fox introducing unnatural amino acids during protein A polypeptide or protein of the invention be with other such as to prepare be for by the synthesis of or fusion proteins provided that the resulting protein retains the of a reverse A peptide or protein of the may he using conventional such as the method described EMBO Journal Cyclic derivatives of the polypeptides of the invention are als part of the Cyclizaiion may allow the polypeptide a mote favorable conformation for association with may b achieved using techniques known in the art For disulfide bonds may between two appropriatel components having free or an amide bond ma fee formed an amino group of one and group of another component may also be achieved using an amino as described by et X components that form the bonds may be side chains of amino acid components or a combination of the I n an embodiment of the inven cyclic peptides may comprise the right posi may fee introduced i nto peptides of the i nvention addi the amino acids at the right It may be desirable to produc polypeptide which more flexible than the cyclic having peptide bond linkages as described A flexible polypeptide ma fee prepared fe introducing cysteines at the right and left position polypeptide and forming a disulfide bridge between two cysteines The cysteines are arranged so as not to polypeptide more flexible a result of the length of the disulfide linkage number of hydroge bonds in the The relative flexibility of cyclic polypeptide can be determined by molecular dynamics The invention also relates to polypeptides comprising reverse transcriptase fused or integrated a target and or capable of directing the chimeric protein to a desired The proteins may also comprise additional amino acid or The chimeric proteins ate recombinant in the that the various components are from different as such are not found together are one the targetin domain can be a membrane spanning membrane binding a sequence directing the protein to associate with for example vesicles or the In one embodiment the targeting can target a peptide to a particular or Fo the targeting can he a cell surface or an antibody against cell surface of a target targeting domain target polypeptide of the invention to cellular polypeptide of the invention may be sy nthesized by For the polypeptides or chimeric proteins may be synthesized b chemical synthesis using solid phase peptide These methods employ either solid or solution phase synthesis methods for and Solid Phase Peptide 2nd Pierce Chemical and Barany and Analysis Biology Gross and Meienhofer Academic New for solid phase synthesis and M Principles of Peptide Berlin and Gross and The for solution By way of polypeptide the in ventio may be synthesized usin solid phase chemistry direct incorporation of phosphothreonine as the or polypeptide chimeric protein of conjugated with other molecules may be prepared b through the or of the polypeptide or chimeric and the sequence of a selected protein or selectable marker with desired biological resultant fusion proteins comprise a reverse transcriptase fused to the selected protein or marker as described Examples of proteins which maybe prepare fusion include truncated Polypeptides of the invention may be developed using a biological expression The use of these allows the of large libraries of random peptide sequences and the screening of these libraries for peptide sequences that bind particular Libraries may be produced by cloning synthetic DM A encodes random peptide sequences into appropriate expression vec tors Christian el al 1 Devlin 1990 Science Cwirla al Natl Libraries may also he constructed by synthesis of overlapping peptides The polypeptides and chimeric proteins of in may be con verted into pharmaceutical salts by reacting with acids such hydrochloric sulfuric phosphoric acid or organic acids as formic pyruvic oxalic malic trie benzoic salicylic bene and one the present an isolated nucleic acid encodmg a reverse transcri in certain the composition comprises a nucleic encoding a revers transcriptase derived f the composition comprises nucleic acid a reverse wherein the iptase comprises maturate variant thereof In certain the nucleic is or hi one the nucleic acid encodes a reverse comprising wherein the ammo sequence of wildtype is set forth in SEQ ID the nucleic acid encodes an variant at least one point mutation group and relative to rn wherein X denotes In some the nucleic acid encodes an maturase comprising at least one mutation selected from the group and R353A relative to wildtype one acid encode an variant to herein as maturase and demoted as SEQ point mutations K 1 and 63 relative to wildtype one the acid encodes an maturase variant to herein as maturase and denoted as comprising the point and relative to wildtype one nucleic acid encodes an maturase variant to herein as maturase denoted as SEQ comprising the point mutations R58 relative to wi ί dry pe one the nucleic acid variant to herein as mamrase and denoted as SEQ comprising point mutations and E353A relative to wildtype one the nucleic encodes an maturase variant comprising one or more mutations the binding thumb one the nucleic encodes an maturase variant engineered to have pairs at positions that proximal in encodes an maturase one or or domains of maturase replaced by one or more domains a reverse transcriptase species other e i the composition increases expression of biologically functional fragment For in one comprises an isolated nucleic sequence encoding a biologically functional of As would be understood in the a biologically functional fragment is a portion portions of full sequence retain the biological function of the full length a biologically of maturase comprises peptide that retains the function of length the encompasses isolated nucleic acid encoding peptide having substantial homology to a reverse transcriptase disclosed the isolated nucleic acid sequence encodes a reverse transcriptase having at least or with a acid sequence selected from S EQ I D SEQ ID SEQ ID SEQ and SEQ ID acid sequence encoding a reverse transcriptase can be obtained using any of the methods known in for example by screening libraries from cells expressing the by the gene from a vector known to the or by isolating directly cells and tissues containing the using standard the gene of interest can be rather than The isolated n ucleic acid comprise any type of n uclei acid but not limited to D and For the composition comprises an isolated DNA for an isolated encoding a reverse one the comprises an A molecule encoding a reverse In one the present invention provides a composition comprising a agent that improves RT activity of maturase or variants thereof the composition comprises an agent that reduces binding of primers to the positively charged surface maturase or variants la the agent binding of primers to the positively surface of of variants thereof comprises a nucleic acid such as a single stranded or double stranded DMA or RNA For in agent comprises such as a double or a single hairpin or In some the agent comprises a nucleic add molecule deri ved om suc as the Ex group I ln one the agent comprises D A a nucleic molecule derived from group the agent a deri f but not limited a gment of a D4A or a nucleic acid molecule having homology to D In one agent comprises fragment of that is able to bind to the surface of maturase or variant For one embodiment the agent comprises fragment of 334 A comprising loo of I one the agent comprises a fragment of D4A comprising the apical loop of and one or more nucl eotides of the stem adjacent to the apical loop of For one the agent comprises a fragment of comprising the nucleotide sequence of ID In one the agent comprises a fragment of comprising the nucleotide sequence of ID In the agent comprises a fragment of D the nucleotide sequence one the agent comprises a mutant including a mutant D4 A having or mutations t improv bindin t the surface of Ex maturase or variant thereof I some the agent comprises a mutant D that retains ability to bind to maturase or variant In one the acid comprises which can be used along wi th or variants thereof to reduce binding of primers to the surface or variants For the Isolated nucleic acid comprises which comprises a nucleotide sequence provided by SEQ In nucleic acid molecule comprises a nucleotide sequence having at sequence homology with the nucleotide sequence by SEQ ID of modifications ate containing at least nucieobase instead of a naturally occurring Bases ma be to the activity of adenosine Exemplary modified but are not limited uridine cytidine modified at the adenosine aid or modified at the 8 and adenosine The above modifications may be the nucleic molecule comprises least of 2 or of one more I c ertain a acid molecule of the invention can have enhanced resistance to For increased nuclease a nucleic acid molecule can for units and or phosphorotliioate For the group can be modified or replaced with a of or increased nuclease resistance the nucleic acid molecules of the metho Inclusion of locked nucleic acids nucleic acids bridged nucleic certain nucieobase modifications such as can also increase affinit to one the nucleic acid molecule 2 a or ylacetamido N In one nucleic acid molecule includes least one and in some all of nucleotides of the nucleic acid molecule include In certain nucleic acid molecule of invention have one more of the following Nuclei herein include otherwise unmodified A and as well as RNA ON A that have been to and polymers of refers to molecule whic the components of the phosphate are th e same or essentially the same that occur in The art has referred to rare or but naturally As as modified et Acids or unusual modified are typically the result of a modification and within the term unmodified R A used Modified as refers to a molecule in which one or more of the components of the namely and phosphate are different from those occur in While referred to as the will of because of the include molecules that are strictly surrogates are molecules in which the ate is replaced with a construct that allows the bases to be presented in the correct spatial relationship such hybridization is substantially simitar to what is seen with a mimics of the ribophosphate Modifications of fee nucleic acid of the invention may be present at one more a phosphate a o Expression systems The invention also a in which the nucleic aci d of the invention is The art replete with sui table vectors are useful the present brief the expression of or nucleic encoding reverse transcriptase described herein is typically achieved b linking a nucleic acid encoding a reverse transcriptase to a and the construct into an expression vectors to used ar for replication integration in host Typical vectors and promoters far regulation of the expression of the desired The isolated nucleic acid of the can be into many types of For acid be a vector limited to a a a phage deri animal and Vectors of particular interest include expression replication probe generation and vec the v ector he provided to cell the of a viral vector technology i s known in the and is in et Molecular A Laboratory Cold Spring New and other virology aad biology which are useful as vectors are not herpes and a suitable vector contains origin of replication functional at least a convenient restriction endonuelease and one or more selectable WO WO and A of viral based systems have been developed for gene transfer For retroviruses provide a convenient for gene delivery A selected can be inserted Into a vector and packaged in particles using techniques known the The recombinant virus then be isolated and delivered to A numbe of systems are som adenovirus vectors are A o adenovirus vectors in the one lentivirus vectors are vectors derived from the leut are suitable tools to achieve gene transfer since they allow stable integration of a transgene and its propagation in daughter vectors have the added advantage vectors derived such as in that they can such as They also have advantage of low im one embodiment the composition includes vector derived an virus viral vectors have become delivery tools for the of various vectors possess a nuniher of that render them ideally suited for including lack of minimal the ability to transduce cells a stable and of a particular gene contained within AA V vector can be specifically to one 5 of cells choosing the appropriate combination of AAV and certain the vector includes conventional which are operahly linked to transgene manner whic its translation expression cell the vector r the virus produced by As used sequences include both expressio control sequences are contiguous with the gene of interest and expression control sequences that trans or distance to gene control sequences appropriate transcription promoter and enhancer efficient processing signals such i 5 splicing polyadenylation sequences that stabilize cytoplasmic sequences that enhance efficiency consensus sequences thai enhance protein and when sequences secretion of the encoded A great number of expression control including promoters are inducihle are known in the and may be uti Additional promoter regulate the frequency transcriptional these are located the region bp upstream of the start although a of promoters have recentl been shown to conta n elements downstream of the start site as The spacing between promoter 5 elements frequently so that promoter function is preserved whe elements inverted or moved relative to one the thymidine kinase the spacing between promoter elements can be increased to 50 hp apart before activity begi ns Depending the it appears individual can either cooperativ ly or independently to activate 30 One of suitable is the earl cytomegalovirus This promoter sequence is a strong constitutive promoter capable of driving of of any polynucleotide sequence linked Another of a suitable is Elongation Growth Factor a other constitutive promoter sequences also be but to the simian 40 early mouse tumor virus repeat an avian leukemia virus an virus immediate early a Rous virus as well as human gene promoters such not limited the the myosin the hemoglobin the kinase the should be limited to use of constitutive Inducible promoiers are also contemplated as part of the The ose of inducible promoter provides molecular switch capable of on expression of the sequence which it is linked when such expression is o turning off the expression expression is not Examples of but not limited to a a glucocorticoid a progesterone a tetracycline Enhancer sequences found on a vec tor also regulate expression of the gene contained enhancers are bound with protein factors to enhanc the transcription of a An enhancer may be located upstream or downstream of the gene it Enhancers may also be to transcription in specific cell or tissue one enibodirnents vector of the present invention comprises or more enhancers to boost transcripti of the gene present within the vector In order to assess the expression of or a derived the expression vector to be introduced into a can also comprise either a selectable marker gene or a reporter gene or both to facilitate identification and of expressing cells the population of cells to be t viral In other the selectable marker may be carried on separate piece of DMA and Both selectable markers and reporter genes flanked appropriate regulatory sequences to enable expression in host Useful selectable markers for resistance such as neo and the Reporter genes are ir cells and for evaluating the of reporter gene a gene that is not present or the recipient organism or tissue that encodes a polypeptide whose expression is manifested by some easily detectable acti Expression of the reporter gene Is assayed at a suitable after the A has been introduced into the recipient Suitable reporter genes include encoding chloramphenicol acetyl secreted alkaline or the green fluorescent protein gene FEBS Letters Suitable expression systems are well known and he prepared known techniques or obtained hi the construct with the minimal region showing highest of expression of reporter gene is identified as the Such promoter regions may be linked to reporter gene and used to evaluate for the ability to modulate driven Methods of introducing and expressing genes into are known in the the of an expression the vector can be a or insect cell by any method the For expression vector can be transferred into host cell by or biological Physical for introducing a polynucleotide into a host include phosphate particle and the ethods for producing cells vectors exogenous nucleic acids are the fo et Molecular A Cold Spring Ne A preferred method the introduction of a into a host cell is calcium phosphate Biological methods for introducing a polynucleotid of interest into a cell include the use of DMA and A Viral and especially retroviral have become the widely used method for inserting genes Other viral vectors can be derived frona simplex virus adenoviruses for Pat and Chemical means for introducing polynucleotide into host cell Include colloidal dispersion as mixed and An exemplary colloidal system for use as a delivery vehicle in vitro and vivo is a liposome artificial Irs case where a delivery is exemplary vehicle is The use of lipid contemplated for the action of acids a host ex vo or in In another the acid may be associated with a The nucleic acid with a lipid may be encapsulated in the aqueous interior of a i nterspersed within the bilayer of a l attached to a liposome via a linking molecule that is associated with both the liposome and the entrapped in a with a dispersed in solution comprising a mixed with combined with a contained as a suspension in a contained or complexed with a or otherwise associated with a vector associated compositions are not limited to an particular For they ma be present in bilayer as with a The may also simply be interspersed possibly forming aggregates that not Uniform in or ids are substances fee naturally occurring or synthetic For lipids include the fatty droplets that naturally occur in the cytoplasm as as the class of compounds chain aliphatic hydrocarbons and deri as fatty amino and Lipids suitable use can be obtained commercial phosphatidylcholine be diceiyl phosphate obtained K Laboratories cholesterol can obtained from dimyristyi and lipids may be from Polar Stock soluti ons of lipids in chloroform or fee stored at about Chloroform is as the onl solvent since is more readily than is a generic a variety of single lipid vehicles formed b the generation of enclosed lipid or Liposomes be characterised as having vesicular structures with bilayer inner aqueous Multilamellar l iposomes have multiple lipid layers separated by aqueous form spontaneously phospholipids are suspended in an excess of aqueous The lipid components undergo the formation of closed entrap and dissolved solutes between the lipid bilayers et Glycob wever compositions that have structures than the normal vesicular structure are also For lipids may assume a structure or merel exist as aggregates of Also contemplated are acid Regardless of th e method used to introduce exogenous nuc lei c acids i nto host cel in order to confirm the presence of the recombinant DMA sequence in the host cell a variety of assays may Such assays for exampl e assays known to those of skill in the such as Southern and Northern an such as detecting the presence o absence of a particular by means and Western or by assays described herein to agents the scop of the one the present invention provides a delivery vehicle comprising a reverse or a nucleic molecule encoding reverse Exemplary delivery vehicles are not limited and For i certain th delivery vehicle loaded with a reverse or a nucleic acid molecule encoding a reverse In certain the delivery vehicle provides fo controlled delayed or continual release of its loaded In certain delivery vehicle comprises a targeting that targets the delivery vehicle to a particular the a A derived from a ull produced by a reverse transcriptase described embod imeiit the A has secondary or tertiary long or equal bases it is described that peptides described are highl processive reverse one the reverse into DMA is at least about at at least about at least least about at least about at least about least about at least about at least about at least about at ieast about at least at least about at least about about at about at least about o about 000 bases in the A so reverse transcribed is at least about at about at least about at least about least about at least about least about at least about at least at least about at least about at least about at least at least about at least about at least about at least about at least or at least about bases in Formulations present also provides formulated one or of the compositions described Forniulations ma in with conventional acceptable or inorganic carrier substances sui table for storage and use of a reverse The be sterilized and if with auxiliary wetting salts for influencin osmotic pressure aromatic substances and the They may also be combined where desired with other acti ve other components of the rever se reacti on or other components sui table for storage of th e or variants In one the formulation is optimised to modify the error other In another the protein itself is optimized modify the error or other Assays for measuring properties of the compositions of the are described elsewhere the composition formulation is optimized to stability of rase or a variant thereof In one the type of the overall ionic strength of the water crowding the buffering molecule types buffering the the identity and of or other carriers or stabilizing are to improve the thermal or a variant thereof one the enzyme can wherein the reverse transcription reaction may be repeated using same or a As used but are not one or more of the surface acti ve crowding dispersing inert granulating and disintegrating binding physiologically egradable compositions such as aqueous vehicles and oily vehicles and suspending dispersin or wettin emulsifying thickening emulsifying stabilizing and or hydrophobic Other that be included in of the inventi on are known in the art and for example Remington Pharmaceutical Mack Publishing which is herei by The of the inventi on may a preservati ve from about to by total weight of the The preservative is prevent spoilage in the case of exposure to contaminants in the Examples of preservatives useful in accordance the included but are no limited those the group sorbic imidure combinations One is combination of about to to In one the composition includes an antioxidant chelating agent that inhibits the degradation of one or more components of the antioxidants are and ascorbic acid in the range of about to or in the range of by weight total weight o the chelating agent is present in an of to by weight by total weight the Chelating agents include edeiate citric acid in the weight range of about to or in the range of to weight by total weight of the The chelating agent is useful for chelating metal ions the thai be detrimental to the shelf of the While BHT di edeiate antioxidant chelating other suitable and equivalent antioxidants chelating agents may be substituted as would be known to those skilled in the Liquid suspensions may be prepared using conventional methods to achieve of the composition of the invention in an aqueous or oily Aqueous vehicles for and isotonic Oily vehicles almond oily ethyl vegetable such as or coconut fractionated vegetable and mineral oils such as liquid Liquid suspensions ma further one or more additional ingredi ents but not limited suspending dispersing or wetting coloring and sweetening Oily suspensions further comprise a thickening agent Known agents but are sorbitol hydrogenated edible gum gum cellulose derivatives such sodium dispersing or wetting agents but are not limited naturally occurring phosphatides as condensation products of an oxide with tatty a long aliphatic with a partial ester derived from a fatt acid a or with a partial ester derived from fatty acid anhydride ethylene sorbitol and Known emulsifying but are not and about 5 DTT and has a pB of about In reaction buffer protein stabilizing Exemplary stabilizing but are such as and polyethylene glycol amino acids and deri vatives thereof as ionic such as and quaternar and such as bovine serum albumin One the optimized buffer comprises trehalose at concentration of M to In one the optimized comprises betaine at a concentration of about to about 10 I one the optimized buffer BSA at a concentratio of about to about 2mg one the optimized buffer comprises glycerol at a of about to about The concentratio of the buffering agent reaction solutions of the invention will vary with the particular buffering agent the working the concentration the reaction of the buffering agent from about 5 to about 500 mM 10 about 15 about 20 about 25 about 30 about 35 m 40 about 45 50 about 55 about 60 about 70 about 75 about 80 t 85 about 90 about about from about 5 mM about 500 about 10 roM to about 500 from about 20 mM to 500 from about to about 500 o about 30 ro to about 50 from about 40 to about 500 from about 50 to about 50 from about 75 mM to about 500 from about 100 m about 500 about 25 mM to about 50 from about 25 to about about 25 to from about 25 to about about 25 mM t about 300 When Tris is Tris working concentration typicall tram about 5 to 5 mM t about 75 from about 1 about 7 from about 10 mM to about 60 from about mM to about 50 m from about 25 mM to about 50 of solutions invention will generally he set and agents present in reaction solutions of the The pH of reaction solutions of the hence mixtures of the will var with the particular use and the buffering agent present be about pH to pH about about about about pH about pE about pH about about pH about about about pH about pH about pH about pH about about pH about pH about pH about pH about pi J about pH about about pH from pH to about pH from about pH to about pH about pH to pE about pH to about pH from about pll to about pH about pH to about pH front H about pH about to about pH from about pH to about about pB to about pH from about to about pH from about pH to about pH from about to about from about pH to pH from about pB to about pH from about pH to about from about pH to about pB from about pE to about pH front about pB to about pH from about pH to about pH about pH to about pH about pH to about As one or salts be included reaction solutions of the in many salts solutions of will dissociate in solution to generate at least on monovalent When included in reaction solutions of the will often be either individually or in a concentration of frotn abou mM to about 500 about I about 2 about 3 about 5 about 10 about 12 about about about about 22 about 23 about 24 about 25 about 27 about 30 about 35 about 40 about 45 about 50 about 55 about 60 about 64 about 65 about 70 about 75 SO about about 90 about about 0 50 abou 200 about 250 about 275 about 300 about 325 about 350 about 375 about 400 fro about 1 mM to about from about 5 to about about raM to about from about 20 mM to about 500 from 30 to 500 from about 40 to about 500 about 50 to about 500 about 60 about 500 about mM to about 500 from about 75 mM to about 500 85 mM to 500 from to 500 from about MM to about 500 from about m to from about to about 500 from about 200 mM to 500 torn about 10 mM to about fr m about mM to about 75 from about 10 to about 50 from about 20 mM to about 200 about mM to about from about 20 mM to about from about 20 mM to about from about 20 to SO from 20 to about 75 about 20 mM to about 60 from about 20 mM about 50 from about 30 mM to about 500 from about 30 mM to about from about 30 raM to 70 from 30 mM t about 50 As one or more divalent canonic salts may in reaction solutions of me salts used m reaction solutions of the will dissociate In solution to at least one species which is monovalent M When included in reaction solutions of the salts wil often be present e individually or in a combined concentration of about to about about 1 about about 3 about about 5 about 6 about about 8 about 9 about about about 15 about 17 about 20 about 22 about 23 about 24 about 25 about 27 about about 35 about 40 about 45 about 50 about 55 about about 64 about 65 about 70 75 about 80 ffl about 85 about 90 about 95 about 100 about 120 about about about about about about 250 about 275 about about about 50 abou 375 about from about 1 mM to about 500 from about 5 mM to about 500 ftora about 10 to about 500 about 20 raM to about 500 from about 30 mM to about from about 40 m about 500 mM from mM about 500 mM about 60 to about 500 fro about 65 to about 500 χη about 75 to about 500 from 85 to about 500 from about 90 mM to about 500 from about 100 mM to about 500 from about 125 mM to about 500 from about mM to about 500 about 200 about 500 from about mM to about from to about 75 from about to about 50 fro about 20 MM to about 20 m about 20 to about from about 2 to about 125 fro about 20 to about 100 fr m about 20 to about 80 from 20 to about 75 from about 20 mM to about 60 from about 20 mM to about 50 from about 30 about from about 30 mM to about 1 0 from about 30 mM about 70 about 30 mM to about 50 included in reaction solutions of reducing agents will often be present either individually or in a concentration of about mM to about mM about abou about about about about about 2 abou 3 about 4 about 5 about 6 about about about about 20 about 22 about 23 about 24 about 25 about 27 about about 35 about 40 about 45 about 50 m from about mM to about 50 from about to about 50 from about 1 mM to about 50 about 2 to about 50 from about mM about 50 about to about 20 from about mM t abou 10 from about m to about 5 from about mM to about from 1 to about 20 about I to about about 1 mM to about 5 from about mM to about from about mM to about from about 1 to about from about i mM to from about 2 to about from about mM to about from about 1 mM to from about to about about 2 mM to about about to about from about about 2 from about to about from about mM to about about mM to about about mM to about from about about 20 from mM to about from about mM to about 20 solutions of the invention also contain one or e or detergent TRITON etc included In reaction solutions of the detergents often be present either individually or i combined of from about to about about about about about about about about about about about about about about about about about from about to about about about from about to about from about to about from about to to about from about to about from about to about from to about from about to from about to about from to about about to about from about to about from about to about For reaction solutions of the invention may contain TRITON at a concentration of about to about about to about from about to from about to about to about about to about Reaction solutions of the also contain or more stabilizing agents some when included in reaction solutions of stabilizing axe present either or M to about M about about about about about 2 about 3 about 4 about 5 about 6 about 10 about about about i 7 about about 22 about 23 about 24 about 25 about 27 30 35 about about 45 about 50 from about to about 1 from about M to about 5 from about to about 2 about M to about 3 from about M to about 4 from about M to about 5 from about M to about about to about from about M to about 1 from about M to about 10 front about M to about In some when included in reaction solutions of the sueb stabilizing agents present either individually ia a to about about about about about about about about about about about ί about about about about about about about about about about about about about about mg about about abou about about to about about to about from about to about some solutions of tie stabilizing agents are be present either individually or in a combined concentration of about to about about about about about about about about about about about about about about about about abou about about about about about about about about abou about about about about about to about from to about from to about from about to about about to about Reaction solutions of the also contain one or more DIM A polymerase inhibitor included reaction solutions the such often present either individually or m a combined concentration of from about to about about about about about about about about about about about about about about about about about μ about about about 10 about about 20 about 25 about 30 about 35 about 40 about 50 60 about 70 about 80 about 90 about 0 about μ to about 30 about μ about 30 about to about 30 from about to about 30 about to about 3 from about to about 30 from about 30 about to about about to about 30 from about 30 about to about about to about 10 from about to about about to about 2 about to about 1 m about 1 to about ΐ μ to about 5 from about 1 to from about 1 to about from about 10 about from about 20 to about 100 from about 40 to about about 30 to about from about 30 about 70 from about 40 to about 60 from about 40 to about 7 from about 40 to about 8 Reaction solutions may also or more that RT agents improve primer efficiency and improve product in one the solution comprises an agent that reduces binding of As described elsewhere the agent may comprise any nucleic acid or small molecule that prevents or reduces in certain the comprises D4A o thereof Variants of D may a 04 or molecule having substantial homology t as described elsewhere When included in reaction solutions of the 04 or variant may be at ratio of A variant concentration to concentration from about to about For in some or variant thereof may he present at ratio of concentration to concentration of about or many nucleotides as will be present in reaction mixtures of ndividual nucleotides will be present in concentrations of about to about 50 about about about about about about about t about 2 I about 3 about 4 5 about about about I S about about mM about 22 about 23 about 24 about about about 35 about 40 about 45 50 to about 50 raM to about 50 from about 1 about 50 from about 2 about 50 from about 3 mM t about 50 mM to about 20 about mM to about about to about 5 about mM to about from about mM to about 20 from about I mM to about HI torn about mM to 5 from mM to about about mM to from about 1 mM to about from mM to about from abou 2 raM to about from about mM to from about 1 mM to about about raM to about from about 2 mM to about from about mM to about from m to about 2 from about mM to about 1 about mM to about to about from about to about from about M to about 20 from about mM to about from about mM to about 20 The combined nucleotide when more than one nucleotides is can be by the concentrations of t he individual nucleotides When more one is present in reaction solutions of the indi vi nucleotides may in utmolar a reaction solution may for 1 1 mM mM 1 mM will typically be present reaction solutions of the most will be added to the reaction shortly prior to reverse reaction solutions may be provided without This will typically be the case solutions are provided in present in reaction often be present in a concentration of 1 to 20 mixture about about 20 50 about 100 20 200 20 about 10 about 500 about 20 about about about 25 about ds 20 about 20 about 20 about 400 raffls 20 about 500 about 750 about about g 2 about about 20 about 30 about 40 about 50 about 70 85 20 about from about 10 to about 100 fig 20 about 10 20 to about 100 about to about 100 from about to about 100 100 20 to about from about 10 to from about 10 to about 5 about to about 1 μ to about 10 from about I 20 to about 5 from about μί to about 1 aboirt to about 5 As the art would different reverse transcription reactions may be perfbrmed in volumes other 20 such the total amount of present will vary with the volume the above amounts are provided as examples of the amount of μί of reaction Re verse transcriptases reverse transcriptases of may also be present in reaction When reverse often present concen which results abou t to ts of reverse transcriptase about about about about about tinit about 1 about about about about about about 20 about 25 about 50 about 0 about 1 SO about 200 about 250 unit about about 500 about 750 about from about 1 to about nnit μί to about 1 about to about from about to from about 10 to about about 20 to about about 50 to about about 100 to about 1 about 200 to about to about about 500 μΐ about 1 from about 1 to about 300 unit from about Methods various the invention includes methods of engineering variants of In some the variants have at least one enhanced property relative to unmodified In some variants are engineered by one or more mutations in such that the engineered variant is improved relative to ma urase with regard the error or other in some the comprises modifying reaction solution conditions relative to unmodified solution to create an improve comprising rna or a thereof with regard to the error or other various the includes of reverse transcriptase for a reverse In one method comprises the use of an or a variant or a acid encoding or a variant thereof a reverse transcription For in one the method comprises a reverse comprising an maturase or variant to template suitable conditions to produc a transcribed molecule the A various the invention includes methods of performing a reverse transcription using or a variant or a acid encoding maturase or a variant in combination with an agent reduces binding of primers to the surface of maturase or variant thereof For some the method comprises using or a thereof or a nucleic acid encoding maturase or a variant in combination with any nucleic acid molecule or small that reduces specific the comprises using or variant or nucleic encoding maturase or a variant acid such as a stranded or single stranded DMA or RN A molecule that reduces some tire method comprises using or a variant thereof or a nucleic acid encoding maturase o a variant in combination a hairpin or molecule reduces some the method comprises using a variant or a nucleic acid encoding maturase or a in combination with a nucleic acid molecule derived from a group that reduces some comprises using or variant thereof or a acid encoding maturase or a variant in combination with or variant or nucleic molecule encoding D or a variant in a reverse transcription For as described P A can be used or a thereof to RT activit by reducing binding of primers t the maturase For one the method comprises mixing agent fo reverse comprising maturase or variant under suitable and contacting the reverse to an template to a transcribed DNA from the RNA the present invention incl udes methods of using or a or acid rase or a variant in reaction buffer in a reverse transcription For one the method comprises reverse transcri comprising a E maturase or variant to an reaction and contacting the reverse transcriptase to an RNA template to produce a transcribed DNA molecule from the RNA one the optimized reaction buffer at a concentratio of about to about KC1 at a concentration of t about at a concentration of about to DTT at a concentration of about to about wherein the reaction buffer has pH of about n one reaction buffer comprises about 50 about 200 mM about 2 M about 5 and a of one embodiment the optimized reaction buffer comprises a protein stabilizing Exemplary stabilising agents but are not limited osmoiytic stabilizers such as mannisdomannitof and polyethy lene amino acids derivati ves thereof such as sax ionic and quaternar such as bovine serum one the optimized reaction buffer trehalose at concentration of about to about 1 in one the optimized reaction buffer ar of to about th reaction comprises at a concentration of about to about one embodiment the optimized reaction buffer glycerol at a concentration of about to Using maturase thereof technology that employs reverse transcriptio as a step utilize the aturase and variants of the various the improved is used to perform revers transcription as part of an various the assay may be at the group qR capillary electrophoresis for mapping as or D sequencing nanopore cD library and a combination In certai the method provides for reverse transcription at physiologic or at relative that required ved reverse certain the lower temperature of the reverse transcription reaction provi des a decreased rate of degradation of the RNA molecule during the relat to the ra e of degradation of an in a reverse transcription reaction that reverse one method comprises reverse of a long complex In certain the reverse transcriptases have thereby allowing the synthesis of longer reads and ON it is herein that the reverse transcriptases of tie present to reverse transcribe RN templates complex the method com a solution comprising a low concentration of reverse transcriptase t concentration required for s reaction using a different reverse In one the method a single reaction amplification of possible by true ability of the reverse transcriptases described the ability of the reverse transcriptases described herein allows for the of RNA without the need for DNA one the is utilized in a quantitative In the of products is monitored in each cycle of the The amplification is measured in which have additional devices fo fluorescence signals during the amplification for and In one the procedure is carried thermostable improved without a In one the improved maturase enzyme is utilized in a capillary electrophoresis mapping The application of electrophoresis to A probing is step i increasing the of RNA Although RNA probing in can readily i for short R probing of long can be challenging without the Gel electrophoresis t pically resolves about a hundred bases of RN A at a hence probing an of several kilobases long might require running tens to hundreds of Capillary allows the resolutio o bases from a structure probing experiment and multiple lanes can at the time to increase the throughput of RNA structure readout of the probing experiment is typically the reverse of a Suorescenily labeled DNA primer that anneals specifically to th RNA of interest the RNA is several kilobases primers are to along the l of the transcript Modification cleavage of the RN A template results in premature stop in th primer extension leading to of the product which are resolved Software tools such CAFA can automate the data acquisition capillar electrophoresis farther improve speed and accuracy for et Nat Rev one the in RNA sequencing by recent developments generation become a powerful tool analyzing expressio detecting transcript varian understanding function of regulatory A is generated sequencing adapters to There are two main classes of methods to prepare ific first method comprises ligating different adapters to and ends of RNA v2 from Life more widely used comprises incorporating i addition to d TPs in the second strand DNA Following adapter the second strand can be specifically so that only the library strand containing the firs strand cDN A be sequenced and on direction of the transcripts can therefore he obtained Sultan et Biochemical and Research 422 also see PCT Patent Application Number The is also directed to methods for one or more molecules labeled nucleic acid comprising mixing one or more nucleic acid templates one or RNA templates or messenger RNA one or more polypeptides of the having reverse transcriptase activity and incubating the mixture under conditions sufficient to synthesize one or more first nucleic acid molecules complementary to all or a of the one more acid wherein at least one of synthesized molecules are optionally and or comprise one or more labeled nucleotides or wherein synthesized may optionally to contain more one or more nucleic acid molecules are Nucleic acid templates suitable for reverse transcri ption according to this aspect of the invention include any nuc leic populatio of nucleic acid molecules particularly derived from a cell or in a population of molecules number of cells or used to make a labeled in accordance the inventiom Exemplary sources of lates include vitally fungal plant cells and animal The also concerns methods fo may optionally fee comprise or more nucleic acid templates or or a population of inRNA with one or polypeptides of invention having reverse transcriptase incubating the mixture under conditions ent to or more first acid molecules complementary to all or a portion of the one or more and incubating the one or more acid molecules under conditions sufficient to make one or more second nucleic acid molecules complementary to all or a portion of the one or more first nucleic thereby forming one or more nucleic acid molecules comprisin the first and second nucleic acid accordance with the the r nucleic acid molecules may be labeled may comprise or more of the same or different labe led may modified to contain one or more of the same or labeled nucleotides ma he at one or both synthesis Such methods ma include the use of one or A polymerases as part of the process of making the one or more nucleic acid invention also compositions useful for making such acid Such compositions comprise one or more reverse transcriptases of the invention and optionally one or more DMA a suitable buffer one or more nucleotides including labeled The invention is also directed to nucleic acid molecules labeled acid produced according to the methods to kits these nucleic acid Such molecuies or kits ma be ased to detect molecules example by or for diagnostic Producing improved In various improved maturase is by methods described generally available in the art of cell and molecular the maturase may produced by a host or by synthetic hi the improve maturase is encoded by a polynucleotide to a which is inserted into an expression vector for expression The vector is then inserted into the host and a selection step may be performed to enrich the culture for host cells th vector has been After cultures may be inoculated host cells carrying the and expression of the improved En maturase may be carried either during exponential growth or at another stage of growt of the culture of host After expression of the improved standard or innovative biochemical purification steps may be perforated to purify the protein cellular for ah Molecular A Laboratory Cold Spring Harbor Hew The invention is also directed to nucleic acid comprising a gene or nucleic acid molecules the mutant or reverse transcriptases of t he present thereof including fragments having and to host such or nucleic acid Any of hosts be used the gene or nucleic molecule of including otic eukaryotic In some cells used express the reverse transcriptases of the One of a p karyotic host suitable for use with present nvention is Escherichia Examples of eukaryotic hosts suitable for with the present invention include cells S s cerev animal cells Sf and COS Polypeptides of the in vention may isolated from a or expressing which be wild type cell or organism or a cell or some suc polypeptides be substantially isolated the or organism which they The invention also relates to a method reverse of the said c a host cell comprising a or other acid a reverse ptase of the invention such reverse transcriptase gene or other nucleic acid is contained b a vector within the host expressing the gene or nucleic and isolating or purifying reverse The invention is also directed kits for in the production o the In various the present invention a kit to produce or a variant thereof one the kit comprises an system that a polynucleotide encoding polypeptide or variant one the kit comprises an expression that comprises a poiyrsiicleottde comprising or encoding a nucleic that reduces the kit comprises an expressio that a polynucleotide encoding a protein that reduces one the instructional material that describes the use of the kit t produce maturate wherein the material creates an increased functional relationship between the components the individual using the one the utilized by one or the kit is utilized by more than one person or In one the kit is used wi thout an additional or another the kit is used with at least one additional composition or The also directed to kits for use in the reverse methods of in Such kits used for molecules labeled nucleic acid or Kits of invention may comprise a having in one more such bottles and the In kits of a fi container may contain or more of the reverse transcri ptase enzymes of in one of the compositions of th e Kits of the the or different at least one one or DNA polymerases thermostable DNA a suitable buffer for nucleic acid synthesis one or more In one kits of invention may als in the same or different an agent that reduces binding of primers to the surface of Ex or v ariant thereof one kits of the i may also the same different a reaction buffer as described elsewhere or used produce the optimized reaction the components the kit amy be divided into separate The is directed use in of Such kits can for making sequencin nucleic acid molecules or at the particular temperatures described Kits of the invention may comprise a such as a or having hi close confinement one or more such as bottles and the In kits of the in a first contains one or of reverse transcriptase of the present Kits of the invention in the same or different one or more DNA polymerases thermostable DNA one or more suitable butlers for nucleic acid nucleotides and one or more oligonucleotide one of the Invention may also in the same or different an agent that reduces binding of primers to the surface of maturase or variant as described elsewhere one kits of the inv ention may also in e same or different optimized reaction buffer as described elsewhere or used to produce the optimized reaction the components of the may be divided into separate for each of the also may instructions or protocols for out the methods of the In the invention a kit to use or a variant in reverse In one the kit comprises polypeptide or a variant thereof the kit includes that describes the use of the kit to use a variant in a reverse n the instructional terial creates an nctional relatiomhip between the kit components the individual one the kit is utilized by one person or in the kit is utilized by one person or entity In one the kit used wi thout any additional compositions or another is used with at one additional composition or EXPERIMENTAL EXAMPLES The invention is described detail to following experimental These ate for purposes of illustration and are not intended to be limiting unless otherwise invention should in no way be construed as being limited to the following but should be construed to encompass any and all variations which become as a result of the teaching provided Without is believed that one of ordinary in the art using the illustrative utilize the present invention and practice claimed The following working examples therefore are not t as limiting any way the remainder of Example of Eiibacerium Most group introns encode proteins as reverse transcriptases S et M Genes These reverse transcriptases to the family as the reverse transcriptases from are characterized by an extension and between 7 blocks feat are conserved across all FJ et on the RT reg plus the insertions between these comprise RT domai of group FJ et The RT domain of lite finger and of which contains tlie catalytic center and is responsible for polymerase fidelity d et 200 Nucleic Acids Blocker FJ et to RT domains the X is analogous to a polymerase and it functions in proeessivity S et 2001 Acids Blocker FJ et stmctiiral information group 11 intron the of RT and X domains in canonical polymerase et Nat Struct Mot Zhao C Nat Struct Mol group II additional including A domain EN domain could be found the X domain 4A and Figure FJ et 11 Agrawal et RNA et at Nat Struct Mol 23 These domains play critical auxiliary in group 11 NN et 20 1 FJ et et Microbiol whether they have contribution to reverse activity is largely In the protein forms a R complex with its intron through a positively surface R domain 1 et Nat Struct Mol Biol Zhao C et Nat Struct its reverse transcriptase activity is exerted in the context of this le A hallmark of reverse transcription reactions is high Highly reverse important group as it is respited for successful of group II and their health relationship with the hosts S et 2 1 AM et Microbiol some studies have repotted the high R of S et related Ts A et Cost et EMBO Piskarev at could be tremendously for RT enzymes in applications such as library contraction et splice characterization TW et and RNA by mutational profiling and M et Mat Methods of the mechanism the high proeessivity of group Mtron is very Such o mechanistic partl y due to the of structural information of group II intron fo the 20 both and A structures of group nitron RT domains C Nat Struct a A microscopy structure of grou II intron raaturase in complex with its host intron RNA G e a 20 Nat Struct were These represent the beginning of a new era of analysis for group nitron To meet growing need for processrv reverse transcription of large RNA an unusually powerful new RT is derived front a intron the E C et Nature of this enzyme was solved to exceptionally high resolution structural molecular Even before the optimization described this enzyme promotes of long RN A thereby the foundation for a versatile set of genomic tools The RT enzyme is further developed aud demonstrating its broad utility for diverse The enzyme is capable of addressing at least two distinct within V genomes are enabling the determination of coupled mutations lead to drug resistance in patients over providing a powerful for studies of site choice amenable to study in including the extraordinarily This goal was previously impossible because of e inability obtain The invention enables investigators to finally track populations of alternati spliced gene providing ne insights and gene Tne scientific of the present invention is a powerful of is used to accurately perforin sequencing of long and this new technology is applied to address unmet biotechnolog and The utilit of the RT is plified by application in studies of diversification in is a RNA evolves as a qtiasispecies millions of individual viruses that rapidly evolve to generate extensive genetic diversity within a single patient VD et Curr Drag Targets HIV diversification plays a major role in disease resistance to and vaccine responses WF et PLoS 1 HIV is present in blood throughout and can be sampled over from the genetic that result drug treatments ave been but efforts to build severely hampered the inability of commercial to faithfully copy the kb HIV R A from individual viruses in the for sequence This has resulted in viral sequencing strategies that rely on short which genetic linkage analysis WF et A et This identification of distal genetic that coevoive in viruses contributing drug or immunological resistance gain fitness a WF et PLoS Comput Ronth A et To overcome the of conventional RT is evaluated its ability to generate cD The Primer been which was nsed for tagging a was limited by CB et Acad Sci U S Primer methodology used with the longer RTs allows for aligned sequence analysis of that compose the evolving patient swarm before and daring By combining RTs with sequencing it possible to fundamentally improve sequencing resulting meaningful sequence and genetic analysis of individual viruses in patients some after drag A second example utility is in the stud of alternative which an essential mechanism for regulating gene and increasing protein di versity The majorit of genes within genomes alternatively spliced MB et For over of genes vele BR et and over of ET et Pan et Nat 1 encode As that undergo alternative While many of these genes encode only two or three some encode and even of thousands of In genes encode over isoferms and together account fbr of all expressed transcripts Bro wn et The example of is which contains 1 15 which are alternatively and which has the capability to express distinct and isoforms D Although sequencing has revolutionized the of transcriptases tire study of alternative m any techn ical issues limit ability to fully characterize complete The biggest is that in many alternatively spliced regions exist at multiple within individual A transcripts and these regions spaced apart the read lengths of most high throughput sequencing platforms much has been devoted developing transcript assembly software tools et Nat Grabherr MG et Nat C Mat Although these computational approaches ma correctly assemble many transcripts from they rarely assemble of that express multiple isoferms In one is likely unable use any software successfully assemble transcri ts complex alternatively spliced genes as available software tools have transcripts that hav many for genes with distantly alternatively spliced they can onl not directly which were present in the original sample M et Nat The availability of a robust and processive RT for preparing sequencing libraries the ability to thoroughly accurately interrogate i in the of sequencing enzymes is as important innovation in sequencing hardware and RN A technologies are platforms that utilize RT undesirable attributes such as poor vity and high error An accurate that efficiently carries out sequencing of long is inherently innovative and is leading to nnovations the study of alternative splicing products and viral RNA all within complex mixed An innovative RT platform facilitates improvement in hardware sequencing protocols because practitioners no longer to less accurate With the RT of HIV from patient populations can maintain genetic linkage in individual This allows researchers and clinicians an opportunity to interrogate coupled changes in viral populations by following individual viruses during development of resistance to cART and vaccine the methodologies protoco ls provided herein ex tend to of coupled transcriptional changes in any microorganism or during a highl processive RT enzyme disclosed sequencing of spliced pools enables distributions of spliced for the first thereby making it possible to Optimizing properties of the a f new reagent fbf accurate reads it is clear that a highly accurate RT and it is important technologies are and optimized before they are widel varied set of conditions have been for obtaining products from structured KN A molecules nts in in To that E RT sequence and stability have been and resultant parameters are used optimize conditions and the enzym i The ability to highly has been Reaction conditions improved so they are robust can be readily by other The results are by conducting comparative studies with other known none of which have quantitatively evaluated on templates longer The art provides quantitative studies of translocating helicase enzymes which can serve to inform the present analysis et J Mol S et j et Mat Struct Mol Myong S et Pang PS et EMBO Serebrov V et J Biol i Serebrov et Wagner ID et EMBO 1 Establish quantitative metrics for faction to irate comparison o Speed To rigorously measure RT processivity and and to determine reaction conditions optimal enzyme is important to determine velocity incorporated of the during the i ndi initiation phases Using RT that undergo only the initiation step of primer extension C et molecular previous results addition of first a distinct initiation phase of To velocity of this RT is incubated with initiating the reaction with a abeled primer and time points are with reactor et Biol before products are separated by electrophoresis and the evolution of short products plotted Elongation velocity is obtained monitoring the evolution of extended cDNA products on a long templat RT dissociation constan One important determinant of polymerase processivity the tendency of the enzyme to fall off the template the formal definition of processivity directional enzyme is how fast moves forward it fails off TM 1 Rev To obtain experiments in which RT is initiated wit a radiolabeled After allowing sufficient time partial a concentration of cold is added to ap dissociatin and rate constant for complete extensio in the presence and absence of trap extractin the difference in these Pfocessivity To evaluate variants of the RT and quantitative values fo processivity are While this can be determined values of and processivity defined more simply as the frequency of RT dissociation per initiation S et on a This is measured by RT reaction in the presence of a trap RNA prevents of free separating the prod ucts on a sequencing and comparing the ratio of radiol abeled products with abortive show that the is highly processive on template and that it significantly Superscript or Metrics of different templates varying length as see Figure are in the presence and absence of actional values for elongation that is mutants variants and as a function of reaction condition Stability The thermal stability of wild RT is under di verse ionic temperature and detergents to identify conditions for opti mal stability of the WT enzyme to establish benchmarks subsequent Intrinsic thermal stability of the RT is using a in which protein is measured in the presence of a dye binds to folded such SYPRC3 orange et j Am studied a plate in a PCR instrument Decrease in emission or i intensity is as a function of temperature et Protein to determine the free energy of stability is measured by monitoring primer extension as a of the conditions described Solubility is by dynamic light et AAPS Error rate and relative incorporation accuracy frequency and of the RT measured compared with other common RTs using methods et Mail Acad S assay has been developed a template is using a primer a and barcode second synthesis is adapters the resulting amplifie by sequencing of sequence reads to collapse the U barcodes creating a consensus sequence for each molec thereby any th ai arose PCR sequencing errors and retaining that occurred during reverse The consensus reads aligned to the reference and the frequency of induced mutations experiment using a set of synthetic As from the External RNA Controls Consortium it was found that the error rate profile of the RT is comparable if slightly better than that of determination assay is and is used to characterize the RT variants thai are and to compare them with commercially available RT To obtai for discrimination the form of selectivity constants fraction a classic kinetic analysis of single nucleotide incorporation into radiolabeled that encode a base for a single type of nucleotide is Reaction is initiated by the for the complementary or for a mismatch and time points are taken a Products are resolved on a denaturing gel and are to obtain relati ve incorporatio rate for matched and nucleotides AM et Biol B et Biol It has been previously shown that template switchin tends to problematic during PCR rather than the steps of RNA sequencing protocols MX et Genome For using a of six distinct and late switching was observed with frequencies of arid in libraries generated using and 30 rounds of PCR using Superscript 11 it is important to evaluate the frequenc of template switching with RT and variants t evaluate this RT was used to extend a radiolabeled annealed to a long template in presence of a second R molecule at concentrations ranging from 100 aM to 2 it shares of No template detection of ί The extent of template switching on all RT variants as well as commercial R s is monitored in two First the biochemical assay just described is performed using two As of that share of are performe of highly by nanopore a pool of transcripts contains 96 difierent s that differ from one another at least two of the three variable are The pools prepared vitro transcription an of collection of individual clones of The are nd mixed together In either ratio or a dilution series where different transcripts differ by up to two orders of These pools are used o prepare libraries the RT variants and reaction conditions above and then sequenced a ON to a of reads per Which each read corresponds to is and the Of is based on the number of reads corresponding to input isoforms and switched isoforms that were not present in the input these approaches are providing valuable about the frequency of switching of RT which has important implications in interpreting read sequence With the above parameters in benchmarks for establishing a set of optimized reaction conditions been In the ability of the RT to cop highly structured A templates is and the protein structure and sequence are optimised to further improve Its thereby expanding its utilit To optimize reaction E RT performance is tested special emphasis on as a function of concentrations of monovalent salts limited to and organic detergents and stabilisers such as and other buffer additives are important to be particularly attentive to improvements i as this can m optimization to ing Evaluate behavior of on stable A robust sequencing than processivit it must have force to disrupt copy stable RNA structures that within the RNA are extensivel in codin and R substructures can present as obstacles that block weak it important to evaluate and optimise the ability of RT to open and copy a diversity of RNA substructures hopping over and reinitiating at downstream portions of the intron such as that of are thought to be strong polymerases because they have evolved to cop highly structured group intron during Presently disclosed data demonstrate that the RT can successfully copy the structured of indicatin that has high degree processi vity structured analysis oi template structure for RT would provide valuable comparative information for optimization and interpretation of any abortive products that are set of stable RNA substructures are inserted into RN A templates for the varying secondary and tertiary stability templates are made the described below into the span of a sequence as the interior of the see Figure transcribing these on l arge scale RN A Stable test and abili ty of an RT to unwind and copy stable RN sterns shown Figure a containin a stable inverted repeat sequence that is located nucleotides the primer binding site is This enables one to test the power of the translocating RT du the elongatio phase it is The in verted repeat forms a stem of ten alternating terminated by a loop sequence of A series constructs i is elongated sequentially by 20 known thermodynamic stability duplex strength s readily calculated energies is generated DM Nucleic Acids The 5 stability are and then the speed of the RT as copies these templates are RNA tertiary structures A series of stable RNA are inserted at the terminus the m the construct described the from HIV et J the A I intron PL et the grou II intron arcia M et aL Celt et stable as the one et other struc motifs of known c stabilit SE et are tested Whether RT copies these and how they and other parameters described herein is M RNA structures become sharply the of Mg2 is important to context of highly structured R A 5 the RT structure to optimize RT the is before is widely distributed as a tool For that are 0 relevant to RT function are thereby enhancing solubility and and enhancing t Stream lining the RT Like all group 0 nitron the RT has additional protein tribute to RMA splicing and transposition but do play role in RT RMA binding site DNA binding domain that can influence and efficiency To address these mutagenesis is used to delete C erfninal DMA domain and the secondar RM A binding sites on the of the protein C et molecular Qa et the conserved and Arg regions with polar groups such as wishing to be bound by any particular believed these changes nonspecific binding of the R to the forcing binding excl usively at the polymerase binding domain is aitered using the parameters described above to test whether the exhibit enhanced or reduced Some alterations facilitate because they have little or no affinity for the E The toral and analysis of the RT and related RTs has demonstrated in to the the RT enzymes have a unique feature that appears to co to their unusually high Adjacent to the primer gri Figure there is a structural element called the Figure which is positioned to clasp the template and maintain ve addition by the Del etion of the does prevent but it inhibits processive elongation the the the loop sequence is ID and the of sequenc is almost invariant RTs and The region of this sequence is and whether alterations processivity or s a alanine a polar residue scan th and an electrostatic scan are Production testin of mutants given that a reasonable nittBber of loop positions is the is substituted with multiple which tend make loops more and with which rigidity X et Adv Finally mutations incorporated on of the thumb optimizing its ability to clas the Any variants with improved properties examined on structured RNA templates to identify any enhancements or diminutions Enhance Proteins identified In es the RT he properties more typical of protein et Environ Sterner Struct such as reactivity and even would be advance RT technology since it would of RNA templates a single While the enzyme was initially identified in a it is highly soluble and does not It is therefore advantageous to optimize the more characterized RT C et Nature molecular Analysis of thermophilic protein structure and function suggests that they tend to larger numbers of hydrogen bonds within rigitl sections of the tertiary structure S et Protein Guided by the structure of the pairs are engineered at positions that are proximal in space Interpretation of the optimization results is straightforward because all of the parameters described above have been well established in of related one may find that m one parameter as results in deterioration of another as in addition to optimizing parameiers as and error indi vidually matrix screens of enzymatic are set which all parameters are varied in large in random combinations et Journal of Applied often resulting in combinations of optimal weald never otherwise When testing stable to the strong stops by the RT may be once substructures block the polymerase are the reads through these is important sequence the products carefully the present on th to the RT is over is that the mutations will completely fell to alter the RT as the deletion mutants of the strongly To other group RT enzymes are guided by pipeline for protein discovery C structural molecular domains are between For all biophysical data obtained triplicate subsequent fitting is that coeffici en ts of determination are A ve and accurate RT is utilized to define population dynamics in the blood before and after patient combinatio treatment and The WT or RT is used to quantify individual barcode coverage with unique HIV sequences to track dsDNA processing errors and determine extent of the error introduction due to resampling and template This strategy the of conditions to reduce processing errors also provides a baseline to alow comparisons of improved RT before analyzing patient samples for mutational Monitoring genetic changes in individual within a patient requires sequencing methodology to detect mutations in the range S et Liang RH et Nucleic while providing viral sequence reads to retain genetic linkage of distal NGS are ideal for detecting genetic differences of viruses within a given low sequencing error rates S et J Liang RH ei Nucleic Acids due to of the linkage between different mutations a viral is single molecule sequencing technologies generates long sequence a concern is the higher error S Nat Rev Quick J et 2 minor viral variants in the population might not To overcome these Pri er ID methodology CB et Proc Set S S J which was originally developed fo the sequencing of short EN to obtain Synthetic reads et 1 Siapleton JA et PLoS PLoS Use of unique barcodes allows for computational reassembly of the iral sequence to determine identity enrichment of species within a and for correction of amplification and sequencing errors S et J et Nucleic Acids The use of to provide HIV cD As will allow viral which Optimize barcode de Various provirai genomes of differing lengths beers given of current commercial to generate and quality viral cDNAs for The read coverage of the Kb proviral WM et J with 2 additional of the 9 to increase genomic was The HRV proviral genome was by overlap extension to incorporate and resulting product was then used for After raw reads were Read alignment relied on previously reported tic took et PLoS 1 average reads per position yielded a coverage per In additio to determination of read depth tagged al s of errors amplification CB et Proc Natl Aca Genome Reads were sorted barcode each an individual genome set o reads mapping to different regions within A total of clusters were in close agreement to genomes calculated by qPCR at of the information allows to strategies for n of sequencing of required a MiSeq amount i ut genomes were that HIV genomes are and other proviruses are to conditions sequencing of patient HIV proviruses are sequence verified and are as reference for sequence RT utilized for generating cD A from transiently lines to provide baseline for evaluating versions of To HIV R A and copies of input or 30 PC cycles are evaluated at step 2 during template is examined whether an identical run with genomes with 25 reads would generate fold depth the genomes To test an MiSeq providing is used to increase sequence depth per All analyses utilize the starting as the sequence Results provide information on barcode sequence coverage and of as sequencing Define the level of template switchi ng during PGR amplification of the Mutations can arise during PC these events are not as frequent given the fidelity of the enzymes The concern is template switching during an of amplification steps in the which would generate chimeric This potential problem is tested by mixing 2 HIV clones that differ in Triple gag and and BAL in and BAL genomic copies are barcoded by overlap extension then An entire providing is used to provide sufficient depth and After reads are sorted barcode each cluster ng an individual genome with a set number of reads to different regions within that g for The number of sequences obtained that are not unique to either or B AL is to of rate template chimeric occur th a rise an PC cycle cycle number is adjusted so that chimeric sequences decrease to of the error chimeric sequences remain when cycles are is important to focus on at the library preparation step and adjust PCR cycle validated protocols allow one to quickly evaluate modified improved E cART resistant HIV virion RN As for In vitro of an H V swarm for RT Studies described involve optimizing for HIV RNA length The sensitivity of RT is determined for detecting in a mixture of wild type HI Vs from patients by simulating a swarm for RN isolation and To accomplish HI mixtures are generated fr m the following after cell type and with the following mutations S et 3 mutations 3 mutations in triple mutations protease and type HI i mixed with and mutant the following ratios based on p24 protease and Wild and copies on are is used for and products run through the and sequenced using a and the is t the Barcode read and sequence coverage eeoavolnte and alignments are based On viral sequences to coverage and error sequencing output on generally resolves viral nmitatlons The use of 2 containing differing of wild type to mutant HI V s enables the resolving power of protocol identifying and depth of mutant sequences to each t is or of total viral which allows the sensitivity arid the sequence of mutant as as error rates to be These outcomes the decision as to whether increased sampling depth and the use of the is Given that product amounts are quantified at each step in the conversion rate viral RNA to is of sequence output ability to identify viral mutants in mixtures methodologies of products for tracking and viral species provides important for depth of and error rates the combination treatment c successes The covariation of HIV mutations n protease and gag patien samples after 1 or 2 cART failures has been reported the studies did not address the presence of preexisting HI V variants patients before treatment that may hav selected or during cART to give rise cART given the absence of RTs capable of RNA the sequencing studies relied on tints removing any genetic between protease and individual well as other genes necessary fo drug resistance and With the development of it is possible to obtain from V and follow indi vidual genetic changes within the viral swarm before and during Sequential samples from 30 patients collected before and ate cART are for viral 30 samples available before reads by th NGS method WF et PLoS 1 l linkage of di stal viral mutations is allowing assessment of gene covariation by measuring linkage A et Mapper to test for LD A et it has been shown that NGS data can be searched for evidence of covariation by measuring LD within the viral mutational RNA is sequenced in considerable depth collected ove and allowing one define the linkage of distal and viral genes contributing to resistance which without precedent at the current ate compared published studies where NGS was used to computationally define of protease and gag mutations reads were supportin or reducing resistance WF et i is possible to whether viruses present before treatment contribute to c S is expected tha t RT reads through the HIV RNA the strategy be primers to obtain HIV RNA by dividing the virus into two overlapping and thereby utilizing RT obtain c NAs than could be obtained current MLV primers for sequencing have been employed in 10 the past WF et PLoS l For all molecular and when study designs are employed obtain data triplicate and with coeffic ient of the use of the RT to generate and 25 cfJ libraries One of the greatest challenges of sequencing is that reverse transcriptases have limited therefore tremendous difficulty in traversing the of Nanopore sequencing was previously used to f cJ MT et 30 These have demonstrated the feasibility of this approach the the for developing reverse The of s pioneere o Oxford this indicates region of 3 to which contains 95 of the exons that can be spliced in 1 different patterns these Superscript was to reverse transcribe either a in RNA pool or totai RMA isolated from Drosophita The processivity of SSII was overcome by Dsea l trsing in the constitutive exons 3 and exon The amplified Dsearni cD were then and ligated to The wer sequenced on R7 for 9 with as as 2 Using LAST S ef Genome reads could be uniquely aligned to one variant in each corresponding to distinct The two direction reads where strands were aligned wi an average of identity across the of the Using a set in vitro transcribe observed at a of less resolving a that plagued previous approaches to sequence were developed et Sun W et libraries were prepared sequenced on a The MM samples used for these are RNA Variant Control Mixes are pools of 69 artificial transcript variants which 7 human model each of which contains multiple The SIRV span splicing GC contents and there are three different pools of SIRV RNA in which the various transcripts are in different either or one or two orders of These synthetic therefore pro vide the pportunity to assess the quality of library and in efficiency of reverse There is desperate need a ve reverse transcriptase to used for the preparation of cD A sequencing The RT valiants conditions are utilized and applied to preparation of cDNA libraries that are assessed by nanopore The initial RMA may be the synthetic RNA pools from The yet complex of these synthetic RNA pools the assessment of the extent of cDNA After preparing and sequencing the usin the standard Oxford Nanopore reads are aligned to the reference sequencing using LAST SM et Genome to assign each read to a specific For the of synthesis calculated dividing the number of reads that span the entire of the transcript by total number of reads tha t m a specifi cally to that Since the transcripts have different GC and secondary calculating these values for each allows the of how each of these characteristics impacts the ability of the ET to faithfully copy these cDN libraries are prepared using several RT and reaction conditions in and then using adapters the libraries with a molecular This enables each library to be sequenced or the of dependi the number of reads needed per library the throughput of the nanopore For initial a SIR in which all are present in e Given mat contain only the am of this pool all ows to obtain coverage of each transcript by obtaining reads l Using the version of the Mi it is possible to reads in a hour sequencing which allows the multiplexing of tip libraries in a the current version of the uses a which at least higher throughput allowing for substantial The using the are an excellent way to monitor both processivity and accuracy of the These results are complemented by assays outlined and the error rat determination assays outlined to characterize the per formance of Use method to ptorne seqnencing of previously as part of the revealed that testis and ovaries express the greatest diversity of of all tissues nanopore sequencing of As synthesized by the RT other commercial front from testis and ovaries is the data from these libraries is compared to more traditional sequence the libraries is generated using same RN A samples previously used for project which billions of short reads were generated using the library preparation This allows the vast amoun t of data previ ously generated from these to used to directly compare to the based libraries that is prepared and as described In short read libraries these same RNA samples using RT other commercial RTs instead of Superscript are generated sequenced on the In this way both short data the arid other commercial The optimal RT and reaction identified herein are used to generate A libraries from testis and ovar RNA samples and these Example from group II is a highly proeessive and accurate reverse transcriptase Group I introns encode maturase proteins that as reverse transcriptases These reverse transcriptases are highly proeessive and as such properties are required for survival of group ΪΙ inside their a critical understanding of the structural elements that determine of group Is their structural has been Described is characterization of RT processivify of group Ϊ1 inrron maturase from which has available structural information for its It was found that maturase has a superior intrinsic RT processivity compared commercial Superscript IV This high processivity allows to substantially on a kb high processivity of is in on a loop finger that act as a The charged binding surface on the domain has contribution t RT reducing its positive charge increases the fraction on a difficult potentially reducing depletion through RNA sequencing estimated that the error rate of niaturase is comparable to error rate These results not only provide a structural mechanism for hig processi vi ty of group intro also demonstrate tha niaturase has created a powerful tool RT the presented RT processi ity of from r was rase had higher processivity than commercial IV and it produced more products a HCV Such processivity may be at least partially attributed to a loop the that is unique to II intron and Deletion of this leads to complete loss of processivity and the niaturase from a processive to a distributive engineering mutations of positive charges the surface interacts with group intro the does not affect R in reducing those positive charges increased the primer incorporation rate on a difficult RT potentially by increasing the active enzyme fraction that be otherwise through specific RNA error rate estimated by sequencing showed that maturase is at as accurate as The results presented in this example provide insights that reveal the structural of the superior RT processi group I intron and these results the foundation for additional engineerin of into more highly and accurate tool reverse Further detail the experiments presented here be found Zhao which incorporated by reference herein in ts The materials methods to these experiments are Construct protein expression and pur fication The group intron was obtained from group intron database MA et Nucleic Acids and the eDNA was by irrogen mutation constructs were generated by Q5 mutagenesis kit Construct has 4 point mutations including and Construct has 2 point mutations and Construct has 6 point mutations is a combination of and Construct Is a triple that consists of and Construct has replaced resides with two and purification were performed according to protocol published previously C et Nat Struct was expressed with 6 in and was initially purified by affinit column fusion protei was then resin by containing 300 the fusion tag was cleave by yeast I at for The precipitated protein after cleavage was spun down the supernatant directly loaded onto a 5 Hitrap SP column equilibrated with a buffer containing 300 at salt Under this Ulp 1 does not bind the SP Hitrap SP used instead of the Hitrap as described in the previous because the SP colum gives for some and the hound proteins were initially directl elated a buffer containing 2 at The 5 fraction was to low salt and was then loaded onto Hitrap SP equilibrated with a mixture of lo salt buffer and high salt The bound protein was with a linear gradient that buffer after 50 For loading the supernatant clarifying the SUMO the protei was etoted with a linear gradient that reacbes high buffer after 50 elution For all the proteins after SP were finally purified by a Superdex column and the peak fraction was concentrated to and under y RT was 5 end labeled by by PN and primer by polvacryiamide In this assa the RNA template first diluted to 40 in RNA storage buffer containing 10 and mM The RN was with 40 primer at 1 1 volume and the mixture was heated at 95 for minute and was then snapped cool on ice for 10 Then the annealed incubated with RT in RT r buffer to the following For Ex maturase 2 mixture was with l RT reaction buffer mM M 20 100 and was then Ex maturase at 50 For SSIV and 2 Τ mixture DTT reaction buffer and was then mixed 1 at 50 The incubation was performed at room temperature 20 after which the RT reaction initiated b adding a 3 solution of 50 1 of μΜ and 1 of 5 mM The RT was performed fo minutes at 42 for Ex maturase 55 for SSIV and 60 for RT was stopped by heating op the samples at 95 for 1 minute to denature the The enzymes were then digested by adding 1 protease at into the 10 RT and incubated 37 for Then the RNA was y adding 3 into the reaction mixture followed by incubating at 95 ftC for 5 The RNA sample was then directly mixed with Urea loading dye the cDN A products resolved a polyacrylaniide sequencing For control similar procedure Was followed except that trap of 50 annealed to of 100 μΜ was included in the step for annealed tem and RT The intensit profiles for the gel extracted b software TL Pixel positions were converted to DNA length b interpolating the linear regression of the logarithm of bands in ladder against pixel position The median of every reaction plots were produced by software Prism version RT assay RT RepA Oi RepA F et Nat and genome were used as The for RepA to position primer for RepA 03 annealed to positio 1630 and primers for genome to positions an I The primer was en by T4 was purified by In the the final RNA template concentration was nM and the final enzyme concentration was 500 The RT reactions were set up the buffer conditions and temperatures for each as been in processivity no were added the The reactions were allowed to proceed for for RepA Dl and 03 and for boor for HCV en ymes then digested by protease and the templates were by as described The products synthesized RepA templates were resolved by a polyacrylamide sequencing gel along ssDNA Sadder A synthesized from HCV genome were resolved by a agarose gel according to the protocol published previously J in LE agarose was first dissolved in by After solution cooled down to alkaline gel running buffer rnM and 1 EOT was added to agarose before casting the The gel was run 1 alkaline gel running buffer at temperature for 5 hours 2 Went The was then transferred onto nylon membrane that was placed top of 2 layers of Whatman after the gel was co vered by Saran To avoid gel the gel was at 80 fo 1 hour under and was then allowed to slowly cool to room temperature under the for 1 The ladder used in alkaline agarose was the double stranded DNA ladder which was denatured under alkaline Error determination RepA 03 was used as the template for error rate and RT primer anneals to position to the annealing the RT has nucleotides sequence molecular which was followed by a condition barcode and region complementary to universal primer that is at the very end 8A and Table Table Primer sequences used for error rate N indicates printer used for synthesis similar which contains a region complementary to Index primer at the very followed a condition barcode region that is complementary to very end of A 8A and Table I the condition barcode was designed to sort different and partially resolves library problem by having condition barcode with in this the same barcode was used for different were by The RT reaction was up in a 20 pL volume with RNA template 2 annealed to pinole RT which is much less the that can he fey combined primers 430 The RT reactions were performed in conditions in that the reaction was 1 The reaction was stopped by heating at 95 for 3 and the reaction mixture was cooled dow slowly to allow efficien annealing of to the A The RNA template was then digested 1 directly into the reaction mixture followed incubation at for 30 Then 20 pL RT reactions 2nd strand synthesis primer the 2nd strand was b Q5 in a 50 uL reaction volume a thermal cyder for a single cycle at 98 for 20 anneal at 50 for 30 seconds an extend 72 for 20 Then the pL products were by pL AMPure XP beads according to protocol The were eluted in pL and their was using LightCycler Green I ster kit using 0 A as The were then sted to the same concentration in different and 1 pL were first by PCR amplification primers for ί 3 cycles in 25 PC The PCR products were then by 45 pL AMPure XP beads and elated i 1 of PCR products were further amplified in PCR reactions for i 0 PC cycles using universal primer index For all PCR amplification the PCR is denaturing at 98 for 5 then amplifying using protocol with desired cycle at 98 for 20 anneal at 64 for seconds extend at for 30 and finally extend at 72 for 5 specificity of PCR was by an agarose gel stained by PicoGreen the products were samples were sequenced on an Mise sequencer mode for cycles with PhiX The data were processed by scrips published DF et Acids el in brief binding region and residues at both ends residues i and SO residues wer sequencing reads that hav residue with a lower than 20 were The sequencing reads were based o th at both and reads that share the same were counted a unique Reads were to reference by MUSCLE R BMC Edgar Nucleic Acids and errors were only when substitutional or in that to the unique only products that appears less than were used in substitutional results of the experiments now RT whic traps and rebinding of any disassociated To the processi vi ty wi th the processivity of other the domain of RepA was chosen as RT template et Nat as it allows efficient RT reaction for a variety of RT A trap concentration was then identified that is sufficient to prevent enzyme turnover given a certain template concentration reaction Under this no RT reaction is expected to occur when was Using this a condition was that is similar as what has reported for RT A et Eng maturase has superior compared to As shown 5 and the intensity distribution i eac lane maturase has only one minor RT stop at about 40 nt wherea to stop locations throughout the another group intron produced no product under optimal concentrate suggesting that the so inefficient or that of product synthesized below the detection limit The high intrinsic of makes highly efficient oil long and structured R A such as the kb HCV et As in gel and profile every gel Sane 2A and had much fewer RT stops and much more cDHAs than for primer that annealed to nt on HCV genome 2A and Figure cation shows that for all the products produced by the is product the of whereas this number is for SSIV and for Structural of RT Structural that contribute to the processivity of This study was by recently et Nat Struct Zhao et Nat Struct which of the finger and pa subdomains of and homology modeling of its thumb From a kinetics point of polymerase processivity should be considere at each nucleotide a ET and it results the competing forces that either drive the translocating forward on the A template and catalyzing addition of extra or lead to backward translocation or polymerase from the template Methods As backward translocation is generally observed a normal polymerase et G et the ikehhood of polymerase disassociatiou major determines the processivity of In mis the of high RT processivity should be the features that interact wit RMA to prevent For reverse the in the finger and enclose RT active site pre vent RNA template from fatting off 4B and For HIV extending the by acids RT Y J addition to conventional processivity novel loop structure that is unique to group intron is in the finger of Thi loop is located right next to the and encloses the active site Deletion of this resulted complete loss of even under the length products for the Δίοορ increased is sharp the type that established a stable of products within minutes This behavior of the mutant consistent with a distributive which talis off the RNA template very frequently or after every nucleotide addition this is a unique processi vity factor and based on sequence alignment this loop is also very likely present and potentiall plays a role in group II intron rnaturases and the closely t was then asked whether all regions in maturase are important for its RT processi vity RT activity in This is because is a protein that could also recognize and stabilize its host group 11 RMA and promote intro splicing et Genes et G Nat Struct Zhao et Nat Struct Mol Structural elements that responsible for these functions might crosstalk introduce unwanted effects the re verse transcription Understanding th arit of these structural el promote the understanding of regulations i ancient proteins such as grou 11 can also the engineering a highly efficient RT effects on RT efficiency have been potentially caused by the recognition of 11 intron For RepA domain was used as the could only a small portion of primer and the situation even worse for 78 and Figure This utilisation problem is not severe RT reactions that as template This template dependency rules out the possibility that maturase has an low in ease maitirase should have performed equally poorly on all RNA this problem could he explained depletion of bot RNA and active maturase through interactions a positi vely charged surface on and intron Without wishing to be bound b particula because different RNA templates have different sequences ami RN A the template interaction has different which leads to different degrees of incorporation in reactions for different further test mutants that have reduced charges On the RNA binding surface in maturase RT and domain engineered and their primer incorporation rate for RepA template In the crystal of maturase RT domain C Struct and the structure o group II complex G et Nat Struct the highly positively and binding surfaces lie on the opposite side of the RT active side 7 and therefore may be unlikely to play a in reverse set of mutations was designe focused on binding including of mutations was 6Α and that potentially interacts these sets of mutants were combined to comprise 6 point mutations in total set of mutations on the maturase t domai including was designed that predicted t interact exon for facilitating group RT assay using the template t constract has increase in primer rate compared to the type construct has almost change whereas construct has increase compared to the type Figure Without wishing to be b particular this increase in primer incorporation rate by decreasing the positive charge on binding surface suggests that template depletion is likel to pla a in the incorporation without wishing to be bound by particular this improvement of t construct compared to and alone suggests that the template binding is Even with 6 alanine mutations on the positi vely charged the construct is still only able to utilize of Rep A Without wishing to be by any particular this suggests as the positively surface on RT is positivel residues need to neutralized in order to achieve a higher RT efficiency on some R A ha no change in the under indicating tha this positively charged surface does affect RT has a decrease compared to wild type and Figure suggesting that the positively charged interact durin group play a in recruiting RN template during RT reaction maturase is an accurate reverse t anscriptas error rate was to determine its accuracy compared to commercial RT and other group 11 introft Man methods have been employed estimate polymerase error rate the For the mutation selection assay was the most widely used I f 85 J Biol as not RT mutations will result in functional this probably error The development Table Error rate for different reverse The total of is the raw of sequencing reads ei ther forward or reverse direction each The unique product is a set of reads that share the same molecular and only products that have less than 3 reads shows the number of The substitutional these unique products were around for and for SB and Table insertion and deletion events were also observed at the sequencing depth used in this Tabl These results suggested that as accurate as other reverse transcriptases as a frequenc at about 1 is the best number achievable for a polymerase without a domain et Nucleic Acids similarity error for the suggests that the positive charge RNA recognition surface does not change polymerase The error rate for group TG1 reported in the prev ious study S et times lower than the rate tor maturase determined in this discrepancy is likel due to differences in th methods used to determine the err or I the previous the auth ors overlapping region of the forward reverse reads in a pai sequencing experiment at a level S et therefore the sequencing depth fo each nucleotide might not have sufficient to accurately estimate the error These previous estimates of the error rates aad TGIRT may from those shown of the following in experimental While only a single template was employed for the Mohr utilized an entire This has two if t e rate has sequence the intrinsic error rates will be different for a different RNA the sequence alignment algorithms for the two approaches is therefore necessarily more noise associated with n Mohr et the calculation was performed on data which the sequences were read twice from both ends and only if the overlapping between these two reads was perfectly aligned and longer than 20 This results a small amount of data that ca be included the subsequent error rate Outlier in Mohr et al the authors errors that are common to This causes a significant of characterizatio this the processiviiy of maturase has been characterized the first and has compared to the popular commercial SSIV that is derived from The results demonstrate that the maturase has high intrinsic that allows it to long 9 transcripts with much fewer RT stops than The comparison of maturase with suggests high processivity a highlight of reactions by group intron structure was identified here is for the of on the crystal structure of maturase RT domain C el aL Mat this loop encloses RT site likely able to prevent template s crystal structure was obtained the absence of RN A this context the forms a short at the tip stabilized in closed conformation that appears to obstruct the template entry pathway C et Struct Mot 6A Figure the structure of G et Nat Struct the same region of this forms a stabilized in an open through interactions wit domain 4 in this is likely to be flexible able to swing and out to the association of RNA Sequence of group I I intron shows the presence of this loo is highly indicating that the of other group intron also be at least partially b this the amino acid within the is poorly especially the suggesting that this tactions primarily through sieric effects The presence of in RTs such as LI indicates that this RT processivity highly l and likely to also play a role in LTR The present of structures of group 11 G et Struct Zhao C Nat Struct suggest this is part of a could potentially sec RNA templates thereby enabling the high processivity in group intron The composed of and of the thumb grasps the duplex close to the extension termini The outer which is also present in other is composed of a finger and a thumb and could help to further stabilise the product duplex Conserved positive charges are identified at the ti of this in are especiall enriched in the thumb and 1 in White not wishing t be bound by any particula these positive charges the tolerance to salt group intron turases compared to has also reported to enhance polymerase amino acids longer impro ved the proeessivity of T Y et J It is likely these and outer clamps since of can to a complete loss of proeessivity By 5 employing stron electrostati forces on the thumb and a extra sterie gate in tire finger the se overcomes its size limitation accomplishes even higher proeessivity than HIV which has a much extensive the duplex this it is that is an accurate RT 0 that has a comparable to other as The substitution frequencies for these RTs are about 1 x Although this imber is o ver an order of magnitude larger DM A polymerases such and it is comparable to the error rate of i enow gment which also lacks a and it is even comparable to Taq polymerase that has activity et Nucleic Acids the error of is about the best that a polymerase ca achieve exonuclease As mentioned lower substitution frequencies and Superscript reported an earlier study et likely a result from of the seq reads used for measuring the error As recognized for lon time error rate of polymerase is beyond what thermodynamics of alone could For it has been reported that the AG for complementary and in solution is only which translates to a mismatch ten to a few hundred 5 based on Boltzmann distribution J Biol was that high specificity beyond the thermodynaniics limitatio be by a kinetic Acad by an state after the step of initial 0 can discriminate correct versus incorrect the error rate of a polymerase could he raised to the second therefore 1 in becomes 1 in 10000 Proc Natl Acad 71 was that kinetic proofreading coald be attributed to of upon dNTPs which is energeticall unfavorable in presence of base pairs Y et Acad Sci S on the substitution error of it is very likely that and other group II also undergo this to closed Engineered site that have improved fidelity of this transition are considered the Because of its hig is a good candidate to he utilized as a tool reverse trasiscriptase Although there is already thermostable group the has its special potential for the following high resolution structural information for its domain and is available C et Nat Struct In the design of mutant construct that is more efficient primer incorporation on a difficult template without affecting processi vity and fidelity 66 and Figure is demonstrated behaves poorly on RepA Dl template and can only utilize the was originally identified the group database for its stability et Struct and protocol has been developed to obtai highly pure is only stable as a fusion construct with an tag f ohr S et and the presence of this MBP tag might future engineering of and introduce unwanted effects in and the mutants described have great potential to as highly ve accurate reverse the reaction condition for buffer composition for reverse transcription by was systematically H I including buffer salts and their and the concentration of Subsequently the effects of different additives were also explored the optimised The additives include and niton Sixteen different buffers were tested here Buffer 1 developed by was used as point The results are shown in Figure terms of and buffer the data demonstrates that perfornis best pH in Tris Primer efficiencies are similar at pH and S3 HEPES buffer I and which are bu yield is highe at pH which is compared to the yield of at pH At H in HEPES buffer no product Tris TAPS buffers were further tested at 4 and and it was found tha the yield of is further improved Tris buffer although primer The primer incorporation in TAPS pH reduced to In the Tris buffer pH the concentration of chloride was then increased from 0 mM to 200 and showed that the primer incorporation is increased to Sodium chloride and ammonium chloride at 200 give lower primer respecti vely 7 and optimised buffer contains Tris pH and 200 potassium the concentration o magnesium chlori de increased to was observed that the activity of Ex maf rase was almost abolished optimal buffer maturate was identified as a that contains 50 mM Tris pH mM 2 mM and 5 using the optimal several additives that are enzymatic including and triton were Betaine is a xwitterion and believed to destabilize the base pairing DN or EN A double and thus may reduce the structure of for the presence of 1 the primer increased to bu t the yield of product slightly reduced to from Trehalose is endogenossly 5 DTT and M r A maturase investigated and from 1 to respectively was that the primer incorporation efficiencies are increased by addition of and respectively torn lane to the yields of are and respectively lane 1 to binding site D4A helix is located at loop and adjacent structure et Dai Besides the apical loop of Ex Group It the adjacent may be for maturase the apical loop and adjacent stem region of shown in the box in may represent a binding and be used as a functional fragment to reduce binding of primers to the maturase of maturas variants for thermostability Designing the imitations maturase referred to as a E cterium r it quickl loses herein are experiments to improve its by introducing The are designed based the conserved residues in et The residues are conserved in thermophilic different in maturase suggest roles in To compare the acid sequence of raatarase with thermophilic a multiple sequence including maturase and 4 maturases thermophilic bacteria was performed Ten residues that are only thermophilic were These positions include i 171 and 337 based on the numbering of the tertiary structure of En A29 and V82 are located in the same hydrophobic core that are all the aligned It is very likely the two residues interact each oilier in a e instead of two is more is located at the of an mutation improve the stability of the a triple created for analysis enzyme activity assa is close to and introduce stackin with 0 stabilize the located at a loop the RT domain and and M337T may linker between two two single and were created for Expressing proteins Expression of the three proteins triple and induced by for After purified by the proteins were treated by SUMO protease to remove The analysis showed that both and mutant proteins were as mixtures and proteins For the more expressed truncated form that in and this situation is more severe for and M337T mutants Lane the triple are mainly expressed a length Lane and Evaluating the mutant After activities of the ree mutant enzymes were measured different RepA served as the A and the optimized that contains 50 200 2 5 mM DTT and M trehalose The reverse transcription reactions were carried at 55 and respectively to evaluate their performance and and the type enzyme served as the At the triple has a better performance than the givin a higher primer incorporation efficiency and length product yield A2 is less thermostable than the At higher the more active than the The thermostability of mutant is almost the same as the at different as s Figure mutation severely impairs the and the of Since these enzymes ve at 55 their activities were not quantified under these At the primer efficiencies by and M337I are and respecti and the yields of product are and At the primer incorporation efficiencies ar reduced to and and the yields of product are reduced to and The disclosures of each and every applic and publication cited herein are hereb irtcorporated herein fey their entirety White this invention has disclosed with reference to is apparent that othe variations of this invention may ski lled in the art without departing from the true spirit scope of the in venti The appended are to be construed to include all such embodiments and equivalent insufficientOCRQuality