Certains contenus de cette application ne sont pas disponibles pour le moment.
Si cette situation persiste, veuillez nous contacter àObservations et contact
1. (WO2011161063) LEAA DE TRICHODERMA REESEI
Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

LEAA FROM TRICHODERMA REESEI

The present invention relates to a transcription promoting protein methyltransferase of Trichoderma reesei.

It is known in the art that filamentous fungi such as

Trichoderma reesei can be used to produce valuable compounds. Due to their glycosylation and secretion capacities, filamentous fungi are preferred hosts for producing secreting proteins.

Most of the industrial production of enzymes for plant bio-mass hydrolysis is performed with mutants of the fungus Trichoderma reesei (the anamorph of the tropical ascomycete Hypocrea jecorina) . It is known that the genome contains a lower number of cellulase and hemicellulase genes than the genomes of other fungi. Although the number of said genes is relatively low

Trichoderma reesei shows a relatively high expression of cellu-lases and hemicellulases which indicates that the expression of these enzymes is somehow up-regulated.

It is an object of the present invention to provide means to improve the expression of homologous and heterologous proteins, in particular enzymes, in fungi, in particular in Trichoderma reesei. This means should be able to up- as well as down-regulate protein expression in fungi. Furthermore this means may also be used in recombinant protein expression.

Therefore the present invention relates to an isolated polypeptide with protein methyltransferase activity having at least 80% identity with amino acid sequence SEQ ID No. 1.

It surprisingly turned out that the polypeptide of the present invention, which may be encoded by nucleotide sequence SEQ ID No. 2 from Trichoderma reesei, is able to regulate the expression rate of a specific group of proteins which are all related to biomass degradation (carbohydrate-active enzymes, CA-Zymes) . It could be shown herein that if the concentration of the polypeptide of the present invention is reduced within the cell the amount of the aforementioned proteins is also reduced. On the other hand an increase of the concentration of the polypeptide of the present invention within the cell leads to an increased production of CAZymes. This data clearly demonstrate that the regulation of the expression of the protein methyl-transferase having at least 80% identity with amino acid sequence SEQ ID No. 1 allows the regulation of the expression rate of specific proteins within a cell.

The above described effect is a result of the fact that bio-mass degrading proteins are usually clustered within the genome of fungi like the enzymes involved in the biosynthesis of secondary metabolites. The latter are known to occur in clusters, frequently near the telomere end of the chromosomes. In another fungal genus, the Aspergilli, such clusters of secondary metabolite genes have been demonstrated to be epigenetically regulated at an upper hierarchic level by the protein methyltransferase LaeA, by reversing the repressing heterochromatin structure resulting from methylation of K9 on histone 3A and binding of the heterochromatin protein HepA to histone 3A. Because of the clustered co-occurrence of cellulase and secondary metabolite synthesis genes in the Trichderma reesei genome, cellulase formation is regulated by an LaeA orthologue which exhibits at least 80% identity with amino acid sequence SEQ ID No. 1.

As indicated in Table 1 of the examples section the regulation of the expression of the polypeptide having amino acid sequence SEQ ID No. 1 results in the regulation of the protein expression of the following proteins (classification in accordance with Henrissat B and Bairoch A, Biochem J. 316(1996) : 695-696; www.cazy.org/Glycoside-Hydrolases.html): GH 5 endo-B-1,4-glucanase Cel5A, GH5 endo-β-Ι , 4-glucanase CEL5B, GH6 Cellobiohy-drolase 2 CEL6A, GH7 endo-β, 4-glucanase EGL1, GH7 cellobiohy-drolase 1 CEL7A, GH12 endo-β-Ι , 4-glucanase 12a, GH45 endo-B-1,4-glucanase EG5, GH61 endo-β-Ι , 4-glucanase CEL61A, GH61endo-B-l , 4-glucanase CEL61B, GH1 β-glucosidase CEL1B, GH1 β-glucosidase CEL1A, GH3 β-glycosidase of uncertain specificity, GH3 β-glucosidase CEL3D, GH3 β-glucosidase CEL3C, CIP2, CIPl, CBM13 protein, swollenin, swollenin-like, 84 % ID to 123992, GH10 xy-lanase XYN3, GHllxylanase XYN1, GH11 xylanase XYN2, GH30 xy-lanase XYN4, GH3 β-xylosidase BXL1, GH43 β-xylosidase/ -arabinofuranosidase, GH74 xyloglucananase CEL74a, hemicellulose side chain cleaving enzymes, CE5 acetyl xylan esterase AXE1, GH67 -glucuronidase AGU1, GH62, -L-arabinofuranosidase ABF2, GH54, L- -arabinofuranosidase ABF1, GH95 -fucosidase, GH95 -fucosidase, GH92 -1 , 2-mannosidase, GH92 -1 , 2-mannosidase, GH47 oi-1 , 2-mannosidase, GH2 β-mannosidase, GH27 -galactosidase AGL1, GH27 a-galactosidase AGL3 and GH28 polygalacturonase.

The findings of the present invention can be used to provide host cells, in particular genetically modified fungi like

Trichoderma reesei, which show an increased or reduced CAZyme expression activity. For instance, it is possible to provide Trichoderma reesei cells which do not express biomass degrading enzymes (by inactivating (e.g. gene deletion or disruption) of the gene encoding for the protein methyltransferase of the present invention) . However, it is also possible to provide cells which show a high activity of said enzymes. The latter effect can be achieved by increasing the expression rate of the protein methyltransferase according to the present invention. This can be achieved by introducing some more copies of a nucleic acid molecule harboring a nucleic acid stretch encoding for the protein methyltransferase of the present invention. Alternatively the promoter region of the native gene may be modified to comprise homologous or heterologous promoter which are much stronger than the native promoter. In order to produce heterologous proteins in a fungus like Trichoderma reesei under the control of the protein methyltransferase of the present invention one of the above identified enzymes, which are regulated by said protein methyltransferase, may be exchanged by genetic manipulation by a heterologous nucleic acid molecule encoding for a product of interest. Since the genome sequence of Trichoderma reesei, for instance, is known in the art such manipulations can be easily performed.

Methods for the production of such genetically modified fungi, in particular Trichoderma reesei, are well known in the art. For instance, in WO 2006/060126 methods for transforming and cultivating Trichoderma reesei cells are disclosed.

It is emphasized that also a functional fragment of the polypeptide showing protein methyltransferase activity and having at least 80% identity with amino acid sequence SEQ ID No. 1 is subject of the present invention. "Functional fragments" of the protein methyltransferase of the present invention refer to possible fragments which still retain the activity of the full length polypeptide.

According to the present invention the polypeptide showing transcription promoting activity may be at least 80%, preferably at least 90%, more preferably at least 95%, even more preferably at least 98%, in particular 100%, identical with amino acid sequence SEQ ID No . 1.

SEQ ID No . 1 :

MSRNAPNGCVPPSQATAPPSPATSLRLTVGEPVSEPATESGERVLQDGFWEHGRFYGSWKPGKY LFPIDKEELNRLDVFHKYFLVARDEKVTSTPLRKDGRPKIMDLGTGTGIWAYNVVEEYAKDAEI MAVDLNQIQPALI PRGV KQFDIEEPSWDPLLRDCELIHMRLLYGS IRDDKWPHVYRKAFEHL APGIGYIEQLEIDWMPRWENEDLPRHSALQEWAQLFQRAMHRYHRSVTVSGEATRRRMEAAGFT DFSETTIRCYVNPWSPDRHQRECARWFNLAFSLGLEAMSMMPMIDKLGMTKDDIVDLCSRAKKE MCILRYRAYCTL

A further aspect of the present invention relates to an isolated nucleic acid molecule encoding an isolated polypeptide having at least 80% identity with amino acid sequence SEQ ID No. 1.

The nucleic acid molecule according to the present invention exhibits preferably at least 80% identity with nucleic acid sequence SEQ ID No . 2.

SEQ ID No . 2 :

ATGTCTCGAAACGCTCCCAACGGGTGTGTTCCACCCTCCCAAGCTACTGCTCCGCCTTCGCCAG CCACAAGTCTGCGACTAACAGTTGGGGAACCGGTCAGCGAGCCGGCCACTGAATCCGGGGAGAG AGTTCTCCAGGATGGGTTCTGGGAGCACGGTCGCTTTTATGGTTCTTGGAAGCCTGGGAAATAC CTTTTCCCCATAGACAAGGAGGAGCTCAATAGGTTAGATGTCTTTCACAAGTATTTCCTCGTTG CAAGAGACGAGAAAGTCACTTCAACTCCCCTGAGGAAAGATGGACGGCCGAAAATCATGGATCT CGGCACAGGCACGGGCATCTGGGCGTATAATGTTGTGGAAGAGTATGCCAAGGATGCCGAAATC ATGGCCGTGGATCTCAATCAAATTCAACCAGCTCTGCACTTGGCCCCTGGCATTGGCTATATCG AGCAACTGGAGATTGACTGGATGCCGCGATGGGAGAATGAGGATCTCCCCAGACATTCGGCTCT TCAAGAATGGGCTCAGCTATTCCAACGTGCCATGCATCGCTACCACCGCAGCGTCACGGTATCA GGCGAGGCTACCAGACGCAGAATGGAAGCGGCTGGCTTTACAGATTTCTCCGAAACAACGATCC GGTGCTACGTAAACCCGTGGTCTCCCGATCGCCATCAGCGGGAGTGTGCCCGTTGGTTCAACCT CGCCTTCAGCCTCGGCCTTGAGGCCATGAGCATGATGCCAATGATTGACAAACTCGGCATGACC AAGGACGATATC

Nucleic acid sequence SEQ ID No. 2 is directly derived from Trichoderma reesei. However, according to the present invention this sequence may of course vary provided that the encoded protein still exhibits the transcription promoting protein methyl-transferase of the polypeptide disclosed herein.

According to the present invention the nucleic acid molecule exhibits at least 80%, preferably at least 90%, more preferably at least 95%, even more preferably at least 98%, in particular 100%, identity with nucleic acid sequence SEQ ID No. 2.

A further aspect of the present invention relates to a vector comprising an isolated nucleic acid molecule according to the present invention.

In order to transfer a nucleic acid molecule encoding the polypeptide of the present invention to a host cell the nucleic acid molecule of the present invention is provided in a vector. The vector may be an expression vector capable to express the polypeptide of the present invention. However, the vector of the present invention may also be a recombination vector which allows to transfer a nucleic acid molecule encoding the polypeptide of the present invention into the genome of a host cell. The vector of the present invention may comprise further elements such as additional coding sequences within the same transcription unit, controlling elements such as promoters, ribosome binding sites, transcription terminators, polyadenylation sites, additional transcription units under control of the same or different promoters, sequences that permit cloning, expression, homologous recombination, and transformation of a host cell.

The vector of the present invention preferably comprises further at least one promoter operably linked to said nucleic acid molecule.

In order to control the transcription of the nucleic acid molecule of the present invention within the cell at least one promoter is provided within the vector. "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A promoter "operably linked" to a coding sequence is present in the cell in such a way that expression of the coding sequence can be directly influenced by the promoter.

According to a preferred embodiment of the present invention the at least one promoter is selected from the group consisting of tefl promoter (transcription elongation factor 1), gpgl promoter, pkil promoter, enol promoter and pgkl promoter.

Of course it is also possible to use any other promoter which can be used to regulate gene expression in host organisms such as Trichoderma reesei. Particularly preferred promoters are those regulating the protein expression of the following pro- teins in Trichoderma reesei: GH5 glycoside hydrolase (protein ID 81087), Bradorhizobium bleomycin resistance (protein ID 103009), unknown hypothetical protein (protein ID 106270), HHE domain protein, conserved (protein ID 70608), hypothetical conserved protein (protein ID 109925), PTHll-type GPCRs (protein ID

109146), MSF permease (protein ID 78585), glutathione-S-transferase (protein ID 112022), Catalase (protein ID 58472), mannose- 6-phosphate isomerase (protein ID 60445) , unknown protein (protein ID 105287), translation initiation regulator Gnc20

(protein ID 22839) , imidazole proprionase-related amidohydrolase

(protein ID 110757), MSF transporter (protein ID 3405), short chain dehydrogenase/reductase (protein ID 106164), MSF permease

(protein ID 79202), unknown protein (protein ID 60370), hypothetical secreted protein (protein ID 122889), Flavonol reductase/cinnamoyl-CoA reductase (protein ID 111716), Zinc-binding oxidoreductase (protein ID 23292), hypothetical protein (protein ID 124198), unknown protein (protein ID 109523), Kynurenine aminotransferase, glutamine transaminase K (protein ID 122820), GPRl/FUN34/yaaH-like protein (protein ID 60810), hypothetical protein (protein ID 54352), Predicted Zn-dependent hydrolase

(beta-lactamase superfamily) (protein ID 70197), sulfite transporter ssul (protein ID 2076), unknown protein (protein ID

4851), unknown secreted protein (protein ID 110830), unknown esterase/lipase (protein ID 70491) and MSF permease (protein ID 105260). These genes are regularly upregulated under cellulose inducing conditions (e.g. cultivation of Trichoderma reesei on lactose) .

The vector according to the present invention, which can be transformed into a Trichoderma reesei cell comprises in the 5' region of the nucleic acid molecule exhibiting at least 80% identity with nucleic acid sequence SEQ ID No. 2 or of the nucleic acid molecule encoding for a polypeptide having at least 80% identity with amino acid sequence SEQ ID No. l a promoter which allows to control the expression rate of the polypeptide of the present invention within a cell. Preferred promoters are selected from the group of promoters selected from the group consisting of cbhl, cbh2, xynl, xyn2, xyn3, egll, gna3, envl, cDNAl, bxll, pkil, gpdA, gpdl or hexl promoters. Preferred promoters are also promoters disclosed, for instance, in Nakari-Setala et al . (Appl . Env. Microbiol. 61 (1995), 3650-3655; cDNAl promotor) , Rahman et al . (Biosci. Biotechnol. Biochem. 73

(2009), 1083-1089; egl3 and xyn3 promoter), WO 98/23764 Al

(short cbhl promoter) , EP 0 952 223 Al (cbhl-promoter of Trichoderma viridae) , US 7,393,664, US 7,517,685 (hexl promoter) and Mach et al . (Curr. Genetics 25 (1994): 567-570; pkil). In a preferred embodiment of the present invention the promoters may be of heterologous or homologous origin.

Suitable promoters can also be provided by using the method disclosed in US 5,989,870.

Another aspect of the present invention relates to a recombinant host cell comprising a nucleic acid molecule or a vector according to the present invention.

In order to increase the amount of the polypeptide of the present invention within the host cell more than one copy of the nucleic acid molecule or vector according to the present invention are preferably provided in said cells.

According to a preferred embodiment of the present invention the host cell is a fungus, preferably a fungus of the class of Sordariomycetes , more preferably a fungus of the family of Hy-pocreaceae, even more preferably a fungus of the genus of

Trichoderma, in particular Trichoderma reesei.

A further aspect of the present invention relates to a genetically modified Trichoderma reesei cell overexpressing a polypeptide with transcription promoting protein methyltransferase having at least 80% identity with amino acid sequence SEQ ID No. 1 compared to the genetically unmodified wild-type Trichoderma reesei cell.

The term "overexpressing a polypeptide", as used herein, refers to the property of the genetically Trichoderma reesei cell to express (i.e. to produce or to synthesize) said polypeptide to a higher extent than the genetically unmodified wild-type Trichoderma reesei cell from which the genetically modified

Trichoderma reesei cell is derived and which is used to obtain said genetically modified Trichoderma reesei cell. According to a preferred embodiment of the present invention the genetically modified Trichoderma reesei cell expresses at least 20 %, preferably at least 30 %, more preferably at least 40 %, particularly at least 50 %, more polypeptide of the present invention than the genetically unmodified wild-type Trichoderma reesei cell. This amount can be determined by methods known in the art such as ELISA or other methods involving antibodies specifically binding to the polypeptide of the present invention. The methods for obtaining a genetically modified Trichoderma reesei cell according to the present invention are well known in the art.

The Trichoderma reesei cell according to the present invention comprises preferably a vector according as defined above.

According to a preferred embodiment of the present invention said cell comprises at least one mutation within the 5 ' -region of the genome location comprising nucleic acid sequence SEQ ID No. 2 or a variant thereof having at least 80% identity to SEQ ID No . 2.

The Trichoderma reesei genome location comprising the 5' and 3' region of nucleic acid SEQ ID No. 2 comprises the following nucleotide sequence (SEQ ID No. 3; italic: intron, underlined: coding region) :

CCTTTTACCAACTTGGCAGCCCTTGCCTCTTCGTTGCTGGCTAGTAGGGGAGGCAGGCCATTGATCCCGG GCTCGCGTCAATCCACCAAGCCCCAAAGAGCCCTAGAAGCTCGCGACACTTGTCATTGAACCAACACGAC TCTCAACCGCCGTCTGTCGATTCTCACTTCGGCATTCGTCGACCTCCTCCCTCCAGCCGCTGGTCCACTC CGGACCCGAGCTTCGCGCCAGTCTTAAAGGGCTAGCCGTCCTCGCCCCCCCTTCTCCAGTCCGCCGAACG ACAGTCTCAACCTCAACCCTGGGAGTCGTGATAATTTTCTATACCGCCCTTCTCCGCTCTCTCAGCCGTC ACTCGCTTGTGTCTTCCAACCGGAGCTGTCCACGCTGCCGCCTGGGACGATTACGCCCTTGCCTGTCGCT GCATTCGTGCTGCAAATTGGTTGCTCCGGTGGCTCAGGTGACCCTGGTCGAGGACTTCTACCTCTTTCAA GGAGCCTCGTGAGAGTTGATACAGCAGCTGCCCTCGACAGGTCACACAAACACATGGGATCGTCGCTTAT TCTGACCTACGGCAGCTGCGCGGCAACCCACCCGTACCACAGGCATCGGTATCTGTGTCACCTCCTTGAC ATCTCGGCCAGTTGAATTTCTGATGTAAGTTGCCATCTGCCTTGTTGTTGTTCGTCTGGCTGCCCTACAC CATCTTGCCCGCCTGCTCGGCTCCCATCTACTCCTTGATTTATCTTTCCATCTTGTCTTACTGCCATGAT GCGCGTTGAAAGGCGCCATGCATGCCATCGTGTCCGGCCAGGATATGAATGCTGTGCTCAGTCAGCCGTC TTCCTTTTTGGAAGCCAATCAGCAAGGTTGAACTGTCTTTCTCTTTCCTTGCCATCATCAACCCTTGCGG CAGATGTTTATGTCTCTCGCCCCTTGCCTTGAAGTGGGCACCTCCCCGCGATTTCAAGCTGTTACACTCT TCCCCCCCTTCGGCCTCTTCTGTCGCCTTCTACCGCCCTTGGACTTCTATTTCGGTAGAAGTGCCACTTG TTCCTTGCCCAAGACATCCTCACGCGTTATACTGTACTTAACAAGCAGGGTCTCGCTCCACTCCTTGTTA TAACACCATCACATGCGCTTCTTCTTTTCAAAAAAACGGCTTGGGATTAACAAACTTGTTGTAATTAGCT ATCTAACATCTCCTTGACCGGCTCAAAATCACCGGCTATAACTACTGACCCTTCCTCTCCCCTTCCCCCT TCCTGGACCCCTTGGCACTGGACTCTGGAACATCCGCCTGGAGCCCGCCACCTGCATTCCCAAGGATTGA CCCGCCCCTCCCGTTGGCCCTCATAACCTTCGCCATTACTCACTATAAACGCCATGTCTCGAAACGCTCC CAACGGGTGTGTTCCACCCTCCCAAGCTACTGCTCCGCCTTCGCCAGCCACAAGTCTGCGACTAACAGTT GGGGAACCGGTCAGCGAGCCGGCCACTGAATCCGGGGAGAGAGTTCTCCAGGATGGGTTCTGGGAGCACG GTCGCTTTTATGGTTCTTGGAAGCCTGGGAAATACCTTTTCCCCATAGACAAGGrrrGrcrcrrrGrAGC GTCAATACCCTCCGCGCCTGTTCGTACAACTAACAACATCACCAGGAGGAGCTCRATAGGTTAGATGTCT TTCACAAGTATTTCCTCGTTGCAAGAGACGAGAAAGTCACTTCAACTCCCCTGAGGAAAGATGGACGGCC GAAAATCATGGATCTCGGCACAGGCACGGGCATCTGGGCGTATAATGTTGTGGAAGAGTAAGTTATTATA

GAGGTAGTTTCACACTACGCTGTGCCAGTTGCTCACATTTTTCAGGTATGCCAAGGATGCCGAAATCATG GCCGTGGATCTCAATCAAATTCAACCAGCTCTGTAAGTTGTGAGCTTTCAATCGCTTGACCTTTTTTTTT TTTTTTTTTTTCAAACGCTAATGCATTTGCTGAATAGCATTCCTCGAGGTGTAACAACCAAGCAGTTTGA CATTGAAGAGCCCTCGTGGGATCCACTGCTTCGGGACTGCGAATTGATCCATATGCGATTGCTATACGGC AGCATAAGAGATGACAAGTGGCCCCATGTCTACCGCAAGGCCTTTGAGTGCGTAACTCGTGTAACCAACA ACCrGCrCCGrrrcrGACGrrCGrrrACAGGCACTTGGCCCCTGGCATTGGCTATATCGAGCAACTGGAG ATTGACTGGATGCCGCGATGGGAGAATGAGGATCTCCCCAGACATTCGGCTCTTCAAGAATGGGCTCAGC TATTCCAACGTGCCATGCATCGCTACCACCGCAGCGTCACGGTATCAGGCGAGGCTACCAGACGCAGAAT GGAAGCGGCTGGCTTTACAGATTTCTCCGAAACAACGATCCGGTGCTACGTAAACCCGTGGTCTCCCGAT CGCCATCAGCGGGAGTGTGCCCGTTGGTTCAACCTCGCCTTCAGCCTCGGCCTTGAGGCCATGAGCATGA TGCCAATGATTGACAAACTCGGCATGACCAAGGACGATATCGTCGACCTCTGTAGCAGAGCCAAGAAGGA GATGTGCATTCTGCGGTACCGCGCCTATTGCACTCTGTAAGCCGCCCGCCCCCCTGACAAACACAATTTG CCGGAGCCACAACTAATCACGCTTGCAGACACATTTGGACAGCCAGAAAACCGAACGAAGATGAGTCTCA AACTTTCAAAGAAAGAGACTCCGATACGCAGCCATCTAGAAGAGAGGAATCCTCTGCTTAAGACGCCATG GTCGCCAATAGATCGGATAGAGAAGAGAAACAGCATCCTTGCGTCCACATAATACAAACGGCGACCGGCA GGGATGCGAGAGCCAGCGTTCTGCGGCTTGTCCTGTTTTCGAAGCTACAGAAGCCCACCGTGCTATGTGA GCTACCGCTTCATATATGATCGTCGCCGACGCTGAACAATGTCATGCAACCATCACCAAGATTCTCATAC AGGTATAGACGAAGCCTATTCAAGAACGAGAAGTATGCGAGGTGTGGATACTTTGGCTCTGTCTCATTCG AATGAAGCACTGCTAGTGGGTTCATGGCCGGATATATGATCTCTCGTCTTCGTTCTTTACGCTCTTGACG ATATGATGCCCGACAAAGGCCAGCCCGTTTGGTTACAATGAAAGATGACCTTGGGCTTAGACCTACTCCT CTTCTCTCCCTCTTCCTTCATCCTTTCTCTTCCCCTTGATCCCGAGGACTTATCACATGACGACGAAGAA CGGGGAAACTCTGATGGCTAGAAGCATTGTAGGGACTTAGACGAAGGTGGGGAACAGGTGTACAGAAACA TGTCGGCCAGAGTGTTATGGGACTCGGCCGTTGCAAACACGAAGAGATGGCGATAAGAGCCTGAATGGGG TTGGCGTTACAGGGGCATTGTGTATGGCTTTACTCTCTGCTGTCGATTGGATAGATCGTGTCTTTCGAAC TTGAGGATGTTGACATCACTCGTGCTGGTTTTTCTGATTATGTTGTTGGTTAATCGCTTCTGCTAGCAGG GGCATCTCGGCAAGGGGTGGGCATGACCAAGAGATGCCGGAATCACCCCATTACGAAACACTACCCAAGC TGCTAAAACATCCCCATGTGGCCAGATGCAAAGGGAACGAAAAAAAGAAGAAGAGTAAAGAAAAAGCAAA CAACACTTTGAAGATATACTATTAGGGCCCTTTGTATGATACACTTGAGACTGCCTCCTGCATGGTTTCA TCTGGGACGCTGATGCATGGATACCAGCGCCATGTAACCCGGGACAAGGTCCCCTTAGGTTTGGGTAGTC TAGGTGGTAACCTAGGCCAGTAGACAGGGGAAGGGTATGGGGGCAGACCGGGCAAATCATTTCAGGGACG GGGCAGCAAACTACGAGTGAAAGATTGAGAGGCCGAGAGGAAACTTGATACGGGTGGAAAGAGTTTGCTT CTGTTCAAAGGGGGATGTTGTTGGAGAATGGAAAGCGTGAGTCTTTTGGTGAGAATGATGTTGTGATGTT G

The 5' region of SEQ ID No. 2 as evidenced above (SEQ ID No. 3) may be mutated by incorporating additional or alternative regulatory sequences. These regulatory sequences include promoters, such as those listed above in connection with the vectors of the present invention.

According to a preferred embodiment of the present invention the at least one mutation is a deletion, an insertion or a point mutation .

According to a further preferred embodiment of the present invention said Trichoderma reesei cell comprises within the 5'-region of the genome location comprising nucleic acid sequence SEQ ID No. 2 a promoter, a transcription factor binding site or a functional fragment thereof is inserted resulting in an increased expression of the polypeptide encoded by the nucleic acid sequence SEQ ID No. 2 or a variant thereof having at least 80% identity to SEQ ID No. 2 compared to the wild-type Trichoderma reesei.

The Trichoderma reesei cell of the present invention preferably comprises further a recombinant nucleic acid coding region operatively linked to a Trichoderma reesei promoter

sequence and/or a vector comprising a nucleic acid coding region operatively linked to a Trichoderma reesei promoter sequence, wherein the nucleic acid coding region is under the

transcriptional control of said promoter sequence.

The nucleic acid coding region encodes preferably for a peptide, a polypeptide, a protein or a functional DNA or RNA.

According to a preferred embodiment of the present invention a nucleic acid molecule encoding for a protein, polypeptide, peptide or functional DNA of interest may be introduced into the genome of Trichoderma reesei by gene replacement at the genome location comprising one or more of the genes encoding the proteins listed in table 1 of the example section. Methods for performing a gene replacement are well known in the art (see e.g. Guangtao Z et al . , J. Biotechnol. 139 (2009): 146-151). The genomic sequence of Trichoderma reesei is known in the art (Martinez D et al., Nat. Biotechnol. 26 (2008): 553-560), therefore the gene replacement can easily be performed using the methods known in the art. The following genes are examples that are abundantly expressed on lactose, and present in LAEl-regulated genomic clusters: alcohol oxidase AOX1 (Trire2: 80659), a hexose transporter (Trire2: 105260), the lactate/pyruvate transporter (Trire2: 121441), a major facilitator superfamily protein (Tri-re2:70972), and GPR1 (Trire2 : 60810) .

Alternatively, if the purpose is heterologous overexpression of proteins, the following cellulase genes could be replaced: Cel6A (Trire2 : 72567) , Cel7A (Trire2 : 123989) , Cel61B (Trire2: 120961), and Cel5A (Trire2 : 120312)

Yet another aspect of the present invention relates to a method for the recombinant production of a peptide, a polypeptide, a protein or a functional DNA or RNA comprising the step of cultivating a genetically modified Trichoderma reesei cell according to the present invention.

As described above and shown in the examples the expression of specific proteins in Trichoderma reesei is controlled by the protein methyltransferase of the present invention. This control mechanism allows to generate recombinant Trichoderma reesei cells which harbor heterologous nucleic acid molecules within the genomic loci of the nucleic acid molecules naturally regulated by the polypeptide of the present invention within the cell. Of course, it is also possible to exploit this effect to overexpress biomass degrading enzymes naturally occurring in Trichoderma reesei cells and to isolate them. To achieve this object genetically modified Trichoderma reesei cells as described above have to be used which overexpress the polypeptide of the present invention in comparison to wild-type Trichoderma reesei cells.

The present invention is further illustrated by the following figures and examples, however, without being restricted thereto .

Fig. 1 shows the effect of loss-of-function of lael on biomass formation and cellulase/hemicellulase enzyme formation by T. reesei. Growth of T. reesei QM 9414 and the corresponding Alael strain on 1 % (w/v) cellulose (a) and 1 % (w/v) glycerol (b) . Biomass on cellulose is quantified as the fungal protein that can be extracted from the cellulose-fungus debris by 0.1 M NaOH (1 h, 30 °C) and refered to 1 L of culture, whereas that on glycerol is given by the biomass dry weight per L. Cellulase (c) and hemicellulase (d) formation by T. reesei QM 9414 and the corresponding Alael strain on 1 % (w/v) lactose and 1 % (w/v) xylan, respectively. Experiments are means of 3-5 biological replicas .

Fig. 2 shows the expression of the two cellulase genes cbhl, encoding CEL7A (a) and cbh2, encoding CEL6A (b) in T. reesei QM 9414 and the Alael mutant during growth or incubation, respectively, on glycerol, lactose and sophorose. Expression in QM

9414 is given with full bars and set to 1.0 for every condition. The respective expression levels in relation to the wild-type are shown with open bars. Data are means of triplicate determinations from two biological replica.

Fig. 3 shows the biomass formation (A), cellulase production (B) and extracellular protein (C) during growth of T. reesei QM 9414 (QM) and several mutant strains bearing an additional copy of the lael gene (DO, Dl, D2, D3, D7) on lactose. The three bars represent (from left to right) values for 48, 72 and 96 hrs of cultivation .

Fig. 4 shows the biomass formation (A), cellulase production (B) and extracellular protein (C) during growth of T. reesei QM 9414 (QM) and several mutant strains bearing an additional copy of the tefl:lael gene construct (Wl , P8, 01, Ml-2, M2-3, El, Nl) on lactose. The three bars represent (from left to right) values for 48, 72 and 96 hrs of cultivation.

EXAMPLE :

Materials and Methods

Strains

T. reesei QM9414 (ATCC 26921), an early cellulase producing mutant and H. jecorina KU70, a derivative of the QM 9414 uridine auxotrophic pyr4 negative strain TU-6 (ATCC MYA-256) , and which bears a deletion in the ku70 gene and is thus deficient in nonhomologous end joining, were used in this example. Escherichia coli JM109 (Promega, USA) was used for plasmid construction and amplification .

For cellulase respectively xylanase production, T. reesei was grown in Mandels-Andreotti medium [26], using Avicel cellulose, lactose, oat spelts xylan or glycerol as a carbon source (1 %, w/v) as stated at the respective results. Induction of cellulases by sophorose (0.5 mM) in pregrown, washed mycelia was performed as described (Sternberg et al . , J Bacteriol 139

(1979) : 761-769) .

Construction of a Alael strain of T. reesei

To delete the lael gene of T. reesei, a 1.2 kb lael coding region was replaced by the T. reesei pyr4 (orotidine 5'-phosphate decarboxylase-encoding) gene. This was performed by amplifying around 1 kb of the up- and downstream non-coding region of lael from genomic DNA of T. reesei QM9414 using the primer pairs given in the following table:

Table A: Oligonucleotide primers used for construction of vectors for lael deletion and amplification


*Respective restriction sites are underlined

Resulting PCR fragments were ligated by T/A cloning into pGEM-T Easy (Promega, USA) . The upstream non-coding region was excised by digestion with Xhol/Hindlll and the downstream region by Xhol/Apal from the pGEM-T Easy backbone, and then both fragments were ligated into a Apal/Hindlll restricted vector

pBluescript SK(+) (Stratagene, USA). The resulting plasmid was cleaved with Xhol, dephosphorylated and the 2.7 kb Sail fragment of T. reesei pyr4 inserted resulting in pAlael.

lael gene amplification in T. reesei

To introduce a second copy of lael into the genome of T. reesei QM 9414, 900 bp of the upstream and 500 bp of the downstream non-coding region of lael were amplified from genomic DNA of T. reesei QM9414 using the primer pairs given in Table A. As a selection marker, 2 kb of the A. oryzae ptrA (pyrithiamine resistance conferring) gene was amplified from plasmid pME2892 (Kubodera et al . 2000) using the primer pair given in Table A. PCR fragments were cloned into pGEM-T Easy, lael was then excised with Spel/Pstl and ptrA by PstI /Hindi I I , respectively, lael was subsequently ligated into pBluescript SK(+), previously cut with Spel/Pstl, followed by the cloning of ptrA into the resulting plasmid plaelptrA.

mRNA extraction and Real Time PCR

DNase treated (DNase I, RNase free; Fermentas) RNA (5yg) was reverse transcribed with the RevertAid™ First Strand cDNA Kit

(Fermentas) according to the manufacturer's protocol with a combination of the provided oligo-dT and random hexamer primers. All real-time RT-PCR experiments were performed on a Bio-Rad

(USA) iCycler IQ. For the reaction the IQ SYBR Green Supermix

(Bio-Rad, USA) was prepared for 25 μΐ assays with standard MgCl2 concentration (3 mM) and a final primer concentration of 100 nM each. All assays were carried out in 96-well plates which were covered with optical tape. The amplification protocol consisted of an initial denaturation step (3 min at 95°C) followed by 40 cycles of denaturation (15 sec at 95°C), annealing (20 sec at 57°C) and elongation (10 sec at 72°C) . Determination of the PCR efficiency was performed using triplicate reactions from a dilution series of cDNA (l.OOE-00, l.OOE-01, 1.00E-02 and 1.00E-03). Amplification efficiency was then calculated from the given slopes in the IQ5 Optical system Software v2.0. Expression ratios were calculated using REST© Software (Pfaffl M.W. et al . Nucleic Acid Research 30(2009): e36) ) . All samples were analyzed in two independent experiments with three replicates in each run. Transcriptome analysis of lael loss-of function

Mycelia from both stages were freeze-dried and ground in liquid nitrogen using a mortar and pestle. For each of the two experimental conditions, five independent replicates of mycelium were mixed. Total RNAs were extracted using TRIzol® reagent (In-vitrogen Life Technologies, USA) , according to the manufacturer's instructions, and then purified using the RNeasy MinE-lute Cleanup Kit (Qiagen, Germany) . The RNA quality and quantity were determined using a Nanodrop spectrophotometer. High quality purified RNAs were submitted to Roche-NimbleGene (40 μg per 3-microarray set) where cDNAs were synthesized, amplified and labeled and then used for subsequent hybridization.

A T. reesei high density oligonucleotide (HDO) microarray

(Roche-NimbleGen, Inc., USA) was constructed, using 60-mer probes representing the 9.130 genes of T. reesei. Microarray scanning, data acquisition and identification of probe sets showing a significant difference (p=0.05) in expression level between the two culture conditions considered were performed by Roche-NimbleGen . Transcripts showing significantly up-regulated expression (2-fold and 5-fold changes) were annotated using the eukaryotic orthologous groups (KOG) classification. The microar-ray data and the related protocols are available at the GEO web site (www.ncbi.nlm.nih.gov/geo/) under accession number:

GSE20516.

Construction of T. reesei strains with altered lael alleles

To study the function of LAE1, we constructed T. reesei strains in which lael was deleted and strains, which expressed lael under the strong constitutive expression signals of the tefl (translation elongation factor 1-alpha encoding) promoter region .

To delete the lael gene of T. reesei, the 1.2 kb lael coding region was replaced by the T. reesei pyr4 (orotidine 5'-phosphate decarboxylase-encoding) gene. This was performed by amplifying around 1 kb of the up- and downstream non-coding region of lael from genomic DNA of T. reesei QM9414 using the primer pairs given in the following table:

Oligonucleotide primers used for construction of vectors for lael deletion and overexpression

Name rSequence (5 '-3'

lael gene deletion

5TrlaelHind TAAGCTTCACTCGCTTGTGTCTTC

5TrlaelXho TCTCGAGCGTTTATAGTGAGTAATGGC

3TrlaelXho TCTCGAGCTATTGCACTCTGTAAGCC

3TrlaelApa TGGGCCCTGGGTAGTGTTTCGTAATG

tefl-lael construction

teflXhofw GCCTCGAGGGACAGAATGTAC

ClaSalrv AGTCGACATCGATGACGGTTTGTGTGATGTAGCGTG

TrLaelATGCla GCTATCGATGTCTCGAAACGCTCCCAAC

TrLaelTermHind CGAAGCTTGCCCAAGGTCATCTTTCATTG

*Respective restriction sites are underlined

The two resulting PCR fragments were digested with Hin-dlll/XhoI (upstream region) and Apal/Xhol (downstream region) and ligated into a Apal /Hindlll restricted vector pBluescript SK(+) (Stratagene, La Jolla, California), followed by the insertion of the 2.7 kb Sail fragment of T. reesei pyr4 in the Xhol site resulting in pRKBSl.

For expression of lael under a strong constitutive promoter, a 1,820-bp lael PCR fragment including the coding and terminator region with the oligonucleotides TrLaelATGCla and TrLaelTermHind was amplified and the fragment was inserted downstream of the tefl promoter region (Genbank accession number Z23012.1) into the Clal /Hindlll sites of pLHlhphtefl resulting in vector

pRKBS3, which contains the E. coll hygromycin B phosphotransferase (hph) under T. reesei expression signals as selection marker (Akel et al . , Eukaryot Cell 8 (2009): 1837-1844) .

Fungal transformation

All vectors constructed were verified by sequencing. The strains were purified twice for mitotic stability, and integration of the expression cassettes was verified by PCR analysis. Gene copy numbers of the integrated constructs were determined by Southern analysis, using chromosomal DNA cleaved with BamRI . Protoplast preparation and DNA mediated transformation was described (Guangtao et al., J Biotechnol 139 (2009): 146-151) .

Biochemical assays

Cellulase enzyme activities were determined using carboxy-methylcellulose (1 %, w/v) as described (Vaheri et al . , Biotechnol Letts 1 (1979) : 41-46) . Protein in the culture supernatant was determined by the method of Bradford.

Transcriptome analysis of lael loss-of function and lael overexpression

Mycelia were ground in liquid nitrogen using a mortar and pestle. Total RNAs were extracted using TRIzol® reagent (Invi-trogen Life Technologies, USA), according to the manufacturer's instructions, and then purified using the RNeasy MinElute Cleanup Kit (Qiagen, Germany) . The RNA quality and quantity were determined using a Nanodrop spectrophotometer. High quality purified RNAs were submitted to Roche-NimbleGen (40 μg per 3-microarray set) where cDNAs were synthesized, amplified and labelled and then used for subsequent hybridization.

A T. reesei high density oligonucleotide (HDO) microarray (Roche-NimbleGen, Inc., USA) was constructed, using 60-mer probes (7 probes per gene, 10 transcripts with less than 7 probes; a total of 63836 probes) representing the 9.143 genes of T. reesei.

Microarray scanning, data acquisition and identification of probe sets showing a significant difference (p < 0.05) in expression level between the different strains were performed by Roche-NimbleGen (www . nimblegen . com) . Transcripts showing significantly down-regulated expression in the lael strain (at least 2-fold changes) were annotated manually. The dataset was also manually screened for the downregulation of genes encoding carbohydrate active enzymes to at least 2-fold changes. The microarray data and the related protocols are available at the GEO web site (www . ncbi . nlm. nih . gov/geo/ ) under accession number:

GSE22687 (platform GPL10642) .

Analysis of genomic clustering of transcripts

T. reesei genes have not yet been mapped to chromosomes, but their appearance on genomic scaffolds is known. In order to identify whether the significantly regulated transcripts would be clustered to particular areas on these scaffolds, we aligned them onto an ordered list of genes on the individual scaffolds. Distances (=numbers of genes) between positive hits were recorded. Clustering of transcripts was considered to appear if the distance between them was at least 3-fold smaller than the average distribution of the 769 significantly regulated

transcripts among all genes (9143), i.e. a third of 11.9, = 3.9.

Real Time PCR

DNase treated (DNase I, RNase free; Fermentas) RNA (5yg) was reverse transcribed with the RevertAid™ First Strand cDNA Kit (Fermentas) according to the manufacturer's protocol with a combination of oligo-dT and random hexamer primers of the following table :

Primers for cellulase transcript quantification

by Real Time PCR

Gene Forward Primer (5' to 3' ) Reverse Primer (5' to 3' ) tefl * CCACATTGCCTGCAAGTTCGC GTCGGTGAAAGCCTCAACGCAC

cel7a (cbhl) Ccgagcttggtagttactctg Ggtagccttcttgaactgagt eel 6a (cbh2) ACTACAACGGGTGGAACATTAC CGTGGATGTACAGCTTCTCG lael ACTGGAGATTGACTGGATGC TTCTGCGTCTGGTAGCCTC

* tefl was used as a reference gene

All real-time RT-PCR experiments were performed on a Bio-Rad iCycler IQ. For the reaction the IQ SYBR Green Supermix (Bio-Rad) was prepared for 25 μΐ assays with standard MgCl2 concentration (3 mM) and a final primer concentration of 100 nM each. All assays were carried out in 96-well plates. The amplification protocol consisted of an initial denaturation step (3 min at 95°C) followed by 40 cycles of denaturation (15 sec at 95°C), annealing (20 sec at 57°C) and elongation (10 sec at 72°C) . Determination of the PCR efficiency was performed using triplicate reactions from a dilution series of cDNA (1; 0.1; 0.01; 0.001) . Amplification efficiency was then calculated from the given slopes in the IQ5 Optical system Software v2.0. Expression ratios were calculated using REST© Software (Pfaffl et al , , Nucleic Acid Res . 30 (2002) : e 36) . All samples were analyzed in at least two independent experiments with three replicates in each run.

Statistical Analysis^

Basic statistical methods such as multiple regression analysis and analysis of variance (Anova) as well as multivariate exploratory techniques (cluster and factor analyses) were performed using Statistica 6.1 (StatSoft, Inc., USA) data analysis software system.

Resul ts

To identify lael, 92 S-methionyl-adenosine-dependent methyl-transferases present in the T. reesei genome database

(http : //genome .jgi-psf.org/Trire2/Trire2. home . html ) were

screened. When any of the functionally verified Aspergillus LaeA proteins was used as a query in BLASTP, several hits with negative probabilities of <e-30 were obtained, but using these as a query in BLASTP of the respective Aspergillus genome databases

(http : //www . broadinstitute .org/annotation/genome/aspergillus_gro up/MultiHome . html ) always resulted in several hits of similar negative probability and thus identified none of them clearly as a LaeA orthologue. Since this approach was therefore prone to lead to false positives, an iterative phylogenetic strategy for its identification was used: briefly, BLASTP was used to detect LaeA orthologues in species more closer related to the Asper-gilli (such as Coccidioides immitis) and then used the identified protein to look for LaeA orthologues in Dothidiomycetes , and used the latter one to the Sordariomycetes and finally the Hypocreaceae . By this means 27 putative LaeA orthologues from Eurotiomycetes , Dothidiomycetes and Sordariomycetes were identified (Table below) . A phylogenetic analysis of these protein sequences produced a tree whose branching was consistent with the established phylogenetic relationship within these fungi, thus proving the orthology of the identified protein sequences. The T. reesei protein Trire2 : 41617 was thus identified as the putative LaeA orthologue, LAE1.

An examination of genomic and cDNA sequences revealed that the total coding region of Lael comprises 1061 nucleotides (SEQ ID No. 3) and is interrupted by three introns (62, 58 and 278 nts long, from 5' to 3' ) , giving rise to a 221 aa mature protein (SEQ ID No. 1) with a molecular weight of 25832 Da and an isoelectric point of 5.83 (data calculated by ProtPARAM; Gasteiger et al . 2005 (In) John M. Walker (ed) : The Proteomics Protocols Handbook, Humana Press pp. 571-607 (2005)). It exhibited an overall amino acid identity of 28 and 24 % to the known proteins from A. nidulans and A. fumigatus, respectively. The protein contained the expected SAM domain, and four S and three T residues were detected which fulfill the consensus of phosphorylation by respective protein kinases (analyzed by NetPhos v 2.0; Blom et al . J. Mol. Biol. 294: 1351-1362 (1999)). Consistent with data from Aspergillus LaeA (Bok and Keller, Eukaryot. Cell 3, 527 (2004)), a conventional nuclear localization signal was not found.

Lael null mutants (Alael) were created by replacing the lael coding region with the orotidine-5-decarboxylase gene pyr410 in T. reesei KU70 (a ku70 delta strain) . Growth of the Alael-deleted strain on simple carbon sources such as glycerol was similar to that of the parent strain, but growth on cellulose was severely impaired (Figure 1 a) , indicating that the loss of lael function indeed leads to a defect in growth on cellulose. To test whether this is due to a loss of cellulase formation, two deletion mutants and the parent strain were cultivated on lactose, a carbon source which induces cellulase formation, but whose utilization is independent of cellulase formation. As shown in Figure 1 b, growth of the parent strains and the mutants is indeed similar. However, significantly reduced cellulase activity was found in the mutant cultures (Figure 1 c) .

Consistent findings were obtained with xylan as a carbon source, on which the Alael-mutants exhibited a somewhat impaired growth, but xylanase activity was even more impaired (Figure 1 d) .

The above data provided a first hint towards an effect of LAE1 on cellulase and hemicellulase formation in T. reesei, but the individual effects on the various cellulolytic and hemicel-lulolytic enzymes present in its genome cannot be deduced. In order to test whether the lael deletion indeed acts at the expression of its biomass degrading enzymes, a genome-wide approach was used: in total, 126 of the approximately 320 carbohydrate-active enzyme (CAZyme) genes of T. reesei are found in 25 discrete regions of the chromosome ranging from 14 kb to 275 kb in length. Thus, microarrays representing all 9130 unique alleles in the genome of T. reesei were used to examine their transcript levels when grown on lactose as a carbon source. 765 genes exhibited an at least twofold decrease in their hybridization intensity in the Alael strain compared to QM 9414. Among these, 65 carbohydrate-active enzyme encoding genes were detected, which in majority comprised glycosyl hydrolases involved in cellulose and hemicellulose degradation (GHs; Table 1) : they included all 10 cellulases (CEL5A, CEL5B, CEL6A, CEL7A, CEL7B, CEL12, CEL45, CEL61A, CEL61B and CEL74), both known swollenins (SWOl, SW02 ; proteins carrying an expansin-like domain and that disrupt the crystalline cellulase structure) and CIPs (CIPl, CIP2; proteins that contain a signal peptide and a cellulose-binding domain) , 5 of the 7 known β-glucosidases (CEL1A, CEL1B, CEL3C and CEL3D) , and all 4 xylanases (XYN1-XYN4) . The majority of the other affected GHs (21 of 28) comprised glycosidases active against various side chains in hemicelluloses .

Table 1. Changes in carbohydrate-active enzyme gene expression in T. reesei by knocking out the function of lael * (sequences and protein ID number are obtainable at genome. jgi-psf.org/Trire2/Trire2.home.html; Martinez D et al . , Nat. Bio-technol 26 (2008) : 553-560)

Protein ID downregulated p-value

Cellulases

GH 5 endo-β-Ι , 4-glucanase Cel5A 120312 17.338 0.000896

GH5 endo-β-Ι , 4-glucanase CEL5B 82616 5.179 0.00219

GH6 Cellobiohydrolase 2 CEL6A 72567 15.565 0.00106

GH7 endo-β , 4-glucanase EGL1 122081 3.019 0.00126

GH7 cellobiohydrolase 1 CEL7A 123989 6.841 0.000994

GH12 endo-β-Ι , 4-glucanase 12a 123232 15.789 0.00132

GH45 endo-β-Ι, 4-glucanase EG5 49976 14.409 0.000892

GH61 endo-β-Ι, 4-glucanase CEL61A 73643 25.659 0.000851

GH61endo-B-l, 4-glucanase CEL61B 120961 40.524 0.000836 GH1 β-glucosidase CEL1B 22197 3.454 0.000918

GH1 β-glucosidase CEL1A 120749 2.428 0.00107

GH3 β-glycosidase of uncertain 108671 2.833 0.00142 specificity

GH3 β-glucosidase CEL3D 46816 2.115 0.00525

GH3 β-glucosidase CEL3C 82227 3.555 0.00153 nonenzymatic cellulose attacking

enzymes

CIP2 123940 3.232 0.00118

CIPl 73638 16.794 0.000855

CBM13 protein 111094 8.529 0.00407 s ollenin 123992 3.714 0.000847 s ollenin-like, 84 % ID to 123992 111874 6.17 0.00153 xylanases

GH10 xylanase XYN3 120229 4.392 0.00202

GHllxylanase XYN1 74223 2.038 0.00213

GH11 xylanase XYN2 123818 23.487 0.000931

GH30 xylanase XYN4 111849 2.121 0.000885

GH3 β-xylosidase BXL1 121127 17.003 0.000896

GH43 β-xylosidase/ - 3739 2.752 0.000911 arabinofuranosidase

GH74 xyloglucananase CEL74a 49081 3.051 0.0009 hemicellulose side chain cleaving

enzymes

CE5 acetyl xylan esterase AXE1 73632 6.821 0.000951

GH67 -glucuronidase AGU1 72526 13.365 0.00101

GH62, -L-arabinofuranosidase ABF2 76210 14.836 0.000893

GH54, L- -arabinofuranosidase ABF1 55319 2.201 0.00158

GH95 -fucosidase 58802 9.946 0.00544

GH95 a-fucosidase 5807 3.606 0.0018

GH92 oi-1 , 2-mannosidase 74198 6.097 0.000712 GH92 oi-1 , 2-mannosidase 60635 2.154 0.00372

GH47 -1 , 2-mannosidase 45717 4.496 0.00133

GH2 β-mannosidase 69245 5.937 0.00181

GH27 -galactosidase AGL1 72632 5.777 0.00134

GH27 a-galactosidase AGL3 27259 2.065 0.00641 pectlnases

GH28 polygalacturonase 103049 2.382 0.00831

* values are given as means of two biological replica; ^own-regulation" is given as -fold decrease compared to the parent strain .

The 25 carbohydrate-active enzyme clusters in the T. reesei genome contain an average five-fold increase in carbohydrate-active enzyme gene density compared to the expected density for randomly distributed genes. 765 of the total 9130 genes in the T. reesei genome to be at least 2-fold downregulated in the Alael strain were identified, thus implying that at a random distribution one at every twelfth gene should be found. If the genes would however be clustered as calculated above, the average gene density of LAEl-affected genes should be around 2.5. To investigate this, the 765 identified genes on the T. reesei scaffolds were mapped and searched for potential clusters. Indeed, 28 areas on 21 scaffolds were found that exhibited a fourfold increase of gene density over the random distribution.

Clusters of expressed genes affected by lael-loss of function in T. reesei

scaffold cluster found** genes in expressed gene denCAZys predicted* cluster genes sity***

1 yes 618-690 72 23 3.13 5

yes 112-135 23 9 2.55 3

2 yes 410-417 7 2 3.5 2

3 no 4-27 23 6 3.83 2

yes 530-547 17 6 2.83 2 4 no 285-293 8 5 1.6 1

5 no 3-18 15 5 3 1

yes 209-233 24 8 3 1

6 no 134-138 4 4 1 1

7 no 226-252 26 8 3.25 2

yes 394-419 25 9 2.77 2

8 yes 167-192 25 7 3.51 3

10 yes 183-201 18 10 1.8 2

13 yes 16-34 18 5 3.6 1

no 43-61 18 6 3 1

14 no 187-194 7 3 2.33 1

16 no 120-134 14 4 3.5 2

19 yes 62-97 35 10 3.5 1

yes 157-189 32 12 2.67 1

22 no 48-98 50 17 2.94 1

27 yes 40-59 19 5 3.8 1

28 no 68-80 22 7 3.15 1

no 111-117 6 4 1.5 3

29 yes 91-102 11 7 1.57 2

30 no 4-19 15 4 3.75 1

31 yes 43-53 10 4 2.5 1

33 yes 29-42 13 6 2.16 1

44 no 4-15 11 5 1.2 1

* specifies that the expressed genes were found within the area of clusters proposed by Martinez et al . (Nature Biotechnol. 26, 553 (2008) ) .

** „found" means that a clustering of expressed genes was found, using the method outlined

*** gene density is defined as the number of genes present in a proposed cluster, divided by the number of genes whose expression was affected at least 2-fold by the lael deletion

32 of the 67 downregulated CAZyme genes were located within these clusters. Interestingly, 13 of these 29 clusters were found in areas not previously predicted.

In order to confirm the results from microarray analysis, cel7A and cel6A were used as cellulase model genes and their expression in the parent strain and in the Alael strain was specifically tested by Real Time PCR. These cells were cultivated on lactose, sophorose (a disaccharide confering high cellulase induction in resting cells) and on the non-inducing carbon source glycerol (Figure 2) . The data confirmed the finding of the microarray experiments as gene expression was absent in the Alael strains on all three carbon sources. The results also demonstrate that the nature of the inducer does not influence the epigenetic regulation of cellulase formation.

Having identified LAE1 as a regulator of cellulase and hemi-cellulase biosynthesis in T. reesei, it was investigated whether an enhanced activity of LAE1 would even stimulate cellulase formation. To test this, two approaches were used: in one, a second copy of T. reesei lael was introduced into its genome.

In another one, lael downstream of the regulatory signals of the tefl (elongation factor la-encoding) gene was fused which allow high constitutive expression. To this end, 740 bp of the promoter region of H. jecorina tefl (Genbank accession number Z23012.1) were amplified by PCR using the oligonucleotides teflXhofw and teflClaSalrv (Table A, above); the Xhol/Sall-restricted tefl fragment was then cloned in the corresponding sites of pLHlhph, which contains an E. coli hygromycin B phosphotransferase (hph) expression cassette as fungal selection marker. A 1,820-bp lael PCR fragment including the coding and terminator region was amplified with the oligonucleotides

TrLaelATGCla and TrLaelTermHind (Table A, above) and inserted downstream of the tefl promoter region at a Clal/Hindlll cleavage site resulting in vector Ptefllaelhph .

Five and three T. reesei strains that contained a single lael copy and the tefl: lael construct, respectively, integrated ectopically in its genome, were examined for their ability to form cellulases on lactose (Figure 4) . Particularly the

tefl: lael copies indeed exhibited an up to 10-fold increased cellulase formation and 10-40 fold enhanced cel7A and cel6A gene expression, and this increased expression correlated with increased expression of lael in these strains.

The data presented clearly show that epigenetic manipulation significantly influences cellulase gene transcription in T.

reesei, and thus represents a so far overlooked area for strain improvement by recombinant techniques. Since LAE1 acts by inactivating the heterochromatin protein HepA or the H3K9 methyl-transferase Clr49 these findings further show that a loss-of-function in these genes also leads to an increase in cellulase formation. Orthologues for these two genes have been detected in T. reesei.