Search International and National Patent Collections
Some content of this application is unavailable at the moment.
If this situation persists, please contact us atFeedback&Contact
1. (WO1998051790) A NOVEL NUCLEIC ACID MOLECULE
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

A NOVEL NUCLEIC ACID MOLECULE

FIELD OF THE INVENTION

The present invention is directed generally to an isolated nucleic acid molecule encompassing a neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof and its use ter alia in developing a range of eukaryotic artificial chromosomes including mammalian (e.g. human) and non-mammalian artificial chromosomes. Such artificial chromosomes are useful in a range of genetic therapies.

BACKGROUND OF THE INVENTION

Bibliographic details of the publications referred to by author in this specification are collected at the end of the description.

The rapidly increasing sophistication of recombinant DNA technology is greatly facilitating research and development in the medical and allied health fields. A particularly important area is in mammalian including human genetics and the molecular mechanisms behind some genetic abnormalities. Progress in research in this area has been hampered by the lack of a cloned nucleic acid molecule encompassing a human centromere. The identification and cloning of a human centromere will promote the development of techniques for introducing genes into eukaryotic cells and in particular mammalian including human cells and will be an important asset to gene therapy and the development of a range of genetic diagnostic tests.

The centromere is an essential structure for sister chromatid cohesion and proper chromosomal segregation during mitotic and meiotic cell divisions. The centromere of the budding yeast Saccharomyces cerevisiae has been extensively studied and shown to be contained within a relatively short DNA segment of 125 bp that is organized into an 8-bp (CDEI) and 26-bp (CDEm) domain, separated by a 78- to 87-bp, highly AT-rich, middle (CDEII) domain (Clarke and Carbon, 1985). The centromere of the fission yeast Schizosaccharomyces pombe is considerably larger, ranging from 40 to 100 kb, and consists of a central core DNA element of 4 to 7 kb flanked on both sides by inverted repeat units (Steiner et al., 1993). Recently, the functional DNA components of a higher eukaryotic centromere have been characterized in a minichromosome from Drosophila melanogaster and shown to consist of a 220-kb essential core DNA flanked by 200 kb of highly repeated sequences on one side (Murphy and Karpen, 1995).

The mammalian centromere, like the centromeres of all higher eukaryotes studied to date, contains a great abundance of highly repetitive, heterochromatic DNA. For example, a typical human centromere contains 2 to 4 Mb of the 171 -bp α-satellite repeat (Wevrick and Willard 1989, 1991; Trowell et al., 1993), plus a smaller and more variable quantity of a 5-bp satellite III DNA (Grady et al., 1992; Trowell et al., 1993). The role of these satellite sequences is presently unclear. Transfection of a cloned 17-kb uninterrupted α-satellite array into cultured simian cells (Haaf et al., 1992) or a 120-kb α-satellite-containing YAC into human and hamster cells (Larin et al., 1994) appear to confer centromere function at the sites of integration. Other workers have analyzed rearranged Y chromosomes (Tyler-Smith et al., 1993), or dissected the centromere ofthe human Y chromosome with cloned telomeric DNA (Brown et al., 1994) and suggested that 150 to 200 kb of α-satellite DNA plus -300 kb of adjacent sequences are associated with human centromere function. In addition, a human X-derived minichromosome that retained 2.5 Mb of α-satellite array has been produced by telomere-associated chromosome fragmentation (Fair et al., 1995). In all these studies, it is not known whether non-α-satellite DNA sequences are embedded within the centromeric site and operate independently of, or in concert with, the α-satellite DNA.

In mammals, four constitutive centromere-binding proteins, CENP-A, CENP-B, CENP-C, and CENP-D, have been characterized to varying extents and implicated to have possible direct roles in centromere function. CENP-A, a protein localized to the outer kinetochore domain, is a centromere-specific core histone that shows sequence homology to the histone H3 protein and may serve to differentiate the centromere from the rest of the chromosome at the most fundamental level of chromatin structure - the nucleosome (Sullivan et al., 1994). CENP-B, a protein which associates with the centromeric heterochromatin through its binding to the CENP-B box motif found in primate α-satellite and mouse minor satellite DNA, probably has a role in packaging centromeric heterochromatic DNA - a role which, however, may not be indispensable since the protein is undetectable on the Y chromosome (Pluta et al., 1990) and is found on the inactive centromeres of dicentric chromosomes (Earnshaw et al., 1989). CENP-C has been shown to be located at the inner kinetochore plate and is postulated to have an essential although yet undetermined centromere function, as seen, for example, from inhibition of mitotic progression following microinjection of anti-CENP-C antibodies into cells (Bernat et al., 1990; Tomkiel et al., 1994) and from its association with the active but not the inactive centromeres of dicentric chromosomes (Earnshaw et al., 1989; Page et al., 1995; Sullivan and Schwartz, 1995). Finally, CENP-D (or RCC1) is a guanine exchange factor that appears to have a general cellular role that is neither specific nor clear for the centromere (Kingwell and Rattner 1987; Bischoff et al., 1990; Dasso, 1993). More recently, a new role for the mammalian centromere as a "marshalling station" for a host of "passenger proteins" (such as ENCENPs, MCAK, CENP-E, CENP-F, 3F3/2 antigens, and cytoplasmic dynein), has been recognized (reviewed by Earnshaw and Mackay, 1994, and Pluta et al., 1995). These passenger proteins, whose appearance at the centromere is transient and tightly regulated by the cell cycle, provide vital functions that include motor movement of chromosomes, modulation of spindle dynamics, nuclear organization, intercellular bridge structure and function, sister chromatid cohesion and release, and cytokinesis. At present, except for CENP-B, none of the constitutive or passenger proteins have been demonstrated to bind mammalian centromere DNA directly.

In work leading up to the present invention, the inventors identified in a patient (hereinafter referred to as "BE") an unusual human marker chromosome, mardel 10, which is 100% stable in mitotic division both in patient BE and in established fibroblast and transformed lymphoblast cultures. In accordance with the present invention, a region of the mardel (10) chromosome has been cloned together with the corresponding region from a normal human subject. The nucleic acid molecules cloned contain no substantial α-satellite repeats yet are mitotically stable. The nucleic acid molecules encompass therefore, a new form of centromere referred to herein as a

"neocentromere". The identification and cloning of a eukaryotic neocentromere without substantial α-satellite DNA repeat sequences now provides the means of generating a range of eukaryotic artificial chromosomes such as mammalian including human artificial chromosomes with uses in genetic therapy, transgenic plant and animal production and recombinant protein production. A range of diagnostic reagents is now also obtainable using the cloned neocentromere.

SUMMARY OF THE INVENTION

Sequence Identity Numbers (SEQ ID NOs.) for the nucleotide sequences referred to in the specification are defined following the bibliography.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

A fibroblast cell line 920158 carrying the mardel marker chromosome was deposited at the European Collection of Cell Cultures (ECACC), Centre for Applied Microbiology Research, Salisbury, Wiltshire, SP4 OJG, UK on 1 May, 1997 under Accession No. 97051716. Bacterial artificial chromosomes (BACs) carrying portions ofthe mardel (10) chromosome have also been deposited at ECACC as follows:

BAC/E8-1: deposited on 5 May 1998 under Accession Number 980505016;
BAC/F2-14: deposited on 5 May 1998 under Accession Number 980505017.

A number of human fibrosarcoma cell lines carrying various neocentromeric constructs were deposited at ECACC as described hereafter by Accession Number with the date of deposit in parenthesises.

HT-38 98050704 (7 May 1998)
HT-47 98050705 (7 May 1998)
HT-54 98050706 (7 May 1998)
HT-190 98050707 (7 May 1998)
HT-191 98050708 (7 May 1998).

One aspect of the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides derived from a eukaryotic chromosome and encompassing a neocentromere or a functional derivative synthetic or hybrid form thereof which nucleic acid molecule or its derivatives, synthetic forms or hybrid forms when introduced into a compatible cell is capable of replicating, acting as an extra-chromosomal element and segregating with cell division.

Another aspect of the present invention contemplates a nucleic acid molecule or its chemical equivalent having a tertiary structure which defines a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue.

Yet a further aspect of the present invention is directed to an isolated nucleic acid molecule comprising a sequence of nucleotides encompassing a neocentromere derived from a eukaryotic chromosome, which nucleic acid molecule when introduced into a compatible cell is a replicating, extra-chromosomal element which segregates with cell division.

Still another aspect of the present invention is directed to an isolated nucleic acid molecule having a sequence of nucleotides or their chemical equivalents which directs a conformation defining a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or a mammalian or non-mammalian homologue thereof wherein the neocentromere associates with centromere binding proteins (CENP) -A and CENP-C or antibodies thereto and does not contain substantial α-satellite DNA repeat sequences.

A further aspect of the present invention is directed to an isolated nucleic acid molecule comprising a nucleotide sequence encompassing a neocentromere or a functional derivative, synthetic or hybrid form thereof which when said nucleic acid molecule is in linear form and co-introduced into a cell together with a telomeric sequence, is capable of replicating, remaining as an extra-chromosomal element and segregates with cell division.

Another aspect of the present invention provides an isolated nucleic acid molecule or a derivative, synthetic or hybrid form thereof comprising a sequence of nucleotides:
(i) which directs conformation defining a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue wherein said neocentromere is capable of associating with CENP-A and CENP-C;
(ii) which contains no substantial α-satellite DNA sequence repeat; and
(iii) which is capable, when introduced into compatible cells, of replication, remaining extra- chromosomal and segregating with cell division.

Even yet another aspect ofthe present invention is directed to a genetic construct comprising an origin of replication for a eukaryotic cell and a nucleic acid molecule encompassing a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue flanked by telomeric nucleotide sequences functional in the cell in which the genetic construct is to replicate and wherein said genetic construct when introduced into a cell is a replicating, extra-chromosomal element which segregates with cell division.

Another aspect of the present invention is directed to a genetic construct in the form of a eukaryotic artificial chromosome such as a mammalian artificial chromosome (MAC), a human artificial chromosome (HAC) or comprising an origin of replication and a sequence of nucleotides which:
(i) directs a conformation defining a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof wherein said neocentromere is capable of associating with CENP-A and CENP-C or antibodies thereto; and
(ii) contains no substantial α-satellite DNA repeat sequences;
said sequence of nucleotides flanked by eukaryotic (e.g. mammalian) telomeric sequences which are in turn flanked by yeast telomeric sequences wherein a unique enzyme site is located between the human and yeast telomeric nucleotide sequences such that upon contact with said enzyme, the yeast telomeric sequences are removed and the eukaryotic (e.g. mammalian) telomeric sequences are exposed.

Still another aspect of the present invention provides a genetic construct comprising an origin of replication and a first nucleic acid molecule defining a human neocentromere or a functional derivative thereof or latent, synthetic or hybrid form thereof, a second nucleic acid molecule encoding a peptide, polypeptide or protein, wherein said first and second nucleic acid molecules are flanked by a first set of eukaryotic (e.g. mammalian, such as human) telomeric sequences which are in turn flanked by a second set of eukaryotic (e.g. yeast) telomeric sequences wherein there are unique enzyme sites between the first and second telomeric sequences such that upon contact with a required enzyme, the second telomeric sequences are cleaved off to expose the first telomeric sequences.

BRIEF DESCRIPTION OF THE FIGURES

Figure 1 is a schematic representation showing identification of a YAC contig spanning the marker centromere region. (A) Comparison of GTL banding patterns of mardel 10 and normal chromosome 10. The pair of open arrows indicate the two breakpoints on a normal chromosome 10 in generating the marker chromosome (Voullaire et al., 1993). The long and short arms of the marker chromosome are designated q' and p', respectively, to distinguish them from the q and p arms of the normal chromosome 10. Asterisk denotes the position of a cosmid 10pC38 that was used to "tag" the q'-arm of stretched marker chromosomes in the ANTI-CEN/FISH experiments. (B) A 4-megabase YAC contig (#082) from 10q25.2 region that spans the marker centromere. The tilling path of YACs #0 to #23 and their corresponding CEPH library addresses are shown. (C) FISH mapping of selected YAC clones from contig #082 using normal fluorescence microscopy and standard metaphase chromosomes prepared from transformed lymphoblast cells of patient BE. The distribution of FISH signals (vertical axis) is shown as a percentage ofthe signals on one arm ofthe marker chromosome that is in excess of those found on the opposite arm of the chromosome. The total number of fluorescence signals scored for each of the YAC clones is indicated in brackets.

Figure 2 is a photographic representation showing ANTI-CEN/FISH analysis of the marker centromere. (A) Detection of α-satellite DNA using a mixture of α-satellite DNA probes (red signals) under low stringency conditions. Centromeres were counter-labelled with CREST#6 autoimmune antibody (pale blue dots; or white when superimposed on a red background). Chromosomes were prepared from transformed lymphoblast cells of patient BE. The right-hand panel represents green pseudo-coloring of DAPI images of chromosomes to provide a better definition of chromosome outline. Only the signal for the antibody, but not that for α-satellite, was seen on the marker centromere (arrowed). (B) Simultaneous labelling of stretched human metaphase chromosomes with CREST#6 (red) and anti-CENP-C antibody, Am-Cl (pale blue), with the white color indicating full coincidence of the two antibody signals. (C) Detection of CENP-C on the marker chromosome. Simultaneous labelling of the marker chromosome (arrowhead) with (a) Am-Cl (pale blue) and (b) CREST#6 (red), (c) Combined images of a and b, showing complete coincidence of Am-Cl and CREST#6 signals, (d) FISH analysis of the same cell as a-c using the 10pC38 cosmid probe (pale blue dots and green arrows) to identify the marker chromosome. Some loss of ANTI-CEN signal, especially for the Am-Cl antibody was seen following FISH, (e) Green pseudo-coloring of DAPI images. A colour photograph corresponding to this figure is available upon request.

Figure 3 is a photographic representation showing ANTI-CEN/FISH analysis of cosmid clones on stretched (A, a-f) and superstretched (B) metaphase chromosomes, (a-c) Examples of cosmid signals (white arrows) localized to the q'-region of the marker centromere, (d-f) Examples of cosmid signals (white arrows) localized to the p'-region of the marker centromere. Green arrows indicate positions of the 10pC38 cosmid DNA tag used to mark the q'-end of the marker chromosome. (B) Mapping of Y6C21 onto a superstretched metaphase chromosome. Not included is the 10pC38 q'-tag signal located further to the left of the chromosomal segment shown. ANTI-CEN signals are in red, FISH signals are in pale blue, and overlapping ANTI-CEN and FISH signals are in white. Each of the pictures is accompanied by DAPI images of chromosomes pseudo-coloured in green. A colour photograph corresponding to this figure is available upon request.

Figure 4 Localization of the anti-centromere antibody -binding domain, a, Relative positions of different cosmid and PAC clones within the YAC #082 contig, using YAC-3 as a reference. Cosmids are designated as YnCm, where 'n' denotes the YAC of origin and 'm' denotes the cosmid number. PACs 1-5 are five different PAC clones isolated from a human PAC library (Genome Systems Inc). "HC-contig" represents a group of overlapping cosmids that map tightly around the marker centromere in ANTI-CEN/FISH experiments. A genomic map corresponding to the depicted YAC region was derived from the DNA of patient BE and shown above the YAC map. S, Sail; K, Kspl; N, Notl; Sf, Sfil. b, Cumulative scoring of FISH signals in AΝTI-CEΝ SH experiments for cosmids Y3C64, Y6C8, Y3C94, Y7C14, Y4C45, Y6C10, Y6C21, Y3C3, PAC5, Y13C1, Y13C8, and Y17C6. The distribution of FISH signals (vertical axis) is those found on the opposite arm of the chromosome. The total number of fluorescence signals scored for each ofthe cosmid clones is indicated in brackets, c, Restriction mapping of the 80- kb region covered by the eight overlapping cosmids of the HC-contig. These eight cosmids were derived from four different YACs (YAC-3, YAC-4, YAC-6, and YAC-7) and provided independent confirmation ofthe map. Furthermore, the map agreed fully with the restriction map of a 120 kb-insert PAC clone (PAC4) that spanned the entire HC-contig region. E, EcoRI; R, EcoRV; Ν, Notl.

Figure 5 is a representation showing restriction analysis of genomic DΝA of patient BE and those of his normal parents using Y6C10 as probe. DΝA was resolved on a PFGE (A) or standard agarose gel (B and C). Samples 1, 2 and 3 were fibroblast cultures of mother of BE, father of BE, and patient BE, respectively. Sample 4 was a somatic hybrid cell line BE2C1-18-5F containing the marker chromosome. Fragment sizes are in kilobases.

Figure 6 is a representation of the full nucleotide sequence of the HC-contig DΝA derived from normal human chromosome lOq 25.2 region.

Figure 7 is a diagrammatic representation of the method used to retrofit YAC3 and YAC5.

Figures 8A to J are diagrammatic representations of the different vectors used for cloning DΝA as YACs by the conventional restriction/ligation methods.

Figures 9 is a diagrammatic representation of circular TAR summarising the recombination process.

Figure 10 is a diagrammatic representation showing modification of TAR vector.

Figure 11 is a diagrammatic representation of the cloning of 10q25 human neocentromere DNA from mardel (10) chromosome. This DNA is designated NC-contig DNA to distinguish it from the HC-contig derived from the corresponding region of the normal chromosome 10. (A) Structural map of the NC-contig region and flanking DNA. Arrows indicate the relative positions and directions of primers used in PCR analyses (Table 3). The restriction sites EcoRI, EcoKV, Srfl, and Sftl and Sfil are indicated by Ε, R, Sr and Sf, respectively. The position of the TAR "hook" CΕ-F2 is represented by the solid box. The hatched bar represents HC- or NC-contig. p' and q' refer to the short and long arms of mardel (10), respectively. (B) Circular TAR strategy using the vectors pVC39-Alu/C3-F2(+) and pVC39-Alu C3-F2(-) for the direct cloning of the neocentromere DNA from mardel (10). The position of the Alu consensus sequence hook is represented by the white box. Crosses denote the sites of recombination between the TAR vector and the genomic DNA at the Alu and C3-F2 hooks during cloning. (C) Structural maps of the resulting circular YACs 5f-52-E8 and 5f-38-F2 containing the neocentromere DNA of the mardel (10) chromosome. The DNA flanking the NC-contig is represented by stippled bars. (D) Structural maps of BAC/E8-1 and BAC/F2- 14. Nt represents Notl and URA-BAC-neo represents the retrofitting vector BRV1 (Larionov et al, 1997).

Figure 12 is a diagrammatic representation showing specific TAR of HC-region from mardel 10.

The method was as follows: (1) Co-transformation into YPH857; (2) Select HIS+ colonies; (3) screen for HC-region by PCR; (4) Prepare high-MW DΝA; (5) Digest with I-Scel to expose hTELS; (6) Transfect HT 1080 cells; (7) Select for G418R; and (8) analyse by PFGE and FISH.

Figure 13 is a diagrammatic representation showing cloning in yeast as YAC/HAC.

Figure 14 is a diagrammatic representation outlining TACT procedure.

Figure 15 is a diagrammatic representation of TACT constructs.

Figure 16A is a representation of the full nucleotide sequence of the ΝC-contig DΝA derived from mardel (10) and corresponds to the HC-contig DNA region of the normal chromosome 10.

Figure 16B is a representation of the partial nucleotide sequence of the BAC/F2-14 clone that is derived from a region immediately p' of the NC-contig DNA (see Fig. 1 ID).

SUMMARY OF SEQ ID NOs.

SEQ ID NO. DESCRIPTION
1 DNA primer
2 DNA primer
3 Nucleotide sequence of HC-contig

4 Nucleotide sequence of NC-contig

5 BAC-F2 contig 1
6 BAC-F2 contig 2
7 BAC-F2 contig 3
8 BAC-F2 contig 4
9 BAC-F2 contig 5
10 BAC-F2 contig 6
11 BAC-F2 contig 7
12 BAC-F2 contig 8
13 BAC-F2 contig 9
14 BAC-F2 contig 15
15 BAC-F2 contig 33
16 BAC-F2 contig 39
17 BAC-F2 contig 41
18 BAC-F2 contig 42
19 BAC-F2 contig 44
20 BAC-F2 contig 47
21 BAC-F2 contig 47 fragment 1

22 BAC-F2 contig 47 fragment 2

23 BAC-F2 contig 47 fragment 3

24 BAC-F2 contig 47 fragment 4

25 BAC-F2 contig 47 fragment 5
26 BAC-F2 contig 47 fragment 6 27 BAC-F2 contig 47 fragment 7
28 BAC-F2 contig 47 fragment 8
29 BAC-F2 contig 47 fragment 9

ABBREVIATIONS USED IN THE SUBJECT SPECIFICATION

mardel (10): Marker chromosome from patient BE; comprises a
rearrangement of chromosome 10.
HAC: Human artificial chromosome
YAC: Yeast artificial chromosome
MAC: Bacterial artificial chromosome
PLAC: Plant artificial chromosome
neocentromere: A centromere containing no substantial α-satellite DNA

CENP: Centromere binding protein
HC-contig: Region of normal chromosome 10 comprising
neocentromere
E8: q' end/region of mardel (10) neocentromere
F2: p' end/region of mardel (10) neocentromere
BE: Patient from which mardel (10) identified
TAR: Transformation-associated recombinant
PCR: Polymerase chain reaction
Marker neocentromere: neocentromere on mardel (10).
NC-contig region of mardel (10) chromosome comprising
neocentromere

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The present invention is predicated in part on the identification and isolation of nucleic acid molecules exhibiting neocentromeric properties. In accordance with the present invention, a neocentromere is considered a centromere which does not contain substantial α-satellite DNA repeat sequences and, when activated, is capable of functioning as a centromere. The term "substantial" in this context means that the nucleic acid molecule does not contain detectable α-satellite by FISH analysis under medium stringency conditions. The neocentromere may contain a small number of highly diversed α-satellite DNA. In primates, α-satellite DNA is consider 171bph in length. An nucleic acid molecule containing an activated neocentromere or a neocentromere otherwise functioning as a centromere facilitates in accordance with the present invention, the nucleic acid molecule replicating, remaining extra-chromosomal and segregating with cell division. Reference herein to "neocentromere" is taken to mean a centromere substantially devoid of α-satellite DNA repeat sequences.

Accordingly, one aspect of the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides which defines an eukaryotic neocentromere.

More particularly the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides derived from a eukaryotic chromosome and encompassing a neocentromere which nucleic acid molecule when introduced into a compatible cell is capable of replicating, acting as an extra-chromosomal element and segregating with cell division.

The present invention is exemplified herein by the identification and cloning of a human neocentromere. This is done, however, with the understanding that the present invention extends to all eukaryotic neocentromeres such as from mammalian, plant, aviary, insect, fungal, yeast and reptilian chromosomes. The most preferred neocentromere, however, is from human chromosomes and their mammalian homologues.

The present invention is predicated in part on the identification of an unusual chromosomal marker in a patient designated "BE". The chromosomal marker is referred to as "mardel (10)" and results from a rearrangement of human chromosome 10. The mardel (10) marker is mitotically stable and, in accordance with the present invention, contains a functional neocentromere at a location regarded as non-centromeric. The neocentromere at mardel (10) is located between q24 and q26 on chromosome 10 and more particularly around q25. Even more particularly, the neocentromere maps to q25.2 on chromosome 10. The present invention is exemplified by DNA cloned from the q24-q26 region of the mardel (10) chromosome as well as the corresponding region on normal human chromosome 10. These DNA molecules contain a functional neocentromere. The present invention extends, however, to any neocentromere or any chromosome in mammalian and non-mammalian animals as well as plants, yeasts and fungi.

For convenience, the DNA clones from the mardel (10) chromosome as well as from normal human chromosome 10 are summarised in Figure 11. The neocentromere located at or around 10q25 is located on a clone designated the "HC-contig". DNA clones from mardel (10) are referred to as "E8" or the "NC-contig" which extends from the long arm (q') of mardel (10) towards the short arm (p'). Clone F2 extends further p' from E8 (see Figure 11). It is emphasised, however, that the present invention extends to any neocentromere on any human chromosome as well as neocentromeres on other mammalian and non-mammalian chromosomes including chromosomes from plants, insects, reptiles, yeast and fungi.

The present invention further contemplates a nucleic acid molecule or its chemical equivalent having a tertiary structure which defines a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue.

Even more particularly, the present invention is directed to an isolated nucleic acid molecule having a sequence of nucleotides or their chemical equivalents which directs a conformation defining a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue wherein the centromere associates with centromere binding proteins (CENP) -A and CENP-C or antibodies thereto.

Reference herein to "latent" in relation to a centromere includes reference to a centromere not normally functional but nevertheless activatable under certain conditions. A latent centromere may also be considered as a neocentromere provided it has no substantial α-satellite DNA repeat sequences.

The size of the neocentromere in accordance with the present invention may range from about 50 bp to about 1500 kbp, from about 70 bp to about 1000 kbp, from about 75bp to about 800 kpb, from about 80 bp to about 500 kbp, from about 85 bp to about 200 kbp, from about 90 bp to about 100 kbp, from about 100 bp to about 1 kbp, about 120 bp to about 500 bp, about 180 bp to about 300 bp. In one particular embodiment, the centromere is approximately 60-100 kbp. 5 In another embodiment, the centromere is about 80 kbp.

The nucleic acid molecule encompassing the HC-contig for human chromosome 10 of the present invention set forth in Figure 6 (SEQ ID NO: 3). The nucleic acid molecule encompassing the NC-contig (part of E8) from mardel (10) is set forth in Figure 16A (SEQ ID

10 NO: 4). The nucleic acid molecule encompassing F2 of mardel (10) is set forth in Figure 16B as separate contigs (SEQ ID NOs: 5-29). The nucleic acid molecules have a tertiary structure and the neocentromere is a conformation of nucleotides within this tertiary structure. Accordingly, the neocentromere is not defined by a linear sequence of nucleotides although this linear sequence directs the conformation which in turn defines the neocentromere. Although this

15 aspect of the present invention is exemplified using the nucleotide sequence set forth in Figure 6, 16A and 16B, the subject invention extends to any sequence directing a conformation defining a centromere and hybridising to the sequence set forth in one or more of Figures 6, 16A and/or 16B under low stringency conditions at 42°C and/or which comprises a nucleotide sequence having at least about 40% nucleotide similarity to one or more sequences set forth in Figures 6,

20 16A and/or 16B. Preferably, the percentage similarity is at least about 50%, more preferably at least about 60%, still more preferably at least about 70%, even more preferably at least about 80-90% or above such as 95%, 97%, 98% and 99%.

Another embodiment of the present invention is directed to YAC 3 and YAC 5 encompassing 25 the HC contig and flanking sequence as well as nucleotide sequences related to YAC 3 and/or YAC 5 at the homology, similarity or hybridization levels.

Reference herein to a low stringency at 42 °C includes and encompasses from at least about 1% v/v to at least about 15% v/v formamide and from at least about IM to at least about 2M salt for

30 hybridisation, and at least about IM to at least about 2M salt for washing conditions. Alternative stringency conditions may be applied where necessary, such as medium stringency, which includes and encompasses from at least about 16% v/v to at least about 30% v/v formamide and from at least about 0.5M to at least about 0.9M salt for hybridisation, and at least about 0.5M to at least about 0.9M salt for washing conditions, or high stringency, which includes and encompasses from at least about 31% v/v to at least about 50% v/v formamide and from at least about 0.01M to at least about 0.15M salt for hybridisation, and at least about 0.01M to at least about 0.15M salt for washing conditions. These stringency conditions may be altered dependent on the source of DNA and other factors.

The term "similarity" as used herein includes exact identity between compared sequences at the nucleotide level. Where there is non-identity at the nucleotide level, "similarity" includes differences between sequences which nevertheless result in conformation defining a functional neocentromere.

The nucleic acid molecule ofthe present invention may comprise a naturally occurring nucleotide sequence from a healthy human subject or may comprise the nucleotide sequence from a human subject exhibiting one or more chromosomal-dependent conditions such as a subject carrying mardel 10 chromosome or a chromosome conferring an equivalent or similar condition or may carry one or more nucleotide substitutions, deletions and/or additions relative to the naturally or non-naturally occurring sequence. Such modifications are referred to herein as "derivatives" and include mutants, fragments, parts, homologues and analogues of the naturally occurring nucleotide sequence. Preferably, the derivatives of the present invention still define a functional neocentromere.

Reference herein to a "neocentromere" includes reference to a functional neocentromere or a functional derivative thereof meaning that it is capable of facilitating sister chromatid cohesion and chromosomal segregation during mitotic cell divisions and/or is capable of associating with CENP-A and/or CENP-C and or is capable of interacting with anti-CENP-A antibodies or anti-CENP-C antibodies. Generally, and preferably, the neocentromere is incapable of interacting with CENP-B or anti-CEP-B antibodies. Alternatively, the neocentromere may be a latent centromere capable of activation by epigenetic mechanisms. The neocentromere may also be a hybrid of other human, mammalian, plant or yeast neocentromeres. Synthetic neocentromeres provided by, for example, polymeric techniques to arrive at the correct confromation are also contemplated by the present invention. All such forms and definitions of neocentromere are encompassed by use of this term.

Another aspect of the present invention provides an isolated nucleic acid molecule or chemical equivalent having the following characteristics:
(i) comprises a nucleotide sequence or chemical equivalent directing a conformation which defines a neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or;
(ii) comprises a nucleotide sequence or chemical equivalent substantially as set forth in one or more of Figures 6, 16A and or 16B or having at least about 40% similarity thereto or capable of hybridising thereto under low stringency conditions at 42 °C; and (iii) comprises a neocentromere capable of associating with CENP-A or CENP-C or antibodies thereto.

Preferably, the neocentromere is incapable of interacting with CENP-B or antibodies thereto.

In a particularly preferred embodiment, the centromere corresponds to a human genomic region which maps between q24 and q26 on chromosome 10, and in particular q25 on chromosome 10.

The nucleic acid molecule or its chemical equivalent of the present invention defining a conformational neocentromere or functional derivative thereof or latent, synthetic or hybrid form thereof is useful inter alia for the generation of artificial chromosomes such as human artificial chromosomes (HACs), mammalian artificial chromosomes (MACs), yeast artificial chromosomes (YACs) and plant artificial chromosomes (PLACs). HACs are particularly useful since they are capable of accommodating large amounts of DNA and are capable of propagation in human cells. The HACs are non- viral in origin and, hence, are more suitable for gene therapy by, for example, introducing therapeutic genes. Furthermore, the HACs remain extra-chromosomal and, hence, have no insertional/substitutional mutagenic potential. The essence of a HAC is the presence of a neocentromere or latent, synthetic or hybrid form thereof which enables stable segregation during cell division. The HAC also remains extra-chromosomal and, hence, is more suitable for gene therapy. Reference to "extra-chromosomal" means that it does not integrate into the main chromosome and, in effect, is episomal.

Accordingly, the present invention provides a genetic construct comprising an origin of replication for a eukaryotic cell and a nucleic acid molecule encompassing a eukaryotic neocentromere or a functional derivative thereof or a latent, synthetic, hybrid form thereof or its mammalian or non-mammalian homologue flanked by telomeric nucleotide sequences functional in the cell in which the genetic construct is to replicate and wherein said genetic construct when introduced into a cell is a replicating, extra-chromosomal element which segregates with cell division.

More particularly, the present invention further contemplates a genetic construct in the form of an artificial chromosome comprising an origin of replication for a mammalian, human, plant or yeast cell and a nucleic acid molecule encompassing a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian or non-mammalian homologue flanked by telomeric nucleotide sequences functional in the cell in which the artificial chromosome is to replicate.

Another embodiment provides a genetic construct in the form of an artificial chromosome comprising an origin of replication for a mammalian, human, plant or yeast cell and a nucleic acid molecule having a tertiary structure which defines a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or its mammalian homologue flanked by telomeric sequences functional in the cell in which the artificial chromosome is to replicate.

Yet another embodiment is directed to a genetic construct in the form of an artificial chromosome comprising an origin of replication for a mammalian, human, plant or yeast cell and a nucleic acid molecule having a sequence of nucleotides which directs a conformation defining a human neocentromere wherein the centromere associates with CENP-A and/or CENP-C or antibodies thereto and does not contain substantial α-satellite DNA repeat sequences, said nucleic acid molecule flanked by telomeric nucleotide sequences functional in the cell which the artificial chromosome is to replicate.

Still yet another aspect of the present invention relates to a genetic construct in the form of an artificial chromosome comprising an origin of replication for a mammalian, human, plant or yeast cell and a nucleic acid molecule comprising a sequence of nucleotides which:
(i) directs a conformation which defines a neocentromere or a functional form thereof or a latent, synthetic or hybrid form thereof;
(ii) comprises a nucleotide sequence substantially as set forth in one or more of Figures 6, 16A and/or 16B or having at least about 40% similarity to the nucleotide sequences set forth in Figures 6, 16A and/or 16B or is capable of hybridising to one or more of these sequences under low stringency conditions at 42 °C;
wherein the neocentromere is capable of associating with CENP-A and/or CENP-C or antibodies thereto and wherein said nucleic acid molecule is flanked by telomeric nucleotide sequences functional in the cell in which the artificial chromosome replicates.

In a preferred embodiment, the genetic construct is a HAC and comprises human telomeric sequences. In a particularly preferred embodiment, the HAC further comprises yeast artificial chromosome (YAC) arms and, hence, becomes a HAC/YAC shuttle vector capable of propagation in human and yeast cells. Preferably, the HAC/YAC contains a unique enzyme site between yeast telomeric sequences and human telomeric sequences such that upon contact with the particular enzyme, the yeast telomeric sequences are removed leaving the human telomeric sequences. Preferably, the unique enzyme site is a yeast specific enzyme site such as I-Scel.

According to this embodiment, there is provided a genetic construct defining a HAC/YAC comprising an origin of replication and a nucleic acid molecule encompassing a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or a mammalian or non-mammalian homologue thereof, said nucleic acid molecule flanked by human telomeric sequences which are in turn flanked by yeast telomeric sequences wherein a unique enzyme site is located between the human and yeast telomeric nucleotide sequences such that upon contact with the enzyme, the yeast telomeric sequences are removed and the human telomeric sequences are exposed.

More particularly, the present invention is directed to a genetic construct defining a HAC/YAC comprising an origin of replication and a nucleic acid molecule encompassing a human centromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or a mammalian or non-mammalian homologue thereof wherein the neocentromere associates with CENP-A and or -C or antibodies thereto and does not contain substantial α-satellite DNA sequences wherein said nucleic acid molecule is flanked by human telomeric sequences which are in turn flanked by yeast telomeric sequences wherein a unique enzyme site is located between the human and yeast telomeric nucleotide sequences such that upon contact with said enzyme, the yeast telomeric sequences are removed and the human telomeric sequences are exposed.

Even more particularly, the present invention is directed to a genetic construct in the form of a HAC/YAC comprising an origin of replication and a sequence of nucleotides which directs a conformation defining a human neocentromere or a functional derivative thereof or a latent, synthetic or hybrid form thereof or a mammalian or non-mammalian homologue thereof wherein said neocentromere is capable of associating with CENP-A and or CENP-C or antibodies thereto, said sequence of nucleotides flanked by human telomeric sequences which are in turn flanked by yeast telomeric sequences wherein a unique enzyme site is located between the human and yeast telomeric nucleotide sequences such that upon contact with said enzyme, the yeast telomeric sequences are removed and the human telomeric sequences are exposed.

Preferably, the length of the nucleotide sequence is between about 30 kpb and 1500 kpb, and more preferably between 60 kbp and 1000 kpb.

In a particularly preferred embodiment, the unique enzyme site is a yeast specific enzyme site such as I-Scel.

The present invention extends to yeast cells and human cells carrying the genetic constructs of the present invention and to proteins produced therefrom.

The genetic constructs may also comprise marker genes and other unique restriction sites to facilitate insertion of adventitious DNA. Accordingly, the genetic constructs of the present invention may further comprise adventitious or heterologous DNA encoding a product of interest. Preferred products of interest include pharmaceutically useful genes such as genes encoding cytokines, receptors, growth regulators and the like. Endogenous genes may also be replaced by wild-type genes or modified genes.

The adventitious or heterologous DNA may also encode a molecule not synthesised in a sufficient amount in a particular subject and hence the increased copy number permits greater amounts of the molecule being synthesised.

Accordingly, the present invention contemplates a genetic construct comprising an origin of replication and a first nucleic acid molecule defining a human neocentromere or a functional derivative thereof or latent, synthetic or hybrid form thereof or a mammalian or non-mammalian homologue, a second nucleic acid molecule encoding a peptide, polypeptide or protein, wherein said first and second nucleic acid molecules are flanked by a first set of human telomeric sequences which are in turn flanked by a second set of yeast telomeric sequences wherein there are unique enzyme sites between the human and yeast telomeric sequences such that upon contact with said enzyme, the yeast telomeric sequences are cleaved off to expose the human telomeric sequences.

Reference herein to segregate preferably means mitotically stable segregation. Conveniently, stable segregation may be determined as the presence of an artificial chromosome in 40-60% of daughter cells after 4-6 months of continuous passage.

The present invention extends to other artificial chromosome analogues to the HACs and HAC/YACs described above such as MACs and PLACs.

Another aspect ofthe present invention relates to peptides, polypeptides and proteins which bind, interact or otherwise associate with the human neocentromere of the present invention or its mammalian and non-mammalian homologue. Preferably, the molecules are proteins, referred to as primary (1°) proteins. The 1° proteins bind to the neocentromere and secondary (2°) proteins bind to the 1° proteins before or after association with the neocentromere. The identification of the human neocentromere in accordance with the present invention provides a mechanism for assaying 1 ° proteins and 2° proteins which may be important for screening chromosomes in, for example, genetic disorders. This is particularly the use in Down's Syndrome which results from defective chromosome segregation.

The 1° proteins are readily detected by, for example, a gel shift assay. The nucleic acid molecule of the present invention defining the human neocentromere is digested, labelled and contacted with nuclear extract putatively containing the 1 ° proteins and resolved on a gel. When a 1 ° protein binds to a fragment carrying a binding portion of the neocentromere, the DNA fragment migrates in the gel at a slower rate due to the bound protein.

The present invention extends to purified 1 ° proteins capable of association with the subject centromere and to genetic sequences encoding same and to antibodies thereto.

The neocentromeres of the present invention are readily identified and characterised using, for example, human fibrosarcoma cell lines. For example, DNA suspect of carrying a neocentromere, is introduced into fibrosarcoma cells in a linear form, generally together with a telomeric sequence. The cells are then screened for the presence of replicating, extra-chromosomal and segregating elements, referred to as mini chromosomes.

The present invention further encompasses eukaryotic cells carrying replicating, extrachromosomal and segregation nucleic acid molecules. Preferably the eukaryotic cells are mammalian cells and most preferably human cells. The nucleic acid molecules according to this aspect ofthe present invention are preferably as herein described. Particularly preferred cells are HT-38, HT-47, HT-54, HT-190, HT-191, BAC/E8-1, and BAC/F2-14.

The present invention is further described by the following non-limiting Figures and Examples.

EXAMPLE 1
YAC and Cosmid Probes for FISH

YACs carrying specific STSs were identified (Moir et al., 1994) by PCR-based screening of YAC libraries prepared in pYAC4 vector at the Center for Genetics in Medicine at Washington University (Brownstein et al., 1989) and at the CEPH (Albertsen et al., 1990). Cosmid DNA inserts (35-40 kb) were ligated to SuperCos I vector (Stratagene) and packaged with Gigapack III Gold extract (Stratagene) according to the manufacturer's instructions. YAC probes were prepared by Alu-PCR of total yeast genomic DNA using primers 5'-GGATTACAGG(CAT)(A G)TGAGCCA-3' [SEQ ID NO: l] and 5'- (A G)CCA(C/T)TGCACTGCAGCCTG-3' [SEQ ID NO:2] according to published method (Archidiacono et al., 1994). For probe labelling, 1 μg of the YAC PCR products or whole cosmid DNA isolated by CsCl centrifugation or Qiagen column was used. The DNA was labelled with Biotin- 16-dUTP (Boehringer Mannheim) using a NICK translation kit (Boehinger Mannheim). A probe mix of 6- 10 μg/ml of biotinylated probe DNA, 300 μg/ml of COT- 1 DNA (Boehringer Mannheim), 500 μg/ml of carrier salmon sperm DNA and, where indicated, 10 μg/ml of biotinylated 10pC38 tag DNA was ethanol precipitated, resuspended in a hybridization mix of 50% v/v formamide in 2 x SSC and 10% w/v dextran sulphate, denatured at 95°C for 5 min, preannealed for 30-60 min at 37°C to suppress repetitive sequences, before adding to slides. FISH of α-satellite and satellite III probes was performed under low stringency as previously described (Voullaire et al, 1993).

EXAMPLE 2
Somatic Cell Hybrids and Other Cell Lines

Skin fibroblasts and transformed lymphoblast cell lines were established from patient BE (Voullaire et al., 1993) and from his normal parents. The presence of the mardel 10 chromosome in the patient cell lines was confirmed by FISH. In addition to these cell lines, two somatic cell hybrids were produced by fusing cultured fibroblast cells derived from patient BE with the Chinese hamster ovary cell line CHO-Kl using polyethylene glycol. Hybrid cells were selected in a proline-free medium for the glutamic oxaloacetic transaminase-1 (GOT-1) gene located in 10q24-q25 region. One ofthe hybrid cell lines, designated BE2Cl-18-lf, was shown to contain the normal chromosome 10 but not the marker chromosome, while another hybrid cell line, designated BE2C1-18-5F, contained the marker chromosome but not the normal chromosome 10 of patient BE. The presence or absence of these chromosomes was established by karyotyping and ANTI-CEN/FISH probing. In addition, PCR analysis of an STS (sequence tagged site) marker, AFM259xg5, which resided on YAC-3, confirmed the status of these chromosomes in the hybrids and excluded the presence of submicroscopic fragments of the marker centromere region within the genome of BE2Cl-18-lf, or the presence of the corresponding region of normal chromosome 10 within the genome of BE2Cl-18-5f. Use of this STS marker also demonstrated that the mardel 10 chromosome has originated from the patient's father.

EXAMPLE 3
Antisera

Antiserum CREST #6 was from a patient with ςalcinosis, Raynaud's phenomenon, esophageal dysmotility, sclerodactyly and telangiectasia (a constellation of symptoms commonly referred to as "CREST"; Moroi et al., 1981; Fritzler and Kinsella, 1980; Brenner et al., 1981). Western blot analysis of this antiserum indicated that the primary antigens detected were human CENP-A and CENP-B. A specific anti-CENP-C polyclonal antibody, designated Am-Cl, was produced by the inventors by expressing a partial mouse CENP-C polypeptide (amino acid #41 to 345) as a GST-fusion product in E. coli, followed by gel purification of the product and its use as an antigen for antibody production in rabbit.

EXAMPLE 4
Preparation of Standard Metaphase Chromosomes for FISH analysis

Actively replicating transformed lymphoblasts were incubated at 37°C for 17 h in the presence of 0.1 M final concentration of thymidine before they were centrifuged at 2000 rpm for 10 min, washed with pre-warmed RPMI, and incubated for a further 5-6 h. 15 min before harvesting, colcemid (lOμg/ml) was added. Cells were harvested according to standard cytogenetic techniques using 0.075M KC1 hypotonic solution for 15 min at 37°C, followed by three fixative washes in ice cold methanol/acetic acid 3: 1, dropped onto clean glass slides, and stored dessicated at -20°C until required.

EXAMPLE 5
Preparation of Mechanically Stretched Chromosomes for ANTI-CEN/FISH Mapping
METHOD - 1

This is an adaptation of the method described by Page et al. (1995). Colcemid (lOμg/ml) was added to actively dividing transformed lymphoblasts for 2-3 h, before the cells were centrifuged at 1500 rpm for 10 min, washed in PBS, and resuspended in 0.075M KC1 hypotonic solution for 10 min at RT at a concentration of approximately 5 x 104 cells/ml; the use of fewer cells here gave better stretching of the chromosomes. 200-300μl of this suspension were then cytocentrifuged onto clean microscope slides using a Cytospin 2 (Shandon) at 1000 rpm for 5 min at high acceleration. The slides were immediately removed, placed flat in a shallow dish and very gently flooded with KCM (Potassium Chromosome Medium: 120 mM KC1, 20mM Nacl, lOmM Tris-HCI, 0.5mM Na^DTA, 0.1% v/v Triton X-100) (Jeppesen et al, 1992). After 10 min at RT, immunofluorescence was performed without fixation (Earnshaw and Migeon, 1985; Earnshaw et al, 1989; Jeppesen et al, 1992; Jeppesen and Turner, 1993). KCM buffer was gently aspirated and 50μl of CREST#6 serum [diluted 1:50 in 1 x TEEN (ImM Triethanolamine HCl, 0.2 mM Na^DTA, 25 mM NaCI), 0.1% v/v Triton X-100, 0.1% w/v BSA] was added to the cell area of the slide and covered with a parafilm coverslip. The slides were incubated for 30 min at 37°C, then washed very gently by flooding in 1 x KB' [10 mM Tris-HCI (pH7.7), 0.15M NaCI, 0.1% w/v BSA), three rinses of 3 min each at RT. The primary antibody was detected with Texas Red-conjugated Affini-pure Rabbit anti-Human IgG (H&L) (Jackson Laboratories) diluted 1:50 in 1 x KB'. 50 μl was added to each slide, covered with a parafilm coverslip, and incubated for 30 min at 37 °C. The slides were again gently washed by flooding in 1 x KB" for 2 min at RT, before they were fixed by flooding in 10% v/v formalin in KCM for 10 min at RT, followed by three rinses of 3 min each in distilled water. If FISH was not performed the slides were rinsed in PBS and mounted in DAPI (0.25 μg/ml) in DABCO antifade mountant. [In experiments where CREST#6 and Am-Cl antisera were simultaneously used to label the centromere (Figs. 2B and C), the above procedure was followed except for the addition of Am-Cl diluted 1: 100 together with CREST#6, and the Am-Cl antibody was detected using 1: 100 diluted Donkey anti-Rabbit DTAF (Jackson Laboratories)].

If FISH was to be performed on the slides, they were then given a second fix in 3: 1 methanol/acetic acid for 15 min at RT. The slides were air dried for at least 5 min and either processed for FISH or stored at -20°C for up to several days before continuing. For FISH, the slides were dehydrated at RT in 70%, 90%, 100% v/v ethanol (2 min each) and air dried. Chromosomal DNA was denaturated in deionised 70% v/v formamide/2xSSC, pH 7.0 at 82°C for 8 min followed by immediate dehydration in 70%, 90% and 100% v/v ethanol at -20°C for 2 min each, then air dried for at least 10 min. (This high temperature of denaturation was critical to obtain maximum FISH signals). An amount of 15 μl of the prepared probe was added to each slide, covered with a 22mm2 coverslip, and sealed with rubber cement. Slides were hybridized overnight in a humid chamber at 37°C, then rinsed in 2 x SSC at RT, followed by 3 washes of 0.1 x SSC at 60°C for 5 min each, rinsed again in 2 x SSC, and immersed in a blocking agent of 5% non fat milk in 4 x SSC for 10 min at RT. Probe hybridization was detected by incubation with FITC-conjugated avidin at 37°C for 30 min, followed by three washes of 5 min each at RT in wash buffer (4 x SSC, 0.05% v/v Tween-20). Signals were amplified by incubating with goat anti-avidin D antibodies for 30 min at 37°C, followed by three washes of 5 min each at RT in wash buffer, then with another layer of avidin-FITC for 30 min at 37°C, before the slides were washed in wash buffer, rinsed in PBS, and counter-stained with DAPI (0.25 μg/ml) in DABCO mountant.

METHOD - π

The following method was modified from that of Haaf and Ward, (1994). Actively dividing lymphoblast cells were treated with lOμg/ml colcemid for 2-3h, washed in PBS and resuspended in a hypotonic solution consisting of lOmM Hepes (pH7.3), 30mM glycerol, l.OmM CaCl2 and 0.8mM MgCl2, at a cell density of approx. 2.5 x 102/ml. After 10 min of hypotonic treatment at RT, 300 μl were cytocentrifuged (Shandon - Cytospin 2) onto glass slides at 800 rpm for 4 min. The slides were immediately removed from the centrifuge, dried for 15 sec, fixed in methanol at -20°C for 20-30 min, rinsed in acetone at -20°C for a few sec, then washed in 3 rinses of PBS at RT. Immunofluorescence staining was done using CREST#6 at a dilution of 1:50 in PBS. After incubation at 37°C for 30 min, the slides were washed three times in PBS for 2 min each. This primary antibody was then detected by a further incubation for 30 min at 37°C with Texas Red-conjugated Rabbit anti-Human IgG diluted at 1 : 50 in PBS. The slides were fixed in 10% v/v formalin in KCM for 10 min at RT, then washed in 3 rinses of distilled water and drained. Before FISH was performed, slides were fixed in methanol acetic acid 3: 1 for 15 min at RT and air dried. Chromosomal DNA was denatured in 70% v/v deionised formamide (pH7.0) in 2 x SSC at 82°C for 4-6 min. After dehydration in an ice cold ethanol series the slides were air dried, and used for FISH as described for Method I. Slides could be stored covered in foil at RT after methanol/acetic acid fix for up to several weeks before FISH.

Both methods I and II were used to obtain the results shown in Figs. 2B, 2C, 3 and 4B.

EXAMPLE 6
Image Analysis

Hybridization signals for YAC mapping on standard metaphase preparations utilized a normal fluorescence microscope. Images for the ANTI-CEN/FISH experiments were analyzed on a

Zeiss Axiolab fluorescence microscope equipped with a lOOx objective and a cooled CCD camera (Photometries Image Point) controlled by a Power Mac computer. Gray scale images were captured separately using a LUDL filter wheel and controller for Texas Red, FITC and

DAPI. These images were pseudocoloured and merged using IPlab Spectrum software from Signal Analytics Corporation. A number of difficulties were commonly associated with the

ANTI-CEN/FISH technique: (a) the deliberate "stretching" of the chromosomes, whilst increasing the resolution of mapping, sometimes caused serious distortion to the chromosomes, often making them quite dysmorphic; (b) FISH treatment following the ANTI-CEN-labelling often significantly reduced the ANTI-CEN signals; (c) more highly stretched chromosomes (which would potentially give better mapping resolution) generally gave weaker ANTI-CEN signals; and (d) the ANTI-CEN signal on the mardel 10 centromere was usually weaker than those of the other human chromosomes. Thus, a cell would only be considered informative and used for scoring if both the p'- and q'-arms of the mardel 10 chromosome were discernible and separated by a discrete ANTI-CEN signal. In addition, FISH signals for both the test probe and the 10pC38 cosmid tag (used to identify the q'-arm of, and thus orientate, the marker 5 chromosome) must be clearly present. Using these criteria, the overall frequency of informative cells was found to be approximately 1 in every 20-30 metaphases analyzed.

EXAMPLE 7
Restriction Analysis of Patient DNA
10
High-molecular weight genomic DNA was extracted from cultured fibroblast cell lines of patient BE and those of his parents and digested with different enzymes to generate restriction fragments ranging from <lkb up to ~1 Mb. The digested DNA was resolved either on a standard agarose gel or by pulsed-field gel electrophoresis (PFGE) using a Bio-Rad CHEF-XA

15 Mapper. For filter hybridization, 50-100 ng of whole cosmid or PAC DNA was labelled by random priming. The labelled probe was then added to 2 ml of hybridization buffer (0.5M Na2HPO4, 7% w/v SDS, 1% w/v BSA, ImM EDTA, pH. 7.0) containing 500 μg of human placental DNA (Sigma). The mixture was boiled for 5 min, then placed in a 65°C water bath for preannealing of repetitive DNA for 90 min. The preannealed probe mix was then added to

20 prehybridizing filters and hybridized overnight at 65°C. Post-hybridization washes were at a final stringency of 0.1 x SSC, 0.1% w/v SDS at 68°C.

EXAMPLE 8
Identification of a YAC region spanning the marker centromere
25
The initial search for DNA sequences spanning the centromere of the mardel 10 chromosome was based on fluorescence in situ hybridization (FISH) of existing cosmid and YAC clones (Moir et al, 1994; Zheng et al, 1994) that have been mapped to the q24 - q26 region of the normal human chromosome 10 where the new marker centromere was formed (Voullaire et al,

30 1993) (Fig. 1 A). This search led to the identification of a 4 megabase YAC contig (designated #082) that spanned the marker centromere region (Fig IB). Fig. IC graphically presents the FISH mapping results with selected YACs from this contig. As can be seen, two of the YACs (YACs-1 and YAC-2) mapped to the q'-side of the marker centromere, whereas the remaining YACs mapped to the p'-side of the centromere. The low signal level observed for YAC-3 was due to a large proportion of this probe hybridising directly on the centromere itself. These results, therefore, provided evidence that YAC contig #082 spanned the marker centromere, and that the centromere region was likely to be within YAC-3, where the "cross-over" between the q' and p' signals occurred.

EXAMPLE 9
Development of Improved ANTI-CEN/FISH Methods for the Simultaneous
Detection of Marker Centromere and Single-copy Cosmid DNA Probes

Although normal fluorescence microscopy and FISH analysis of standard metaphase chromosomes were adequate for the initial identification of the YAC contig spanning the marker centromere, methods with significantly higher sensitivity and resolution were needed to allow further walking into the marker centromere DNA. Three requirements have to be satisfied by these methods: (a) the metaphase chromosomes have to be extended to offer much greater mapping resolution, (b) the centromeres have to be more precisely defined than that offered by a cytogenetic constriction, and (c) the methods should allow simultaneous visualization of both the centromere antibody and FISH signal. Two published methods were explored (designated here as ANTI-CEN/FISH methods) based on extending metaphase chromosomes by mechanical stretching and labelling ofthe neocentromere by autoimmune antibodies (Haaf and Ward, 1994; Page et al, 1995). Since these methods were originally established for the labelling of normal centromeres and for FISH analysis of highly repeated DNA, they were modified (see Example 4) to allow detection of the generally reduced ANTI-CEN signal of the subject marker neocentromere and the lower FISH signals resulting from the use of single-copy cosmid DNA probes.

With the improved detection methods, the status of α-satellite and satellite III DNA on the marker neocentromere was reassessed, since this was previously determined using standard microscopy and FISH (Voullaire et al, 1993). Fig. 2A shows the result of antibody labelling using CREST#6 and FISH using α-satellite DNA, and indicated the absence of detectable signal on the marker centromere. The same result was obtained when the experiments were repeated without ANTI-CEN-labelling, ruling out the possibility that the anti-centromere antibody might have obscured any weak FISH signals. Similar results were obtained with satellite III DNA. Since in separate reconstruction experiments, it was possible to demonstrate the sensitivity of the procedure in detecting a single-copy DNA probe of less than 1.5 kb, and making the reasonable assumption that the low-stringency hybridization conditions used for the α-satellite and satellite III DNA which, by virtue of the use of > 100-fold excess of probes and the strong hybridisation of these probes to all the other centromeres, would have allowed the detection of any related sequences, it can be concluded that these satellite are absent.

EXAMPLE 10
Co-localization of CENP-C and CENP-A on the Marker neocentromere

To test if CENP-C is present on the marker centromere, a specific rabbit polyclonal antibody was prepared against a recombinant product of mouse CENP-C. This antibody, designated Am-C 1 , reacted strongly with the centromeres of rodent and human chromosomes. Fig. 2B shows results for the labelling of stretched human metaphase chromosomes using this antibody simultaneously with the CREST#6 autoimmune antibody. As can be seen, irrespective of the degree of chromosome stretching, the signals for the two antibodies coincided fully on all the centromeres. The localization of these two antibodies on the marker chromosome was further determined by employing the 10pC38 cosmid tag in an ANTI-CEN/FISH experiment to identify the marker chromosome. The results indicated that both the antibody signals were clearly present and again coincided completely on the marker centromere (Fig. 2C, a-e). Although CREST #6 was known to bind CENP-A and CENP-B, indirect evidence suggests that binding to the marker centromere presumably occurred via CENP-A since the presence of the marker centromere was previously demonstrated not to bind CENP-B (Voullaire et al, 1993). The abo /e results, therefore, established the localization of CENP-C, and probably CENP-A, on the marker centromere.

EXAMPLE 11
Localization of the anti-centromere antibody-binding domain

For further walking into the marker centromere region, cosmid libraries were prepared from total yeast genomic DNA containing YACs-2, -3, -4, -6, -7, -13, and -17. Cosmid clones containing human DNA inserts were isolated by hybridization with human COT-1 DNA using low stringency. All resulting cosmids were screened by standard FISH to confirm their localization to the expected marker centromere and normal chromosome 10 regions, and to eliminate clones that might have originated from other genomic sites due to chimeric YACs. Positive clones were then analyzed further with the ANTI-CEN/FISH methods, using CREST#6 to label the centromere. Fig. 3a (I and II) show examples of cosmid signals that mapped to the q'- and p'-side, respectively, of the marker centromere in the ANTI-CEN/FISH experiments. The cosmid tag (clone 10pC38) was used in these experiments to define the q' arm of the marker chromosome. For cosmid walking, we concentrated on clones derived from YAC-3 since FISH mapping of YAC contig #082 indicated that the marker centromere region was likely to be within this YAC. Fig. Aa shows a restriction map of the region covered by this and surrounding YACs and compares this map with a genomic map derived from patient BE. The relative positions of a series of cosmid clones (including five independent PACs) were also determined and placed on the YAC map. Fig. Ab presents the ANTI-CEN/FISH results obtained with a number of the cosmid clones and one of the PAC clones. Clones Y3C64, Y6C8, and Y3C94 localized preferentially to the q'-side, while Y13C1+C8 and Y17C6 localized preferentially to the p'-side ofthe marker centromere, suggesting that the nucleus of the antibody-binding domain is situated between these two cosmid clusters. Within this central region, a group of cosmid clones comprising the HC-contig (Fig. Aa) was found to map closely around the ANTI-CEN signal. Fig. 4c shows a restriction map for eight different overlapping clones from this HC-contig. The chromosomal positions of five of these overlapping clones were analyzed in detail using ANTI-CEN/FISH. Fig. Ab shows the cumulative results for more than 60 informative chromosomes for each of these five probes. The results indicated that Y7C14 mapped preferentially q'- of the antibody-binding domain, while the remaining four clones (Y4C45, Y6C10, Y6C21 and Y3C3) mapped preferentially to the p'-side. In addition, the results for PAC5 (a 75 kb-insert PAC clone that overlapped with the p'-end of PAC4 by approximately 5 kb; see Fig. Aa) provided further evidence for the emergence of the HC-contig region onto the p'-arm. Based on these results, we conclude that the eight contiguous cosmid clones within the HC-contig shown in Fig. 4c, which together constitute -80 kbp of DNA, have defined the nucleus of the antibody -binding domain of the marker centromere.

From the above ANTI-CEN/FISH results, it was difficult to determine if the sequences of the HC-contig and its surrounding DNA, both originally derived from a normal individual, were part of the marker centromere DNA, or whether these sequences simply flanked a transposed centromere DNA with an unrelated nucleotide composition. However, supporting evidence from the ANTI-CEN/FISH experiments suggested that the DNA of the HC-contig region appeared to be a part ofthe marker centromere. This came from the mapping of Y6C10 and Y6C21 onto superstretched chromosomes that were occasionally detected in the slide preparations. An example of such mapping is shown in Fig. 3b using Y6C21. As can be seen, whilst a significant portion of Y6C21 hybridized to the p'-side of the CREST signal on the highly extended chromosome, a substantial portion ofthe cosmid DNA also overlapped directly with the CREST signal. This suggests that at least part ofthe HC-contig region actually comprises the same DNA sequence as the marker centromere. This possibility was further investigated by detailed genomic mapping.

EXAMPLE 12
The Marker Centromere DNA has a Similar or Identical Sequence
Organization as the HC-Contig

The genomic organization of the HC-contig region was compared with that of the corresponding DNA region of the mardel (10) chromosome. Three overlapping cosmids (Y7C14, Y6C10, and Y4C7, the latter being essentially the same as Y6C21; Fig. 4C) from the HC-contig were used as probes to analyze the restriction patterns of genomic DNA prepared from patient BE and those of his karyotypically normal parents. Fig. 5 shows examples of the band patterns obtained with Y6C10, while Table 1 summarizes the results for all the enzymes tested with Y7C14, Y6C10 and Y4C7. The detection of a single band on PFGE gels with a number of the enzymes indicated that the cosmid DNA sequences were unique within the human genome (Sfil, Sail, Kspl, Kpnl and Bell in Fig. 5 A; Table 1). The detection of a single on PFGE gels with a number of the enzymes (Clal in Fig. 5 A; Table 1) could be explained by differential methylation of different restriction sites found in this region (Nelson and McClelland, 1991); the reproducibility of these multiple band patterns ruled out incomplete digestion as a possible cause. The multiple bands detected with the more frequent cutting enzymes on a standard gel (Fig. 5B and Table 1) were a result of the presence of cleavage sites present within the probe DNA, since similarly digested cosmid DNA electrophoresed next to the genomic DNA yielded identical patterns for all the bands not containing cosmid vector sequences. In all, 37 enzymes were used to generate more than 160 different fragments for the three cosmid probes (Table 1). The results indicated that, except for a polymorphic fragment found in one of the parents, an identical banding pattern was present in the genomic DNA of patient BE and those of his parents. Furthermore, when the restriction patterns obtained for the genomic DNA of patient BE were compared with those of the smatic hybrid cell line BE2C1-18-5F, which contained the marker chromosome but not the normal chromosome 10, no detectable difference was seen between the two DNA preparations within the HC-contig region (Fig. 5C).

In addition to Y7C14, Y6C10 and Y4C7, a host of other probes from within or surrounding the HC-contig have been tested, each with an average of 12 different informative enzymes. These probes included PAC4 (which spanned the entire HC-contig region shown in Fig. 4C), Y3C64, Y3C109, Y6C6, Y6C8, Y3C94, PAC1, Y3C90, Y4C4, Y4C8, Y4C13, and Y3C33. The results again indicated identical restriction enzyme patterns between patient BE and normal DNA. Thus, through the analysis of a relatively large number of probes covering about 500 kb of YAC-3 around the HC-contig region, and the use of a high density of restriction enzymes that generated a range of fragments from <1 kb to ~1 Mb, it was evident that the marker centromere DNA and a substantial stretch of its adjoining regions showed no detectable difference against the corresponding genomic region of the normal chromosome 10.

Since a potential limitation of the above Southern blot analyses was that highly repeated sequences were not detected because of the preannealing step used in the hybridisation procedure, a different approach was employed to compare the DNA of the marker chromosome and that of the normal chromosome 10. In this approach, oligonucleotide primers from different regions of the HC-contig were used to prepare a series of PCR fragments from the BE2C1-18-5F and BE2C1-18-1F hybrid cell lines. Electrophoretic comparison of such fragments, which randomly covered approximately 40 kb of the HC-contig, indicated no detectable difference between the two chromosomes and provided independent support for the results obtained in the Southern blot analyses. Thus, it can be concluded that the sequence organization of the marker centromere region is similar, if not identical, to that found in the HC-contig region of the normal chromosome 10.

EXAMPLE 13
Implications for Centromere Study and Mammalian
Artificial Chromosome Construction

The mammalian centromere has been difficult to study due to the massive amount of repetitive DNA normally associated with it. By avoiding such repetitive DNA and analyzing the unusual centromere found in the present marker chromosome, the inventors have created a much more tractable system for centromere studies. The present analysis has already shed some light on the important question of DNA sequence versus conformational requirement of a centromere, and on the intriguing concepts of latent centromeres and epigenetic mechanisms. One urgent application of this DNA is to use it to identify the primary protein(s) which binds to the centromeric DNA. Another important application of the marker centromere DNA is in the construction of mammalian artificial chromosomes. Such artificial chromosomes offer a potentially powerful vehicle for the structural and functional analysis of chromosomes, for the genetic manipulation of plants and animals, and for the stable transmission of therapeutic genes in human gene therapy. The artificial chromosomes require a functional mammalian centromere, and the marker centromere DNA element of the present invention now provides a suitable centromere especially because of its relatively small size in the absence of α-satellite DNA and its cloning stability, as indicated by the cosmid, YAK and BAC clones of the HC-contig and NC-contig.

EXAMPLE 14
Sequence analysis

Figures 6, 16A and 16B show partial nucleotide sequences for the HC-contig (SEQ ID NO: 3) NC-contig [SEQ ID NO: 4] and F2 (BAC/F2-14) [SEQ ID NO: 5-29] regions, respectively.

EXAMPLE 15
Human Artificial Chromosome (HAC)

The following are examples of the different approaches being used in the inventors' laboratory for the production of a HAC:

Retrofitting of HC-contig DNA from normal chromosome 10

This procedure aims to produce HACs of 100 kb to >lMb using the region of the normal chromosome 10 containing and surrounding the HC-contig DNA. The generation of a HAC by this approach will provide crucial proof that this normal DNA region can be reactivated to form a functional centromere.

A retrofitting procedure suitable for introducing human telomeres to both ends of any YAC prepared in the pYAC4 vector in the yeast host strain ABI 380 has been previously described (Larin et al, 1994; Taylor et al, 1994, 1996). YACs (in particular YAC-3 and YAC-5) spanning the normal HC-contig region are used for retrofitting by plasmid constructs designed to recombine with their pYAC4 vector arms (Figure 7). The construct pLGTEL 1 is used to target the left arms of the YACs. This serves to add a LYS2 yeast selectable marker, gpt element for ultimate selection in mammalian and avian cell culture, and a human telomere. The right arm of the YACs are targeted by homologous recombination with pRANT 11 to produce a final construct where additional markers are introduced along with a second human telomere to cap the construct. Specifically, an ADE2 yeast marker is added and the URA3 gene of the YAC is disrupted, serving a useful role in negative selection of the construct. A neomycin (neo) resistance gene shown to function in mammalian and avian cells is also introduced. The finished constructs are transfected into different cultured cell lines, including HT1080 (of human sarcoma origin) (Larin et al, 1994; Rasheed et al, 197 '4), DT40 (a recombination-proficient chicken cell line) (Dieken et al, 1996), and BE2CI-18-5f (a human/hamster somatic hybrid cell line containing the mardel (10) chromosome but not the normal chromosome 10).

In vitro cloning of HC-region into YAC/HAC vectors

The different vectors used for the cloning ofthe normal and mardel (10) centromeric DNA in the preparation of HACs are summarised in Table 2.

A number of different YAC cloning strategies are employed:

Conventional YAC cloning approach. Figures 8A-D show the different vectors used for cloning DNA as YACs by the conventional restriction/ligation methods. These YACs can then be shuttled into mammalian cells and tested for HAC function.

ALU-ALU circular TAR cloning approach. Transformation-associated recombination (TAR) in the yeast S. cerevisiae, is a method for constructing linear and circular YACs from mammalian DNA (Larionov et al, 1996a, 1996b). The recombination process is shown in Figure 9. Briefly, the technique involves the use of a vector (pVC39-AAH2, Fig. 8E) lacking an autonomous replicating sequence (ARS) but containing a functional yeast centromere (e.g. CEN6) and selectable marker (e.g. HIS3), and two ALU DNA hooks to trap mammalian DNA by recombination at ALU sequences after co-transformation of linearized vector and high molecular weight DNA into yeast spheroplasts and followed by selection on medium lacking histidine. The key to the process is that the mammalian DNA provides an ARS (11 -bp sequence found frequently in mammalian DNA) which allows the HIS7CEN vector to replicate as a circular YAC. These YACs are very stable and range in size from 100 kb to greater than 600 kb (Larionov et al, 1996b).

pVC39-AAH2 vector is used to clone DNA from hybrid BE2CI-18-5f to make YACs with an average insert of 250 kb. This TAR vector is further modified to create pAAH-TCNa (Fig. 8G) so that it has the ability to shuttle between yeast and mammalian cells (as outlined in Figure 10), including the potential to expose human telomeres (TEL) at each end of a cloned fragment using a unique restriction site I-Scel.

Semi-specific and specific circular TAR. A modified circular TAR method utilising two specific 5'C and 3'C DNA hooks (300-700 bp in size) may be used to clone a specific human DNA at a frequency of 3/1000 HIS+ transformants. The inventors prepared the vectors pVC39-ALU/C3-F2(+/-) and pTCN-TCS (Table 2) to perform semi-specific and specific TAR cloning, respectively.

The Semi-specific TAR methodology is a modification of a specific circular TAR strategy which permits the site directed isolation of target chromosomal DNA. Furthermore, in accordance with the present invention, the methodology described herein enables the site-specific cloning of target chromosomal DNA from total genomic DNA as a circular YAC at relatively high frequencies and without the need for the construction and extensive screening of complex libraries made from genomic DNA.

In a preferred embodiment of the present invention, the methodology employs a single specific DNA hook which flanks the mardel (10) chromosome and a less specific Alu-hook to trap the other side of the target DNA.

In initial experiments, a unique repeat DNA-free, 1.4kb EcoRI fragment (designated C3-F2) was identified from the p' side of the 80-kb HC-contig (Fig. 11 A) (du Sart et al, 1997). This fragment was subcloned into the centromere-based yeast circular TAR vector, pVC39-AAH2, by replacing the existing BLUR13 Alu (Larionov et al, 1996b) to create the pVC39-ALU/C3-F2 constructs. As the specific orientation of the C3-F2 sequence on the chromosome was not known, the fragment was cloned in two different orientations, for which the (+) orientation (Fig. 1 IB) was expected to trap the genomic region to the left of C3-F2, while the (-) orientation was expected to trap the region to the right. Both constructs were used in yeast transformation.

As a source of genomic DNA containing the neo-centromere, a somatic hybrid cell line, BΕ2C1- 18-5f (du Sart et al, 1997), containing the mardel 10 chromosome but not the normal human chromosome 10 was used. 5μg of high-molecular- weight DNA from this cell line and lμg of pVC39-ALU/C3-F2(+) or pVC39-Alu/C3-F2(-) (linearized with Sm l to expose the 0.21-kb Alu and 1.4-kb C3-F2 hooks) were co-transformed into 109 (previously prepared and stored frozen) spheroplasts of S. cerevisiae YPH857 which carries a HIS3 gene deletion, (Sikorski and Hieter, 1989) and grown on SD, without HIS medium, (Larionov et al, 1996a;b) to yield between 10 and 100 HIS+ colonies. Control experiments in which YPH857 was transformed with vector alone did not produce any colonies, indicating that the C3-F2 fragment lacked ARS-like sequences. Twenty TAR experiments were performed and HIS+ colonies were picked into 96-well trays containing YPD medium (supplemented with 50μg/ml ampicillin and 15μg/ml tetracycline), grown at 30°C with aeration for 24h and stored in 20% (v/v) glycerol at -70°C. Total yeast DNA was prepared in pools of 48 (Kwiatkowski jr et al, 1990) and screened by PCR with the primers norm 5 and norm 7 (Table 3) which are located 30-kb q' of C3-F2 (Fig. 11 A). Two desired positive clones, designated 5f-52-E8 and 5f-38-F2, which contained the neo-centromere DNA derived from mardel 10 and mardel (10) and the DNA immediately p' of the neocentromeric DNA, respectively, were identified. For subsequent studies, these clones were grown on SD without HIS medium and single colonies were re-isolated for characterization.

Initially, the sequence nature and sizes of the 5f-52-E8 and 5f-38-F2 insert DNA were determined. High-molecular-weight DNA was prepared in agarose blocks and digested with an enzyme (Srf ) that linearized with YAC (Fig. 11 A). The linearized DNA, as well as uncut intact

DNA, were resolved by pulsed-field gel electrophoresis (PFGE), transferred onto a nylon membrane and probed with radiolabelled PAC4, a PI -derived artificial chromosome clone containing a 120-kb insert that spans the entire HC-contig from normal chromosome 10, (du Sart et al, 1997) following preannealing with human placental DNA to suppress repetitive DNA.

The intact 5f-52-E8 and 5f-38-F2 remained trapped in the electrophoretic wells and the linearized DNA migrated into the gel and demonstrated a size of approximately 110 kbp and 80 kbp, suggesting insert sizes of about 105 kbp and 75 kbp, respectively (given that the vector size is 5.9 kb).

Despite the use of a genomic DNA source previously shown by sequence-tag-site (STS) analysis to be free from normal chromosome 10 material, it is desirable to independently confirm the mardel (10) -origin ofthe 5f-52-E8 YAC clone. This was achieved using a set of primers (norm 17 and 18; Fig. 1 IA) that detected a variable-number-tandem repeat (VNTR) region within the HC-contig/neocentromere region. The results clearly indicated the presence of a 1.4-kb PCR product that was specific for the mardel (10) chromosome (Table 3).

PCR was used to further compare the 5f-52-E8 DNA with the previously cloned HC-contig sequence derived from normal chromosome 10. PCR products with sizes ranging between 0.2 and 15.9 kb were generated by standard PCR or with the Expand Long Template PCR system (Boehringer-Manneheim). Products greater than 1 kb were digested with frequent cutting enzymes, Rsal and BsiXI, and their fingerprints were compared by agarose gel electrophoresis. The results, shown in Table 3, indicated the absence of any detectable difference between the 5f-52-E8 DNA and those of the corresponding regions of the normal chromosome 10 (in somatic cell hybrid BE2Cl-18-lf) and the neocentromere region of mardel (10) (in somatic cell hybrid BE2Cl-18-5f). These results also demonstrated that the YAC 5f-52-E8 spanned at least 75 kb ofthe HC-contig region (Fig. 1 IC), consistent with the size determined by PFGE. Furthermore, the ability of all the internal primers to amplify DNA from 5f-52-E8 strongly suggested that the YAC was not chimeric. This result was confirmed by isolating DNA from four single-colony isolates of 5f-52-E8, digesting these with EcoRI and EcoRV, and probing with radiolabelled PAC4. The hybridization patterns obtained with these enzymes were consistent with those established in the previous study (du Sart et al, 1997). Thus, this analysis, based on cloned DNA derived directly from mardel 10, has provided confirmation that the neocentromere DNA region is structurally identical to that of the corresponding HC-contig region of the normal chromosome 10 (du Sart et al, 1997).

The circular YACs 5f-52-Ε8 and 5f-38-F2 were further retrofitted with the yeast-bacterial-mammalian cells shuttle vector BRV1 as previously described (Larionov et al, 1997). The resulting BAC clones were designated BAC/E8-1 and BAC/F2-14, respectively (Fig. 1 ID).

The specific TAR strategy is outlined in Figure 12 and uses unique fragments from the HC-contig region, such as the ends of PAC4 (a 120 kb-insert PAC clone containing the HC-region) to create the YAC/HAC shuttle vector pTCN-TCS. An example of a YAC/HAC construct containing the HC-contig region of normal chromosome 10 is shown in Fig 13.

Completed constructs are transfected into different cultured mammalian or chicken cells (see above) by lipofection using Transfectam or DOSPER.

In vivo "cloning" of HC-region into HAC vectors

This strategy employs a technique known as Telomere Associated Chromosomal Truncation (TACT) (Fig. 14). The technique is based on the principle that cloned mammalian telomeric DNA when reintroduced into a mammalian cell can seed the formation of a new telomere at an intrachromosomal location. If the introduced telomeric DNA is targeted to a known site through homologous recombination, integration at that location and subsequent truncation of distal sequences on the original chromomosomal arm can result (Brown et al, 1994; Farr et al, 1995). This technique is employed in our own study to truncate the mardel 10 chromosome on either side of the HC-contig/core centromeric DNA element to produce in vivo a stable HAC of minimal size.

Figure 15A shows an example of TACT-construct used in our study. Key features of this construct are: (a) Cloning ofthe pericentric human genomic DNA in both orientations (+/-). This is necessary since we do not know the chromosomal orientation of this DNA. This DNA is used to target the human telomeric sequences to locations on either side of the HC-contig region on mardel 10. Genomic DNA is derived from several different sources including Y2C24, Y3C64,

Y3C109, Y3C94, Y13C12, Y13C15, Y17C6, Y17C8. The resulting truncation derivatives produced using these genomic DNAs will vary in size accordingly, (b) The termini contain 2.4 kilobases of tandem repeat human telomeric DNA (htel). This DNA has been shown previously to act as a substrate for mammalian telomerase to allow seeding of a complete telomere tens of kilobases in length, (c) The hygromycin (Hyg) resistance gene allows for positive selection of mammalian cell lines containing construct sequences integrated into the genome. This is the initial screening procedure. In addition, some constructs contain the neomycin phosophotransferase gene (Neo) rather than Hyg. (c) The Herpes simplex thymidine kinase (TK) gene is used for negative selection against non homologous integration events into the genome. Those cell lines containing the TK gene can be selected against by adding the nucleoside analogue gancyclovir.

Figure 15B shows another example of TACT-construct used in our study. In addition to the features of the linearised construct shown in Fig. 15A, specific additional features are: (a) The incorporation of tandem telomeric blocks (htel.htel) since others have shown these to have the highest seeding efficiency of new telomeres in mammalian cells, (b) The incorporation of yeast selectable marker (eg. URA3), DNA origin of replication (eg. ARS), and centromere (eg. CEN6), to allow transfer and maintenance ofthe resulting truncation derivatives into yeast. This should facilitate further characterisation and manipulation, such as the introduction of therapeutic genes for gene therapy purposes, (c) The relocation of the TK gene adjacent to the genomic DNA to increase the effectiveness of the negative selection system, (d) The human growth hormone (GH) gene has been included to allow proof of principle that human genes can be introduced into a HAC and expressed under the control of endogenous regulatory elements. This is essential for gene therapy applications of the resulting HAC. (e) A CMV promoter upstream of a PI phage loxP site (CMV/loxP) has been included to allow introduction of large human genes into a HAC in vivo. A plasmid containing a gene of interest, a second loxP site and a promoterless selectable marker gene is introduced into a mammalian cell line containing the HAC. Transient expression of CRE recombinase results in recombination between the two loxP sites within the cell, thereby integrating the introduced plasmid into the HAC and placing the selectable marker gene next to the CMV promoter to allow for marker selection.

For chromosomal truncation, the above TACT-constructs are transfected into a somatic cell hybrid line BE2CI-18-5f containing the mardel (10) chromosome. Positive selection is applied for Hygromycin or Geneticin resistance whereas negative selection is applied against the Thymidine Kinase Gene. Resulting colonies are further screened with distal p' and q' DNA fragments to ascertain the presence or absence of the two mardel 10 chromosome arms. In addition to the BE2CI-18-5f cell line, a human/chicken somatic cell hybrid line (derived from the recombination-proficient DT40 chicken cell line; Dieken et al, 1996) containing the mardel (10) chromosome will also be generated and used.

EXAMPLE 16
Analysis of HAC

Irrespective of which ofthe approaches described above is used, the presence of a new product in a mammalian cell line as an extrachromosomal, artificial chromosome, will be assessed by fluorescence in situ hybridisation (FISH) analysis, as well as tested by extracting high molecular weight DNA to determine independently existing chromosomal entity on pulsed field gel. The stability of the construct through successive cell division, both in the presence and absence of drug-resistance selection, will be determined. The presence of the construct, in all or a high percentage of the original transfected cells indicates stability. Demonstration of this stability indicates the successful creation of a HAC.

EXAMPLE 17
Production of HAC

This example describes the use of the neocentromere as a source of centromeric DNA in the "bottom-up" approach to produce HACs in human cell culture. Bacterial artificial chromosomes (BACs) containing cloned neocentromeric DNA and a selectable marker were co-transfected with human telomeric DNA into human HT1080 cells to yield independent HACs that were single-copy and stable in the absence of selection. The properties of these HACs, and their potential utility as a new, improved vector system for gene therapy are described.

EXPERIMENTAL PROTOCOL

Preparation of DNA. Highly-purified BAC DNA was prepared using Qiagen columns according to the manufacturer's instructions. Prior to transfection, BACs were linearized with SgrAI in the presence of 2.5 mM spermidine and examined by pulsed-field gel electrophoresis. Human telomeric DNA was gel-purified as a 1.6-kb BamHVBglH fragment from pSXneo270T2AG3 (Bianchi et al, 1997). High-molecular-weight genomic DNA was prepared from cultured cell lines using standard methods (du Sart et al, 1997).

Transfection of HT1080 cells. Transfection of human fibrosarcoma cell line HT1080 (Rasheed et al, 191 A) was performed using the DOPSER liposomal transfection reagent (Boehringer-Mannheim). The day before transfection, 6-well trays (each well is 962 mm2) were seeded with 3 x 105 HT1080 cells per well and grown at 37°C, 5% CO2. Different combinations containing 1-2 μg of each BAC, 50 ng of telomeric DNA, 100 ng of each PAC-1, 4 and 5 (du Sart et al, 1997) and 50 ng of human genomic DNA were prepared in 50 μl of HBS (20 mM HEPES, 150 mM NaCI) supplemented with 0.075 mM spermidine and 0.030 mM spermine. These DNA cocktails were mixed with 50 μl of 0.4 μg/μl DOPSER (diluted in HBS) and left at room temperature for 15 to 20 min. The HT1080 cells were washed with PBS (phosphate buffered saline) and 1 ml of serum-free DMEM (Dulbecco's modified Eagles medium) was placed in each well. The DNA-DOPSER mixture was then added dropwise with swirling and the cells were incubated for 6 h. 1 ml of DMEM and 20% v/v fetal calf serum (FCS) was then added and the cells left for 24 h at 37°C, 5% v/v CO2.The cells were harvested and seeded into 48-well cluster trays (each well is 100 mm2) containing DMEM- 10% v/v FCS supplemented with Geneticin (G418, Gibco-BRL) at 250 μg/ml. The media was changed every 3 to 4 days. G418-resistant colonies normally appeared 10 to 14 days after transfection. These colonies were expanded into duplicate 6-well trays, where the cells of one tray were stored frozen in liquid N2, and the remaining cells were analysed by fluorescence in situ hybridization (FISH).

Cell culture and mitotic stability. HT1080 cells were grown in DMEM supplemented with 10% v/v FCS, penicillin/streptomycin, and glutamine. The mitotic stability of HAC containing clones was determined by growth in 25 cm2 flasks in the presence (200-250 μg/ml) or absence of G418 selection, and grown to confluency (3-4 days) and split 1/5 and 1/10, respectively. Aliquots of each culture were harvested fortnightly and analysed by FISH (20-50 metaphases) with BAC/E8 and/or BAC/F2 probes.

FISH, ANTI-CEN/FISH and PRINS FISH. Fluorescence in situ hybridization (FISH) analysis of HT1080 clones was performed with BAC/E8, BAC/F2, and/or α-satellite DNA probes.

Hybridization using the BAC probes were performed under high stringency whereas the α-satellite DNA probes were used in low stringency conditions (du Sart et al, 1997). ANTI- CEN/FISH analyses involved an initial immunofluorescence staining step using a CREST antibody or specific antibodies against CENP-B, CENP-C, or CENP-E, followed by FISH using the probes described above, essentially as previously described (du Sart et al, 1997).

Results

HAC construction strategy. The basic strategy involved the co-transfection of the 10q25.2 neocentromere DNA with human telomeric DNA into human cells. The neocentromere region is cloned as two, circular YACs in Saccharomyces cerevisiae. To facilitate handling and purification of the cloned DNA in large quantities, these YACs are retrofitted into BACs and maintained episomally in E. coli as circular molecules. One of the BAC clones, BAC/E8, is 120 kb in size and has an insert of 105 kb that encompassed 70 kb of the 80-kb core NC-DNA region (Fig. 16). The second BAC clone, BAC/F2, has an insert size of 75 kb that overlapped BAC/E8 by 1.4 kb, and contains -10 kb of the core NC-DNA while extending -65 kb into the p'-side of the mardel (10) chromosome (Fig. 16). The BAC vector backbone further contains the neomycin-resistance (NeoR) gene to allow selection in mammalian cells. BAC/E8 and BAC/F2, used either on their own, in combination with each other or with additional DNA are used in the following transfection experiments.

Transfection of HT1080 cells. The human cell line HT1080 (Rasheed et al, 1974) is chosen for the transfection experiments because of its near-diploid karyotype, its high level of telomerase activity (Holt et al, 1997), and its demonstrated ability to form microchromosomes containing de novo centromeres from transfected arrays of α-satellite DNA and human telomeric DNA (Harrington et al, 1997; Ikeno et al, 1998). The resulting G418-resistant clones are analyzed by FISH and classified into different categories of events.

Transfected cell lines are designated HT-38, HT-47, HT-54, HT-190, and HT-191.

Those skilled in the art will appreciate that the invention described herein is susceptible to variation and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, individually or collectively, and any and all combinations of any two or more said steps or features.

TABLE 1
Restriction analysis of the genomic DNA of patient BE and those of his parents using three overlapping cosmids that span the marker centromere.

Y7C14 Y6C10 Y4C7

Notl n.a. 910 910
BssHII n.a. 815,340 n.a.
BsiWI n.a. 740 740
Sail 410 410 410, 540
Clal 315, 145, 110,80 315, 145, 110,80 315, 145, 110,80
SnaBI n.a. 250, 148 n.a.
Nael 240,210, 155, 120 240, 210, 155, 120 240, 210, 155, 120
Narl 222, 108, 70 222, 108 222, 200, 108, 70
EclXI 180 180 180
Sfil 170 170 170
Kspl 168 168 168
Aatπ 165, 146 165, 146 165, 146
Nhel 38 38 38
BstBI n.a. 35 35
Smal n.a. 90, 40, 22 90, 40, 22
Bgll 25 25, 7.2, 6.2 25
Pad n.a. 25 n.a.
BamHI 24, 19, 15 24, 22* 24, 22*
Kpnl 23 23 23,19
Bell 21 21 21
Pstl 9.4,5.9,5.1, 4.2, 3.8, 9.4, 3.8, 2.9, 2.7, 2.4, 9.4,7.1,4.2,3.3,2.9,
3.3, 2.9, 2.4 1.5, 1.1 2.7, 1.9, 1.5, 1.1
Xbal 14 14,10 10
Eael n.a. 15, 12,8,6 n.a.
SphI 16,7.5 16 16,9
Pvull 14,7.5 7.5,6 7.5,6
Hindll 8.6, 6.9, 6.2, 2.7, 1.8, 6.9, 6.2, 5.6, 5.2, 5, 2.7, 6.2,5.6,5.2,4.3,2.9,
1.2 1.9, 1.8, 1.7, 1.2,0.6 1.7, 1.2
Apal 15,8.5 15 15
11,4.3,3.9, 1.9, 1.5 11,4,3,2, 1.9, 1.7, 1.5 10.2,7.6,3,2,1.9, 1.7,

EcoRI 1.5
Hpaπ 5.5,4.3,3.6, 1.6 6.9, 3.6, 2.8, 1.6, 1.2 3.6,2.8,2.5, 1.6, 1.2 Mspl 3.9,3.0,2.8,2.5,2, 1.6, 3.9,3.6,2.8,2.5,2.2, 3.6,3.2,2.8,2.5,2.2,
1.2 1.6, 1.5, 1.3, 1.2,0.9 1.6, 1.5, 1.2, 1
Sspl n.a. 10 n.a.
XhoII 7.5 n.a. n.a.
Dral 7.5 7.5 7.5
Bgiπ 8.5,6,5,4.7,3.5,2.5 6,5,4.7,2.5, 1.6, 1.5, 1 7,6,5,4.7,2.5, 1.6,
1.5, 1.1, 1
Avail 7.4,3.7,3.4,2.8,2.6, 3.7,2.8,2.6, 1.8, 1.7, 4.3,3.7,2.8,2.6, 1.8,
1.8, 1.7, 1.4, 1.2, 1.1 1.4, 1.2, 1.1,0.9,0.8, 1.7, 1.4, 1.2
0.5
Stul 12.5,8,7.5 12.5,9,8.5 9,8.5
Hindlll 6.6,5.4,4.7,4.4,2.9, 5,4.7,4.4,4.1,2.9,2.5, 5,4.7,4.1,3.1,2.5,2.3,
2.5 0.7 1.9

n.a.= data not available.The values represent restriction fragment lengths in kilobases. Multiple values for an enzyme denote different bands detected by a cosmid probe on a gel lane. Since there were no detectable differences between the DNA of patient BE and those of his parents in any of the fragments (except for a BamHI polymorphic band found in one of the parents, indicated by an asterisk), only one set of values is shown for all three genomic DNA.

TABLE 2

Table 2. Vectors for cloning centromeric regions from normal chromosome 10 or mardel (10) DNA into yeast artificial chromosomes (YACs). These YACs can be shuttled into mammalian cells to test for function as HACs.

Vector: Key Feature(s) Map

pJS97ARTi hTEL/I-Scel/yTEL, DHFR Fig.8A pJS98ANTi hTEL/I-Scel/yTEL, neo Fig.8B
Fragmentation 1 hTEL/I-Scel/yTEL, hyg Fig.8C
Fragmentation 2 (-/+ hGH) hTEL/I-Scel/yTEL, neo, hGH Fig.8D pVC39-AAH2 ALU-ALU TAR vector Fig.8E pTEL/CAT/TEL hTEL/I-Scel hTEL/neo Fig.8F pAAH TCNa TAR vector with hTEL/I-Scel/hTEL/neo Fig.8G pVC39-ALU/C3-F2(+/-) ALU-specifc TAR vectors Fig.8H pTCS ends of PAC4 in pBS Fig.8I pTCN-TCS specific TAR vector hTEL/I-Scel/hTEL/neo Fig.8J

TABLE 3
PCR analysis of YAC 5f-52-E8 clone and comparison with the HC-contig/neo- centromere region from normal chromosome 10 and mar del (10)

Pπmer-Pairs a Genomic DNA used in PCR (product size in kb)
BE2C1 -18-1P BE2C1 -18-5P YAC 5f-52-E8

norm 141 + 55 1 80 1 80 not present norm 32 + 30 0 90 0 90 0 90
norm 28 + 29 1 00 1 00 1 00
norm 1 + 3 2 90 2 90 2 90 norm 39 + 52 1 20 1 20 1 20 norm 5 + 7 0 23 0 23 0 23 norm 16 + 5 3 50 3 50 3 50 norm 9+ 14 0 90 0 90 0 90
norm 36 + 37 2 00 2 00 2 00
norm 168 + 71 4 00 4 00 4 00
norm 27 + 10 1 5 90 15 90 15 90
norm 18 + 17 (VNTR)C 1 20 1 40 1 40
norm 68 + 17 8 00 8 00 8 00
norm 34 + 47 3 00 3 00 3 00
PAC4t7 a + b 0 30 0 30 not present

AFM259xκ5 ca ι + κt c 0 21 0 19 not present ' Refer to Fig la for the relative positions of each primer-pair
bBE2Cl -18-lf and BE2Cl-18-5f are somatic hybrid cell lines containing the normal human chromosome 10 and mar del (10), respectively (2)
c The 'norm 18 + 17' and 'AFM259xg5: ca and gt* primer sets allow distinction between the normal human chromosome 10 and mar del (10) by detecting a VNTR and a microsatellite, respectively

BD3LIOGRAPHY:

1. Albertsen, H., Abderrahim, H., Cann, H., J, D., Paslier, D. L., and Cohen, D. (1990). Construction and characterization of a yeast artificial chromosome library containing seven haploid human genome equivalents. Proc. Natl. Acad. Sci. USA. 87, 4256-4260.

2. Archidiacono, N., Antonacci, R., Forabosco, A., and Rocchi, M. (1994). Preparation of human chromosomal painting probes from somatic cell hybrids. In In Situ Hybridization Protocols. K. H. A. Choo, ed. (Totowa, New Jersey: Humana Press), pp. 1-14.
3. Bemat, R. L., Borisy, G. G., Rothfield, N. F., and Earnshaw, W. C. (1990). Injection of anticentromere antibodies in interphase disrupts events required for chromosome movement in mitosis. J Cell. Biol. Ill, 1519-1533.
4. Bischoff, F., Maier, G., Tilz, G., and Ponstingl, H. (1990). A 47-kDa human nuclear protein recognized by antikinetochore autoimmune sera is homologous with the protein encoded by RCC1, a gene implicated in onset of chromosome condensation. Proc. Natl. Acad. Sci. 87, 8617-8621.
5. Brenner, S., Pepper, D., Berns, M. W., Tan, E., and Brinkley, B. R. (1981). Kinetochore structure, duplication and distribution in mammalian cells: analysis by human autoantibodies from scleroderma patients. J. Cell. Biol 91, 95-102.
6. Brown, K E., Barnett, M. A., Burgtorf, C, Shaw, P., Buckle, V. j., and Brown, W. R. A. (1994). Dissecting the centromere ofthe human Y chromosome with cloned telomeric DNA. Hum. Mol. Genet. 3, 1227-1237.
7. Brownstein, B., Silverman, G., Little, R., Burke, D., Korsmeyer, S., Schlessinger, D., and Olson, M. (1989). Isolation of single-copy human genes from a library of yeast artificial chromosome clones. Science 244, 1348-1351.
8. Clarke, L., and Carbon, J. (1985). The structure and function of yeast centromeres. Annu. Rev. Genet. 19, 29-56.
9. Dasso, M. (1993). RCC1 in the cell cycle: the regulator of chromosome condensation takes on new roles. Trends Biochem. Sci. 18, 96-101.
10. Dieken et al (1996) Nature Genetics 12: 174-182.
11. du Sart, D., Cancilla, M.R., Earle, E., Mao, J., Saffery, R., Tainton, K. M., Kalitsis, P., Martyn, J., Barry, A.E., and Choo, KH.A. (1997). A functional neo-centromere formed through activation of a latent human centromere and consisting of non-alpha-satellit DNA. Nature Genet. 16, 144-153.

12. du Sart, D., Cancilla, M.R., Earle, E., Mao, J., Saffery, R., Tainton, K.M., Kalitsis, P., Martyn, J., Barry, A.E., and Choo, K.H.A. 1997. A functional neo-centromere formed through activation of a latent human centromere and consisting of non-alpha-satellite DNA. Nature Genetics 16: 144-153.
13. Harrington, J.J., Van Bokkelen, G., Mays, R.W., Gustashaw, K, and Willard, H.F. 1997. Formation of de novo centromeres and construction of first-generation human artificial microchromosomes. Nature Genetics 15:345-355.
14. Holt, S.E., Aisner, D.L., Shay, J.W., and Wright, W.E. 1997. Lack of cell cycle regulation of telomerase activity in human cells. Proc. Natl. Acad. Sci. USA 94: 10687- 10692.
15. Dceno, M., Grimes, B., Okazaki, T., Nakano, M., Saitoh, K., Hoshino, H., McGill, N.I., Cooke, H., and Masumoto, H. 1998. Construction of YAC -based mammalian artificial chromosomes. Nature Biotechnology 16:(in press).
16. Earnshaw, W., and MacKay, A. (1994). Role of nonhistone proteins in the chromosomal events of mitosis. FASEB J. 8, 947-956.
17. Earnshaw, W. C, and Migeon, B. R. (1985). Three related centromere proteins are absent from the inactive centromere of a stable isodicentric chromosome. Chromosoma 92, 290-296.
18. Earnshaw, W. C, Ratrie, H., and Stetten, G. (1989). Visualization of centromere proteins CENP-B and CENP-C on a stable dicentric chromosome in cytological spreads. Chromosoma 98, 1-12.
19. Farr, C, Bayne, R., Kipling, D., Mills, W., Critcher, R., and Cooke, H. (1995). Generation of a human X-derived minichromosome using telomere-associated chromosome fragmentation. EMBO Journal 14, 5444-5454.
20. Fritzler, M. J., and Kinsella, T. D. (1980). The CREST syndrome: a distinct serologic entity with anticentromere antibodies. Am. J. Med. 69, 520-526.
21. Grady, D., Ratliff, R., Robinson, D., McCanlies, E., Meyne, J., and Moyzis, R. (1992). Highly conserved repetitive DNA sequences are present at human centromeres. Proc. Natl. Acad. Sci. USA 89, 1695-9.
22. Haaf, T., and Ward, D. C. (1994). Structural analysis of α-satellite DNA and centromere proteins using extended chromatin and chromosomes. Hum. Mol. Genet. 3, 697-709.

23. Haaf, T., Warburton, P. E., and Willard, H. F. (1992). Integration of human α-satellite DNA into simian chromosomes: centromere protein binding and disruption of normal chromosome segregation. Cell 70, 681-696.
24. Jeppensen, P., Mitchell, A., Turner, B., and Perry, P. (1992). Antibodies to defined histone epitopes reveal variations in chromatin conformation and underacetylation of centric heterochromatin in human metaphase chromosomes. Chromosoma 101, 322-332.

25. Jeppensen, P., and Turner, B. M. (1993). The inactive X chromosome in female mammals is dinstinguished by a lack of histone H4 acetylation, a cytogenetic marker for gene expression. Cell 74, 281-289.
26. Kingwell, B., and Rattner, J. (1987). Mammalian kinetochore/centromere composition: A 50 kDa antigen is present in the mammalian kinetochore/centromere. Chromosoma 95, 403-407.
27. Larin, Z., Fricker, M. D., and Tyler-Smith, C. (1994). De novo formation of several features of a centromere following introduction of a Y alphoid YAC into mammalian cells. Hum. Mol. Genet. 3, 689-695.
28. Larionov, V. et al. (1997) Proc. Natl Acad. Sci. USA 94: 7384-7387.
29. Larionov, V., Kouprina, N., Graves, J., Chen, X. N., Korenberg, J. R., and Resnick, M. A. (1996a). Specific cloning of human DNA as yeast artificial chromosomes by transfromation-associated recombination. Proc. Nat. Acad. Sci. USA 93, 491-496.

30. Larionov, V., Kouprina, N., Graves, J., and Resnick, M. A. (1996b). Highly selective isolation of human DNAs from rodent-human hybrid cells as circular yeast artificial chromosomes by transformation-associated recombination cloning. Proc. Nat. Acad. Sci. USA 93, 13925-13930.
31. Moir, D. T., Dorman, T. E., Day, J. C, Ma, N. S., Wang, M., and Mao, J. (1994). Toward a physical map of human chromosome 10: isolation of 183 YACs representing 80 loci and regional assignment of 94 YACs by fluorescence in situ hybridization. Genomics 22, 1-12.
32. Moroi, Y., Hartman, A. L., Nakane, P. K., and Tan, E. M. (1981). Distribution of kinetochore (centromere) antigen in mammalian cell nuclei. J. Cell Biol. 90, 254-259.

33. Moschonas, N. K., Spurr, N. K., and Mao, J. (1996). Report of the first international workshop on human chromosome 10 mapping 1995. Cytogenet. Cell Genet. 72: 99-112.

34. Murphy, T. D., and Karpen, G. H. (1995). Localization of centromere function in a Drosophila minichromosome. Cell 82, 599-609.
35. Nelson, M., and McClelland, M. (1991). Site-specific methylation: effect on DNA modification methyltransferases and restriction endonucleases. Nucl. Acids Res. 19: 2045-2071.
36. Page, S. L., Earnshaw, W. C, Choo, K. H. A., and Shaffer, L. G. (1995). Further evidence that CENP-C is a necessary component of active centromeres: studies of a dic(X;15) with simultaneous immunofluorescence and FISH. Hum. Mol. Genet. 4, 289- 294.
37. Pluta, A. F., Cooke, C. A., and Earnshaw, W. C. (1990). Structure of the human centromere at metaphase. Trends Biochem. 15, 181-185.
38. Pluta, A. F., Mackay, A. M., Ainsztein, A. M., Goldberg, I. G., and Earnshaw, W. C. (1995). The centromere: hub of chromosomal activities. Science 270, 1591-1594.

39. Rasheed, S., Nelson-Rees, W.A., Toth, E.M., Arnstein, P., and Gardner, M.B. (1974) Characterisation of a newly derived human sarcoma line (HT1080). Cancer 33, 1027- 1033.
40. Sikorski, R.S. and Hieter, P. (1989). A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics 122, 19-27.
41. Steiner, N., Hahnenberger, K., and Clarke, L. (1993). Centromeres of the fission yeast Schizosaccharomyces pombe are highly variable genetic loci. Mol. Cell Biol. 13, 4578- 4587.
42. Sullivan, B.A. and Schwartz, S. (1995). Identification of centromeric antigens in dicentric Robertsonian translocations: CENP-C and CENP-E are necessary components of functional centromeres. Hum. Mol. Genet. 4, 2189-2197.
43. Sullivan, K. F., Hechenberger, M., and Masri, K. (1994). Human CENP-A contains a histone H3 related histone fold domain that is required for targeting to the centromere. J. Cell Biol. 727, 581-592.
44. Taylor, S.S., Larin, Z., and Tyler-Smith, C. (1994) Addition of functional human telomeres to YACs. Human Mol Genet 3, 1383-1386.
45. Taylor, S.S., Larin, Z., and Tyler-Smith, C. (1996) Analysis of extrachromosomal structures containing human centromeric alphoid satellite DNA sequences in mouse cells. Chromosoma 105, 70-81.
46. Tomkiel, J., Cooke, C. A., Saitoh, H., Bernat, R. L., and Earnshaw, W. C. (1994). CENP-C is required for maintaining proper kinetochore size and for a timely transition to anaphase. J. Cell Biol. 125, 531-545.
47. Trowell, H. E., Nagy, A., Vissel, B., and Choo, K. H. A. (1993). Long-range analyses of the centromeric regions of human chromosomes 13, 14 and 21 : identification of a narrow domain containing two key centromeric DNA elements. Hum. Mol. Genet. 2, 1639-1649.
48. Tyler-Smith, C, Oakey, R. J., Larin, Z., Fisher, R. B., Crocker, M., Affara, N. A., Ferguson-Smith, M.A., Muenke, M., Orsetta, Z., and Jobling, M. A. (1993). Localization of DNA sequences required for human centromere function through an analysis of rearranged Y chromosomes. Nature Genet. 5, 368-375.
49. Voullaire, L. E., Slater, H. R., Petrovic, V., and Choo, K. H. A. (1993). A functional marker centromere with no detectable alpha-satellite, satellite III, or CENP-B protein: activation of a latent centromere. Am. J. Hum. Genet. 52, 1 153-1 163.
50. Wevrick, R., and Willard, H. F. (1989). Long-range organization of tandem arrays of alpha-satellite DNA at the centromeres of human chromosomes: high-frequency array- length polymorphism and meiotic stability. Proc. Natl. Acad. Sci. USA 86, 9394-9398.

51. Wevrick, R., and Willard, H. F. (1991). Physical map ofthe centromeric region of human chromosome 7: relationship between two distinct alpha satellite arrays. Nucl. Acids Res. 79, 2295-2301.
52. Zheng, C, Ma, N. S., Dorman, T. E., Wang, M., Braunschweiger, K., Soares, L., Schuster, M. K., Rothschild, C. B., Bowden, D. W., Torrey, D., Keith, T. P., Moir, D. T., and Mao, J. (1994). Development of 124 sequence-tagged sites and cytogenetic localization of 217 cosmids for human chromosome 10. Genomics 22, 55-67.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT: (US ONLY) CHOO, Kong-Hong Andy, DU SART, Desiree and
CANCILLA, Michael Robert
(OTHER THAN US) AMRAD OPERATIONS PTY LTD

(ii) TITLE OF INVENTION: A NOVEL NUCLEIC ACID MOLECULE

(iii) NUMBER OF SEQUENCES: 29

(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: DA VIES COLLISON CAVE
(B) STREET: 1 LITTLE COLLINS STREET
(C) CITY: MELBOURNE
(D) STATE: VICTORIA
(E) COUNTRY: AUSTRALIA
(F) ZIP: 3000

(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.25

(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE: 13-MAY-1998

(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: PO6784
(B) FILING DATE: 13-MAY-1998

(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: PO8791
(B) FILING DATE: 26-AUG-1997

(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: HUGHES DR, E JOHN L
(C) REFERENCE/DOCKET NUMBER: EJH AF

(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: +61 3 9254 2777
(B) TELEFAX: +61 3 9254 2770 (2) INFORMATION FOR SEQ ID NO : 1 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 :

GGATTACAGG (C/T) (A/G)TGAGCCA 19

(2) INFORMATION FOR SEQ ID NO : 2 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 :

(A/G)CCA(C/T)TGCAC TGCAGCCTG 19

(2) INFORMATION FOR SEQ ID NO :3a:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 40917 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :3a:

GAATTCTCCT GCCTCAGCCT CCCAAGTAGC TGAGGTTACA GGTGCCAGCC ACCACGTCCA 60

GCTAATTTTT GTATTTTAGT AGAGACGGGG TTTCACCGTG TTTGCCAGGC TGGTATCAAA 120

CTCCTGACCT CAAGTGATCT GCCTGCCTCA GCCTCCCAAA ATGCTAGGAT TACAGGTGTG 180

AGTCACCGCA CCCAGCCCTT CTTTCAGTTC TATCACCTCT TTTTGCTATA TTTGTATGAG 240

AGCTTTATTA TTAGGGGCAC ATACATTTAA AATTGTTATG TCTTATTGAT AGATTGATCT 300

GTCATTATGA ATGTCTGTAT TCATTCCCTG ATAGTATTTC TTTTTCTAAA TATTTTTCTG 360

AATGTGTCTG CTATTAACAT AGCCACTCTG GCTTTTTAAA ATTAGTATTT TTATGGTATA 420

TATTTTTCCT TTTTTTTTTT TTTAAGTTTT AGATGTTATG TTTCCTTATA CTTAAAGTGG 480

GTGTCTTATA GGCAGCATAT ATCTGGGTCT TGATGTATTA TTTAATCTGA TAATCTCAAC 540

CTTTTTGTTG GAGTGTTTAG GCCATTTACA TTTAGTGTAA TTATAGACAT GGTTTGATTT 600

GCTATACCAT CTTTTCATTT GTTTTATATG TGAGCCATCT TTTCATTGTT CTTTTTTCAT 660

CTTTGACCAT TTTCTTTAGT ACTGAATACT TTTTTTGTAT TTCATTATAT CTATTGGCTT 720

TTTAGTTATA CCTCTTAAAA TTTTTTTTTC TGTTTTATGT AGGATTTATA ATATACATCT 780

TTAACTTATC ACAGATTACC TTCAAATAGT ATTTTACCAG CTCAAGTGTA ATGTAGAAAC 840

CTTACAAGAG TATATTTTCA TTTCTGTCTC CTAATTTTTA TGCTATGTCT ATAATACATT 900

AGGTTTGTTG TTGTTTGTTT TTACCTTATT GCTGTTGGCT GGGGTCAGCA AACATTTTCT 960 GTAAAGGGCT AGATAGTACA GGCATACCTT GGAGATACTG TGGGTTTGGT TCCATACCAC 1020

CACAATAATA CAAATATGCA AGAAGTGGAT ATCACAATAA AGTGAGTCAC ACAAGTCTTT 1080

TGGCTTCCCA GTGCATATAA AAGTTTTGCT TATACTACAC TGTAGTCTGT TAAGTGTGCA 1140

ATAGTGTTAT GTCTAAAAAA ACACATACCT TAATTTTAAA ATGCTTTATT ACTAAAAAAT 1200

GCTAACAATC ATTTGAGCAT TCAGTGAGTT GTAATCTTTT TGCTGGTGGA AGGTCTTTTC 1260

TTATTGATGA CTGATCGGGG GTCAGGTGCT GAAGCTTAGG GTGGCTGTGG CAGTTTCTTA 1320

AAACAACAGT GAAGATTGCA ATATCAGTTG ACTCTTCCTT TCATGAAAGA TTTCTCTCTA 1380

GTGTGTGATG CTTTTTGATA GCATTTTATG CACAGTAGAA CTTCTTTGAA AATTGGATCA 1440

ATCCTCTCAA ACCCTGCTCT GCTTTAACAA CCTAAGTTAA TATAATATTC TGAATCCATT 1500

GTTGTCATTT CAACAATTTT CACAGTGTCT TCACCAGGAG TAGATTCCAT CTCATTTCCT 1560

GAGATGGAAT CTTTGCTCAT CCATAAGAAG AAATTCCTCA TCTGTTCAAG TTTTATCATG 1620

AGATTGCAGC AATACAGTCA TGTCTTCAGG CCTCACTTCA CTTTTAATTC CAGTTCTCTT 1680

GCTGTTTCTA CCACATCTGT GGTTCCTTCC TCCATTGAAG TCTTGAACCT CTCCAAGTCA 1740

TCCATGAGGG TTGGAATCGA CTTCTTCCAA ATTCCTGTTA ATATTTATAT TTTGACCTCC 1800

CATGAATCAT GAATGTTCTT AATGGCACCT GGAATGGTGA ATCCTTTCCA AAAGGTTTTC 1860

AATTTACTTA GTCCAGATCC ATCCATCCAG AGGATCCACT TTCAATGCCA GTTATAGCCT 1920

TATGGAATGT ATTTCTTCAA TAATAAGGCT TGAAAGTTGA AATTACTCCT TGATCCATTT 1980

TCTGCAAAAT AGATGTTGTG TTAGCAGGCA TGAAAGCAAC ATTAATCTTT TTGTACATGT 2040

CCATCAGAGC TCTTGGGTGA CCAGGTATAT TGCCAGTGAG CAGTAATACT TTGAAAGGAA 2100

TTATTTTTCT TAGCAGTAGG TCTCAACAAT GGGCTTAAAA TATTTGGTCC ACCATTCTGT 2160

AAACTGATGT GCTGTCATCT AAACTTTGTA GTTTCATTTA TAGAGCACAG GCAGAGTAGA 2220

TGTAGCATAA TTCTTAAGGG AC TAGGATT TTCAGAATGG TAAATGAACA TTGGCATCAA 2280

TTTAAATCAC TAGCTGTATT AGCCCCCAAC AAGAGAGTCA GCCTATTTTT TGAAGCTTTG 2340 AAGCCAAGCG TCGACTTCTC CTCCCTGGTT ACAAAAGTCC TAAATGGCAT CTTCTTCCAA 2400

TATAAGGCTG TTTTATCTAC ATTGAAAATC TGTTGTTTAG TGTAGCCACC TTCATCAATG 2460

ATACTATCTA GATCTCTTGG ATAACTTGTG CAGCTTCTAC ATCAGCATTT GCTACTTCAC 2520

CTTGTACTCT TATGTAATGG AGTGGCATCT TTCCTCGTAC CTCATGAACC AACCTCTGCT 2580

AGCTTCCAAC TTTTCTTCTG TAGTTTCCTC GCCTCTCTCA GCCTTCATAG ACTTGAGGAT 2640

AGTTAGAGAC TTGCTTTGGA TTAGATTTTG GCTTCAGGAA ATGTTGTGGC TGGTTTGATC 2700

TTCTATCCAG ACCACTAAAA CTTTATCCAT ATCAGCAATA AGGCTGTTTT GCTTTCTTAT 2760

TATTTGTGTG TTCACTGGAG TAGCACTTTT AATTTGCTTC AAGATATATT TCTTTGCATT 2820

CACAACTTGG CTGACTGGTG CAAGAGGCCT AGCTTTCAGA CTATCTTGGC TTTTGACATG 2880

CCTTCCTCAC TAAGCTTAAT CATTTCTAGC TTTTGATTTA AAATGAGAGA TGTAGGCCAG 2940

GCACAGTGGC AGGCACAGTG GCATATGCCT GTAATTCCAA CACATTAAGA GGCCAAGGTG 3000

GGAGGATTGC TTGAACCCAG GAGGTGGAGG TTGTAGAGAT CACACCACTG CATTCCGTCC 3060

TGGATGACAG AGCAAGACCT TTCTCAAAAT AAAATGAGAG GTGTGCTTCT TCTTTTTGTT 3120

TGAGCCCATA GAAGCCATAG TATGATTTTT AATTGGCCTA ATTTCAATAC TGTTGTGTCT 3180

CAGAGAATAG GGAGGTCTGA AGAGAGGGAG AGAGGTGGGG GAATGGCTGG TCAGTGGAGC 3240

AGTCAGAACA CACATAACAC TAATAAATTG TTTGCTGTCT TATATGGATG TGGTTTGTGA 3300

TGCCCCCAAA CAATTACAAT AGTTACAGCA AATATCACTG ATCACAGATC ACCATAACAG 3360

ATATAAGAAT CATGGCAAAG TTTGAAATAT TCTTGAGAAT TAGCAAAGTG TGACACAGAG 3420

AAACAAAGTG AGCACATGCC GTTGGAAAAA ATTGGTGTTG ATAGACTTGC TCCATCGCAA 3480

GTTTGCCATA CGCCTTCAAT TTATAAAAAA CACAATATCT AGGAAGTTCA ATAAAGTGAA 3540

GTGCAATAAG ATGAAGTATG CCTGTAAATA TTTCAGGCTT TCCAGACCAT AGGGTTTCTG 3600

TTGCAACTGC TCACCTCTGC CATTATAGCA TGAAAGCAGC TATAGAAAAT ATACATAAAT 3660

GAGGCCTGTA ATCCCAACAC TTTGGGAGCC CAAGGTGGAT GGATCACTTG AGGTCAGGAA 3720 TTCGAGACCA GCTTGGCCAA CATGGCAAAA CCCCGTCTCT ACTAAAAATA CAAAAATGAG 3780

CCAGGACTAC GCATGCCTGT AGTCCCAGCT ACTTGGGAGG CTGAGGCAGG AGAATCTCTT 3840

GAACCCGGGA AGGGGAGGTT ACAGTGAGCC AAGATTGTGC CACTGCACTC CAGCCTGGGC 3900

AACAGAGTGA GACTGTCTCA CAAAAAAAAA AAAAGGAAAA GAAAATACAC ATAAATGAAT 3960

GTATGTGGCT GTGTACCAGT ATATCCTCAT GCTCTAGCTT GCCAACCCTT GCTTTACACT 4020

GTCAGTTACC TTCTAAAGAG ATTAAAAATC ATAACAATAT CTATTACGTT TATTCACATC 4080

CTAGTGTCAT TTCTTCCTTA TGTAGAATCA AATTTCATTC TGGTATCATA TTTCTTCTTT 4140

CTAAATAATT TCCTTTAATA TTTTTTATAG CACAGGTCTA ATAGCAATGC ATTATGCAAT 4200

TCATTGCTAT TAGACCTGTG CTATAAAATA GCAATGAATT ATGTCAGTTT TTATTTGTCT 4260

GAAAAAGTTT TTTGTTTTTG AAATATACTT TTGCTGGGTA TATAAATCCA TGTTGCATAA 4320

CTTCTCTTTT CTTCAGCACT TTAATGAAGT CACTCAGTTA TCTTCTGGCT TGTATAGTTT 4380

CTCTGGCTGC CTTCAAGATT TTTTCATTGT CTTTAATTTT TAGCAGTTTG ATGTGTCTAG 4440

GAGTGATTTT CTTTGTATTT ATCCTTTTGG GGGCCTCTTA ATTTCTTTGA TCCTTTTTTT 4500 cτττττττττ TTTTTTTAAT CAGTTTTGGT CTGTCTCCTC AAGTGGGCTG AAAAAAAAAG 4560

AAAAATAAAA TCATAGTTTA AAAAACTAAT TTTGGAAAAT TTTCAGCTAT CATTTCTTCA 4620

AATATTTATC CTACTCTATG CTCCCCTCCT CCCCTTTCCT TCTGTGACTC AAATTACAGG 4680

TATATTTAAC CATTTTATTT GTTCACGGCA CTTGGATGCT CTGCTTTCTT ATTTTTTGTC 4740

TTTCATTTTG GATAATTTCT ACTGACCTAT CTTCAAGTTC ACTGATTCTT TTCTCAGTCA 4800

TATCTAGTGT GCTCAACGCC TGTTGAAGAA ATCCTTTGTC TTTAATATCA TGTTTTTTAT 4860

TTCTAGCATT TTCATGTAAC TCTTTGTTCT GGTTTCCATC TCTCTACTCA CTTTTTTTTT 4920 ττττττττττ TTTTTTTGAG ACAGAGTCTC GCTCTGTCAC CCAGGCTGGA GTGTAGTGGC 4980

GCGATCTCGG CTCACTGCAA CTTCCGTCCC CTGGGTTCAA GTGATTCTCC TGCCTCATCC 5040

TCCCGAGTAG TTGGAATTAC AGGTGCCCAC CACCGTGGCT GGCTAATTTT TGTATTTTTT 5100 TAGTGGAAAC AGGGTTTCAC CATGTTGGCC AGGCTGGTCT TGAATTCCTG ACCTCAGGTG 5160

ATCCACCTGC CTCAGCCTCC CCAATTGCTG AAATTACTGG CATGAGGCAC TGCACCCAGC 5220

TCTGCTGACA TTTTTTATCT TTTGCTGCAT TTTGTCTACC TTTTCCATGA AATCCTTTAA 5280

CATAGTAGTC ATAGTTACTT TCAATTCCTT GTCTGACAGT TCTGACATTC AAGTCTAGGT 5340

CTGTTAATAG CTTTGTGAGT CTGTTAACAG CTTTTTTTCA TTCTTGTCTG TGTGTTTTGT 5400

ATTTCTTGAT TGTATGCCAA ATATTGCCTG TAAAATAAAC TTAGATAAGT CATACTTCTA 5460

TCCAGAAATA GGCACATTTT TTGTGTCCAG TCAT AGTGT GGAGGGAGGT TGGGGCAGTC 5520

TAGTCAGTGG CTGAACTAGG TTTGGATTTG TTGATGCTAT ACTTAGAATG CACCAGACTT 5580

CCATTCACTG CAAGAGTGGG CTGCTGCGCT TTGTGATTCA TGTGAGGCCT GAATTGTGGG 5640

TTTTTCCTTA GTGTGTCCCT CCATGCTCAG ATTTCAGCAA GTCTTCATAT CTGTGCCACA 5700

GAAGGAATCT GACCCATGCT CTTTTTGACC TCCCCAAGTG ATCAACTGTT GCTTGTTATA 5760

GCTTGTCATG GAGTAAGAGG GTGTTTTTTT AGTTTTCATC CTCCAGCCTT GGTCTTGGGC 5820

CCTGAGCTCC TAGACTCCAG GAGTGGATGG AATCCAGTGA TTTCTCAGTA ATTCAGCCCC 5880

TTCTCCAGTA GTGGCAGATC TCTGCTTTGT ATCAGTGCAA GATCCTGGGC TGAGCTCATT 5940

TTCTGCCCTT CCTCGAGTGG CAGACAGCTC TTGCTTTCAC CCTTCTACCA AAGGCAGTGC 6000

ATCTTTTCTT GGGCCTCTCC CCATTGAACT TATGACTTTC ACATAAGAGA AGGGCTCATG 6060

TATCAGAGAA TTCTGTGACT TTGTGCCACA TACAGAGTCT CTCAGTTCTC TTGCCCTGCC 6120

CCAGTCTTTT TTGTGAGCAC CTAGTAGAGA CCCTTGGAGA AGAGCAAGGA AGCGAGTATG 6180

GACTTCTTTT GTGTCTGTCG ATTGCTTTGT TTCTCAACTG CTACTCTTGG ACTTTAAGAA 6240

TTCATTAAAA TTTCAGCTGT TTTCTTTTAT TCTTTTTGTT TTTCTTTTTT TTTTTTTTTT 6300

TTTTTAGATG GAGTCTTGCT CTGTTGCCCA GGCTGGAGTG CAGTGGTGTG ATCTTGGCTT 6360

GCTGCAACCT CCGCCTCCCG GGTTCAAGCG ATTCTCCTGC CTCAGCCTCC CAAGTAGTTG 6420

GGATTACAGG TGCCCACCAC CACACCTGGC TAATTTTTGT ATTTTTAGTA GACACAGGGT 6480 TTCACCATTT TGGTCAGGCT TGTCTCAAAC TCCTGACCTC ATGATCTGCC CGCCTCAGCC 6540

TCCCAAAGTG CTGGGATTAC AGGCATGAGC CACCGCGCCA GGCCTCAGCT GTTCTCTTTT 6600

TACCTGCTGG GATGGCTAGT TTTCTGTGTC AACTTGACTG GGCCATGGGA TGTCCAGATA 6660

TGTAATTAAA CAGTATTTCT GGGTGTTTCT GTGAGGGTGT CTTCAGAAGA GATTTGCATT 6720

TGAATTGGTG AACTAAGTAA AGCAGAGGGC CCTGTCTAGT AGGGGTAGGC ATCATCCAGT 6780

CTGTTGAGGA CTTGAATAGA ACAAAAGGCA GGGGAAGGTT GGAATTGCCC CCTCTCTGCT 6840

TGAGCTGAGA CATCTATCCT GCCCTTGGCA CTCCTGGTTC TCAGGGGTTC AGACCTGGAT 6900

TCCTGGTCTC CACCTTGCCC ATGGCAGACT GTGGGACTTC TCAGCCTCCT ATCTAATTAA 6960

TAAATCTCTT CATACACACA CACACACACA CACACACACA CACACACACA CACACACACA 7020

CCCTATGTAT CCTTCTGTTT CTCTGCAGAA CCATATCTAA TACACCTGCT TTTATGACGA 7080

TTACCTATCG ATTCTGTATT CTGCCAAAAC TGAAAACAGT TCATTTTTCC ATCTCTTCTC 7140

AGAGAGGCTT GTCAGCCATT AGTTCTCTGA TGGGCTCAAG AAGTTATGCA GTTTTTTTTT 7200

TCTCACTGTT AGGATGGAAT TGATATTCTG TTGAAACTTT CTATACCTAA GTGGAAACTT 7260

GTTTTGAGGT TATTTTCTCT ACTTACTTTT GCTGGAAATG GAACACTCTG TATCTAGTTA 7320

AGACACATAA ACTGACTTGT GATACCATAA TGTTGTGTTG AATTTTATAT TCTTAGAAAA 7380

TCATCTGTCA AGGTGTTAAC TAATGGCAAA GCATTTAATA AATCAGCATT CATGTATTCA 7440

GGTGCTCTGA ATTATCTGAC TTTTAAATTC TTACTTTATA AATGAGAAAA TTGGGGCATG 7500

GAAAAGTTAA CTCTCCTAAC CCCGAATTAT TACATTATTA AGGACAGGAC TTAGAGGCCA 7560

GATATCTTAA GTCATTAATA TTCTTTGGCT CACAGAATTG GCAGTATAAC CTAAAGGTAA 7620

TAACTAGGTG ATTTTCTTTT ATATCAATTA AATATGTCAG TTTTCAAATA TTCATAAGTA 7680

CCTACTGTGC AGGGAAAGAA CATGCCATAC AAAAGATGTA GTCCAGGCCT TTAAGAAACT 7740

TTCATTTAAT GGGAACTCAA GAAGTGTACA TATAAGGAGG GAAGTAGCAG TATGGTACAA 7800

GATAATACAT ACATATCAGT GAATGATATT GCCAAAAAGT GCTATTGATA GAGAAATAAT 7860 TCATTTCTGC AAACAGCTGC TGATCTCCTA CTGAAAACAG AGGAGGGAGA ACAGGACGCC 7920

TCGTGGTCAG GATAGAAGAG AAAGACCTTG AGTTGAGCCT TGAACAGTAT TTAATATTCA 7980

AAAGGTTAAG AGAGGAGAGC AATTGAGGAG GGGAGAATAG TTCCAGCACA AATGATGGTG 8040

TACAAGATGA ACACAGTCAG TAAAGAGCAG ACTGGTCTGG ATGGAGAGGA GGATTTGCAT 8100

CATTTGGGAT TACGTCATTT AGACCCTTGA AAGCCAGGAT TGAGTAAAGC CACAGTGAAG 8160

CGACTGGCTC GTATGGAAGC TTTATTTTAA GAAGATTAAT CTGGTAGTGA CATGTGCCAA 8220

AAACTGAATA GGTAGAAATG AGATGCAGAG AGCCCAGTTA GAACTAAGTC TGGTGCAGTA 8280

ATGCAGGATT GAGGCAATAA ACACCAAACT ACAGTATCAC CAGATAATGG ATGTTTGAAC 8340

GGACGGTTTA AAGGAAAATT GATGGTATTT GGTAATTTAT TAGATAATCC AGGGCCATGG 8400

AATGAGAGGG GAAAATGACT AACCATAGTC ATCAAATGGT TTTTCTTAAT GAATCTGAAT 8460

TTTGGTGTAA GAGCAACATT TTCTTAGGCC TTGCCTAGTT GGTACAGCTG ACTATGATAA 8520

TGACTGCTAC CATGCTTGTT CCTCTTTTAG CAGCTGTGAG TCCCCCACCA GCCAAACAAT 8580

GAGCCTCTTG AAAAGGACGA TGCCTTTTCA CTTCTCTCCA AGTGCTTGGC AAATAGGAGG 8640

CCTTTTGAAG TTACTTTATA GTTAGGGGTT CCCAGTGAGT ATTTGAAATA TTAAGTCATG 8700

CCCGTGGTTG ACAGCATGGC CCTACTGCTC ATCATCAGCT ATTAACCTTA GGCAAGTTAA 8760

TGAACTTTTC TAAGCCCCAG TCTACTCATT TATAAAGTGG GATTATTAAT AATGTCTACT 8820

TCATAAAATT ATGAAGCCTG AGTTAGGTCA TTCAGATAGT GTTTAGTCTG ATTCTTCGAA 8880

CCTAGTAAAC AGTCAGTAAA CAGAAGCAAA TGCCACATGC CTGATTTATA TCCAAGGGGA 8940

GAAAGGTAAA AGTGAAATTT TCATGATTTA TGGATTCAAA TTATACATTT CAAAGATGCT 9000

TTATAAGCTA TTGTTTTGGT AAGAAGAATT GAGCTGAAAC AGAATTTTCT GACAGCAGTG 9060

ATTATTAAAT GGTGAAATAG GCTATTGATG TCTTTAGAGG ATATAGATGT TCACCTTTTG 9120

CATATAAGTG CACAAAAATT CACTAAGTAG ATATGTCTGT CTACACAGAG AGAGAGAGCG 9180

TGAGAGCATT AAAGTTAGTA AACATCCCCC TCGCTTTTTT TTTTTTGAGA CAGGGTCTTA 9240 CTCTGTTGCC TAGGCTGGAG TGCAGTGGTG CAATCGTGGC TCACTGCAGT CTCAACATCC 9300

TGGGCTCAAG CGATCCTCTC GCTCAGCCTC CTGAGTAGCT GAGGTGTGCA CCACCACACC 9360

CGGCTAATTT TTAAATTTTT TTATTGTAAA GGTGAGGTTT CACCATGTTG CCCAGGTCTC 9420

AAACTCCTGA GCTCAAGCAA TCTGCTCACT TCAGCCTCCA AAAATGCTGG GATTACAGGC 9480

GTGAGCCACC ACGCCTGGCC AGTAAACCCC ATTCATTTAC ATCATCTTAC TTGTCCCTCC 9540

AAAATCCTGC AAAGTAGGTA GGTTCTGTCT TTATTTGTTA TTTAGGTGAA GAACTTGAAG 9600

TGGTGTTGAG GAATAGGTGT TTTGCCAAGA GTCACGCAGC TGGAGTGGCA GAGCTGTATA 9660

CTCTTCTGAT TCCACCAACG CTGTTTACAT CACATCTGGA GAAAAGTGCT CTGAGGCACA 9720

GATGTTTAGT GGGAGGGATG AGACACAGGC TGCAATGCCT AAAGATAATC GGGAATAAAA 9780

GCAGAAAACA AGACGTTTGT TTCTGTTAAA ATGAGACAGA AAATAAGGCG TTTGTTGTTT 9840

GGGATTGAGC ACTTGGAGAA GTGGGGAGCG ATTTGATTTG GGTGAGACTG CTCCTGGAAT 9900

GCTGCATCTG GTTCTGGACT ACTCATTACT AGGCTTATAG AAACTAGCTG GAGGAGGTTC 9960

AAAGAAAAGC TCCAAAATGA TTAGCGGGCT GACGGGATTG ATTTATAAGA AATATTAAAA 10020

GAATTAAATG TGTATAGCTC AGCTAAGCAA AGATGAAAGA GACCAGCTAA ATGTATACAA 10080

ATATCTGAAA CGTGCAAACT TTAAAAAGAG AGATTAATTA TTTAACATGA TACACGGGGG 10140

CACAATATGC AGTCACAGGA TGAAAATTTC AGCTGAGTAT CTAGAAGAAT TCCCCGATAG 10200

TGAATCTGTT AAGGCTGTCT GTAGTGTGGC CTTTCCCTGG AGAGGCAATA GAAATTTCAA 10260

GTCTTACGAT TTTAAAAGTT TCTTGGGAAC TAGGTATTAG ATGATGTTAG AGAATTATTA 10320

TTAATTTGGT CAGGTATGAT AATGGTATTG TAGTTCTATA AGAAAAATTG TATTTTTTAG 10380

AGTTACATAC CCTGAAATAT AAGCATAGAA TATGATGTAG GAGATTTGCT TTAAAATACC 10440

ACAGTAAGGA AAGAAAGGAA GGAGGAAGAA AAGAAAGGAA GGGGAAGAAA GGGAAAAAGA 10500

GGCAAAGAAG GAAGAGAAGG TAAGAGAAAG AAAAAGAATG AAGGAAGAAG GCTGGGCACT 10560

GTGGCTCATG CCTATAATCC CAGCATTTAG GAGGCCAAGT TGGGAGGATC ACTTAATTAA 10620 GCCCAGGAGT TCAAGGCTGC AGTGAGCTGT GATTGCGCCA CTGCACTCCA GCCTGGGTGG 10680

CAGAGTGAAG CCCTGTCTCT AAAAAAAAAA AATAAGTTAA AAAGAAAGAA AAGGATAGAT 10740

GAAGTATGGC AAGATGTTGG TAATGTTGAA CCTGAAGGAA GTTAATATGT GAGTTCACTT 10800

TCCTCTTCAG TCTTCTTTAT GTATGTTTGC CAACTTTCAT AATAAACAAT TTAAATTATA 10860

TTTTCCTGAT CAAAACTTAG TAGCAGTATT AATCCCTGGG CTTCCTGACT AGAACAGCCT 10920

CATTACCACA TGGGCAGAGT TCTGGCCGAC CAGGGACCAC GTAGTGGTTC ACCATCTTGC 10980

TCTGGTAATG TGGTCTGGGC TGAAGGGCCC TTTCTAAGGT TGTAGATAGA AATCCAGGAA 11040

ACTTGTTAGA ACTGCAGACC TATCAGGGTA CCTGCAGGAG GTGAGTCTAC TAAGGTGAAA 11100

AAGCAGAGGG CAGAGGTCGT GATTAGCAGC TGACCGCCCC CTGCTTTTCT GTCCCTCATT 11160

CGTGGAAAAT TGAGTGGAGC TCAATTTTGA GTGGAGCTCT AAGTAGCTCC ACTTGTAGAC 11220

ATTGAGTGGA GCTCTAAGTG TCTTCAGAAT AGCAAAACAC TAGTTTTCTT TTTCTTTTCT 11280 ττττττττττ TTTTGGAGAC AGAGTCTTGG TCTGTCGCCC AGGCTGGAGT GCAATGGCAC 11340

GATCTCCGCT CACTGAACTC TGCCTCCCGG GTTCAAGCGA CTCTCCTGCC TCAGCCTCCC 11400

GAGTAGCTGG GATTACAGGT GCCCACCACC ACGCCCAGCT AATTTTCCTA TTTTTAGTAG 11460

AGATGAGGTT TCACCGTGTT GGCCAGGCTG GTCTCAAACT CCTGGCCTCA AGTGATCCGC 11520

CTGCCTTGGC CTCCCAAAGT CCTGGGATTA CAGGTGTGAG CCACCACACC CAGCTGCAAA 11580

ACCCTATTTT TCTTGAATGG AGAAACACTT TCCCCTTATT TATTGAGTTT GGGAAGCAAG 11640

AAGAGGGGTA ATTCATTAAG TGAAAATTTC CAAAATCCAG AAAACATCGA TAAAGCAGCA 11700

GCTTAATTTT TTTAAGGAAG AATTTTTTAA ACTATCTTCT TTTGAGCCTC TTTAGGAAGA 11760

CCTCACGTCC TTGCCTTGAA TGTTGAGAGT GGGAAATCCA GGGAGTTTTG GAATGCATGC 11820

CTTATGTCTG CTTTTTTGTT TGTTAGAGAA ATATAAATAT TTTATCTAGG TTTTGCTGAT 11880

GGCAGTCAAG CATGAACACA ACCCACTGTT TGAGAAGCTG TAATTTCTGA ATTTCTGCAG 11940

AGTGCACATC TAGGCCAGCA AATGGCAGTA AGAGTGAGGT GGATTTAGCT CAGTGTAAGG 12000 ATGAACTCCA GAACCATCGG CTCTGACTGA AAGTGAAGCG GCAGCCGCGT TGTGGGAAAG 12060

CTGGCTGGAG TCTCTCTCAT AAGCAGGCAT TCTTTTTCTC CAGCCCGTCA CTGTGTTGGT 12120

TTGGGCCCAC GGTAAGCCTC CTGGCCTCTA GGCTGTAACC CCCACCATCC TCCTCTGCCT 12180

CGCCTCCAGA GTGATTGTTC TGAAGCACAA CTGGATGTCA TTCCCCTTCC TGAACTCCTA 12240

GCACCTACAG GGACTCCATC CCTTGTGCCC CACATACCTC ACACGTAGAC ATTCCTAATG 12300

AAGATTTGAT TGAATTATTG TAAACTCAGT GCCTCCCACT CTTCTAGTTG CCTCTCTGCC 12360

TGCCTTTGTA CATTTATTTA TTTATTTATT TATTTATTTA TTTATGAGAC AGAGTCTTAC 12420

TGTATCACCC AGGCTGGAGT TTAGTGGCAC CATCTCAGCT CACTGCAACT TACCTCCCAG 12480

ATCAAGCAAT CCTCCCACCT CAGCCTCCCG AGGAGCTGGG ACCATAGGCA CGTGCCATAT 12540

GCCCGGTTAA TTTATTGTAA TTTTTGTAGA GATGGGGTTT CATCGTGTTG CCCAGGCTAG 12600

TCTTGAACTC CTGGACTCAG GCGATTCGCC CGTCTCAGTC TCCCAAAGTG CTGGGATTAT 12660

AGGCGTGAGC CACCATGCCC AGCCGCTAGC ACTCATCTTA ATCGTATATT TACTTATCTG 12720

GCTTTCCCAC CAGACTGCGG GCTCTTCAAG AGTAAATGCC ATGTTTTCAC CTTTATTTCC 12780

CCAGTTTGTG GCACATTCTA GGCACTCGCC ATCATGAAAT AAACCTCTGG AGCTGTGATA 12840

TTACAAACGT GGAAAGATGA CGAGCACTCA GCAACTTTCA GTGAGTAAAC AAAGGCTTTC 12900

ATTCAGCATG ATTTATTGAC TGCCCAAATC TGGGCTGCTT CCTGTCTGTG GTTCAAGGAG 12960

AGCATAGTCT ACAGAACCAG AGACCTGGCT ACTCTGGAAG TTAGACTTAA GCCCACCCCG 13020

GTCCTTGAAT GGGGAAATAT TTCCCTTCAT TCCTGTGTTT TAGGGACAGA AAGATGAGTA 13080

ATGCAGTGAT ACATGCTGGA AATGTTTATT CCACTACCCG AAGCTGCCTC TCAACTTAAC 13140

AATCCATGAA AGAAACAAGA TGGTATATAA CTTTTTCTAA TTTGTGATGC CTTTGTTTAT 13200

TTGTTTCCGG TTAAAAGAGG AGGTGGCATT GAATTGTTTG TTTGGTTTGG TTTCTTCTTC 13260

AATAAGAAGC ATCTTAATAT AACTAGACTG GACATCTGTC CCATTTTCAA AAATTACAAG 13320

TTTCGATCAT TGCTAAATTG TACAGATCCC AATCTGTCTG CTCTGCATAC ATTTGCATTT 13380 ATAAAAGCAG AAGCAGACTA GCAGTCTTTC TAATGCAATC CCCCAAATGC ATGAAGTATT 13440

AGATTGCTTC TCCCTATTGG TTCATGCATT GCTAAAGGCT TAAAAGGATC ATTGATTTTA 13500

ATTATTTAAT GTGTACAGCA GGCTGAGCTT CCTTTCTTTT TTAAGGGAAG AACCTTCAGG 13560

GGCATTGCTT TAGTTTTTTA ATGTTAAATC TCATTTTTCT TTGAAAATAA GAAGTTAAAG 13620

CTGTATTCAC ACAAGCTCTC AAAGTGCCAG ATTTTCATTG TGTTTTTAAA CCATCTAGGA 13680

AATGTTTGAT TCTAATGAAA CATTACTGCT GAAAATTGGG CTGAAATTGC TGGGCTGGAA 13740

ATATTGTTAT AACTTCACAT GATTCCAGTG TTGTATTATT ATTTTTTCTT TTTCTTTTTT 13800

TGACCCGATA TAGATGAAGC GAAGAGACAA GGAGCAATCC CATGTGTAAT AGAAAAAGGC 13860

AGCCTGAATT GTTGTTGCTG TTTTTGAAAT TTAAGCTGGT TTTCGATTAA ATTCAGTAAA 13920

TGGTCCAGGA CTATAAATGT TGAACATTTT TTACCGTGTG ATTTAAATTT TAGTCTTATT 13980

Gτττττττττ TTTTTGATGG TTTACATTTT CCCCATGGGA AGCAGCTATG TCATGTCGGC 14040

ATGATTCATC ATGGTAACAT CTCGGGTTAT TTTGGTTTGT GTTATGTTCA GAAAGCGGAA 14100

TGCCAAAAAT AAAGAGTGGT TTGTGATGTC TAGTGTGTCT TCCTTTAACA AATCAAAGGC 14160

TTTTATTTAA TCCACTTAAT GGGACACTGC AGAAATTTAA AAAATGGAAG TCCCATCCAC 14220

AGAAGGCAGG TACTATGATG TAAAAAGTTT AGGTGGGGGA TTAATAGAGT GATCA ATAA 14280

TTTATGAGCT AAACCGGAGG CACTTTTTTT TTTGAGATCG AGTCTCACTG TTGCCTAGGC 14340

TGGAGTGCAG TGACGTGATC ACAGCTCACT GCAACCTCCG CCTCCCGGGT TCAAGCGATT 14400

CTCATGCCTC AGCCTCCTGA GTAGCTGGGA CTATAGGCGC CCACCACCAT GCCCAGCTAA 14460

TTTTTGTGTT TTTTGTAGAG ATGGGGTTTC ACCATGTTGG CCAGGCTTGT CTCAAACTCC 14520

TGACCTCAGG TGATCCGCCC ACCTCGACCT CCTAAACTGC TGGGATTACA GGCGTAAGCC 14580

ACCATGCCTG GCCCAGAGAC ACTTTTGAGA GTGAAGAGGA AGCTGAGAAT AATTCACTGA 14640

TCTACAACTG GGACCATCCA GGGCAAGCCA GATGCCATTA CCACTAGCTA GAAAGCTTGC 14700

CAAGGTCTCA TTTACCTTGG TATATAGCAA ATTCTTCTTT TGAATTCTGG AAATTCTGGT 14760 AAGTCATTGA GGTAGCTCTG TGCCAAGGAG CAATATGGTA GAATTCTAAT ATTTCAGGCA 14820

GACAACACTT TCCTGCATTT GTAGCAGGTA AAGGGAGGTC AGGGCAGAAG ACAAAACCAC 14880

TGGGACTCGA CAAAGGGCAT AAACGTCTAA TGCACCTGAT GTAGCTGATG GTAAATTGTT 14940

ATCAGCTAAA GATCTTTCAT AATAAATAAA CTTATCATTT GTAGGAGGGC ACAGAAATCG 15000

TGGAAAGCTG GGATTCAGGT TGCCTGTGGC TTTAATTCTG GAATCAGAAA TATTAGTCAA 15060

GGATATCAGT CTATGAAGTA AGTTTTCAAT GTTATATGCC ACAAGATGCA GCTGTCCTAT 15120

TTTCACTTCC AGTAATTCCT TCTGAATTAA TACACCTTAA AAATAGCTGC AGCTTCTCAA 15180

ATCTGTGAGA ATCGTATGTG CTGCTTGCTA CACTTTCTTT TTCCTGAAGG CTCTTTGAGG 15240

TCTTTCAAGA ACTCAATTCA ATTCAGCAAC AATTAGGGGG TCTAAGGTAT ACAGACGCTG 15300

TGCAAGATGC TCCTGAGACA CAAAGAGGAG GTCAAGCCCC TGCCTTCAGG CACCTCTCTA 15360

TAATATAGGA GGAGAAAGAG AAGAAACACT AATACACATA GGTAGGTGCC ATTAAAAGGG 15420

TACATACATT AAAGCCAGGT GGTAGGTGTA AGAAGATTTG TAACATGAGA ATTTTCTGCA 15480

TGTTTGAAAT ATCTTATAAT TTTTAAAAAT TAAAATGGGA GATACATATA TATGTATTTA 15540

TGTATGTATA TATGTATGTA CATATACACA CATATATACA TAAATATATA CATAAATATG 15600

TATATATGTG TATATAGACA TAAATATGTA TATATGTGTA TATATACATA AATATGTATA 15660

TATGTGTATA TAGACATAAA TATGTATATA TGTGTA ATA GACA AAATA TGTATATATG 15720

TGTATATAGA CATAAATATG TATATATGTG TATATAGACA TAAATATGTA TATATGTGTA 15780

TATAGACATA AATATGTATA TATGTGTATA TAGACATAAA TATGTATATA TGTGTATATA 15840

GACATAAATA TGTATATATG TGTATATAGA CATAAATATG TATATATGTG TATATAGACA 15900

TAAATATGTA TATATGTGTA TATAGACATA AATATGTATA TATGTGTATA TAGACATAAA 15960

TATGTATATA TGTGTATATA GACATAAATA TGTATATATG TGTATATAGA CATAAATATG 16020

TATATATGTT GTATATAGAC ATAAATATGT ATA ATGTGT GTATA AATA ATGTGTGTCA 16080

TATACACACA TATATACATA CATAAACATT CTGCATTATA CCATTCACTT TGTAACCCAT 16140 CTTCCCTAAA AACTGTCTCA TAAAGAGTCT TCTTTTCCCT GTACCTATGC AATGGTAAGT 16200

AGCAAAACAC ACATTCTTTT GGGTCCCCAT AACATTCCCT GTAGTTTGCC CTTAACAGTC 16260

TTTGATGTGA AATTTACTGT TTCTGTCTTA ACCTTGCCTG TCTCGCGTAC ATGGAGTTTT 16320

GGCTCCTGGC TCCTAGTCTG CATCTTCACC CCATCCCTTG CCCAAAGAAT CTGGTTATGT 16380

GACCACTGCT CATCTTTTCT GCTGCCACAA CTCCAGTCCA AGCCACAAAC CTCTCTCTCC 16440

TGGACTCCTG CGGGGAGTTC CTTTCTCTCC CTGCATGAGT CTATTCTCCG CACAACTGGC 16500

ATAGGTAAGT GAGACTGCGG AAGAGGCAAG TTTGCAAGTC CAGAGGAAAT GAAGACTCTG 16560

CTTGTGCACA TGCTGGGTTT GACGGGTGCT GGATATCCGA TGGATGGCCC TTAAGGTGAG 16620

CTCAAGGCTT AAGGGAGAGA TAGGGGCTGA TGATCTGAGA TTCATCAGTG TGTGGCTGAT 16680

GTTTAAACCC AGGGGACAGG ATAAGAAGGT TATTCCAGGG AGAGCGTAGA TAAAGAAGCT 16740

AAATGGCTTC TGGGTCCTTA GTCATTCAAA ATCGGACCTC TGAGGCAGGA GGAAAGCCCA 16800

GAAAGAGTAG ATTCCTGGGA CTCACGGGAT AAAGACTTTC AAAAAGTGGG GGCTGGCCAG 16860

TGCTGCTGAA GGAAGTAGCA GGACCGGAAC AGAAGGGTAA TCGTTGGACC TGGAGAACTT 16920

GAATTTGAAT TTTAAGGTTG GTAACCTTAA AAAAGAGCAA TTTTAGATAC CTTTTGAAAT 16980

TATTTGCAAG ATTTGTTTGG TATATGTGTT ATTCCAGGCA AAGGGACCAG AAAAGTAAAA 17040

AATACTTACT GAACAGTTAC TGCATGCCTG GCACTGTAAC ACCCTGTTTA ATTCTCACGG 17100

CAACCCTATA GAGTAGGTGT CATCATCCCC ATCTTACAGA TGAGGATATG AGGTGCAGCT 17160

AGATTAAGCA GTTTGCCTCA GGTTACACCA ACTGGTTAAC GTAGAGCTAG GATTTGAACC 17220

CGGATGGGCT GATCCCAGAG CTCATGCTTT AAATCGCTAG ACTGGTGCTC ACAGAAGACT 17280

GGGACCGAAA AAAATTAATA AAAAAAATAA GGAGCCCCCT GGGCTAGCAA ATTAGGAGTT 17340

GTTCAGACAG ATGTGAAAAG GAAAGCAAGG CAGAGGGAAA GTCACTGTAC AGAAGAGAGA 17400

GACCCATGAC AGCAGAGACA GTGAGCTGGT AAAGTGGCTG GCGATCTAGC CCCTGAAAAT 17460

ACCTCCAGAG AGGCAGGCTC ACGCCTGTAA TCCCAGCACT TTGGGAGGCC GAGGTGGGCA 17520 GATCACCTGA GGTCAGGAGT TTGAGACCAG CCTGGCCAAT GGCGAAATCC CGTCTCTACT 17580

AAAAATACAA AAATTAGCCG AGCATGGTGA CAGGCACCTG TAATCCCAGC TGTTCAGTTG 17640

GCTGAGTCAG GAGAATAGCC TGGATCCGGG AAGTGGAGGT TGTAGTAAGC CAAGATTGCG 17700

CCACTGCATG CCAGCCTGGG CGACAGAGCA AGACTTTTCT TAAAACAAAC AAACAAAAAA 17760

GAAAAAAGAA AAGGAAAGAA GAAAGAGACA AAGAAAGAAA GAGAGAAGGA AAGAAAGGAA 17820

GGAAGGAAGA GAAGGAAGGA AGGAAAGAAA GAAAAGGAAA GAAAGAAAAA GAAAGAAGAA 17880

AGAAAGGAAA GAAAAGAAAG AAAAAGAAAG AAAGAAAATA CCTCCAGAGA GCCAGGTCTC 17940

TTAGGCCTTC TGAGAAACTC ACATCCCTTT TGATGAACAC AAATGCTTCA CACTCTCAAT 18000

GTTATTGGTA ATCCAAGTTA TCAATATACC TAAATCACTT AGTACTGAAT CTGGCATATA 18060

GTAATCACCT AATGAAGAGA TAAGAGTCAT GGAGTATTCT GAAGCAATTA GAATCAATAG 18120

ACTCAATATA CACATGGCAA CAAAGTTGGA TCTTAAAAAC CGACCTGAGT GAAAAAGGAA 18180

AGGGAAAGAT ACATAACACG GTACCATTAT GTAAATTGAT AATATATGCT TACACAATTT 18240

GTAAGAACAC ATACAAATAG ATACATGTAT ATTAAATATA CTCGAACGGT TACCTATGGG 18300

GTGGTGGCTG GAGTGGGGGT AAGTCCGTAA GCTGTAATGG AACCTAAACA AATACATGAA 18360

ACGAGTAGGA ATCAGAAGGA GTAACAATAA AAATGTGCCA TGAACTGAGG AGTGTAAATT 18420

AATCAACTCA CTGCATCTGA GGTTAAAAAT AGAAAGATGA TAATTGTTAT TCTTATTACT 18480

CGTAGGTCTT CCACTTGCAC TCAGCTTTAC AATGTTGGAC TATCCTTCAG ATGGCACCCT 18540

CCTTGCACTT GCTCAGGCAG GAGAGCTTTT TCCTCCAGCT TTCTAGGTGA TTTAATATAT 18600

CAGGGAATAA GTATAAAAAA AGGCACGGTG CTCCCTGGGT AGCCTTTCTG GACTTCAGAG 18660

CTAAATTGCA AAGTCAGTTT TACACATGTG ATTTCATCTA TGAAATTAGG GCAAGGTAGA 18720

AAACTGGCAC AGAAAAAATG TGATTTATTA TGGTGTTACT ATCCCTTACA AGCGGAGTGT 18780

CAGCTGCCTC TTTTTGTCCA CTGATTTAAG GCAAGATGAA CTGAAAGTGG CTATGATCAC 18840

GTCTTCAAAA GCACACTCTG GCCCCTCGGC TGCAGGCGCC CTGCACATTC CCCAGCTGCG 18900 TGTCCGGTGG TGACACAGTG CATAATTGTG GCGCCTTCCT GGTGCAAACT GTCTCACTTA 18960

GCTCCGTCTT GCTGGCACAG CAGAAAGGAA GAAATCGAAA ATGTTTGGAT TTCAAAGGTA 19020

ACAAGAAGCT GGAAAACAAC TACTGGCCGA GTCTGAGAGT TTCAGCGGAG ACTGGTGCAG 19080

CCTTGTGTTT TTCCACTGAC AGCTGAAAAT GAGCCCAGCT TCAGTGAAGC TTGTTTCCTT 19140

CCCTCCTCAA GGTTACCCAC AATTCTCAGT TCTCTCAGGA AAGCCAAAAA ATGAATTTGA 19200

GGGTTTAGGA TTGTGGTTCT TTTATCTATT ACAGGATTGA TAATATGTTC CTCCACCAGA 19260

TGTTCTGCTT GTAACAATAC TCACTTCCTG ACACTACTGC ATATGCAGGA GTGTTACTAC 19320

CAAGGTAAAC ACAGAATTGG CTGCCCAATT CCAAATCCCT GAACTGAGTG AGAGAAATCA 19380

GAATTATAAT AGGGGATTCA ACAGAGCTGG CTACGGATGT GCCAGTGGTC AGATACTTTG 19440

CTCATCATAC GCAGGTGCTG CTGCTCTAGC AACTGCTCAC TGCTTCATTT CCTGCCTTGG 19500

TCTTTAAATA CTGCTTTTCT CAGCTCAATT GGCTTTCTTC CCTCTGGCAG TCACGTTTCT 19560

TTGGGTCAAA CAGCAAATGA TTCTTTAGAA TCACCTGGTA CTCAAAGGAG CTACAAGACA 19620

TTGGGCATCC ACTTCCACTC TCTTGGAAAA ACAATTTTAT GGAAGCCAAG GTTGCCATAG 19680

TGCCTCTTGA GGTTGTTTGC TCAGCCAAGG CCCAAGCTTT GTGCTTCAAA CATGAAATTA 19740

GAGAGCTTCA GAACAAGATC CACATTTTCA ATGGCCTCAC CCAACTGGAT AAAAGAACAA 19800

TTGCCATATC TCAATGACCA CCTTTTTCAG GTGGGATGGT AGATGCTGGA ATGGGTCACA 19860

GCATTGCCCA ACCAAACTTT GCAAAAAAGG CTGGAAGCTC TGACTGGGGA CCCTAAATAT 19920

GCAAAAGTTG ATAGGCTCTT CATGCAGAAT ATGAACCCCG TGTATGGATA TAGCTAAAGG 19980

GTTGGCCTTT ATGTTTCTAT TCCTTCACAA ACCTGGTAGA ATAGATATGC TTGTTTCCCT 20040

TTAAAAAATG TCAACAATTG CATTTATGAT GCTGTGTATA GTAACTCACA GATCATGCTC 20100

CATGAAAATG CTTCAGAACC CAATATAAGG AGATTTTTTA GCCATGTGTG ACAAAAGAGA 20160

GGCCATTTCA GTGTTGAAAT TGTTCAGAGA AGTATTTGAT TATGTTTTCT CAGATCTTTT 20220

TATTTTTATT TTTTTTGAAA CAGAGTCTCA CTTTGTCACC CAGGCTGGAG TACAGTGGCT 20280 GTGGTCTCGG CTCACTGCAA CCTCTGCCTC CCAGGTTCAA GCGATTCTCC TGTCAGCTTC 20340

CCGAATAGCT GGGATTACAG GCGCATGCAC CACCATGCCT AATTTTTGTA TTTTTAGTAG 20400

AGACAGAGTT TCGCCATGTT GACCAGGCTT GCCTTGAACT CCTGACTTCA GGTGATCCAC 20460

CCACCTCAGC CTCCCAAAGC ACTGGGATTA CAGGCATGAG CCACCGTGCC CAGCCTGTTT 20520

TCTCAGATCC TGTATTTGTT TCTGAAGCCT TCATTTCTAT CTTCTTATTC ATTTTGGAAG 20580

TAGTACACCT AAGTAAGGTT TTTAACAATC AAATATCTTT GGAAAATTCC CTGGTTCCTT 20640

TCTTATTCCT ACAAAAATAT GTTCAGTATA GCTGATGTTA TGTTTCTTTC AAATTATTCA 20700

TTTCTCTATC TCAGAATTTA TCTCATGCCT AATTGTTATT GAATAGTCTT CACTTCTTGT 20760

CATCCAGTTT CTGGTCTCTT ATTTCACTCT AAGTCTAAGT GGCTATTAGA ATAAAGAGCT 20820

TGTAACAGAT TCTTTCTCCA ATATGTCTTA TCTTTTGACT GCATGCCAGT GACAAACTGT 20880

TAACTGTTTT GATTCTTCAT AACATTCCAC AGAACATGCT GACTCCTCTC TTCCTGAAAG 20940

CAATGCCCAA GCACAGCATT GTTAGATAGT ATGTACGCAA CAGGGACATG GGTGCATAGC 21000

AAAAACTAGA AGGAAGGAGG ACCTTCCTTA GCAATGGGTG ATATGGTCCC TGGACTTAGA 21060

CTCCAAAGGG TCGTGAGGTG AAACACACAT CGTCCATACC CAGGAAGCAC ACAGGTGGGA 21120

TGGAAGAGCT GTGCCTAATG AAACTTCATC CACGTGGAGG TGGAGGAGGC TGCAGCTGCA 21180

AGAACTCAGA GCTGCCTTAC CCAGACCAGG GACCAGGGAG GGCTTTCTGG AGGAAACAGC 21240

CTCTGAACTG CCAGCTGATA GAGGAGCTCT ACCTCAACTC TTCTGGTTCC CCAGGGCTGC 21300

TTTTCCACGT CCATTTATTG GCACTGAAGT TTGAATACCT TCAGGGGCCC GAAAGCCTGC 21360

CAGGTCCTCT TCTCTGCAGA GCAATCACAC CAACCTGCAA AGGGCTAGGA AAGGGCTGTC 21420

ATCATCTCCT ACTCAGAAAC TGGTTCACTG GAAGGACTCA GGGGCCACTG AATACATCCT 21480

GGCAGCTTTC ACAAGAAGGG CTTCTGACTC AAGGATGTTT CCATCTTTGC CAGGTCGCCT 21540

TTTCTCCTTC TCTTAGAGTT TGGAGGACGC AAATGTGCTG AGAAGTCAAC CTTTCCTGCA 21600

AGGTGAGACA CAAGGGCCTT TCCCAGCAGA AAGAAGAGAG CAAATGGAAG GTCCTTCTTC 21660 CTCCAGTAGA GGATGGACTC TGTCTGGCAG CCACCCAACA GGAAAAGCAC AATGCATGCC 21720

TGCCTGCTTC CCTCCCTCCC TCCGTTTCTC CCTCCCTCCC TCCTTCCTCC CTTCCATTCT 21780

CTTCCCTTCC CCTCCCTTCC CTTCCCCTCC CTTCCCTTCC CCTCCCCTCC CCTTCCCTTC 21840

TCCCTCTCCT TCCCTTCCTC TTCCCTTCCT TCCTCTTCCC TTCCTTTCCC CTCCCCTTCC 21900

TTTCCCTTCC TCCCTCCCTT CCTCCCTTCT TTCCTTCCCT TCTTTCCTTC CTCATTTCCT 21960

CCCTTCCTTC CTTCCTTCCT TCCTTTCTTC CTACTTTCCT ACCTTTAGGG CTCTGTGTCT 22020

TTGGAGTCCA TTCTGATTAT GCTGTAATGT CTGCCCCTTC CTCTTCTCTG TCAAAAAATG 22080

AAAGACATGG AAGCCACTTG CCTTTTACTG AATTAAAAAT TAGTAAAAGA GCTAAAAATT 22140

AATGGTTAAA AATGTACGCA TAAATTATGC AGTATACTAA CCAATGAAAA GATACACTTC 22200

TCTTAATTAA AAGCTGACAG GGAGGGAAAC AAGAAAAGAG AAACACAAAA CAATAATCTA 22260

AATGACCTAT TAGTTGGAAG AACAACATCA GAGAAAATAG ATACTGTGTA TAGTCATGTG 22320

TATGTCTATG GAATAACATT TGTAGAGAAA TCTGGACTGA TCCTTTCTGA GTAAAGAGAG 22380

CTGTGGGTAC AATTAAGGGG AGATTGAAAG GAATCCAAAA GCATAGCAGA TGCTGTGCCT 22440

CACTGGAATG GTTGCCGATC TCCTCCAAAC TATGAAGTGT TTGAGGCTCA ACTTTAATAT 22500

AATTAAGATA CAAAGACAGA ATGAGAGAAA GAGAGAAGGG AGCTCACTGG AAGAACACTC 22560

AAGATTCCTT ACTACTCATT CTCTAAAATT ACAATTGTTC TAGATGGAAA AGAAAAAAAG 22620

CTTCTCTGTT AAAAAAGGAG CTTGTGCTAT AGGAGGTTTA AAATATACTT CTGACCCATC 22680

TCCAACATTC TAAATCCTTC CCAGAAAAGT ATGCCAATCC CAAGAAATAT TCAATCAAAT 22740

TGCTGGAAAG AAAAATACAA AATATTAAAA TGTATTAGGA AGCGACAGTA ATTAAATCAG 22800

AACTGGAGCA GGAATAGACC AGCAGATCAA TGAGACAGAC ATCAAGTCCC GGAATGTGGA 22860

CTTGCAAATG CATTAAGTAA TATGATATGC AATAAAGGTG GCACAGTGAA CCAATGGGAA 22920

AAAAATTAAT CTTATAATAA TTGATATTGC AATAATTGTC TAGTAATTGG GGGAAGAAAT 22980

AAGCTTATTC CTTATCTCAT TTCTTTTTTT CTTTTTGAGA CAGAGTCTCA CTCTGGTAGC 23040 CCAGGCTGGA GTGCAGCGAT GCGATCTCTG CCCACTGCAA CCTTGCTCTC CCGGGCTCAG 23100

GCGATTCTCC CACCTCAGCC TCCCGAGCAG CTGAACTACA GGCGTGTGCC ACCACTCCCG 23160

GCAATTTTTT TTTCCATTTT TAGTAAAAAT GGGGTTTCAC CATGTTGCCT GGGCTGGTCT 23220

TGAACTCCTG GGCTCAGGCA ATCCACCCGC CTTGGCCTCC CAAAGTGCTA GCATTACAGG 23280

CATGAGCCAC CGCGCCTGGC AGCTCATTTC TTAGACTAAA TAAATTGGAG ATGGCTAAAA 23340

GATTTCTATG TAGGCCAACT ATGTTTTTAA AAAGTTTTTT TTTTAAGGAT ATCTGCTGGA 23400

ACCAATCATG CCACCAACCA AAGATGCAAG ACTATAAAAC ATACCCAGTT TTTCAAAGCA 23460

TTTAAAAATT ATTCTAAAAA TATTTTTTCT CCAGAAATTT TGCATTGATT CCCTGAAGAA 23520

GCATTAATAT GGGACCTGAC TTATAAAATG ATGAACTCAA TCTCCCCACT CAAGGTAGGA 23580

GTCTCTCAGA TTTAAAAAAT AAGCATCCTA GTCCTCTTGT CCCTGTAAAA GTTAACCCTT 23640

ACACCTGAAA CACCAGGAGA CTGGCGGTTG TTTGCATAGG GGTTACAATT AAAGTTGAGC 23700

TACCTCTGAC ATCTATTAAC ACCAAAATTA GTAAACTATG CATGTATGGA GACTTTTATG 23760

ATTGAACTTG TTTATTGAGT CAAGAGATAT AGTTTACAAT GAAAATTTGG GGCATATCAA 23820

AATGACCTTG GCTTAGCTTA GCATTTGCTG ATGTTAACTA TTTTCTTCAT TGGGCTGATT 23880

TTAGTTGCTT AGGAAAAATA CAAACACACA CACTTTAAAA TTATATTAAA ATCCCGTCCT 23940

AAACCTCAGA GTCCAGAACC GCATCCTAAC ACTGGTCATG CATAATATGT TTAAATTTTT 24000

GTGCTTTAAA AACTACAAAT AAGGAATGTA TTAATAGTTC CACAATCAAT GGTCAGTTAG 24060

CCGAGGGAAG ATTAGCATAG TTAAAGACTT AAAATGGCTT TACAACATAT ATCAAAAGGA 24120

CAAAATAAGG GGAACAGAGT CTAGAAATGA GGAAACTGGG ACACAGGCAA AAAAAAAAAA 24180

TGAGAACTGG GACATGAATA ACGCAAGGGA TAAGACTAAT ACACAAAACA CCCCAAATAA 24240

ATAGCCAGCA TTTGCTGAGC TCTTACTGTG AGCCTGTTCT AAGCACTTTA CATATATTAA 24300

CTCATTTCAT CCTCAAGGAA CCATCTGAGG CAGGCACTGT TATCATCTCC ATTTTACAGA 24360

TAAGGAATAG ACCCAGAGAG GCTGAGCAAC TGGGCCTATT CCACAGCTAC TATGGTGGAG 24420 ATGAGATTTA AATCTAATCA TTGGCTCCAG AGCCCATGCA CCCAATGGCT GCACTAAGTG 24480

AATGCATGCG CTATCAACGT TGCCAAAAGT GGGCCACAGC TCGGATCTGC GTTTTCCAGT 24540

AGCCAAAGCA GAGAGTGTGA TCAGACCTCA CTTTAATAAG CAAGTCTCAA GCCAGAGAGA 24600

GGTGGTATCA GGCAGCAAAC AGGCTGCTAG TCGAAATCCC ACTTCTTCTC TGAGTGGTCC 24660

ATACAGTTTT ACTCTACTTG CTTACAGAAT GAAAATAGCT GGAGTTCAGG TGCGCTTTCA 24720

ATGCCCTGTT GTCAGGATTG GGCTTTTCAA GTTTATTTTT TGTTGTTGTT TTTAATAGAC 24780

TGTACTTTTT AGAAAATTTT TAGATTTACA GAAAGATTGA GAGGATAGTA CAGAGAGTTC 24840

CCGTATACCT CACACCCAGT TTCTGCAATT ATTAACCTCT TACATTCATG CGGTACATTT 24900

GTTACAATTA ATGAGCCAGG GCCGGCCGGG CACAGTGGTT CAGGCCCCTA ATCCCAGCAC 24960

TTTGGGAGGC AGAGGCAAGC GAATCACTTG AGGTCAGGAG TTCGAGACTA GCCTGACCAA 25020

CATGGTAAAC CCTTTCTGTA CTAAAAATAC AAAAAATTAG CCAGGCATGG TGCTGGTTGC 25080

CTGTATTCCC AGATACTCAG GAGGCTGAGG CACAAGAATT GCTTGAACCA GGGAGGCGGA 25140

GGTTGCAGTA AGCCGAGATC GTGCCACTGC ACTCCAGCCT GGGCAACAGA GCGAGACTCC 25200

ATCAAAAAAA AAAAAAAAAA AAAAAGAAGG AAGGAAGGAA GGAAAATTAA TGAGCCAATA 25260

TTGAGACATT ATTATTACTA AAGTCCATGC TTTATGCAGA TTTTCTTAGT TTTTACCTGC 25320

TGTCATTTTT CAGTTCCAGG AATGCATTCA GGATGCCATA CCACATTTAG TTCTCATATC 25380

TGCTTAGGCT CCTCTTGGCT AGACTGAGTT TTAATCTACT TTCTGCAGAG CCTGAGAACT 25440

TTAGCATAAT TTCCTTGGAA ATTACAGCTC AATATTTTCA AGCACTTATA CAAACAGCCT 25500

AATGTTACGT TGGCCCATAA CAGTGTTTCA AGGTAATAAA CTTCTTTGTT TTCTGTGCCG 25560

ATTGAAAGAA CTGCTGCTTA GCCTCCTGCC AGATGATGAA CTGGGTACAC ACGAGCATTT 25620

TTCCAGGTAA AGCATATTTC GTGCGACTTC TTAAGCTGCA GCCTTATATG CAATAATTGT 25680

CCATTTACAA GACTTATGTT CGAATTTCAG GCACTCTGTT TTCACTAACC ATATCCTTCA 25740

ACTTTGATAA GTACTGCTTT AATCAACTCA GAAAATTTAA CTTGACTAAT TTTTTTTCAC 25800 CATCAGTTTT TTTTCTGTTG ACTCTTTCTC CTTTTTCTGT TTGCCCAGAA ACATGCTCAG 25860

GATTCTCTCA GGCTTTAAAA AATGAAAAAA TGTTTCCTGC AATCTAGTTA CTCCTTGATT 25920

CTCTTGTTCT GTTTATCGCT GGAATTCTTG AAAGCTTGGT GTATTAGTCT TTTTTCATGC 25980

TGCTGATAAA GATATACCTG AGACTGGATA ATTTATAAAG AAAAAGAGGT TTAATGGACT 26040

CACAGTTCCA CGTGGCTGAG GAAGCCTCAC AATCATGGTG GAAGGCAAAA GGCATGTCTT 26100

ACATGGCAGC AGACAAGAGA GAATGAGAAC CAAGGGATTT CCCCTTATAA AACCATCAGA 26160

TCTTGTGAGA CTTATTCACT ACCACAAGAA CAATATGGGG TAAACCGCCC CCATGATTCA 26220

ATTATCTCCC ACCGGGGCCC TCCCACAACA CGTGGGAATT ATGGGAGCTA CAATTCAAGA 26280

TGACATTTGG GTGGGGACAT GGCCAAACCA TATCACCTGG CCTATAGCAT TATTTCCATT 26340

TCTTCCCCAT CCTTTTATTC CTCAAACCGG TACAACCAGA CCTCTTTTTT TTTTTTTCTA 26400

CCTGAAACTG CTCTTTTGAG GGTAGCTGAT AAGTCCAAAA TACTGTCACC TTTTCTCAAT 26460

TCCGTTCCTT CTTATGCCTT TGGAGCAATT GACTGTGTTG GTTGCCCCCT CCTTTAAAGT 26520

GTCTCTCACT TGGTTTTATG ACTAATGATG ATTTTCTTTT TCCTCTCTAA ACATTCCGCT 26580

ATCTTTTTAG CTTCCCTTCC CCCTCCCATC CCCTAAATGT CCTTGTTTCC CAGAATCTGC 26640

CTCACCTCTT TGACTTCTCT ATGCCCTGTC ATTCACTCAT GGGTCTTTAT TACATTATTG 26700

CATCTGTGTC AATAACTCTG GTCTTTCTGT TAAGTTCCAG TCTCCCATTT TCAAATGTCC 26760

CCAGACATTT CCAATTGAGT ATCTCTCCAA TGTATTTAAC CTGCTAAATA TCTAACACAT 26820

AATCTTTCCC ATCAAATCGT TTCCTCTTAA GCTTTTCGTT ATTTCCTATT AGACTCCTGC 26880

ACTTCTCCCA GGAGCCCAGA CTTAAAACCT TGAATTTCTC ACCATAACCT CTCTTTTGTC 26940

TCCCATAATC AATTAGTAGC AAGTGTTATC AATGATTACT TGACAATATC TTTTTCTATT 27000

TCCCTCCCTG CTATGATCAT TCATCTAGCA AGAAGAGTTG GCCCTTTGTA TCTGTGGTTT 27060

CTGCATCCCT GGATTCAACC AACTGTAGAT GGAAAATATT TGAAGAAAAA AGCGTCTATA 27120

CTGAGTATGA AAAAATTTTA TTTCTTGTCA TTATTCCCTA AACAATACAG TATAACAACT 27180 ACAGCATTTA CACTGTAGCG TATAGATCTT ATAATCTAGA AATGATTTCA AGTACACCAT 27240

TATATATAAG GGACTTGAGC ATCTGTGAAG TTTGGTATTT GTGGGGCATA CTGGGACCAA 27300

TTCCCCCATG GATACAGAGG GACAACTATA TTTACTCAGT GCTTACTAAA TACCAGTTGG 27360

CCAATGTGTT TTTCTTTTTC TGTTTTCCTG TCTTTAGTTT GCCCCTTGCC AATTAATTCA 27420

ATAGTGCTGC CAATGCCAGG TGTACCTTCA GAATATTCTA TTCTAATTTT GTCATCTCCA 27480

AGCTTAAAAA TATTTAATGG GCCAGGCGCA GTGGCTCACA CTTGTAATCC CAGCATTTTG 27540

GGAGGCCAAG GGGGGGTGTA TCACTTGAGG TCAGGAGTTC CAGACCAGCC TGGCCAACAT 27600

GGCGAAACCC TGTCTCTACA AAAAAGTATA AAAGTTAACC AGGTGCTGGA GCATTTGCCT 27660

GTGGTCCCAG CTACTCAGGA GGCTGAGGCA GGAAAATCAC TTTAATCTGG GAGGTGGAGT 27720

TTGCAGTGAG CCAAGATCTC TCCACTGCAC TCCAGCCTGG GTGACACAGC AAGACTCTAT 27780

CTCAAAACAA CAATAACAAC AACAACGAAA AACATTTAAT GGCTGCACCT TGCCTGTGAA 27840

AAATGCATTT CTTGGCCAGA TGTGGTGGCT CAAACCTGTA ATCCCAACAC TTTGGGAAGC 27900

TAAGGCCAGG AGTTCGAGAC GAGCTGGGAT ATATAGGAAG ACACAATCTC TACAAAAAAA 27960

AATCCACAAA ATTAGTCAGG CTTATTGTTC ATGCCTGTAG TCCCAGGTAC TCAGGAGGCT 28020

GAGGCAGGAT TCCTCAAGCC CAGGAGTTCA AGGCTTCCGT GAGCTATGAT GGCACAACTG 28080

CACTCCATCT TGGGTGACAG AGCAAGGTCC TATCTCTGGA GAAAAAAAAA AAAAGAAGGC 28140

ATTTCTTAGG AGAGTTCTTC TCTGTAGAGT CCTAAGGGTT CCATGGAACT CCTTAAAAGC 28200

ATCAGAGTAT GTGAGTGCAA TGGGAGGAAG CATTTAGCCA GAGCAGTTGT GCTCCCATTG 28260

CATATTAATT TTTAAAAAAC AAAGCTATAA AAAAAAGTTG AAAACTACTA CGTTAGCATC 28320

AGCCTGACAT TTAATGGCCT CGTAAATCAA ACCTTAATTG ACTTTTTAGC CAGTTATGCT 28380

ACTAGCCAAC TACAGACAAC ACACTTTTTA ACCAAATTAG ACTAATAGTT GTCATCAGTG 28440

GAAATCAAGT TTGCCATTCT TCCATGCCTT TGCTCACACC ATTACCTTTT CTGGAATGTC 28500

CTGTACTCAT CTTCCTGTGT TGAACTCTAT ACCCAACTTT AAAAACCTAG CTCAAAGTTC 28560 AACACTTCCA TTCCATTTCA AAAAGAGCTT TCCTCTTCCT TAAAGTTTAA GAACTCATTT 28620

TCATGAATCT TTTTGGCATT TATTGCACAC ATGCTTGCTT TGTGTTATTT GTGTTCAGCC 28680

TCATATGCCC CCAAGGTGTT TTAGACTCCT TAACGGCAAA AATGATGCTC TAAACACCTT 28740

TCTATCTTTC ATAGTGTCTT AGTCTGTTTG TGTTGCTATA AAGGAATACC TGAGGCTGGG 28800

GAATTTATTT AAAAAAGAGG TTTATTTGGC TCACAGTTCT GCAGCTATAT AAGAAGCATA 28860

GTGTCAGCAT CTGCTTCAGG TGAGGGCTTC AGGAAGTTTC CACCCATGGT AGAAGGCAAA 28920

GGGGAGCAGG CATCACATAT CAAGAGAGGA GGAAAAAAAG GAAGGAAGAA AGGAGGGTGC 28980

CATTCTCTTT CAACAATCAG TTCTTGTGGG AACTAATGGG ACAAGAGGCT GGGCACGGTG 29040

GCTCATGCCT GTAATCCCAG CCCTTTGGGA GACCAAGGTG GGTGGATCAC CTGAAGTCAG 29100

AAGCCTGAGA CCAGCCTGGC CAATGTGGTG AAACTCCGTC TCTACTAAAA ATACAAAAAT 29160

TAGCTGGGCC TGGTGGCGTG TACCTGTAGT CCCAGATACT CAGGAGGCTG AGGTAGGATA 29220

ATCACTTGAA CCCGGAAGAC AGAGGTTGCA GTGAGCTTGT GCCACTGCAC TCCAGCCGGG 29280

GCAACAGAGT GAGACGGTCT CAAAAAATTT TAAAAACTTT AAAAATAATA GAGCAAGAAA 29340

GCACCAAGTT ATTCAGGAGG GATCCACCCC CAATGACTCA AATACCTCCC ACCAGGCCTC 29400

ACTTCCAACA CTGGGGATCA ATTTCCGTAT GAGATTTGGA GGAGACAAAT ATCCAAACTA 29460

TATCACATAG TAATGAACAT AGTACCTTAT CTATAGAAAG CAATGGCTAG ACAACTGTTG 29520

AATGGCTAAC CAAATCTGCT TTCCTATGGT CTCGCTCTAG AGGGGGTCAG TATGAGTTTC 29580

TGTCAAAAGG AGAAAAAAAA ATGTATAGTC AGTTTTGTGT GTGTGTGTGT TCATGTAAAA 29640

GAGATCAAGA GAAAAGAACA AGAGAAATCA TGAAAAGGAG GGGGAATATA AGAATAATAC 29700

ATAGAAAAAA GCAAATTATC TTGTTTATCA GTAATACCCA AGGGGGTAGA AATGGTAAGT 29760

AATAATCCTT CTTCACTTTG TCTGTAGTTC ACTTTTTTGC ACCTTTATTT TGATGAATTC 29820

ACATCGAAGA CATTAACTCA TTAAGGCTTC CAATATTTTT GGAGATAAGA AGGGCTGCTA 29880

TGCTCTTTAT AGATGGAAAA CTTGGGTCAT TAATAACTCA AACAAGGACA TAACAAAGAA 29940 ATGGAGCATA AACTGCCAGG TCCTGACTGT AGATTTGGAT TCCCAGTTGG TGTCTTGTCA 30000

CCCTTTGTTA CTCTTCCTAA AGTTATGATC TTTTCTTGTG CATAGGAAAT TCATAGTGAT 30060

TTCCCATCAC CCTTGGGATT ATCATAGCTC CTTTAAGGTC CCCTCTATGC ACTCAATAAC 30120

ATCAACAGTA AGTGTTCTTC GAGCACTTAC TGAGTGTATA TCATTGTGTT CTCACGCAGC 30180

ACCCACAGAT CTCACCAAGA ACCTAGCTGA AGCCTGTAGA ATGAATAGGT AAGTACTGCC 30240

ATGCCAATCT GGAGTACTCA AGCGATGCAA ATGATTCCTT TAATTGTACT TTTGCAGGCT 30300

TGTCAGTTTT GCTCATGGAG AAGTGGCTAC TGCATCCATG TTATATCTAT GTAATGTTGG 30360

ACTGCGAAGC ATCACTTGAC TTTTTCCAAG CAGAAATTAC AGCTGATGAC AAGCTGCTGC 30420

TGAGAAAATG GATATTTTTC TGAATTCAGT TCTACGTGGA AACAGCTGAC TAGTTTCCAT 30480

TGCTGTAAGA ATGGCTCTTT TGCTCTTGGT TGATTTTGAG TAATGGCTTT ACTTCTGTAG 30540

AAAGGAGATT TCATTTGAAG TCCACTCAGG GATTTGGTTC AACAAACTGG AGTACAGGTT 30600

TCAGAAAATA TCTCTTTAAT CCTCCAATAA TAAATTTTCT CATCTATAAT TCCTGGAACA 30660

CTTCATCCTT TGCAGCCGAG CATATAGATA GATTTGTTGC TCACTGTGTT CTGATTGCCA 30720

CTTTGACCTG CTTTTTCAAC TTAGGTTACA AATAGAACAG AATCTCTCTG ATTTTTCTCA 30780

TTAATTGTTT GAATTCCCAC TTTTCCTCAT TAGCAAGAAG TCCAGTATCT TCCTGAGAAC 30840

TTCCTTTTCT CAATCTAGGA ACTTACTTGG TCCATAAGGT AACAGTCTTA TTTCTGACTA 30900

TCAAGGAGAG AAATAACAGG AGCCATTATC ATCTTCATGG TGTCACTTTT GAAAACTGGT 30960

CCTCTGTAGA TCTTCAGATT CTTGCGTTAG TCCATTCAGC TGCTATAACA AAATTGCATA 31020

GACAGCATGG CTTATAAATA ACAGAAATGT ATTTCTGACA GTTCTGAAGG CTAGAAAGTC 31080

AAAGATTAAG ACACTGGCTG ATTTGGTGTC TGGCGAAGGC CCATTTGCTC ATAGATGGAC 31140

GATGACCTTT CACTCTGTCT GCACATGGCA GAAGGGCAAG AGAGCTCTCT GGGTCTTTTT 31200

TATAAGGGCA CTAATCTCAT TTTTGAGGAC CCTGCCCCCA TGACTTAATC ACCTCCCAAA 31260

GGCACTGTCT CCCAATACCA TCACCTTGAG GGTTAGGATT TCAACATATG ATTTTGGGGG 31320 GACAGAAACA CGCAGTCCAT CTCGCTTGTC CACTCCATGG TGGTATTCTT GCTGGATCAG 31380

TTTCCTCCTT GGGGTGCATT TGTGTTCCAT GTCTAACTTG CAAGTTATAG CAGGCCCGAT 31440

AGCAAAGTAT TCCAATGTTG GTATGCAGAG GCATTGAATA ATCAGAATGA ACCCACGCCA 31500

TAAACAACTG GTAGAGCTGC AGAGAGTACC AGCTGATTAT GAGCCCTGGG TAACAGTGGT 31560

TTTTAGTTCC TATGTCCGTC AGCCCTTTTC TCCCATAGTA GCCCCACTGT GTTGAAGTGG 31620

CTGAATCGAC AGAAGCTTCC AGCTTGGGCC ACATGCTCAT GGAACCAATT CTCCTTATGA 31680

GCCGTACAAG AGCTGGGTTG CCATTCTGGA TACCCTCTTT TTTCAAGAGA TTTTATTTCA 31740

AGGATATTTT TTCTTTTATC AACTACAGGG ATTATTTAGA ATCTTAGGGC AGTGGTGCCC 31800

AACCTTTTTG GCCCCAGGGA CAGGTTTTGT GGGAGACAGT TTTTCCATGG ACCAGTGTCA 31860

GGGGGCTGGG AGGCATGGTT TTGGGATGAG TCAAGTACAT TACGTTTGTT GTATACTTTA 31920

TTTCTATTAT TATTATATTG TAATATATAA TGAAATAATT ACACAACTCA CCATAATGTA 31980

GGAATCAGTG GGGAGCCCTA AGTTTGTTTT CCTGCAACTA GACAGTCCCA TCTGGGGGCA 32040

ATGGGAGATA GTGACAGATC ATCAAGCATT AGATTCTCAT AAGGAGTGCT CAGCCTAGAT 32100

CCCCGGCATG TGCAGTTCAC AATAGGATTT GCTCACCTAT GAGAATCTAA TGCCACTGCT 32160

GATCTGACAG GAGGTGGAGC TCGGGCAGTA ATGCGAGGGT TGGGGAGCAG CTGTCAATAT 32220

AGATGAAGCT TTGCTCGCTC GCCTGCCACT CACCTCCTGC TGTGTGGTCC ACTTCCTAAC 32280

AGGTCACAGA CTGGTACTGG TCCATGGCCA GGGAGTTGGG GACCCTGTCT TAGGGAGTAG 32340

GGGTGGAGTT CCCTTCACTT CTAGAAGGCC CTGGATTAGT ATCCCAGAGC TGTCATTACA 32400

GAGTATCACA AACCAGGTGG CTAAAAACAG ACATGAATTC TCTCTTATTT TTGATGGCTT 32460

GGAAGTCCAA AGTCAAGGTG CTGCCAGGGC CATGCTCCCT CTGAAATGTG TAGGGGAGAA 32520

TCCTTCCTTC CTCTTTCTAG CTTCTGGTGG TTTGCTGGCA ATCACTGGCA TCGCTTGGCT 32580

TGCAGCACTT CAACATCTGC CTTTACTGTC TCATAGTGTT CTCCCCTCAT GTCTCCAGGT 32640

CTCTCTGTCT CTCTTCTTTG TATAAGGAAA CTAGTCATAT TGGATTAAGG GCCAACCCTA 32700 CTCTAGTATG ACCTCATCTT AAGGTCACAT GCAATGACTA TTCCAGATAA GGTCACATTC 32760

TGAAGAACTG GGAGTTAGGA CTTCATATCT TTTGAAGGAA CACAGTTCAA CCAATAACAG 32820

CCCCTGTACT GTTTTACAAA TAGGTATTCC TCTCCTTCCC AAAGTTCTTC ATAGCAGAGA 32880

CAACTTGTAC CAAAAGGCAA AATACCTTAT TATGTAACCT TAACCTAGGA TCATAGATCC 32940

CTACTGTCTG GTGCTTTATA AGCACAGAAC CACCGGGAAA TCATTATTAA GACAAGGAAA 33000

GGCCAAGTGC AGTGGCTCAT GCCTGTAATC CCAGCACTTT GGGAAATTGA GGCGAGTGGA 33060

TCAACCTGAA GTCAAGAGTT TGAGACCAAA CTGACCAGCA TGACAGAACC CCATCTCTAC 33120

TAAAAATACA AAAATTAGTT GGGCATGGTG GCATGTGCCT GTAATCCCAG CTACTCAAAA 33180

GACTGAGGCA GGAAAATCAC TTGAACCGAG GATGCCAAGA TAGCAGTGAG CCAATATCGT 33240

GCCACTGCAC TCCAGTCTGG ATGATAGAGC AAGATCCTGT CTCAAAAAAT TAATAAATAA 33300

ATAAAAAGAC AAGGAAAGCC TTTTCCAAGG AGACCCTTCT GCTTTGCTAG TTCAGAGAAC 33360

TTCTCTTTTG GAGAAAACAA ACACCCAGTC CATTAGCAGC AACGTCAGGG ATTGAATTCT 33420

TAGGGCAGCA GGCTGGGCAC AGTGGCTCAT GCCTGTAATC CCAGTACTTT GGGAGGCTGA 33480

GATGGGTGGA TCACTTGACA TCAGGTGTTC GAGACCAGCC TGGCCAACAT GGTGAAAACT 33540

CATCTCTACA AAAAATATGA AAAAAAAAAA AAAAAAAAAA GCTGGGTGTG TTGGCTTATG 33600

CCTGTAGTCT CAGCTACCTG GGAGGCTGAA GCAGGAGAAT CACTTGAACC CGGGAGTTGG 33660

AGGTTGCAGT GAGCTGAGAT TGCCCTACTG TACTCCAACC TGGGTGACAG AGAGAGACTC 33720

CATCTCAAAA AAATAAAGAA TTCTTCGGGC AGCAGTCTTT CCTCCACCTC ATAGACCATG 33780

GAGGTGAGCC AGCTCTGACA AACCATGAGA ACAATGGCAG AGACATACCT GTAACGTAAC 33840

TGACTGGGGC AAAGACAAAG GTGAGGAAAA TGACAAGTTT GAGGAACTAT GAGACCAGGC 33900

AGTGGGGAAC ACCACTAGCA GAAATGATGG AAGTTCTCAA GAATAACAAC AGAGAAATAG 33960

ACCATGGCCA GAGTCTAGAA CCCTCCAGGG AAAGGAGATG GGCTCCAGAG GCAGAAGAGG 34020

ACGTTGAAGG GAATGGGGAG TGGGTGAAAT ATATAGACGA TGGGGACCAC CCAAGAGCAG 34080 TCGCTATTGC AAAACTGAGG AGAAGGAGAG TCTGGAGGGG GTGGTGGGAA GCTGGGTCTC 34140

CTAAGGAGGT TTTGACAAAA GCAGTCATGG AGCGGGCTTA GAAATCACAG TTGGGGACAG 34200

GGTAAAGTTC CTCGGGATAT AGAGGATGAG ATTAGAAGAG GTTCCAACTA GGGTAGTGTG 34260

GAGAAAAGCA CTATTGACCC AAAAAGGAAG GAGAATGTGG GTGGAAGTGG CAGAGAAAGA 34320

GGGGTTTGAG CAGAGAGTGG TGATTTTTCT AATGCAGAGT TGTGGGAGGT GGAGTGCAGG 34380

GAGCCAGGCT GGGTGGCTGT GCTGATGTGA TTAAGCACTT ACTGACTGCC AGGCAATGGG 34440

CTAAGTACCT GAGATGCTTT GTCTGTTATC CCTCCCGAAA CCCCTCTGAG CAGGTGCAGT 34500

TATTATTCTC ACTTCACAGA TAAGGAAATT GAGGCACAGA GAATTGAGTA ACTTACCCAA 34560

GGTGACATAG CTCATATATG GTAAAGCAGG CTTTGAACTC AGTCTAGCTC CCGAACCTAA 34620

GCTTGTAACT ACTATGCTTT TCCCAAAAAA AGGGGGCTGG CACAAAAAGA GCTGAGGGGG 34680

CTGGGCATGG TGGCTCATGC CTGTAATCCC AGCACTTCGG GAGACTGAGG CAGGTGGTTC 34740

ACCAGAGTTC AGGAGTTCGA GACCAGCCTG GTCAACATGG TGAAGCCCTG TCTCTACTAA 34800

AAATACAAAA ATTAGCTGGG TGTGGTGGTG TGCACCTGTA GTCCCAGCTA CTTTGGGAGG 34860

CTGAGGCAGG AGAATCGCTT GAACCCCAGA GGCGGATGTT GTAGTGAGCC AAGATCATGC 34920

CACTGGACTC CAGCCTGGGT GACAGAGTGA GACTCCATCC AAAAAAAAGA AGAGCTGAGG 34980

TGATGGCCAC CATCAGCATC AGCCTGGAAG TTATAGCAGG ATGCTAAGTT TCTCTAAAGC 35040

TGTCTTTCTT AGGACTTGAA AAAGATAACT TGGGTTTGTA TCCCATCTCT GCCATTAGTA 35100

GTTTACTGGC TTTGGATAAA TTACTTAGCC TTACTGAACC AACTTTGGAT TTTTATAGAG 35160

ATACTGTAAT GAAAGGAATA AGGTATCAGT CTTAGCAGAG CATCCAGAGT GTTCCTATTA 35220

AAACCTAAAT CATATCCTGT CATTGCTGTG CCCCAAACCA TTCAATGGCT TCCCAACTCA 35280

AAGTTAAAAA CTCATCTTTC CAGTGGCCTG CAAGAGCCTA TGCTATCCGG TGTCTGACCT 35340

CATCTGTTGT TCCTTTCTCC CTCCCTTTCT TGGCTCCAGA CGCACTCTGG TCTCCTTGCT 35400

GTTCCTTGAA TACACCAGGC ACACTCTCTT CGCCTGAAAC ACTTTACCCC AGATATCTTA 35460 GCTTACTCTC TGCCTCCCTC AATTCATTGA TGAAATGTCT CAGTGAAGTC TTCTCTCTCT 35520

CCTCTGTAAA AGTATACTCT CTGTTCCCCT TCTTTACTGT TCTAGCTACT ATTGCTGTGT 35580

AACAAATCAC TCCCCAAATT TAATGAGTGA AAACATCAGC CATCATCTTA TTTCTCACGG 35640

TTTCTGAGGG TCAGGAATTC TGGAAGGGCT CAGCTGGGAG GTTCTGGCTC TATAATCTCT 35700

TATGCAGTGA GAGTCAGATG CTGGCTAAAA CTGAAACAAA GCAGGGTTCT AGTAGCTGAG 35760

GGCTGGCTGG GTCTCTCAGA TATAGTTCAG ATCTCCTCCA GGGGGTCTCT CCACGTGGGC 35820

TAGTCTGAAC TTCCTCACAG CATGGTGGCC TCAGGGCAGT GGACTCTGCA TAGTGGCTGA 35880

AGGCTTCGCA GCTGAGTATT CCAGCAAGCA AAGTGGGAGC TGTATTGCCT CATATGACCC 35940

AACCTTGGAA TCCACACAGC ATCACTTCCG TGTATTCTAC GGGTTGAAAA GTCACAAAAA 36000

CCAACCAGTT TCAAGGAGAA GGAACAGAGA TCACATTTCT CAATTGGAGA AGGGTCAAAG 36060

TCACATTGTA ATCAGAGCCT ATGGGATACG AAGTATTGCG GTCAGGTATG AAAAATTTGA 36120

TTTGCTGCAT CTGCTTTACT TTCTCCACAG CGTTCATGAT CTGCTTCTCA CATGATATTG 36180

ACTTACGTCA TTTCTGCGTT TCCTGTCTTC CACACTAAAA TGTCAGCCTG TTTTGTTCAC 36240

TGCTGTATCC CCAGAGCCTA GCACGGAGCC CAGCATGTAG TGGTATCCAA TAAATACTTG 36300

TTGCATGAAT GAATTCTGTC TTTTAATCCT AGCTATAGGT TTCTAAGTTA AATATTACTA 36360

TAATCATCTT ACAGACGAGG GAAATGAGGC TCAAGAAGAT TTGGTAACTT ATGCGGGATC 36420

ACTCAGCCAC ATAATGGAAG AGACAGCATT GAAGTACACA TGCTTGCTCT GTCTGCTCTT 36480

CCAAGCTGCT CATCACACAG CTGCACCTCT GAGGACTTCC CTCCCCAGTC CACCTCCACC 36540

CTTACCCAGA GACACACATG GCCACAATCC ACTAGCAGAC CAAAATTCAA TTTTTCCCCA 36600

GTTGGTTGCA CTCAAGCTGA GAGCAAAGCA ATTGCACTTT AAATCCCCTT ACAGCAGATA 36660

TTTCAGAGCA TGTTCGGAAG AACCCATCAC ACTTGGCTTT TAGATCTTAT TTCTGGTTTG 36720

TTACAAAAAC ACAATTAAAT GAAAGGTTAG GTAGCTTTTG AATGGCCAGC TCAAAGTTTT 36780

GGCTTATTTT TGCCTTGCTG TCTTTATAGG CATTTTACCA ATATTTATCA CTATTTCCCT 36840 TAGGGAACCC TTAGATCTGT GATATTTGAA ATAATAAAGC CTCTCCATTG GCCCTTTAAA 36900

AGGTTTGTGG TAAAACCACA CCATTAACAT TCACAGTTCC TTATTTATGA GGCCTGATTG 36960

CACTTATTTC CATATTTCTC ACTGTTTCTC CGATGAGGAT TTCACATAAT AGTGTTTGAA 37020

GGCTAAAGAC TTCAAAGCAG ATTCTTTACT ATTTTTATCT TGAAAAATAT TCAATATTTG 37080

TGTAATTAAA GTGAAGTCTT CCTAGAGAAA ATGACAACTC AAATAATCTT AAATGTACCT 37140

CCAAGAAAAA AGCTGTCAAA GTGACATTTA GTAATAGAGT CACATTCTCT AAGGCCTTTG 37200

CTTCTCCTTC TGATTCTTAT CATCTTTGAA GGTTATGTCA TGGGCTGACT TCAAATCAAC 37260

TTTTAAAATT ATTATGGCCT TCTTTAAATG TGAGTTCTGA AGGTGAGGGG CTTTATCTTT 37320

CTTTTGCTCC AGATTTTTTT TACCGCGTCA TTACCAAGCA TCTTAAAACA AAACCTAAAA 37380

ACAAAAATCT TCCTTGACCT GGTTTTTCCC ACTAGCTAAC ATCCTATTTT TATCTTTCCC 37440

CTTTGCACTA AAGGTTTTTA AACGGATCTT TATACCCTCT GTCTCCATTT TCTCATCTGC 37500

TAACTTATAT GGCAAAGATT ACCACTGCCT TTCAACATAA TTGGCCAATC TACAGAAAGT 37560

TTTCAAGTTC TCTTTTTAAT TGACCACCTC CTGCCTACCT CCCCACCTTT GACATCTTGC 37620

TTCTCACTTG GCACCTTACC CAGTGTTCAA GATTCCCTCC TTTAGGATGT CTTCAGAGCA 37680

GCTACACAGT TGGTACTATA ATTTATACAT CCTTGTACAC AGGGCTTGCT GGGATATTGA 37740

TGGAGAGAAG GAGGAAACTG GAAGTAGTTC AGGCCAGAGC TAGGGAAATT GACCCATCTC 37800

CAGGTCTCAG GTCTGCAAGG GGAGCTCACA GCTTAACACA TGGAGTCTAG AAACTTGTGC 37860

TGGACCTTGA CCAACACCAG CCCATGGAGT CCAATACAGT GCTCAATAGG GATTTCCAGG 37920

AAATTGCTAT ATTTATTCAA AGAGAACTTA CCAAGTGTCA GCTACGTGTT GGGCATTGTG 37980

CTAGGCACAG GGACCACAAA GATAAGACAT TGTAGCTTTC CTTAAGTTGC TCACTGAGTA 38040

AATAGAGAGA CAGAAAGGTA AACAGGTAAG TGCAAAAATA CATACAATTC AGCAATAGTG 38100

TTCATAGTGG CTATGGAGAG AACGCTCACT AACTTTGTTT AAACAGTTGT TCTTTCAAGG 38160

ATTTGACATG GATTTGATTG GAAAAGCATG ATACCATTTT TTGCAATTAA ACACAGGAAT 38220 ACATAAATAA AATGCATCAG TATTTTTTAC AAATAGCTAC TAAGAGCTAC TAGAAAACCT 38280

GGGAATTCTT AAAACCTTAC CATGCTACTT GCTCTAAAAT ATTTTATTTT ATGTTATTTT 38340

GTACATTTCT TTACCTACAC AAACACCACT GTTTTCTTCA TTTCTTAGTC TATTTAAACC 38400

TCACACCCTT TCAGCATCTC TTAATTATTT ACTACCATCT GTTAGTTCTC CTGTCCTGAA 38460

TGAAACAAAA ATGGCAGAAT GTAAAACGAG GGCGAACAGA TTTTTGACAG GAAGTATTCA 38520

GAGGTAGAAG GAAATAGTCA AGACACATAT GATAAACGAA AACAATAATA ACTTTATACA 38580

TAACAACTTA TAGACACATT TAAAAAGTTT AAGATCTCAA GAGCTATGTC TGAATAGATA 38640

GGAGTAAAAA CTCTATTAAG TAATTAGGAA AATAACAAGA ACAGTGAATT TCTTAATGAA 38700

TGGCATGTAA TCAAAACTGT ACTTATCGTC TAATTCATAA TCTTGAATGT TTTTATTTTA 38760

TTTATTTATT TTTTTATTTT TTGAGACAGA GTCTTGCTCT GTCACCCAGG CTAGAGTACA 38820

GTGGCGTGAT CTCAGCTCAC TGCAACCTCC ACCTCCCAGG TTCAAGCGAT TCTGCTGCCT 38880

CAGCCTCCTG AGTAGCTGGG ATTACAGAGG CCTGCCACTG CACCCGGCTA ATTTCTGTAT 38940

TTTTAGTAGA GATGGGGTTT CACCATCTTG GCCAGGCTGG TCTTGAACTC CTGACCTCAT 39000

GATCCACCAG CCTTGGCCTC CCAAAGTGCT GGGATTACAG GCGTGAGCCA CCACGCCTGG 39060

TCGAATGTCT TTATTATTTG AAGAGACAAC ATGGGCCTTA AATCTGTCTT CTATTTGACA 39120

GACTTTGATG GAGTCAAATC CCAATGCTGC CACTTACTGA ACGGCCTTAA ATGACTTAGT 39180

CTCTCTCAGC TGTCTTTCTG CATATGTAAG GTGGAATAAT GATGGCTTTC AAGGAGGAAT 39240

AAACCTATGA AAAGTGTTGA GGATAGTGTT TGATATGAAA TAAGGATTTC AACAAGTAGT 39300

AGCTGCTATT GAAGATTTAA GAGTTATTTA TTACAACTAT TTAATAAAAT TTTAAAAACT 39360

AATACACTTA AATTATTAAA GAGCTTTGAA ATGGGCCAGG CGCAGTAGCT CCTGCCTGTA 39420

ATCCCAACAC TTTGGGAGGC CAAGGTGGGC GGATCACCTG AGGTCAGGAG TTTAAGACCA 39480

GCCTGGCCAA CATGGTGAAA CCCTGTCTCT ACTAAAAACG CAAAAATTAG CCAGGTGTGG 39540

TGGCATGCAC CTGTAGTCCC AACTACTCAG GAGGTTGAGG GAGGAGAATT GCTTGAACCT 39600 AGGAGGTGGA GGTTGCAGTA ACCCGAGATG TCACTGCACT CCAGCCTGGC AACAGAGCAA 39660

GACTCCATAA AGACAACAAA AGCTTTGAAA TTGTGTAAAT GAGTTGTACC TATCTTCATT 39720

TAAGAAATTC ATCTTTGTTC ATTTATTTTT ACTTGACATG AGAGCTTCCA GCAATTTTTA 39780

ATTAAGCCCT CACAGATTTT ATGTCACTGG CTATGTGATA AACAAATTAT TTGCTAAAAT 39840

AATATTCTTG CTTCTTTTTT AAGGAATTGT CTCCCTAGAA ACGGTTTGTA CCAAACAATA 39900

CACTGACTTT ACACAAAATC AGATCTGATT GGCAACAGTT GCAGATGTTT TCAAAAGATT 39960

TTCATTTGAG AAGGGGCCCA TTTGGGTTAT TTAGATTCTA AGAACTGAAA CTGCTTTGTT 40020

CTGTTTTTCT GGCTTCTGGG AGAGGAGGAG ACATGAATTC AGTTAGCACC TTGGTATTTT 40080

CTTTATCCTT CATTTCAATA CAGAAGATGC TTCATATGCA CAGTGGTGTC AGGTCACATC 40140

AAAAGAAAGA GAAACAGTTT CTTGGTTTTT AATTTTCAAC CGGAAAGGAA AGGCACCCAT 40200

TTTGTTCCGC TCTAATTAGC CAGTGCATGA CTTAGAGAGC AGGCAGATGC TTTGAAGGCG 40260

TGGTAACACA GGTCTTCATT AATCTCCACG CAGGACTTGC ACTTCTACTA TGCCTAGGCT 40320

GAAGAAAATG GCTCAGGAAG ATGAACAATC TCACAGAGCC CTAACTAACT GAAGCCAGGT 40380

GTTATAAAGC ACAAGTCAAG AGGGTGAGAA ACTAACGTTC TTGAAATCTC CCACTTCTTT 40440

CTACGTCAGA AGAGCCAAGC TGATTATTTT AGTTGGAATT TAGAAATTTT TAAAAATTAT 40500

TCTAAAGTCA TGAACAAGCC TAATTATAAA GATAGTTGCT GTGAAGGTGC TGAAATAACT 40560

CGATTTTACC AACCCCCTCT TCTGGAGGAA GCCATAATGG AATCCTGTAC AATGTTCACT 40620

CTACCAACGA ACTCTTGTTT TTCTAATGAG GAAACAGAGG CCCACAGTAT TAAACTATCT 40680

TAACCAATAC AAAATGACTA GTGCTCTGGT CCTTTTATTA AGCACTAAAA TTTTGATCCA 40740

ATAATAAATC TGTCCATTAG AAGGAGTTTC CCTAATGTAC TGGTTCTAAC TTGTTCCCTT 40800

CAAGGGGCCA GTGTCCCGTA CACATAGCTA AATGGGACTT CTCTTCAACT ACCATTACCC 40860

AGAGGGCAGA ACCTAAAATG CTGTGAATGA CATTCTGCTG TTCACATCTC AGCAGCA 40917 (2) INFORMATION FOR SEQ ID NO :3b:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39678 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :3b:

GTGTTGCATT TGAGCTTCTG CAGGGCCACC CAGGACCTAT ATCTGCTCAG ATGTTTAACT 60

CATCTAATTC AGTGAACACT TCATTCTAGT TAACTGAACA TCTACTTTGT ACAAGGCACT 120

ACAGCGGTTC AGAGATGAAT AAAATCATGA GATTCCACTG TCTCCTATAA ACCATCACTT 180

TGGGAAATTT TAGAAATGTG GGTAAGCTCC AGGGCTTCCT GCAGCGTAGA AGTCACAAAC 240

TCAAATGCCT GCAGAGGCCC AGCTGACAAC ATAAGTAAAT GATTCTGGCT GGGCGGAAAA 300

CAATTACGGG TGGGTGGGTT TCCAGCTGGG GAGTGCACGC CTGTGTTAAA GGACAGCTGC 360

TACTCATTTC CAGCCAACTG TGTTCCCATG TAGAACTGCG GCCCAGTGTA GCCAGTACCG 420

AAGATTTCTC AGAAAAAGCC GGAGATCTCA ATGTTAGTGT AAAATCTCTC AAATTTCCAA 480

GAGGATTATA TGGGGCAAAG GTTCTCAGAT CAGTTTGCAG TCTCTTACTT AGCCCATGTG 540

CAGAGCAGTC GTAGAGGGTA GCATGCAGTG TCCTACATAA TAATTCTTTT TTATTTTATT 600

TTATGCCTTC CTCCTTCCTG TCTCTCTTTA ACCTTTCTTC TTCCCTCAGG CTGGCTTCTT 660

CCCTCAGCCT CGTCCGACCC CAGCCTGGGT TCAATGAACA TTCGGTAAAG GAACACGGAA 720

TGTCAAGCGC ATTAGAGACA ACCTTGAGAC ACATTCCTCT TGCGGTAAGC ACTTCACTGT 780

AGATTTTTAA TTTTAAACAA GACAATGTTT ACGACTTGCT TCTTTCAGGG AAGAGCGATA 840

TCAATTTTAG TGAACACTTC AAGGCTGAGA TACGCTAGGA GAGTCGTGTG GTGTTGCACA 900

GCAAAGAATT CCACTTTGAA GCGAGTGGGA AAAAAAGCAT CAAATGCCAC ATGTAACTCA 960 CCGCCTGAAG GGTTACATTG GTATGAAACC TGGGTTTAAA AAGGGACCGA ATAGACTAGC 1020

CATTAAAAGA CCTGCGTACA ACCTCTCTCT CTCTCTTTGA GAGATAATGT ATCTGGACAA 1080

TAAACATGAA CAGAGTGGAG TCTATCCTGT TTAAAACATT GCCTACTGTA CAGGCACCAG 1140

GAGCTGAAGG GTCAGAATAT TAGCAGTGGG AGCTTGATTA GAGTTGATGA GAGATGGGTA 1200

GTAGGAGGAA AGAGTGAGAT AGAGGAAGAG GACATGGGGG TTACCCATAA GTGGAGAGTA 1260

GAAAAGTAGA ATCAGCTGGC CATCAAAGGG CGTGGGACTG AGGAACAGTA TGGCATGTAT 1320

TAAATATACT AAGCGCTGAC ATTGGAGGAG AACTAGGAAG TTAAATGAAA TCAATAGGGG 1380

ATGATGGAGA ATAGTTAGGT GTGCAGGGAT TAGGGTTATG ATAGAAATAC ATGTGAATAC 1440

ATGCAGTATT GTCCTGGAAA ATGGTTAACA GTTGGTTCTC CTGGGGGGTG AGGGGAAGCC 1500

CTGATTTGTA ATATTTGCCT ATTTCTGTGG TGCAAATACT CCCACCATGA CCAGTTTCAA 1560

GCTATGAATG TTGAAGTCAC AGAAAGCAGG TTGGGAGGAG ATGCGCACAT TTGTTCCCCG 1620

GCAAGGTGGA AGGTAAGGAA GGTGAAATCA ACAAGGTCAA AGAAAACTCA AGATTTCGAG 1680

GTGCCTCAGG TCTGAGGGGC AATGAAGTCT AGGAATGGCT GTGCTGAGGT AGCTGAAATA 1740

GAAGTGACTG CAGAGGTCAT GAAGCTGAAG AGGTGAAAAC AGAAATTAGA AAGGCAAACC 1800

CCCACCGCCC AACCCCCACC CCTGCAGCCA GTTTCTGAGG GTGACAATAG AGGAAAGGGT 1860

GGAGATGGAG TTCAGGTCCA GAAGCCATAG AAGCGAGTGT GACATTGTGC TCAAGGTCAG 1920

CACATGTCAG TGTGGGGTGT CACATGCTGT TGTGAACCAT CATTTATCAC CAATTATGGA 1980

AGACCTCCTA TGGGCATCTT GCCATATGCA TTATAAAGAT GTGTAAGAAG ACATTTCCCT 2040

CCACTTGGTG AGGAGAATTA GGGCTGTACA CAGATACTGT AGAGTGCCAT GTGCCTGGTA 2100

CAGATAAGGT GTGTTAGAGG TTAAAAGATG AGGCTCTTAA TATTAATGAT AGATCCCACT 2160

TACCTGAGTC TGACTTACAA TGTGCCTAGC ATTAAGTGTT TTACCTGCAT TCCCTTTGAC 2220

GTTCAGAACA ACCCATTTTA CAGATAGGGA AATTGGGTCA GAAAGTTTCA GTAACTTATC 2280

CAAGGTCAAC ACAATTGGCA AGTGCCAGAG CTGAGCCAGG AACTGAGGTC CTTCTAACAC 2340 CAAACAGCTT GTCTCCCCAA TCACTGTGCT ATTTTCCTCC CCCAGAAGAT AATACTCTGA 2400

TGGAAATGAA GGATAGTGTA ATAGGAGATT CGGTGTTCCT TTTTTTAAAA AAAATTCAGC 2460

TTGCATATTC CTAAAGAGTC AATTCATGTT TAAAAAAAAT TTCCCTTGTG CTTGCATGTG 2520

ACATGTATTT TTAGGATCTG CTGTTAGCAA GTGTATTTTT GTGTGATTGA GTGGGAGAGT 2580

GGGAAAAGTT TTGCAGAGCT GTTGAAGCCA GAATGCAGGG GGGCTGCGCA GCAGAGACTG 2640

TAAAATCTCT GCCATCTCAG GTCTTGGAAC AAGCACAAAG AGATGTGTTC TCGATTTATT 2700

ATTCTATGTA CATCCCCAGA TGAATGACTA GTTAAAGGTA TTGTTAAAGC ATTTTAAATG 2760

ACCCACTTCC AGCAGCGAAC AAAATCACTT GCTGTGCCAA GCCAACTGGC ATTTCTGAGA 2820

TGATAAAACC ACAAAGTGAG GAAAACGTTA AAACTGCTAA AGCAAAAATG ATACACAATA 2880

ATGGAGAAGG AGAAAAATTG AGCTTTATTG TCTGCCTAGG CAGATGGCTG ACCACTAGGT 2940

GGGCCTCGGC GTCACGTCCA GGGTAATTGG TTGCTGGGGT GTTTCTGGCG AGGAAGATTC 3000

ACGCTTCAGC TCGGTCCACA AGATCCTGGC TCATTCTTTC CTAGATTCCA TTTTCTGCCT 3060

CCTCTCCATG ACTGGGTCTG ATGGTTGATC CAAACGGGCA ATTGAAATCA GAAGGTTACC 3120

TTTACCTTAA AATGCTTTTC TGGAAATAAA AGGACATGAA AAGTAACTAA GGACCGGATT 3180

TCCTAGCCGT CTTTCTCTCC TGCATGCGCA ATTTATCCCC AGATATAAAA TTGCCTGCTT 3240

TGATAATTAT ACCCTCTAAA TGAGGGGCAA GTGGCTAATT ATGCCCACAT GTGGCCGATT 3300

GCACTCCCCA TTAGCCAATT ATGTGCTCAA TTATTTGTGC ACATGAATAA TTGCACTCAT 3360

GGAAAATAGC GCCCTCCTTT CAAATCCTCG TGCTTGGAGT GGCTGATGGA GTAATTGTCA 3420

CACTGGAAAT GCACTTGGTG GGGAGGGAAA GAGTATCAGA TACCAGGAAA CGCATAAGTG 3480

ACCAGAGCTC GCAGATGTTC ACTGCCACAA ATGGCCTTAG GAGCCAGAGA GAGCGGGAAG 3540

GACCACAGGA TGGAACGGGC CAGCCTGTGA GTTAGGAAGC CTGCTTCTGA AGTTGCCTGG 3600

GCAGCTCATG TGCGGTGACC TTGGGCAAGT CATTAACTTT CCTTCAGGTC TAACTGGTTC 3660

TGCATACACA ATGAGGATGG TAATAACGCC CAATTCCCAT CACTATCGTG GGATGGATCA 3720 GACTATTTAA AAGGATTTAC AATCTGCTTG GGTAAAAGCT TTACATAAAT ATGAGGCATT 3780

ATCATGTCGC TTGGTACATC TCCAATTATG AAGGAAGGGT AATGACCCTC CACAGCAATG 3840

CAGGACTCCT GGTTTGGAGG GAGGGAAAGT TTGAGAAGGA CAGGAAGCTT GTTGCCCCAG 3900

CACTGATGTT TCTACTGAGG TACCAGAAAA TGTCATGTGG TCATACAGAA TTCATTTATT 3960

CATTCAACAA ACATCTGTCA ATTGTTACAC TGTCCTGAGA ATTTGGAAAA ATGATGAAAG 4020

ACTCAGTCCT GCCTTAGGAG GTCACTGGCA CATTGGCCCG GGCCCCTGTT TTGGGCCTTT 4080

TACTCTGACC TGTGCTGATT TGCAAATAGT GGGAAATTTT ATCTCAAGTC TATGAAATCT 4140

GGCATGCATT TTCACGGTTT GATTGCCAGG TACATTCGAT GGCAATGAGT CTTATAATGT 4200

TTGGTTACCT TCATTTACCT AAGAACTGTG GTTGTTGCTG TGGTTGTTGT TTTTGTTGTT 4260

TTTGAGACGG AGTCTTGCTC TGTCATCCAG GCTGGAGTGC AGTGGCATGA TCTCCGGTCA 4320

CTGCAAACTC CACCTCCCAG GTTCAAGCGA TTCTCATGCC TCAGCCCCCT CAGTAGCTGG 4380

ATTACAGGCG CGCACCACCA TGCCCGGCTA ATTTTTGTAT TTTTTGTTCG GGACACAGAT 4440

TTCACATGTT GGCCAGGCTG GTCTCGAACT CCTGATCTCT GGTGATCCGC CTGCCTCGGC 4500

CTCCCAAAGT GCTGTGATTA CAGGCGTGAG CCACTGTGCC CAGCCAGAAC TGTGGTTTTA 4560

ATGACAATGC TAAAAAGTGG TATATGTCAC AGTGTCGGGT GGGGCTAAGA GGCACATTGC 4620

TGCAGTGATC CATCATTCAT TTCCCACCAT TCTCGCCTGG ATTAGCGCAG CAGCTCCCAG 4680

AGAGGCACCT CACTTTGACC TTCTTCCTCA AAGACATTCT CTGTGACCTG CCTGGCCCTT 4740

ATTACCTCTC TAGCTTTGCC ACTTCCCTAT GTCTCCATCT CCCCTCTCAC ACGTAGTAGA 4800

AAGAGACTCT ACCTCATGGA GTAAGGAGAG GCTTCACAGA GGCAGGATTG CTATTAGTCT 4860

TCAAAGATGA GGTATTTGCT AAATGAATGA GACAAAGGGA TTGGGGCCAC ATTACAGGGA 4920

AATTGAGGTA TGTAATAGCC TGGTGCAGGT TAAGAGTGTG GACTCTGAAA CCAGACTCAG 4980

CCTGGAATTG AATCCTGGCT GTGTGATGTT GGGCCAGTGA CTTAACCTCT CTGTGCTTTT 5040

ATTCACTCTT CTATAAAATG GGGATTATAA TAAACCTACC TTATAAGGTT ATTATAACAG 5100 TCAGTAAATA TAAAAATAGA AGTTTTTGGA TGATGACTAT CACATCAGTA AACACTTGTT 5160

TGCCATTATT TTTATTACTT GACTAAAAAT ATACCAAAAA GACCATCCAA GAAAACCCTT 5220

TAAGCTGCTA GTGCAGAAAG ATTCCCCTTG TGTTTGTGTG CTGGGGGGTC AGTGGTGCCT 5280

GTGGCCCACT GGAGAGGAGA CAGCTATGGC TGGAGTGATT CTCAAACTTC AGAATGTCTA 5340

AAATCATCAC ATGGACAACT TATTAAGGAA AGCAAATGCC TGGGCTCCAT CCTCAGAGAG 5400

TCTCATTCAC TGGGTCAGGA TAGAGCCCAG GAATCTTTAC CTTAAAGAAC CATCCCACCT 5460

CCCACCTCAT ATGATCCTTA TGCAGGTGAT CTGGGGCCCA CACTTTGAGA AATAGACTCA 5520

GGTCAAAGTG GCTCTAACTG CATCTCATTT CTTACCTGGC ATATCTAATA GTAGAGAAGA 5580

AGACAATGCT AAGATTTTTG TTGGAGATCT TTTGCTGGGA TTGCTGCTTC ATTCATTCAC 5640

TCATTTATTT ATTTATTTAT TTATTTTGAA ACAGAGTCTC ACTTTGTCAC CCAGGCTGGA 5700

GGGCAGTGGC ACAATCTGAG CTCACTGCAG CCTCAGGCTC CTGGGTTCAA TCGATTCTCT 5760

TGCCTCAGCC TCCCGAGTAG CTGGGATTAC AGTCATGCAC CACCACGCCC AACTAATTCT 5820

TGTATTTTTA GTAGTGACAG CGTTTCACCA TGTTAGCTAG ACTGGTCTCG AACTCCTGAC 5880

ATCAGGTAAT CTGCCTGCCT CGGCCTCTCA AAATTAGTAG CTGCAATTAC ACGTGTGAGC 5940

TGCCGTGCCT GGCCTGCTGT TTCTTTTAGT TGGGCCTCTT CTGTAATAGA GTGTGAGAAT 6000

TCTGACTTGC TGCAACAGTC TGCTTTGAAG CAGGGCTGTG TTTACACTGG TCAGATGTGG 6060

AATTGTGGGG CACACTTAGC AGCTTCCTTC TCTAATTTTT CTGTATTTTC AGGAGAACAA 6120

TTTTAAAAAA TTTAATAAAA ATGCCTTAAA AATTAACATT ATTATAAGAT GAATCCCATT 6180

TTTCTAATCT TGTAAATTAA AAACAATCAT AAGCATATGA GCACCTGCAC TTAGGGAATC 6240

AAGGTTGGCA AAGCTAAACA CTTCCAGCTC TAGGTGATTC GCGGCAATAC AAATGGAGCT 6300

GGACTTTGGC CACAGTGCAA AAATATTGAT CTGTTGTTAG ATGCTCTGAA GTTTCCAGAA 6360

AGAATTGGTT CTGCCTGCTG TGCTTCAGTG CTTAAGGGAA GTGGTTCCTC AAAATGTTAG 6420

TTTTTAAGCC CAGCTTTCTT AAATAGGAAG ATTCTAATAG TAGCAAAAAT ATAAACTGCT 6480 TCTAGGTTTA AAAAGGACCC AGCACACAAT GGTTATCACA CACCTTTCTC CTCAGGTGAT 6540

GAGTGGATGA GTGGCCTGGT GTATTTCATA ACATCTCCCA GGTCCAAATG CTAAAGCAAT 6600

TGCTGAAAAG ATACCATGTG TACCGGAACC TTGCAGAGGT ATTTTGTTGG CATAAAAAGA 6660

AATATTGATC ATCTATAGTA AAAATGGTTC TACTTTAATA CTACTGAGAA AAGATTTTCT 6720

TTTCCCAGAT CTACATCCTG AATCTTCATG AAGACAAGAT CCCCTAAACT TCCACTAACA 6780

CCATAATGTG TGCTGTCCTT TGTAATGTAG TCCACAGATC TCATAAACTG TCAGAAATAG 6840

CAGAGATTGT AAGGTCATCC ACTTCCCCTG TAAGGCCTGC GTCCCTCACT TACATCCCTA 6900

ATAACGTCCT CTAACCTCTG CTGGAGGGCA GATTTAGCTG CCAGCTGGGA AGAGCTCTGC 6960

CCTAGTCAAC ATTTTTATCT GTGGCTTTCA GATGAGAACA CTGGATGCTT ATCTGAAAAA 7020

AGCTCCTCAG GCTGGAGGGA GGGATTGGCT CTAACAAGAT GCAATGTGAT AAGAATAAAA 7080

GCGAAGCCAA ACTCTAGGCC CAAAGGCTCT AGCAACACAC TTTTGAGAAC CTTGGAGACG 7140

AGTTTTGGCT GATGCGAGCT TCTCCGCCTG CTAAAGTAGC CCATTCCATT TGGACGGCTC 7200

TAGAGGCTGG CATGTTCTTC TCCACGTTGT GTTAATGTAC TCCAGTTTCT TCCTGCCATG 7260

AACTGGCATG CCCTGGCTCC TCCTACCTTC CCCACTTTAA GTCTTCCCTC CCTCCTTCTG 7320

ACCTTCCCAT TCCAGCCACA CTGGCCTTTT GTCTGGTCCT AACAAACCAT GCCTTTCCTG 7380

CCTCCAAGCC CTACACCTGC TATCCATCCC TCTGTCTGAG AGACACTCCC ACCCCTTCAC 7440

AAAGCCTGTT TCTCATCCTT CCAGTTCAGA TGTCTTCTCA GCTTGCCTCA ACTGACCTCT 7500

TTCAGCTATT CTCACTCTTT GTACTCTGTT CATTTCCTTC CTGGCAGTCA CCATAATTTA 7560

TCTTTATTTG AATCAATTTC TTAGTTGTAT TATTTAGTTA TTTGCACACT CTGTCTCTCT 7620

GTGCCTTTCT TATTCACTGC AGGCTTTCTT ATGTAAGTAA TTTATTTACT TAAATTTTTA 7680

AAAATAATTT CAACTTTTGG CCGGGCACAG TGGCTCACGC CTGTAATCCC AGCACTTTGG 7740

GAGGCCGAGG TGGGTAGATC AGCTGAGGTC AGGAGTTCGA GACCAGCCTG GCCAACATGG 7800

TGAAATCCCA TCTCTATTTA AAATACAAAA ACTAGCCGGG CGTGGTGGTA TGCACCTGTA 7860 ATCCCAGCTA CTCGGGAGGT TGAGGGAGGA GAATCACTTG AACCGGGGAG GTGGAGGTTG 7920

CAGTGAGCTG AGATCACGCC ATTGCACTCC AGCCTGGGGC ACGAGAGTGA GACTTCATCT 7980

CAAAAAAACA AAAAACAAAA AACCCCTGCT TTTCAGAGGG GCTGAACTAA TTTACATTCT 8040

CACCAATAGT GTATAAGCAT TCCCCTTTCT CTACAGCCTC ACTAGCATTT ACTTTTTTAA 8100

AAAACTTTTT AATAATAGCC ATTCTGACTG GTATGAGATG GTATCTCCTT GTGGTTTTCA 8160

CTTGCAATTC TCTGATGATT AGTGATATTG AGCATTGTTT TATGTTTGTT GGCTGTTCGT 8220

ATGTCTTCTT TTGAGAAGTG TCTTTTCATA TATTCTGCCC ATTTTTTGAA TGGAGTTGTT 8280

TTGTGCTTGT TGAATTAAGT TCCTTATAGA TTCTAGATAT TAGACTTTTG TTGGATGCAT 8340

AGTTTGTGAA TATTTTCTCC CATCCTATAG TTCTGTTTAC TCTGTTGATA GTTCCTGTTT 8400

TGTTATGTTT TGTTTTTTTG CTGTACAGAA GCTGTTTAAT CTAATTGGTC CCACTTGTCA 8460

ATTTTTGTTT TTGTTGCAAT GGCTTTTGAA TTTTAATAAT AAATTCTTTC CTAAGGCTGA 8520

TGCCCAGAAC AGCATTTTCT AGGTTTTCTT CTAGGATTCT TATAGTTCAA AGTCTTATAT 8580

TTAAGCTTTT AATCCACCTC AAGTTAATTT TTATATATAG TGAAATGCAG GGGTCCTGTT 8640

TCATTCTTTT GCATGTGGCC AGCCAGCAAT CCCAGAACCA TTTATTGAAT AAGGAATCTT 8700

TTCCTCATTG CTTATTTTGT CAACTTTGTC AAAGATCGGA TGACTGTAGG AGTGTGGCTT 8760

TTTCTGGGTT ATCTACTCTG TTACATTGGT CTATGTGTCT GTTTTTGTAT CAGTATCATG 8820

CTGTTTTTGT TACTATGGTC TCATAACATA GTTTAAAGTT GGATAATGTT ATGCCTCTGC 8880

TTTGCTGTTT TTGCTTAAGA TTGCTTTGGC TATTGAGGCT CTTTTTTCAC TTCATATGAA 8940

TTTTAGAATA GTTTTTTCTA ATTCTTTGAA AAATGACCTT GGCAGTTTGA TAGGAATAGC 9000

ATTGAATCTA TAGATTGCTT TGGGCAGTAT GCTATTTTAA TGATATTGAT TCTTCCTATC 9060

CATGAGCATG GAATATTTTT CCATTTGTTT GTGTCATCTA CTATTTCCTT TAGCAATGTT 9120

TTTTAGTTTT CCTTGTAGAG ATCCTCCTAG GTATTTCATT TTTTATGTGA CTATTTTAAA 9180

TGGGATTGCA TTCTTCATGT GGCTCTCAGC TTGAATGTTA TTGGTGTATA GAAATGCTAC 9240 AGAGTTTTGT ACACTGATTC TGTATCCTGA AACCTTACTG AAGTCATTTA TCAGTTCTAG 9300

GAGCCTTTGG CAAAGTCTGT AGTGTTTTCT AGGTATAGAA TCATATCATT AGCAAAGAAA 9360

GATAGTTTGA CTTCTTCTTT TCCTATTTGA ATGCCTTTTA TTTCTTTCCC TTGTCTGATT 9420

GCTCTTCCAG TACTACGTTG AATAGGAGTG CTGAGAGTGA GCATCCTTGT CTTGTTCCAC 9480

CTCTCAGGGG AAATGGTTCC AGCTTTTGCC CATTCAATAT GATGTTGGCC ATGGGTTTGT 9540

CACAGATGGC TCTTATTATT TTGAGGTGTA TTCCTTTGAT GCCTAGTTTG TCAAAGGCCT 9600

TTATCATGAA GGGATGTTGG ATTTTATTGA AAGCTTTTTC TGGGTCTTAT TTGGTGAATT 9660

GCATTTATTG AATTGTGCAT GTTGAGCCAA ACTTCCATCC CAGGGATTAA ACCTACTTAA 9720

TCATGGTGTT AACTTTTTGA TGTGCTGCTG GATTTGGTTT GCTAATTTTT TTTTTTTTTT 9780

TAAGATGGAG TCTCGCTCTG TCGCGCAGGC TGGAGTGCAG TGGTGTGATC TTGGCTCACT 9840

GCAAGCTCCA CCTCCCGAGT TCATGCCATT CTCCTGCCTC AGCCTCCCGA GTAGCTGGGA 9900

CTACAGGCAC CCGCTACCAT ACCCAGCTAA TTTTTGTATT TTTTAGTAGA GACAGGATTT 9960

CACCATGTTA GCCAGGATGG TCTTGATCTC CTGACCTCGT GATCTGCCTG CCTCAGCCTC 10020

CCAAAGTGGC TAGTATTTTT TTAATTACTA TTTTTTCTCA CCCTTGCTGC CATCTTATGA 10080

TTTTCTAGTA TTTTGTTGAA GATTTTTGCA TCTATTTTCA TCAGGGATAT TGGCCTGTAA 10140

TTTTCTTTTT TCATTTCATC TTTACCACAT TTTTGTATCA GGTTCATACT GGCTTCATAG 10200

AATGAGTTCA GGAATGGTCC CTCCTCCTCG AATTTTCTCT GTAGAATTAG TACCAGCTCT 10260

TTGTGTGTCT GGGAGAAGTT GTATGCCAAT AATTTAAATG CAGTTAATAT TTACTGGACA 10320

ATTTCCTCCA GATAATTGTA TATGATTTTT GGTCCACCCT GAGTTGATAC ATGTATTTTA 10380

ATTGTATCAT GGTATGAAAA GAGCAAGAGT TATTTGGTCA CCTAGTCTTG CCTATAGATG 10440

TTGCCTAATG ATTCAAAGTA GATATTTTGG GAGCCTTAAC AGGTGCCGTG GACTAGGCAG 10500

TTTTGTTTTT TTTTTTTTTT GAGGGACAGA GTCTCGTTAT GCTGCGCAGG GCTGGAGTGC 10560

AGGGGCATGA TGTAGGATCA ATGCAACATC CGCCTCGTGG GTTCAGAGCA ATTATACTGC 10620 ATCAGCCTCC CCAGTAGCTG GGACTACAGG CTCACGCCAC CACGCCTGGC TAATTTTTGT 10680

ATTTTTAGTA GAGATGGGGT TTCACCATAT TGGCCAGGCT GGTGTTGAAC TCGTGGCCTC 10740

ATGATCCACC CGCCTCGGCT CCCAATGTGC TGGGCTTACA GGCGTGAGCC ACCGCACCCG 10800

GAGATTAGGC AATTTTATAT TCCCAAATAT CCAACTCTTC TGACCCGCTT TCTCAGCCTG 10860

GGTGTATCAG GCACAAGGCC TGTTCAGATT ATGTGGTCTC TGAAGATATG GCTCTCCAGG 10920

GTTGACAATG TGGATAAGGA TTCACCTGGT TTAGGATTTA CACATTCGCC TTGAATGTCT 10980

GTTGCACCAA GTAGACAGTC CATCCCAACT TGGCCATTTG GTCAGAGCTG TAAGGAGACA 11040

AGGAGGTGGG CAGCCGCTGC TGTGAACTGC TTGGACAAAG ACTGCCAAAT AGCTATCAGA 11100

CAGTGTTAAC AACAGCTGAT TTAGGTTTGA AGGGGGCAGT CTCTTGGGCC ACTTACTATG 11160

CTGCATCATC CTCTTTGGAA AATGCTCTTC AGGTAACTGC CTAACAGACT GAGAAAATAA 11220

AATGCTCACA GAGAAAAAAG ACCCGGAAAG TCTGACTTCT CAGAGCTCAG TGTTTAGGTG 11280

CAGAACTGGA TTGTGAAAGG ATTTTTAAAT TTTTTATATT CATTGCAGGG AACATTCATT 11340

TATTCCATCC TTCTCCACTC CCACCTGTCT GTCGTTGTCT TTGTCTCTGT CTCCCCACCT 11400

CTCTCTCTAG ACACACACAC GCACACACAC ACACACACAC ACACACACAC ACACACACAC 11460

ACACACACAC ACACACACAC ACACACACAC CCCTATTCAT TGCCAACAGT AATAGAGTTG 11520

CTTCTTTACT TCTTGGAGAG AAAAGCCTCA ATCTGAGGAA GCTGTGCTGA CTAGCCTTGC 11580

TCTTAATCAT GGAGACAATG CTTTATGCCT TTATCTTTGC ACAGCTGAAA GCCATGGCAG 11640

AAGCAGTCCT CTAAACGAAA TAAAATAGAA AGGTTCCTGC TAAGCCCTGG CAAATGCAGC 11700

CTTCTATCCC TCCCCCAACA CTCACAGCTT CTGAGCAAGA TGTTGCTGCC TTCCAGGAGC 11760

TGGGTGATGG GCAATAATGA GCAGAGCCAC GTGAAGGAAA GATGGGTGAA GAAATGTGTG 11820

TGGAGTCATG CTGGCTGCAC TGACCATGAA ACAAAGGATC TACCCCTCTA GTAACTGCCC 11880

TACTCCTTTG GTAACTGTTC TGAAATTATA ACTTGCCAGA AGTTCAGAAG GACCTAGTGC 11940

AGGTATTAGA GGAAATTCGT AAGATTGAGC CATTTATTCC TGCACAGATA CATAATAATG 12000 GACACGGGCC ATGGTGGCCA GCATTCTTGC TCTTGACAAT GGTGAAGGGA AGGGTTGTAG 12060

GTCATGGCTA TGCTCTCAGA ATTATAATGG AAAGAAACAG CTCCTGAGTG TTTACTATGA 12120

GCCAAGGGCT GTGCTAAACA CTTTACCATA TGATGACATC TTTTTCTCAC AGGTATCAAA 12180

AAACAATAGG ACATACCGGA TAGCTACAAT CTTTGGGCCC CTGCAAACAC AATAATGTGT 12240

ATTCTCTTCT TCAAATCCTA CATATTGCTA CAAACTGTAT CCCTGAGGCA TATTCATTGT 12300

AAAATAAAAA CATATAAAGT ACTACTTTTG TTTTTTGAGA TGGAGTCTCG CTCTGTCACC 12360

CAGACTGGAG TGCAATAGCA TGATCGTGGC TCACTGCAAC CCCCTGCTCC TGGGCTCAAG 12420

TGATTCTCCT GACTCAGCCT CTCAAGTAGC TGGGATTACA GGCGCACGCC CCCATGCCTG 12480

GCTAATTTTT GTACTTTTAA TAGAGACCAG GTTTCACCAT GTTGGCCAGG CTGGTCTCAA 12540

ACTCCTGACC TCAAGTGATC CACCTGCCTC GGCCTTCCAA AGTGCTGGCA TTACAGCTGT 12600

GAGCCACTGC ACCCGGCCCA TATAAAGTAC TACTAATGTA ACAGGGTGCT AGTCCAGACA 12660

GTGACCACAC GTGGTGTTCA TTGAAGGCTG GACTAACAAC TCCAGCCTCT CCGCCATCAC 12720

AGAGTGATGA CTGCCTTCCC TGAAGCAAAG CTTCTGGTTC AAGGAAAGGC CAGTAAGTGA 12780

CTGCTCTTTG TTGTATACAT GTTAGATGAT CAGGCCTCAA GAAAAGTATA AAGAGATCTT 12840

TGTGCTCTCT GGGACTCAAA AAGCTGCACT CTTTGGGGGA AGGATAGCCA GGTAAAAGTG 12900

GCCCAGGTAA AGAGGGCCTG GTACACCTGG TTCTGCAAGA TGGTAGACAC AAAAATGAGA 12960

GCTACATTTG GAGCTTATGT GCCCCTAACT CTGTACATAA CCTGCAAGAT CTAATTACTA 13020

ACAACTGGAA TCTTGGAAAC ACCTGTAGTA CATCCTTGGC TAAGGTTAGC CCCAACAGAG 13080

AGGGCTCTCC TCTTACAGAG AACCATTACA TTTGTGCCTT CATCCTAGAG TAGAAAAGGC 13140

ATGATCAGAC TACTAAAAAG ACATCAGGAA AGGGCCTGTG ACATCTGAGG GAAGTGGTTG 13200

CCCTCTCTGG GATGTTGGTT CGGGAAGAGG GGCATGGAGG AGTGCCTGCT TTAGATGGTC 13260

ATTCAGGAAC CCAGGCTGAT AGTGAGAGGT GAAGCCAGTT GGGCTTCTGG GCTAGGGGGG 13320

ACTTGGAGAA CTTTTGTGTC TAGCTAAAGG ATTGTAAATG CACCAATCAG CACTCTGTAA 13380 AATGGACCAA TCAGCAGGAT GTGGGCAGGG CCAAATAAGG GAATAAAAGC TGGCCACCAG 13440

AGCCAGCAGT GGCAAACTGC TCAGGTCCCC TTCCACGCTG TGGAAGCTTT GTTCTTTTGC 13500

TCTTCACAAT AAATCTTGCT GCTGCTCACT CTTTGGGTCT GCACTATCTT TATGAGCTGT 13560

AACACTCACC GTGAGGGTCT GTGGCTTCAT TCCTGAAGTC AGTGAGACCA CAAACCCACT 13620

GGGAGGAACA AACAACTCTG GACACGCCAA CTTTAAGAGC TGTAACATTC ACTGCGAAGG 13680

TCTGCGGCTT CACCTCTGAA GTCAGCGAGA CTATGAACCC ACTGGAAGGA AGAAACTCCA 13740

GACACATCTG AACATCTGAA GGAAGAAACT CCAGACACAC CATCTTTAAG AGCTGTAACA 13800

CTCACTGCAA GGGTCTGCGG CTTCATTCTT GAAGTCAGCA AGACCAAGAA CCCACTGGAA 13860

GGAAACAATT CCGGACACAT TTTGGTGACC CAGATGGGAC TATCACCAAG TGGTGAGTAC 13920

CATCAACCCC TTTCACTTGT TATTCTGTCC TATTTTTCCT TAGAATTCGG GGGCTAAATA 13980

TTGGGCACCT GTCAGCCAGT TAAAAGCGAC TAGCATGGCT GCCAGACTTA AGAAACTAAA 14040

GACACGGGTG TCAGACTTTC TGGGAAAGGG CTCTCTAATA ACCCCCAACT CTTTGGAGTT 14100

GGGAGCGTTG GTTTGCCTGG AACCAGCTTC CACATTTCCT GTACTTCTGG GCTGAGACGA 14160

GGGTCAACAT AGAGGAAAGC CATTCAGCTC TGGGGTCCCG ACAGCAAGTT GGTTGACCCT 14220

GTGGCCATGA TCACAACTCT CGAAGTCATG TTGCCCAAGC GAGACTCACC CATCTATCCT 14280

ATCTATCCTG ACTCTTGCTT CCTGGGTCCT AATGCCTGGA AGACAAAACT TCCTCTTGTC 14340

TCTGTTCTCC AAGGCTAGTC CCACTTCTAA AAACCACTCC CTGTCTCTGG TGCTTTTCTA 14400

GTTTCTCCTA TAAGAATGAT TTCTAGTATA AACTCCAGGA CTCTATTCTC TTCTTTAGGC 14460

ACCCGGGCTC ACCAATCAGA AAGCCATAAT TTTTGCCCAA AGCCCCATCT TAGGGGGGAC 14520

TATCTGGAAT TTTAGGATCC CTCCTCAGAC AAGCAGGCCT AACAAAAGCT ATTCCTGAAG 14580

CTAGGATATG GGGAGCCTCA GAAATGATAT CCTTCCTATT CAAGTGAGGA CAAAAGGCAT 14640

CACTCTTCCA ATTCTGGAGA TCCCTTCCCT CCCTCAGGGT ATGGCCCTCC ACTTCACTTT 14700

TGGGGCATAA CGTCTTTATA GGACACGGGT AAAGTCCCAA TACTAACAGG AGAATGTTTA 14760 GGACTCTAAC AGGTTTTCAA GAATGTGTCG GTAAGGGCCA CTAAATCCGA TTTTTCTCGG 14820

TCCTCTTTGT GGTCTAGGAG GACAGGTAAG GGTGCAGGTT TTCAATAATG TGTTGGTAAG 14880

GGCCACTAAA TCTGACATTC CTTGGTCCTC CTTGTGGTCT AGGAGGAAAA CTAGTGTTTC 14940

TGCTGCTGCA TCAGTGAGCG CAACTATTCC AATCAACAGG GTCCAGGGAC CATTGTGGGT 15000

TCTTGGGCAA GAGGTGTTTC TGCTGCTGCA TTGGTGGGCT CAACTATTCC AATCAGCAGG 15060

GTCCAGTGAC CTTTGCGGGT TCTTGGGTCG GGGGGTGGGG GGAACAAACA GACCAAAACT 15120

GGGGGCAGTT TTGTCTTTCA GATGGGAAAC ACTCAGGCAC CAACAGGCTC ACCCTTGAAA 15180

TGTATCCTAA GCCATTGGGA CTAATTTGAC CCGCAAACCC TGAAAAAGAG TGGCTCATTT 15240

TATTCTGCAC TATGGCCTGG TCCCAATATT CTCTCTCTGA TGGGGAAAAA TGGCCACCTG 15300

AAGGAAGTAT AAATTACAAT ACTATCCTGC AGCTTGACCT TTTCTGTAAG AAGGAAAGCA 15360

AATGGAGTGA AATACCTTAT GTCCAAACTT TCTTTTCATT AAAGGAAAAT CCACAACTAT 15420

GCAAAACTTA CAATTCACAT CCCACAAGAA GAACTCTCAC TTACCCCCAT ATCCTAGCTT 15480

CCCTATAGCT CCCCTTCCTA TTAATGATAA GCCTCCTCTA TCTCCCCACC CAGAAGGAAA 15540

CAAGCAAAGA AATCTCCAAA GGACCACAAA AACCCCTGGG CTATCGGTTA TGTCCCCTTC 15600

AAGCTGTAGC GGGGGAGGGG AATTTGGCCC AACCCAGGTA CATGTCCCCT TCTCCCTCTC 15660

TGATTTAAAG CAGATCAAGG CAGACCAGGG GAAGCTTTCA GATGATCCTG ATAGGTATAC 15720

AGATGTCCTA CAGGGTCTAG GGCAAACCTT CAATCTCACT TGGAGAGATG TCATGCTATT 15780

GTTAGATCAA ACCCTGGCCT TTAATTTAAA GAATGTGGCT TTAGCCACAG CCCGAGAGTT 15840

TGGAGATACC TGGTATCTTA GTCAAGTAAA TGATAGAATG ACAGCTGGGG AAAGGGACAA 15900

AGTCTCTCCC GGTCAGCAAG CCATCCCTAG TGTGGATCCC CACTGGGACC TAGACTCAGA 15960

TCATTGGGAC TGGAGTCGCA AACATCTGTT GACCTGTGTT CTAGAAAGAC TAAGGAGAAT 16020

TAGGAAAGAG CCTATGAATT ATTCAATGAT GTCCACCATA ACTCAGGAAA AGGAAGAAAG 16080

TCTTGCCTTC CTTGAGTGGC TACAGGAGCC TTAAGAAAAT ACACTCCCCT GTCACCCAAC 16140 TCACTCAAGG GTTAATTGAT TCTAAAAGAT ATGTTTATTA CTCAATCAGC TGCAGATATC 16200

AGGAGAAAGC TCCCAAAAGC AAGCCCTTGG CCCTGAACAA AATTTGGAGG CATTATTAAA 16260

CCTGGCAACC TTGGTGTTCT ATAATAGGGG CCAAGAGGAG CAGGCCAAAA TGGAAAAGCG 16320

AGATAAGAGA AAGGCCACAG CCTTAGTCAT GGCCCTCAGA CAAACAAACC TTGGTGGTTC 16380

AGAGAGGACA GAAAATGGAG CAGGCCAATC ACCCAGTAGG GCTTGTTGTC AGTGTGGTTT 16440

GCAAGGACAG TTTAAAAAAG ATTGTCCTAT GAGAAACAAG CTGCCCCCTC ACCCATGTCC 16500

ACTATCGCTG AAGCAATCAC TGGAAGCCAC ACTGCCCCAA AGGACAAAGA TTATCTGGGC 16560

CAGAAGCCCC CAAGCAGATG ATCCAACCAC AGGACTGAGG TGCTCAGGGT TAGCGCCAGC 16620

TCATGTCATC ACCTCACTGA GCCCTGGGTA CATTTAACCA TTGAGGGCCA GGAAATTGAC 16680

TTCTACTGGA CACTGGTGCG GCTTTCTCAG TGTTAACCTC CTGTCCTGGA CAGCTGTCCT 16740

CAAGGTCTGT TACCATCCGA GGAATCCTGG GACAGCCTAT ATCCAGGTAT TTCTCCCACC 16800

TCCTCAGTTG TAACTGGGAG ACTTTGCTAC AGATAGTAAG TATGCTTACC TAATCCTACA 16860

TGCCCATGCT GCGATATGGA AAGAAAGGGA ATTCCTAACT TCTGGGTGAA CCCCCATTAA 16920

ATATCACAAG GAAACTATGG AGTTATTGCA CACAGTGCAA AAACCCAAGG AGGTGGCGGT 16980

CTTACATTGC CGAAGCCATC AAAAGGGGAA GGAGAGGGGA GAACTGCAGC ATAAGTGGCT 17040

GGCAGAGGCA GGGAAAGACA AGCAGAAAGG AAAGAGAGAA AGAGCAGAAA GTGAGAGAGA 17100

AAGAGAGATA GGAAGTGATA GCAAAGAGGG AGTCAGAAAG AAAAGAGAGA GGAGAGAGAG 17160

AGGGGGAAAG ACAGAGAGAG ACAGAGGAAG AGACAGAGAG ACAGAAAGAG AGAAGCAAAG 17220

AGAGGAAGAG ACAAAGAAGG AGTCAAAGAG AGGGAAAGAG AAGTAGTAAA GAAAAAACAG 17280

TGTACCCTAT TCCTTTAAAA GCCAGGTTAA ATTTAAAACC TATAATTGAT AATTGAAGGC 17340

CTTTTCTGTT AACCCTATAA TACTCCCAAT ACCACCTTGT TGTTCAGTGT TAAACAAGGG 17400

TTATTAGCCC AAAAGCCACT GAGGCCACTG ACAACCCGTA GCCTTCTTAT CCAAAATCCT 17460

TAACACAGCA GGTTTCCTAA CAGGGATCTA ATCTTAGGTC GACCAGACTG GAGAACTGCC 17520 TTCAGGACAG GATGATAGAT GGTTCCTCCC AGGTGATTAA GGAAAAAGAC ACAATGGGTA 17580

TTCAGTAAGT GATAAGGAAA CTCTTATAGA AGCAGAGTTA GGAAAATTGC GAAATAAGTG 17640

GTCTGCTCAA ACGTTGAAGC TGTTTGCTGT TTGCACTCAG CTAAACCTTA AAGTACTTAC 17700

AGAATCAGGA AGGAGCCATC TATACCAATT CTAAGTTAAT ATGGACTGAA CGAGGTTTTA 17760

TTAATAGCAA AGAAAATTAA AATCTCAAAC TTACGAGGTT TTCAAGTAAA GTAAAGTTTG 17820

GTAAAAGTTA ACAGCGTAAC ATGTATTATC CTAGTACCAC ACATTCTCTC AAAGGATTTG 17880

CTCAGACAGT TTGCAAAAAA GAACGAAATC TGTCCTTACT CTACAATCCC AAATAGACTT 17940

TTGGCAGCAG TGACTCTCCA AAACCGCTGA GGCCTAGACT CTCATGTTGA GAAAGGAAGA 18000

TTCTGCACTT CTTAGGGGTA GAGTGTTGTT TTTATACTAA CCAGTCAGGG ATAGTATGAG 18060

ATACCACCCA GTGTTTACAG GAAAAGGCTT CTGAAATCAG ACAATGCCTT TCAAACTCTT 18120

ATACCAACCT CTGGAGTTGG GCGACATGGC TTCTCCCCTT TCTAGGTCCT GTGACAGCCA 18180

TCTTGCTAAT AGTCGCATTT GGGCCCTGTA TTTTTAACCT CTTGGTCAAA TTTGTTTCCT 18240

CTAGGATCGA GGCCATCAAG CTACAGATGA TCTTACAAAT GTAACCCCAA ATGAGCTCAA 18300

CTAACAACTT CTGCTGAGGA CCCCTGGACC GACCCGCTGG CCCTTTCAAT GGCCTAAAGA 18360

GCTCCCCTCT GGAGGACACT ACCACTGCAG GGCCCCTTCT TCACCCCTAT CCAGCAGGAA 18420

GTAGCTACAG CGGTCATCGC CAAATCCCAA CAGCAGCTGG GGTGTCCTGT TTGGAGGGGG 18480

GATTGAGAGG TGAAGCCAGC TGGGCTTCTG GGTCAGGTGG GGACTTGGAG AACTTTTGTG 18540

TCTAGCTAAA GGATTGTAAA TGCACCAATC AGCACTCTGT GTCTAGCTAA AGGATTGTAA 18600

ATGCACCAAT CAGCACTCTG TAAAATGGAC CAATCAGCAG GATGTGGGCG GGGTCAAATA 18660

AGGGAGTAAA AACTGGCCAC CCGAGCCAGC AGTGGCAACC CACTCGGGTC CCCTTCCACA 18720

CTGTGGAAGC TTTGTTCTTT TGCTCTTCAC AATAAATCTT GCTGCTGCTC ATTCTTTGTG 18780

TCCACACTAC CTTTATGAGC TGTAACACTC ACTGCGAGGG TCTGTGGCTT CATTCCTGAA 18840

GTCAACAGAC CACGAACCCA CTGGAAGGAA CAAAGAACTC CCGATGTGCT GCCTTTAAGA 18900 GCTGTAACAC TCACTGCGAA GCTCTGCAGC TTCACTCCTG AAGTCAGTGA GACCACAAAC 18960

CCACCAGAAG GAAGAAACTC TGGACACACC TGAATATCTG AAGGAACAAA CTCCAGACAC 19020

ACCATCTTTC AGAGCTGTAA CACTCACCGC AAGGGTCTGT GGCTTCATTC TTGAAGTCAG 19080

CAAGACCAAG AACCCACCGG AAGGAACAAA TTCCAGACAC AGTAGGAAAT CTGTATTTTT 19140

GATCTGTGGC TTCCAGGGTT ACTCCAGTCA TTGAAGTCTC CATTGCAGCC TTAAGGAAAC 19200

AGAGAATGGT TTGGAGGAGC ACATGTGGGA ATTGTTATGG ACCAGGCTTG AGATGCACAT 19260

AGGGCATTTC TGATCAAACC TAGCTGGAAG CAGGGCCAGG AAATATAATC TAAGGAAGAC 19320

AGTTTTTGTA GACAGTAGTA GTCTTTGCAT CTGAGACATG TAGATTATCA AGCAATTAAT 19380

TAGAAAAAAT ATAGCCAGGT GCGATGGCTC ATGCCTGTAA TCCCAGCACT TTGGGAGGCC 19440

AAGGGGTGTG GATCACGAGG TCAGGCGTTC GAGACCAGCC TGGCCAACAT GGTGAAACCC 19500

CGTCTCTACT AAAAATACAA AAATTAGCCT GGTGTGGTGG CACGCATCTG TAATCCCAGT 19560

ACTCAGGAGG CTGAGGCAGG GGAATCTCTT GAACTTGGGA GGCAGAGGTT GCAGTGAGCC 19620

AAGATCACAC CACAGCACTC CATCCTGGGT GACAGAGCGA GACTCTGTCT CAAAAAAAAA 19680

AAAAAAAAAA GGAAAGGAAA ATATAATCAA GAATATTGAC AGGTAACATT TATTCAACAC 19740

TTACTATGCA CCAGGCAATA CACTAAGTGT TTTACATGGA TTAACTCATT TAATCTTAAC 19800

AATAGCCCTA TGAAGTCAGT GCTGTTATTA TCTCCACTTT ATAGATAAGG AAACTGAAGT 19860

ACAGAAAGGT CAAGTAGAGA AATGGCCATG CTTGCATTCT CAGTTTTTGA AGCAACTGTT 19920

ACAGGAATCT GGTGTGAGAA ATGCTCTAAC AAGATGTGAG TCAGGGGTTG GGAGGTACTG 19980

AGTCTGAGTT GGGCAGTTGG GGATGGAAGG ATGGATGAAG AACAGCTTGA CAGAGAAGCT 20040

GACACTTGGC AACTCTGTGG GACCTTGAAG GGTTAGAGGG ACTTCACCAA AGAAACTGGT 20100

GGTCAGGGAT ACGGGAGGGT CACGGCAAGG AGGGAAAGGA AACTGTACCA CAGCAGAGAG 20160

TCTGAAGCTA CTACAGTGTA GTTCAGCGTA TAAAGAATAA TTATTTTAAG GTAAACTTAT 20220

AACCTCATGC AAATATAAAA TGAACACGTG TCAAAGATCT TATTTAATTT ATTAATTAAT 20280 GAGGGAACCT GTAAGATGTT ACAGCCAGTT CAAAGGATAA TTCAAATAAA TCCATGCACA 20340

TATGTAGGCA ATAAGGAATG CTGAAATGAA TTTAAAAGTA GATGTAAACT GATTTATCCA 20400

CAGAGAAATA ATCAGTTGCA TTTCACATAA CAAAATTCAG TTGCTTTTCT ACAGAAGGAA 20460

TTGTTTGCAT CATTACCAAT TTTTCTACAA CTAACAGAAT TATAAAATAA CTCAAACACA 20520

ATGAAAGGCA GATATAACCC ACAATGGTAT GATAGATACA ATATCCACAT CCAGGATGTT 20580

TTTTTCTCAT TTCAAAGTCT TTCACAAGTT TTCCTGATAA GGGAGTGTCA ATAATACTGT 20640

ATGGCAGGCA ATAAGACTGG ATGGATGGTT GGGGCCAGGT TTTAAGGGGT AATAAATGCC 20700

ATGTAAAGGT ATGTGCATAC TGTGCAACAT GTCGGGGAAT CTCAAATTAT TGGTAGAGTA 20760

TGTAAGAAAC ACTTGTGGAG CTTGTTAATA AATTCAAATT CCCAGACCCA ACTCCTCAAG 20820

GGTCTAATAC AGTAGGTTTG GAGTAAAGCC TGAAAATCTG CAATTGTGCA AAAAAAAAAA 20880

CCCAGGTGAT TCTGATACAC TTTGAGAAGC ACTGGTGGAA CTAATAGTCA CTGAACGTTT 20940

TTGAGCAGGG GAGAAACCTG AGGACGTCTA TGTTGCAGCA GTGGAAACTT GATTAGAAGT 21000

AGGAGAAGAT GCATGGTCTT AAAAGAATGC AAAATGATGG CTAATATTTG AGTGCTTATG 21060

ATGGGCCAGG GGCTGTGCTA GGCGCGTGGC ACACATTCAA TACGATGGAA GCCTGTACCA 21120

GTCAGTATTA GTGGGGTATC TTTAAGAGTG ACCAGAATTA AGGGGGGTTT TCACCAAAGC 21180

CTGAGGACTG AGCCTCCTCA TCCTAAATTC AGACACAATG CTGTACCTAT GCATTTGCCT 21240

CCAGGCTGTT CCTGGGCCTC CAGGGACTGG CCCAGGCTCC TGATAAATAG GGACTCCCAA 21300

CAACATAAAG CCTGGATTTT GGAACTTCCT GAATGTTACT CAGGCTTTCT AGTAACTGTG 21360

GAGATCTGAA TAATAACACA ATTCTAAGTT CCCCTACTCA TAAAGCTGCT CATCATTTAG 21420

ATGGGGTAAA GCACCTGAAA TACAATGAGC ATCACTATTT TCATTCATCC ATGAAATGAA 21480

CATTCCGGGG AGATCAGTAA GTTGATGTAT CACCCTTGAA CAGGGCAAAA TGAATACTCA 21540

CCAGGAATAT GTGGTATTTT AAAAAGAAGG CAAAGGGAAG AATAGTGGGG ATGGGGCAAA 21600

AACTTTAAAT AGATTCCCCC AATCATATAT GGCAATTGAA GATAATTAAA TTATCATTTT 21660 AATTGAGTAA GTACTCATAG AGCCCTCACT ATTTGAAAAT GAACTGCCTC CTAATTGTTA 21720

TTGTGCAAAT GTGATACATT AAACTTAAGC TATTTTAATA AAACATCCAT TTTCGGAAGC 21780

TGTAGTAGGT TCTCCCAGGT CAGATTTGAT AAGCCATAAA GAACAAATGC CAACTCCTAT 21840

TTTTCTATGG TGCTGGGAAA TAAGAGAGAA ATGTGTAATT CAAAGCAATC ATTTAATTTT 21900

ATCCAATAGC TTGATTCTCC TCTCTCTTCT AGCCTTTTAG CTAAGCTGTT ACCAAGTAAC 21960

CACACTAGTT GGCTTGAGTC TTACCACTGT TTCCCTGACC CCACAGTGGA GAGACTGCAT 22020

CTGTTAAAGA GCAGTTATGT AACCATGGCT ATGCTGAGCT GGGATTCCCA AGGCTTAGGT 22080

TCTTTCTGTG AATGACCTTC ACCAAGACAC CTGAGGTCTG TGTGGAACCA CAGGCTTGTC 22140

ATCTCTAAGG CAGAGTTGAT AATTCCATCT GTTTCTTGAG CCCACACTGA GAAAAAGATT 22200

ACATGACTGC AGTTATTTGA ATGCCTCATG GAAAGACGTC TTATAAATAT TATAATTAAT 22260

GTTATCATTA AGTAATGCTT CAATGCAGAT CTTCCAAGTA TAAATATCAG CTGAGTAAGA 22320

AGTCAATCTT CCCTGAAGCA AAATTGAAAT TTGTAAATGC GATTTCTGGG AGCTTATTTT 22380

GTAATACATG ATTCCAGAGT GTCCATAACA CACACAATTG TCTTTTTTCC CCTACATGGG 22440

CTATTTACAA CAAAATTGGA CTTATAATGT TTATTTCCAG GGATGACTAG AACTTTAATA 22500

ACAAACCTTG GGCCAGGCAT AGTGGCTCAT GCCTATAATC ACAGCACTTC GGGAGGCTGA 22560

GGCTGGTTAG ATTACTTGAG GCCAGGAGTT TGAGAACAGC CTGGCCAACA TGGCAAAACC 22620

CTGTCTCTAC TAAAAATACA AAAATTAGCC GGGTGTGGTG GCGCATGCCA GTAATCCCAG 22680

TTACTAGGTA GGCTGAGGTA CGACAATCGC TGGAACCTGG GAGGCGGAGG TTGCAGTGAG 22740

CTGAGATTGC ACTACTGCAC TCCAGCCTGG GTGACAGAGA AAGACTCTGT CTCAAAAAAA 22800

AAAAAAAAAT AATAATAATA ATAATAAACC CTGATGAAAG GTTTCTAAAA TGTTTTCATC 22860

TAATGGTTTT CTTGACAATT AAATTTTCTA TATAATGTCA GTTCATAAAA AAACTGAGAA 22920

CGACCACATG TCATATCGAC TGCTTAAAAG AAAATACGTA TATTTACAAA CATATACACA 22980

ATACTGTCTT TTGTCTGGTT AGTTTAGAGG TTAGATAAAC TGCAGTATGT TGTAGTGGAC 23040 AGATCATAGA ACTAGGAGTC AGGATGTCTG GATTCCTAGG AAGCAATGAA TAGGTTGCAC 23100

GGTGCAGCTC AAGGTTATTC AAAGTGTGGT GCCCAGACCA GCATCATGAG TATCCTCAGG 23160

GAGCTTGTTA GAACTGCAGA TCCTTTAACT CATTGAATCA GAATCCCTAG GTGTGGGGCC 23220

CTGAAATCTG TATTTTAGCA GGCTCTCTGG GATTGTGATG TGCCTTAGAG TTTGACAACC 23280

ACTGGGTAGC TGATCCTGAC TTAGACTTAT CAGGCATGTG ATCTTGAACA AGTCACATAA 23340

TCTCACTGAG TTCAGTTTTC TTATGTTTAA AATAGGCCCA ATAATATCTA TTTCACATGG 23400

ATTGCTTTGA GGATTAGGCA AGAGATCTGT AACAGACACT GTAGAACAGT GTCTCTGGTC 23460

TACAGCTGAC CTTCCATAAA TGGTAGTTGC CTTGATTCTC TGCTCTGCCA CATAATAGCT 23520

GGTTAACTAT GAGCAAGTAA TTTAGTTCTT CTCAGTTTAG TTTCTTCCCC TGTAAAAGAA 23580

GGAAAATAAC TGTTATACTC CATTTCTGAA TTGCTATAAA AGTCATTTAA TTATGGGCAT 23640

TGAAGCTCTT TGTTCACTGT ATAAGGACTG TACATCTAAG GGATTAATGA GACCAGGCTT 23700

ATGATTTTAA GCATGGAGTA AATAGTAACA CTGACTCTGT TCTATGAACC ACATGGAAAC 23760

TCTAAAGAAT ATGCACATTT GAAACACAGG TATCATCTGG GGAAGGTGAT CTGCTCACCC 23820

AAACCAGTTC ATGAACATCA ATCTCCAGTG GCGTGCTGGA GCTAGCTGTA CCAGCTCATG 23880

AGGGCCAATT GTTTCATTTT TAGGAATTTT GTTTGCTGGT TAAAAATAGT CATTATTTAA 23940

AATTAAATTA TGTAAACAAT AATATTAGAT AAAATAAGTT AAAATAAAAA CAAAGGAACT 24000

AATTATCCCC AAACTCTTCC CCACCTAATT ATTTTACTAT CTGTGCCTTG GGATTATTTA 24060

CATTGATTTT ATCCATATGG TGACAATACT ATTCATATAT AAATGGTGTG CTTCTCTTCA 24120

TAACTCTACA TAGCCTGATG TCAGGCTAGT AGCTTGAAAT TGGCCACAGT GGGAGTGTGA 24180

GCATTTGTAC CATGAGGCTT GGCCAAGGCT ACAAATCCAG ACTTTTGTTT TTCCCTCCTG 24240

GAGAGCTGTC TGTTAAAAAT TTACCAACAC ACCACTGGTC TTACCTTTGT TAATTTACCA 24300

CAGTCCAGGT TCTGACCTAG ACTTAGAAAC CTGGATTTGT CAGCAAGCTG AGGATAGAGC 24360

CATTATTTCT AAGAAGGACT CACATTACCC AAGTGCAAAG CCTGATATAT ACCTTCAGAA 24420 TATCAATTTA TTAATTTACA GTGAAGAAAG CCACCCCAGG GCATTCCCCA GGGGAAGGCA 24480

AAAAGAGCTA GTTGCACATT TTGAATGTTT GATGACATTA GGGTAAGGTG ACACAGAATA 24540

TCCATTTCCA CAACTGAGAT ACCTGCTGCC TTAAGGAAGG GACAGGCAAG TCCTTGGGCA 24600

GGACCTTAGA TTGTCACTGT CCATCTTGCT GTAGGACTCT CCTTTCCAGG CATGACGATG 24660

GCCAACTCTG TCCTCCTACC CTACTGATGG GATTATCTTT TCTTGACACA TGGCAATGCC 24720

TCCAATCAGA GGCTGGTAGC TATTTTTAAT CTTCAGGGCA GTATTTTTCA AAGGGAAGTT 24780

CATGGACCAT ATGCATCTGT ATCATTTAGA TGTATATTAA AAATGCTTAG TCTTCCCCAG 24840

TTATACTAGA TCAGAATCTC TGTTGGTGGG GCCCACGAAT CGGTATTTTC AACAAATCAC 24900

TAGGTAATTT CTGTATATAC TATAGTGTGA AGACCACTGC TTGAAGGTTT CTTTGCATAT 24960

CTCCACTAAA TATAAAAAAT ATTGACTTCT AGATTTAACT CCCAAAGCAC TTGCATTTTT 25020

AAGTTTCTGG GGGCATTATA TTGTGGTACC CCTATACCAC TCACACTCTA GTCAGGAGGT 25080

ATATTATGGA CTGAATGTTT GTGTCCCTCC AAAACTCATA TGTTGAAGTC TTAGCTTCCA 25140

ATGTGATAGT ATTAGGAGAT GGTGCCTTCT GGAGGTAAAA TCAAGCCCTC ATGAATGGGA 25200

TTAGTGCCTT TAGAAAGAGA GCTCGTCACT GTCTTTCCAT CAATTGAAGA TGCAGTGAGA 25260

AGCTGGTAGT CTTGCATCTG GAAGAGGGCC CTCACACAAC CTGATCATGC TGGCACCTGG 25320

TCTCAGACTT TCTGCCTCCA GAACTATGAG ATGATAAATT TCTGTTGTTC ATACCCCACC 25380

CAGGCTACAA TATTAGGTTG CTGCAAAGTA TTTGTGATTT TTGCCTTTAC TTTTCAGGGC 25440

AAAAACTGCA ATTACTTTTG TGCCAACCTA ATATTTTGTT ATAGCAGCCC GAACTAAGGC 25500

AAGGGAGACT ACATCAGACA GTGTAGCTAT GTAAGTACAA ATGTATCCCT GTTGAAGGAA 25560

AACTAAGTTC TAACCCTGAC TTCAGGCCAG TAGCCACCTT TTCAATCTCT TTCATGAAGG 25620

GACCATTATC ATTATCACTG GTGGCAAAAA TAGAGCACGA GAATGGAATT TGCTTTTCTG 25680

TGAAATCTCA GTGTATACAG ATGAAGAGCA AGGGTTTGCT TTCATCTCTA AGAAGCAAAA 25740

GTGAGTACGG ACTGGCACAT TATCAGAGAA AGAATCATTC TAGCTCGGTG GGTCTTAACC 25800 AGGAGTGAAT TTGACTCCAG GGAACAGTTG GCAATGTCTG GAGACGTTTT TATTTGTTAT 25860

AGCTGGGGGA TGAGTGGGTG GGTTGCTACT GGCATCTAGT GGGTGGAGAC CAGAGATGCT 25920

GTTAAACATC CCGCAAAGCA CAGGACAGTC CCCGACAACA AAGAATTATC TGGCCCCAAA 25980

TATCAATAGT GCCAAAGTTG AGAAACCTCA TTCTAGCTTC CTTTTCCCTT CTACGTTCTA 26040

ATCAACTGTT GTTCTTTCAG CATTAGGATT CATCCAGCAG TCTCTTTCCC CAGCAATTTG 26100

TTGAAATTTT TTTAAAAATG GACTCATTTT AGTGTCACAA GAAAAAAATA CATTCACAGG 26160

AAAGGATGGG TCATTTTGTT TAATGATGTT TTGCCTTTCA CATAGCAAAA GCTTAATAAA 26220

GTATTTTTAA ATAAAATGGT GAATAGATCA AAACATTAAT TTCACATGTG TTTTAATAAA 26280

TAACAGGAAG ATGGCTATAT TATATAAATT GTTCTTGTAT ATGTCTTGAG TGGATCATCA 26340

AACACAAACG TATCTACATG CCTTTTCTTG TGAATAGATC TAATAATAAC GCTCTTCTAA 26400

AAACAAATTA AATGGATATT ATTTGCTGAG AATGTAATGC TTGTGTGAAT AGAAGCCAGC 26460

CCTGAATCCA AGCCCCCAGA TCTATTTAAA GAATTTGAAG AATGTCAGAA AAGCACGTGG 26520

CTTCAAGGTT AATGTGTAAG ACTCACAGAA ACTTGAAAAA TCACTATGAC TAAAAAGAAA 26580

GTATGAGCTC CCTGCATGCC TGTAAATTGG AATGACAGCC AAAACCAGTT AATTATAAAA 26640

ACAGCTAATT TAACAGGTTT TCAAATTTGT TTCTTTCTCC AAGTAGCATA TAGTCAATAA 26700

TCCTTAAAGA GAAAGCAAAG AAGGGGAAGC ACTGAACCAA ATTTGCTTTT TTGTACCTGC 26760

TCAGCTCAAA TGCAGAGTTC TCTACCTGGA AATTGACTGC TTCCATAGTT TGATAGCCAC 26820

AGAGAGATGG GAACAGAAGG AGAGGTATAA TCCCAGACTT GATTCAGCTA TAGAGAATGA 26880

CAATAGTGTC AGAGGCCTTC CAACCAGAGC GACTCCATCT TGAATACGGG CTGGGTAAAA 26940

CAGGGCTGAG ACCTACTGGG CTGCATTCCC AGGAGGCTAA GCATTCTAAG TCACAGGATG 27000

AGACAGGAGG TCAGCACAAG ACCTTGCTGA TAAAACAGGT TGTAATAAAG AAGCCAGCCA 27060

AAACCCACCA AAACCAAGAT GGCCATGAGA GTTATCTGTG GTTGGTCTCA CTGCTCATTG 27120

TATGCTAATT ATAATGTATT AGCATGTTAA AAGACACTCC CACCAGTGCT ATGACAGTTT 27180 ACAGGTACAT TGGCAACTTC CGGAAGTTAC CCTCTATGGT CTAAAAAGGG GAGGAACCCT 27240

CACCTCCCAG AATTGCCCAC CCCTTTCCTG GAAAACTTGT GAATAATTCA CCCTTGTTCA 27300

GCATATAATC AAGAAGTAAC TGTAAGTATC CTTAGGCCAG AAGCTCAGGC CACTGCTCTG 27360

AATGTGGAAT AGCCATTCTT TTATCCTTTA CTTTCTTAAT AAACTTGCTT TCACTTTACT 27420

GTATGGACCC CTGTGAATTC TTTCTTGCAA GAGATCCAAA AACTCTCTCT TGGGGTCTGG 27480

ATCAGGACCT CTTCCCAGTA ACAATAGTAG TAAGGGGTCG GGGAAACTGG ACAAAGGAGT 27540

TTAAGAAGCC TTAGATAAAG GGTCCTCATC ATTGTCATAA CATAAAATCA TGGACTCCTA 27600

GAATTTTATA GCTGATAGGA TTAGAAATTT CAAAATTCAA TTTCATTAAT TTTCATCTGC 27660

GAAAACAGAT GGCCAGAGAG GCCAAACAAT TTGTTAAGGA GCACTGAGGC GATGGAACAC 27720

CACACTGGAC CGCAAACCTC CTAGCAGAGT ATACAAGGCC TTTGATCTCC TCAGTCAGAA 27780

TGAACTAGAG CTTTCCAGGG GTACCCTTTC TGACTGTTTA GCATGTTTGC CAGTCTGACT 27840

AATTTTGAAG TTGCTTAAAT ATCTGTCATT TCCACTGTAT CATAATCTCC TCATTCATCT 27900

TCAATCTCCA ATGCCTTGAA CTCAGTAAAT GTTAGTTGAA CAAAAGTAAA TTGAACCCAG 27960

AATTTCTGAT CATAATCTGG AGCACTTTAA AATTGTCAGC TTACTGGGAA ACGGGATAAC 28020

ATGTGATTTG TCTTTGATTT TTTTTTTCTC ATATGCTTTT TCCACCTATA GATGCTACAC 28080

GAATGTTTTT AAAATCTGAT ATAAAAATTA AAATTAAAAA ATTAAAAAAA GAAAATTTGA 28140

TACAATGCTA CATTTAGAGT GTTGTGATTA GATTCCTTAA GTGTATCATG GTGATCTCTA 28200

CATCACGTGG TGATCAAATT GCTTTGGGTT TTAACACATA ACTGACAAAG GCTTGGGGAC 28260

ATGTAAGATC CCAAATACAT TTTTATTGAT TTTTTTTTCT TGTTTGTCCT CTTTTAAATA 28320

ACTTTTTTTT GTTATAAGAA TAATTCATGT TCAGTGGAGA AACCATAGAA AATAGTGACA 28380

AGTGAAGGAA TAAATTTAAA ATGACCCATA ATTGTACCAT ACATTCTGAT TTTTTAAACG 28440

CTGAACAAAT TAGCCTTGGG TAAGTACCAG GAATAGAGTG CAGCATTGAA AGTTAAAGTT 28500

TGGGGAAGGA TAGCTGACTT AAGAAATTAT CTAGTTAGAC ATTTTTTGGA TGGGGTAATT 28560 TTGCAGATGA CATTAGTGAG AGAAAGGACT TGCCACTCTC ACACAGCTAG TAGGGGTGTG 28620

GGAGGATATT GGAACCAAGT TTCAAGTCTT CAGTGAAGAA TCAAGGGAGA AGTTCTAAAA 28680

CCTAACAATA TCCCTCTGGA TGGACATTTA TTTTATTACT ACAATAAGCC ACACGGTGAG 28740

TCATAAGGAG CATTTCATTC TTCTAATATG TCTCTACTGT ATTTAGAATC TGATAAAGCC 28800

CTATTAGAAT TCATCTCTTT AAGAATAAAA GAAGCTGAGG AACTAAAGAG AGGGTTGGAA 28860

TAATCCACTA ATTATATCCG TTAAGCTTCA GTTACGCTAA TAAGGAATAT CACATGACTG 28920

TGGTGTGTGC TTGTTCTGAA CAGTAAAGTA CATGAGGAAA GATAAGATTC AGGGCTGAAA 28980

TGTCCTTCAG CATATGTAGG TAGTGGTGAT GAAAGTCATT AAAAGAAAAA TTGATTGAGG 29040

TATTTTAGTA AACAAAAGAA CTCACCACTT ACCCATCAGG AAGTGTATTG TTAATGCAGT 29100

GCTGTTCAGC CTTCTGGAAG AAAAGGTTTC TTCATGCTTC TCTCTTTAGC CTAATTCTTA 29160

TCCTGTCACT TTTCAGGCAA AATTAAAAAA AAAAAAAGAT TGAAAACGAT GCTCCTATTT 29220

TATTTGCTTC AAAAGAAACA GGCTGTTGCA TTGTGCTTGG AACAGTTTAC TCTTGGCCTT 29280

GATGTAAGTG TGAAAGGAAG CCCATGTAAT TGACTAGGCA GTATCTGAAG AAGCAGGAAA 29340

TACAGTGTTA AGAAAATGAA CAGGCATGAA AACCATGGCT ATTTGATAAA AGTAAATAAT 29400

TTCTGCAGTT CACATGTTCT CAGCATATTT TCTTTGATAC TGACTTGCTT AATATGACAA 29460

TAGCAGAACC ATGGTAGCTT GTAGGCATTA CTTTTCTTTT AATTTCTTTT ACATTTTGAA 29520

TTTACCAGCA CTCACATTTG TATTACTTTT GGGTTATACT GAGGATCTAT AACTTATAGA 29580

TCAAATACCT GACATATATA TGCATTCTCT GAAGTCTTAG GGCAGAACTA GAACATTCTT 29640

GTGAACATCA GTATAAGATA TTAAAATGGA AGTTTTGCCT AAGACTGAAG ACAATAAAAA 29700

TATCATAGTC TGAAATGAAT GCCAGCACAC CATACAGGAT TTAAATATCT ATACATATAT 29760

ATGTGTGTGT ATTATATATA TTTAATATAT ATCTGTGTGG GATAGGAAGA GGTAGGGGGA 29820

AATCAGTTTT ACAATTATTA AGTATTTCAC CCTTGACAAG AGTATATATA TTGGAAATCA 29880

GTTGGAGAGT ATTTTCAAAG ATAAATGTTA GTGTGCTATG AATGAATCCA CCCCTACCAC 29940 - I l l -

CACTGAGGCA GGGTAGGAGA GGCCTGTGCT CCTCAAGCAT AGTTGGAAAA GGACCTCAAC 30000

AAGACCACTT CAAGAGTCTA ATGTGTGGAG ACTGTTGCTT AGGGAGACCT TATGGTCTAG 30060

CTTCTGACTC ACAGCTAAGT CAGGGAGACA GGTTGGCTGC TCTGATCGTG GAGTCCAAAA 30120

GATGGCCTGC ACTGAAAAGC CTCATGAGTG TTGACTTAGG GCTAGTCTAA GAGGTCCCTG 30180

GAAGAAGAAA CACTCAGTAG GAGAGAAGCT GGAGGTACCT TCAGTGCTGA ATTGGAACTA 30240

GATTCATTCC CCCGTGGAGC AAATTACATA GGAAAGATGC CCAGTGATGG AGAGTGGGGG 30300

TGTCTCTAAC AATTACCCAC CCACTGCCCC CACCCTAAGA AAAAGAAAAT CACATACAAC 30360

CAGTCAGCTG TAAACATATG CCGAGCCTAG TAAACTCAGA TACTAAGTTA CCAGGGTACC 30420

TGGCAAGTAA GAACATTCCT GATTCCCTTC CTCTCTTCTC TTTGCCCTCC AACCTTAGTG 30480

GCTAGCAAGA TGGGGAGAGG AGGAGAAGCT GTAAGTGGGG AAAAAAGAGC AGCTTTCTCT 30540

CCTTTTCAGC TGCTGGATTC TCCCTCATCA TAGGCCTGAG CTGGGGAATC AGGAAGAAGG 30600

ATTCTTTTTA AAACTGAAGT AACGTTATCA TTTAATTTTA AAACATTTTA AATTTTGACA 30660

ATGTTGAGAT TAGATATACT AATTATTAAA CTAAGATTAT GTTTTGCAGC TTGAAGTGAT 30720

AAGAAAAACT CTTATCTAAG AGCATCCAGG AAAGTCGGGG GTTTCCTGAA CATCCTTTTA 30780

AATCCTTTGG AAGTCAGCTT TCAGAGAGGA TTTAAAGTGT AGACTGGGCC TTCAGAAACT 30840

TGGTTAATGT AGGGGTTTCC TATGCAGACT TGGGGACTAT ACCTTGTGTG GAAGAGAGAA 30900

AATAAGATTA TCTTACATTT TTCCCATTCC TTTTTCAAAA AGAAAGCTCA GCTAGCATGA 30960

AAGTTAAATT CAAAACGTAA TGGGTATTAT TTGCATATTC AAATCTAGTG CATATCATGT 31020

AAGTACTGAA TTATGGTATT CATTATTTCA AATGACAAGC TGGATTTTTT TTTCTTTCGA 31080

ATTTCACAAA TTAATTTTCC TTGGAACCTT TTGGTTTGGG CTTTAAGAGT TTAGGCTTTC 31140

ATCACAAAGA GAGGACAGCC TTGAAGATTA AAGTGTGTGG CTCTTCTCAA GATGTTCTTA 31200

GTCCAGCAAA GGATTCTATG CATATTTGGG CTTCCTTCTG TCTCATAACC TGTATTTCTT 31260

GATATTCTAT TTATATTCTG TAAGATTTTT TTTTTAAAGG AAAAATTCTT CCATGGTTGA 31320 AGGACATGTC AAAAATAGAG GATACAGTTT TATATCAAAG GAAGTTTCAT GATATGACTG 31380

TAGAAGCTCA TTTGACTTAA GACACATCAT TTCCTCATGG AAGTGTTAAA CAGATCTGTA 31440

CAATAAGGTT GGCAATCTTT GTGTAAAACA GTTTTTTTTC TCCTGCTCTA AAGAAAGTGT 31500

ATATTTCAAA ATGTGAATGT CAGCAGTCAG AAAATAGTAT TTTTTTAACT TCGTTTTCAA 31560

AGTCCTCAAA AACCTGTACC TAATCATGAA TTTTTTTTCC CACAGATTGT TTCTTCTTCT 31620

CCCTCCCAGA AACTTTGAAG TTTTTCTACA TGACACCAGG ACCTATGTCT TTTTTTAATT 31680

ACACAGAAAT GAAAGAAAAA AAGTGTGTTG TATCGTTAAC CAAATATATG AAATCTTTAA 31740

GCTGTATTTT TATTTTTAAC TTTGTTTTGC AAAGAGGCCA TTCCCTTTGG TTAAATAATT 31800

TGTTATTCAC AGTTTCCTTG TCCTCATATT ATCAAGGGGA AAATTGTAGA AATTTTAAAG 31860

GAAGCTCTAG GCAATGTTTT CATCCCTGAA TCTTTGGAGA GTTATAAAAA CAAACAGATT 31920

ACTGAACCTG TAAGAGAACC AATCGTGAAG TCATTACATC TAAGCATAAG CAAAATCTCC 31980

TCTTGGATCA TTAAGTTATA GAAGAAAAGA AAGCCTGCAC TTTGAAATTT AGATAAAGCT 32040

TGGTAACTTG TAAGTCAAAC ACGTAAAATT TTACAATTCA GGAATATCGA TAGCAGTTGA 32100

GTTTAATAGA CTTCTCACAT TCCAAATTTA AAGCTTCCTT CTCTGTGCTA ATAGAGATAC 32160

AATAGCAGTA GGCGTTTAAG AAGAATGAAT CAACAATTTA AAACTATAAT GTGTTTTTTA 32220

TTCATCTCCC TTATTCACAT ATATTTGTTT TGTTTTGAGA AGGAGTTCTG CTCTGTCGCC 32280

CAGGCAGGAG TGCTGTGGCA CGATCTCAGC TCACCGCAAC CTCTGCCTCC CGGGTTCAAG 32340

CGATTCTCTT GCCTCAGCCT CCTGAGTAGC TGCGATTACA GGCGTGCGCC AGCAACCCCG 32400

GCTAATTTTT GTATTTTTAG TAGAGACAGG GTTTCACCAC GTTGGACATC TTGGTCTCGA 32460

ACCCCTGATC TCAAGTGATC AGCCCGCCTC GGCCTCCCAA AGTGCTGGGA TTACAGGCGT 32520

GAGCCATCAC TTCTGGCCCT TATTCGCATA CAATTTAAAA ATCATCACAG AAGGTTTGAA 32580

AGAAGGAAGG GGCAGAAAAT TACCTACTTT TCCTCTCCCC AGCGATCTCC TTCAAATCTG 32640

TGCCTTTTCC TCAGGCCCAG GCCTCAATTT ACTGAGCAGT CACACCTCAC AGAGGGAGGT 32700 CTGGGCAATC CACTCTTGGT CACAGGAAAG CCATTGACCC TCCCACTTCC TCTCCTCCAC 32760

CTTGTTCTCA ACTCTTGACT TTGGGCTTTG TTTCTGTTCA AGTCCTAGAA CTGGTTTCTT 32820

TTATCAGGTT AAGTGATTAG TTCTCTTTCC CTCTAGTTGC TCTCACTCCC TGACTCTTGC 32880

CTTCTGTAAC AACTGGAGAC AACTCTTTCA AAACCAGCTC CAAGCCCCAG ACTTCTCTCT 32940

GGGCTTTAGT TCGTAAGGCA GGTGCCCTAC TGAGTGAGCC TAGATCAGAC AGAAACATAG 33000

CTGTTGGCAA GGATTTAGGT GAATTTCCTT CCATTGTTTT TCTAATACCT TTTTTTTTTT 33060

TTTGTAAATA TAACCATGCA CCTACACACA TATTTGAATA TCCTGCCTTT TTATTTAAAA 33120

TGACATGATA GGTCCGGGAG TGGTGGCTCA TGCCTGTAAT CCCAGCACTT TGGGAGGCCG 33180

AGGTGGGCAG ATCACCTGAG GTCAGGAGTT CGAGACCAGC CTGGCCAACA TGGTGAAACT 33240

CCATCTCTAC TAAAAATCAA AAATTAGCCG GGCATGGTGG CAGGCTCCCA GCTACTCAGG 33300

AGGCTGAGAT GTGAAAATCG CTTGAACCCG GGAGGTAGAG GTTGCAGTGA GCTGAGATCT 33360

TGCCATTGCA TTCCAGCCTG GGCAATAAGA GCGAAACTCC ATCTCAAAAA AAAAAAAAAA 33420

AAAAAGACAG GATAAACATT CTAGATAGTC TCTATAATGG TCATGATTAA GACAATAAAA 33480

TAGTCTGAAA TTGTCAATAT ATATTAATAA TAATTTATTT GGCCATTCTG CCAAGTAGCA 33540

GACACCTGTC ATTCTGCCCA CTCAGCACCT CTCTTTCTTT TAGGGAAATG CTACCCACTC 33600

TTTGCATGGG TTCTGGATGG AACTGTTGAT CACAGTGTTT TCACTCCCCA TTTTGCCTCA 33660

CCAGAGGTAG ACAGAAGACC CAAGCCAGGC CAGTTACACA CAATCTTCAG ATAATTACCG 33720

TATTGATCAC AGTATCACCC CACTCAAGGC TTGGTTGGAG ATGAGCAGAA GAGACTAAAG 33780

CTGGGTCATT TTAATTAACA CCTGTACCCC AAAGAAAGAC TGTCAATGAG GCTTTTATAC 33840

CGACACTCCT GGTTTCCATT CTTCCTGATG CCATTCATTT GACGAACTAC CCAATCTTTC 33900

CAACAGTGTC TTTGGAAGAA AGATAGTCAG AAAAGAAGAT AGAGTTGTTT TCTGTTCTTT 33960

GCAACCAAGG AACTCTAAAT GATAGACTTG TTGCTAGGCA CTTTGGTTAT TTTTATTATC 34020

TTGAATACTT CTGTGATATA CTTCTTTGTG CATGCCTGTT TGTACGGATG TAGCTTTTTA 34080 TATATTTTAT ATAATTTCTC AGAAGTGGAA TTACTTAGTC AAAAGGTATG AACATTTTCT 34140

GATTCTTAAT ATAAATTGTG CAAATGCTTT TTAAGAAGAT TATACCAGTT TACATTTTGT 34200

GTTATATATA ACAGAAAGTA CTACTGAAAA ATATTACAAA AATTGTCTCT CTGTTCAGGA 34260

GGACTTGTAA TAGATGATAA AGTACTTGAA ATAGGAACAT AGAGCATTTT CAGTTTAAAA 34320

TAATTTCATT GGGTTATTTA CGGAATCCTT AGAATTATGG CCAGACATTT ATAGATGATC 34380

TGTACCAAAC CTAGTTGGTT ACATAAATTG CTTATTCAAC TGGCTTAAAT CTATAATAGA 34440

AAGATGACAC TTACTGAATG TTTAATATAC ACTTTGTCAG GGGCTTTGTA TTATTCTATG 34500

ACATCTTCAA AATGACCCTA CTTTCCTATT TTATAAGTAA GGACAGGAAG GCTTCAAGAA 34560

CATGACTAAT TTTCCCAAGG GCTGTACCAA AGCCAGAACC CAAATCTATA AGGCTTTTAA 34620

ACCTGCATTC TAAAACTGCA TCTCGGCCAT CTTATTCCTA CAGAACTTAA GGTTAGAAAG 34680

CCAGATTGGA GTCCCAATTT CACCACTTAG TAACCAGACA AACTTGAGGA ATTCACTCAA 34740

CGTCTTTGAA TCTCCATTTC CTAATCTTTA AAACTAAAAC AATAATACTG GCCCTACCTA 34800

TTTCCTAAAA TTTCGTGAGG CACATAGAGC TAGTGTGGTA GAGTGCTGTA CAGATGTCAA 34860

GTGTTAGCGT GAATTACTTA GATCCCTGAA CACCATGGAT GAATGTGTCT GACTGCTATT 34920

AGAGGTCATA AAGAATATTG GGGCCAGGTA CATTGGCTTA TTCCTATAAT GCCAGCACTT 34980

TGGGAGCCTG AGACAGGAGG ATCACTCGAG GCCACAATTT CAAGACCGGC CTGGGCAACA 35040

TAGTGAGACC CCTTCTCTAC AAAAAAAAAA AAGCAGCCAC GTGTAGTGGC ACACACCTGT 35100

AGTCCCACAT ACTCAGGAGG GTGATTTGGG AGGATAACTT TAGTCCAGGA GTTTCAAGGT 35160

GCAGTGAGCT GTGATTGCAC CACTGTACTC TAACCTGGAC AGCAGAGTGA GACCCTGTCT 35220

CTAAAAAAAA AGAAAAAAAA AATAATAATA ATAAAGAATA ATGGGCCTTG GGATACCCAC 35280

TCCTCTCTTT CTGCTCTGAG TTGTGAAGCA GTTGAGTTAC ATATGCATGT CCAATGGATG 35340

AGGTTGAAAA TATCAACTGG ATTGGAATGT GGCTTACTTG CGTGGCCACA ATGAGCTTCG 35400

TAACACTTCC TGACAGGGTG AGAAGACAAA CTTCCTCACC CAGTCACTGG CAGAGCTGGA 35460 CACTCTGTGT CTCTCCCACA GAACAACCTC TTACTGCATG GAGGTGGATG AAAAAGTCAA 35520

CCGAGAACAG GCTACTCCAA AAAGCAGAGC ACCAAAGGCA CCAGCTGGTC AGGTCCCCCT 35580

TCCTAAGTAA ACAATCACGT AATTCATTCG GGACAAAGCC AGAGAGGTGG TGTGGAGAAA 35640

GAGAGGGCAG TTTCCTCCCA AGTTTTTCCT GGAATTCTTT ATGGGAATAT GAGGTTTAGG 35700

GGAATAAGAC TTCCCTTTAA CAGTGAAGAA TCCCCAGCTC TATTGGTAAT AGGAAATCGC 35760

TTACAAGGAT CATGGGGAGT ATTTCCTCAG CTCGTTCTGC CTCCTACTTG GCTGAGTGGA 35820

ATGGAACCAT CTGTGGCTGC TGCATATGAT ATTGTCAACT TTGTCATTCC ACACCCACTC 35880

CTTGACGCCC TACCATGTGG TCATAAGACT CCCTTTAAAG TGTTCCTTTA AAAAACAAAA 35940

TGTGTTTTGT TTCTATAAAA TACAGCTCAA TGTCAGAACC CTTGTCTTGT TTGCTCTCTG 36000

ATGTAACCCT TTCACAATGT TTGGGCAGCT TATTCTCTCT ATTTCCCTGT AGGGTCCCAT 36060

CCAGGCCAAA GTGAGTGCCA GCCTCATTTG GGCAGCACAT GCCCTGTGGA AGGGCAGGAA 36120

GAGACGAAAG CTAATTGTAA CTTTGTGATT AGCTGTCATG GATGCCTGGT CCTGTCAATA 36180

GCGCTCAATA AAGCCAGAAG GCCAAGCGTT CGCTTCTGCA TACTGATTGC TGAGTCAGAT 36240

TTCTCAGTGC AGAAGGGCTT TCTAGGCAGT CAATTTTAGA ATATTAGTCT TGGTTCTTAA 36300

GTGGTTAAAA TCCCTAGCTG GTCTTTAATC TGAGCCTGGA GAATTTAGTT AGGGCTGACA 36360

TTCTGCTGTG ATATTTTTGC CCTCAATATA TATGTCTTTC CTCCATCTCT TAGATCCCTG 36420

AATCATAGAG ATATATATGT TATATAATCA ACTGTCTCCA GTCTCTAAGA GTGATAAGTA 36480

CACATTGTGT CAGGTTGAGG GGACAGGAGA ACTTTCAAAA GCCTTTCTTG CCCCTTTTTC 36540

CTTCTCACTG CCTCCCACTA AGTCCAGCCA CTTATTATTC AGCTGACACT ATCATCATGA 36600

CCATGAGTCT TTTGGGGCTA CCCTGGTTCG GATCCTTTTG GAGGTTTGTT GCTTAACTCT 36660

GTCTTCAGTC CTATGGAGCT GCTTTTTCAA TAAGTTTCTA TTTTGGCTAA AGTTGGCCAG 36720

AATCTCCTTG TAACCAAAGA ACAAATAAAA TACCAGCTTG CAATGTTCTA TGTTGCTTCC 36780

ACCAAACTTA TGCAGCACTT CCTATCTAAT CCACCTACTA GTCTTTTTTT TTTTTATTTT 36840 TTTGGAGACG GAGTCTCGCT CTGTTGCTCA GGATGGAGTG CAATGGTGCA ATCTCGGCTC 36900

ACTGCAACCT CTGCCTCCCG GGTTCAAGCA ATTCCCCGGC CTCAGCCTCC TGAGTAGCTG 36960

GGACTACAGG TGCATGCCAC CACGTCCGGC TAATTTTTGT ATTTTAGGAG AGAGAGGGTT 37020

TCACCATGTT GCCCAGGCTG GTCACGAACT CCTGAGCTCA GGCAATCCGC CCTCCTCGGG 37080

CTCCCAAAGT GCTGGGATTA CAGGAGTGAG CCACCTCACC TGGCCCCGAC CTACTAGTCT 37140

TTAGTGTTTG CTTCCTTCTA TTGGGTAATT GTCTGTTTAT ATGCATGTCT TGTTTCCTCA 37200

AATAAAATGT GGTCTTCTCA AGGGTATTGG CCCATGTTCT ATCCATCTGT AGATATCACA 37260

GCACCTAGCA GTGTCTTTCA CAGAGGAAGT ACACAACTGG CATTATTGAT TCATTGCTCC 37320

ATTTTTTCCT TCTTTATCCC CAGCATTTCT CAATAATTTC AAACATCTCC ATTGGAGTAC 37380

CGGAGAAAGC AGGTAGCTTT ACTTGCAGCT ATGTTTCTAT CCCCATAGTA ACTAAAAGAG 37440

GACCCAGAGA AACATGTTTA AATGCTGTCC TGTTATCAGG ACCTCAGCCT TCTGATGCTC 37500

CGTGGCTTGG GGGTTAATGC TTGATCATTT CCTCCCCAAC CTACACTGTG TACCTATGCT 37560

AGTCTCTTCA TGAGGACTAA GCCCCATAGT AAAAGGGCTA GATAAATAGA AAATCATTTT 37620

ATGTAATTAT AAGAATGAGA ATACTGAGTA TTACTGGTGT TTGTTTAGGA TAAGCACATC 37680

TTTATTTGTA TGAGAAAAAG AAAAAGAGAG TGAAAAATAT ATTAACGTGC ATATAGTTCA 37740

GGACCATGGA TTGCAAGTGA CAGAAACTCA ATTCAAACCA ACGTAAGTCA AAAGGAAAAT 37800

ATATTGGCTC ATGTAACCTT CTCACAGAGA GGGCAGGATG GAAGGGGCTT TGGGAACAAG 37860

AGAATTGTTC TCAAATTCTA GGAATACTAG GATTAGTCCA GGATGGGTCA CCTTCCTGTC 37920

CCTGAGGTGG TGGTAGCGAT GGTAGAGTCT TATGGGAGGA AAGAGTGCAT GTTAGGATGA 37980

AGGTAGGGCT AAGCAAACAA GGGCAAGGGC CACTATATCA TGCTAAAAAT GGTTTTTTTT 38040

GATGTCTTCC TTAATTTCAC AAATGCTTCC AACAAAGTAG CACACAGGAA AAAGAACATA 38100

GGGACTCTAC TGGTGGGTGC TTTTATCTTA AGCCTTGTAC TTGCTTTTCA CAGCTTACTC 38160

ACTGCTTGTA CCTGAGGCCA TATGCCCTGT AAAAGCTTCT GCAGGGTTTC TACTAAGCTG 38220 GGTTCCTTAT ATGGCTCTCT CCCATTTCTG TTGCCTCACT CTAGTGATCT TTCTCTTTTC 38280

CTCACCTCTG GGACTGGTGG CTGTTTGTAT GGACTGCCTT AGCTTTGCTT TGGGTTTTTT 38340

CCTGGGGACA ATGTCTTCAG ATTATCCTAG ACCAAATAAA CTACAGCCAC TGGGCCAGGC 38400

TCTTCCTCCT CCAACTGGAC CATGTTCCCA GGGCTCTTCA CCTTAGTTTA GGTCAAGCAT 38460

TCTTGGCAAA AGAAAGGCCT AGTTAACAAT AGACATTCTA GCAATTGATT CTTTTTGACA 38520

TGTTGTAAGA TCTATTCACA TTTTGTAATT AAAGCATTCC CCTATGGAAA CCAACACGAA 38580

CTAAGCTGCT CCTGGAATGC AGGGTGGCCT CCTCAATACA GGATGTTCTA GAGAGCTGTA 38640

TTTTGGGCAC TTAACTATTC TCCACTACTT AGGGCACAGC ACTGAAATTA ACACCACTAA 38700

GTTTGTCATG TCCATGTAGT TAGTCTCAGG CAGTGCAGCC TCAGGAGTGG AACTGACCTC 38760

TTATGTGTGT CCAGCCTTTC TTCCTTCAGA AGTCAGCTGT GTTTTCTGCT GACTCTCCAT 38820

AGGAACATCA GTCCTGAATC CTCAGACCAC CATCTGGAGT AGTAAGTGCT CCTGACAGTC 38880

CTAGAAGTTG TCTACCGCTG GATCTCCAAA GCGTGTGACA CACCGTGAGA GAGAAATGAG 38940

AAAGCTGGGC TCTTCAGGTA AATCTTGCTT TTTCACAAGC CCCCTAATTT TACTGCATAA 39000

TTATTTTGAA TTCACTGATA ATTTCTACAA TTTTCCCATA AGTCATCTAC ACACAATACC 39060

CTCTCATGCA ACACTTGGCT TTGCTAATAC ATATCTATTA TGAGAGCTGT GCTTCTTAAG 39120

CGTAAATGTT TTATATGCAC TAAGGCTCTT GGCTTACATA TAAAAGGGGT ATTGAGCAAT 39180

GTGATACAGA AGTCTTTTCT CCACAGGTCT CATATGTAAA GAATTCATTA GATTGGCTGA 39240

AATAGACTGA TCTGTCCATT TCTCTGCTCA CTTATCATAA GGAAGTCATT AGCTAAGGAA 39300

CAAAAACTAC AATCTATGTA ATTAGAAGAA CAAGCTGGTT TTGCTCAATA TAAAAATAAG 39360

AAAAAGAAAC CATGTGAAAG TCAAAATATT TGTTTAATCA GGTCATTGAG AATCTATTAA 39420

AAAGTATTTG AATTCTTTAT GATGAGAACT ATCTTGACTC AAGTGGACAG TGGTGAGCTT 39480

TTTGGCCTGT GGTCCCTACG TAGAAAGGAG GCTTTGTCAT AAAGTCTTAT ATGGTACAGG 39540

TGCCAAGTTA AGTGCCCAAG CTTGCTCTTA AAAGCATACT GGATTTTGTT TTAGACTTTT 39600 AGTGAACTGA AGGGAATAAA CAAATCCCTC TGGGAGAACT TCTCCTCCAT CCTTGGTGAA 39660

GTCATTCTGC CAGAATTC 39678

(2) INFORMATION FOR SEQ ID NO :4a:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 41008 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :4a:

TGGTTGATTT GTNNATAAGG AAGTTTGGAA TCAATCCCGG AAGGAATTTT TTTTTTAAAA 60

AATTTTTTGG AAGGGTTTGG TAWTAAAAAA RCCAATTTGG GTTTTTAAAA ATAGGAATTT 120

TATGGGAAAA AATTTTCCCT TTTTTTTTTT TTAAGTTTTA GATGTTATGT TTCCTTATAC 180

TTAAAGTGGG TGTCTTATAG GCAGCATATA TCTGGGTCTT GATGTATTAT TTAATCTGAT 240

AATCTCAACC TTTTTGTTGG AGTGTTTAGG CCATTTACAT TTAGTGTAAT TATAGACATG 300

GTTTGATTTG CTATACCATC TTTTCATTTG TTTTATATGT GAGCCATCTT TTCATTGTTC 360

TTTTTTCATC TTTGACCATT TTCTTTAGTA CTGAATACTT TTTTTGTATT TCATTATATC 420

TATTGGCTTT TTAGTTATAC CTCTTAAAAT TTTTTTTTCT GTTTTATGTA GGATTTATAA 480

TATACATCTT TAACTTATCA CAGATTACCT TCAAATAGTA TTTTACCAGC TCAAGTGTAA 540

TGTAGAAACC TTACAAGAGT ATATTTTCAT TTCTGTCTCC TAATTTTTAT GCTATTGTCT 600

ATAATACATT AGGTTTGTTG TTGTTTGTTT TTACCTTATT GCTGTTGGCT GGGGTCAGCA 660

AACATTTTCT GTAAAGGGCT AGATAGTACA GGCATACCTT GGAGATACTG TGGGTTTGGT 720

TCCATACCAC CACAATAATA CAAATATGCA AGAAGTGGAT ATCACAATAA AGTGAGTCAC 780

ACAAGTCTTT TGGCTTCCCA GTGCATATAA AAGTTTTGCT TATACTACAC TGTAGTCTGT 840 TCAGTGTGCA ATAGTGTTAT GTCTAAAAAA ACACATACCT TAATTTTAAA ATGCTTTATT 900

ACTAAAAAAT GCTAACAATC ATTTGAGCAT TCAGTGAGTT GTAATCTTTT TGCTGGTGGA 960

AGGTCTTTTC TTATTGATGA CTGATCGGGG GTCAGGTGCT GAAGCTTAGG GTGGCTGTGG 1020

CAGTTTCTTA AAACAACAGT GAAGATTGCA ATATCAGTTG ACTCTTCCTT TCATGAAAGA 1080

TTTCTCTCTA GTGTGTGATG CTTTTTGATA GCATTTTATG CACAGTAGAA CTTCTTTGAA 1140

AATTGGAGTC AATCCTCTCA AACCCTGCTC TGCTTTAACA ACCTAAGTTA ATATAATATT 1200

CTGAATCCAT TGTTGTCATT TCAACAATTT TCACAGTGTC TTCACCAGGA GTAGATTCCA 1260

TCTCATTTCC TGAGATGGAA TCTTTGCTCA TCCATAAGAA GAAATTCCTC ATCTGTTCAA 1320

GTTTTATCAT GAGATTGCAG CAATACAGTC ATGTCTTCAG GCCTCACTTC ACTTTTAATT 1380

CCAGTTCTCT TGCTGTTTCT ACCACATCTG TGGTTCCTTC CTCCATTGAA GTCTTGAACC 1440

TCTCCAAGTC ATCCATGAGG GCTGGAATCG ACTTCTTCCA AATTCCTGTT AATATTTATA 1500

TTTTGACCTC CCATGAATCA TGAATGTTCT TAATGGCACC TGGAATGGTG AATCCTTTCC 1560

AAAAGGTTTT CAATTTACTT AGTCCAGATC CATCCATCCA GAGGATCCAC TTTCAATGCC 1620

AGTTATAGCC TTATGGAATG TATTTCTTCA ATAATAAGGC TTGAAAGTTG AAATTACTCC 1680

TTGATCCATT TTCTGCAAAA TAGATGTTGT GTTAGCAGGC ATGAAAGCAA CATTAATCTT 1740

TTTGTACATG TCCATCAGAG CTCTTGGGTG ACCAGGTATA TTGCCAGTGA GCAGTAATAC 1800

TTTGAAAGGA ATTATTTTTC TTAGCAGTAG GTCTCAACAA TGGGCTTAAA ATATTTGGTC 1860

CACCATTCTG TAAACTGATG TGCTGTCATC TAAACTTTGT AGTTTCATTT ATAGAGCACA 1920

GGCAGAGTAG ATGTAGCATA ATTCTTAAGG GACTTAGGAT TTTCAGAATG GTAAATGAAC 1980

ATTGGCATCA ATTTAAATCA CTAGCTGTAT TAGCCCCCAA CAAGAGAGTC AGCCTATTTT 2040

TTGAAGCTTT GAAGCCAAGC GTCGACTTCT CCTCCCTGGT TACAAAAGTC CTAAATGGCA 2100

TCTTCTTCCA ATATAAGGCT GTTTTATCTA CATTGAAAAT CTGTTGTTTA GTGTAGCCAC 2160

CTTCATCAAT GATACTATCT AAATCTCTTG GATAACTTGT GCAGCTTCTA CATCAGCATT 2220 TGCTACTTCA CCTTGTACTC TTATGTAATG GAGTGGCATC TTTCCTCGTA CCTCATGAAC 2280

CAACCTCTGC TAGCTTCCAA CTTTTCTTCT GTAGTTTCCT CGCCTCTCTC AGCCTTCATA 2340

GACTTGAGGA TAGTTAGAGA CTTGCTTTGG ATTAGATTTT GGCTTCAGGA AATGTTGTGG 2400

CTGGTTTGAT CTTCTATCCA GACCACTAAA ACTTTATCCA TATCAGCAAT AAGGCTGTTT 2460

TGCTTTCTTA TTATTTGTGT GTTCACTGGA GTAGCACTTT TAATTTGCTT CAAGATATAT 2520

TTCTTTGCAT TCACAACTTG GCTGACTGGT GCAAGAGGCC TAGCTTTCAG ACTATCTTGG 2580

CTTTTGACAT GCCTTCCTCA CTAAGCTTAA TCATTTCTAG CTTTTGATTT AAAATGAGAG 2640

ATGTAGGCCA GGCACAGTGG CAGGCACAGT GGCATATGCC TGTAATTCCA ACACATTAAG 2700

AGGCCAAGGT GGGAGGATTG CTTGAACCCA GGAGGTGGAG GTTGTAGAGA TCACACCACT 2760

GCATTCCGTC CTGGATGACA GAGCAAGACC CTTTCTCAAA ATAAAATGAG AGGTGTGCTT 2820

CTTCTTTTTG TTTGAGCCCA TAGAAGCCAT AGTATGATTT TTAATTGGCC TAATTTCAAT 2880

ACTGTTGTGT CTCAGAGAAT AGGGAGGTCT GAAGAGAGGG AGAGAGGTGG GGGAATGGCT 2940

GGTCAGTGGA GCAGTCAGAA CACACATAAC ACTAATAAAT TGTTTGCTGT CTTATATGGA 3000

TGTGGTTTGT GATGCCCCCA AACAATTACA ATAGTTACAG CAAATATCAC TGATCACAGA 3060

TCACCATAAC AGATATAAGA ATCATGGAAA AGTTTGAAAT ATTTTGAGAA TTAGCAAAGT 3120

GTGACACAGA GAAACAAAGT GAGCACATGC TGTTGGAAAA AATTGGTGTT GATAGACTTG 3180

CTCCATGTAA GTTTGCCATA CGCCTTCAAT TTATAAAAAA CACAATATCT AGGAAGTTCA 3240

ATAAAGTGAA GTGCAATAAG ATGAAGTATG CCTGTAAATA TTTCAGGCTT TCCAGACCAT 3300

AGGGTTTCTG TTGCAACTGC TCACCTCTGC CATTATAGCA TGAAAGCAGC TATAGAAAAT 3360

A ACATAAAT GAGGCCTGTA ATCCCAACAC TTTGGGAGCC CAAGGTGGAT GGATCACTTG 3420

AGGTCAGGAA TTCGAGACCA GCTTGGCCAA CATGGCAAAA CCCCGTCTCT ACTAAAAATA 3480

CAAAAATGAG CCAGGACTAC GCATGCCTGT AGTCCCAGCT ACTTGGGAGG CTGAGGCAGG 3540

AGAATCTCTT GAACCCGGGA AGGGGAGGTT ACAGTGAGCC AAGATTGTGC CACTGCACTC 3600 CAGCCTGGGC AACAGAGTGA GACTGTCTCA CAAAAAAAAA AAAAGGAAAA GAAAATACAC 3660

ATAAATGAAT GTATGTGGCT GTGTACCAGT ATATCCTCAT GCTCTAGCTT GCCAACCCTT 3720

GCTTTACACT GTCAGTTACC TTCTAAAGAG ATTAAAAATC ATAACAATAT CTATTACGTT 3780

TATTCACATC CTAGTGTCAT TTCTTCCTTA TGTAGAATCA AATTTCATTC TGGTATCATA 3840

TTTCTTCTTT CTAAATAATT TCCTTTAATA TTTTTTATAG CACAGGTCTA ATAGCAATGC 3900

ATTATGCAAT TCATTGCTAT TAGACCTGTG CTATAAAATA GCAATGAATT ATGTCAGTTT 3960

TTATTTGTCT GAAAAAGTTT TTTGTTTTTG AAATATACTT TTGCTGGGTA TATAAATCCA 4020

TGTTGCATAA CTTCTCTTTT CTTCAGCACT TTAATGAAGT CACTCAGTTA TCTTCTGGCT 4080

TGTATAGTTT CTCTGGCTGC CTTCAAGATT TTTTCATTGT CTTTAATTTT TAGCAGTTTG 4140

ATGTGTCTAG GAGTGATTTT CTTTGTATTT ATCCTTTTGG GGGCCTCTTA ATTTCTTTGA 4200

TCCTTTTTTT CTTTTTTTTT TTTTTTAAAC CATTTTGGGT CTTTCCCCCC ATTTGGGGTG 4260

AAAAAAAAAA AAAAATAAAA TCATAGTTTA AAAAACTAAT TTTGGAAAAT TTTCAGCTAT 4320

CATTTCTTCA AATATTTATC CTACTCTATG CTCCCCTCCT CCCCTTTCCT TCTGTGACTC 4380

AAATTACAGG TATATTTAAC CATTTTATTT GTTCACGGCA CTTGGATGCT CTGCTTTCTT 4440

ATTTTTTGTC TTTCATTTTG GATAATTTCT ACTGACCTAT CTTCAAGTTC ACTGATTCTT 4500

TTCTCAGTCA TGTCTAGTGT GCTCAACGCC TGTTGAAGAA ATCCTTTGTC TTTAATATCA 4560

TGTTTTTTAT TTCTAGCATT TTCATGTAAC TCTTTGTTCT GGTTTCCATC TCTCTACTCA 4620 cτττττττττ rprprprprprpTTTT TTTTTTTTTT TTTTTTTAGA CAGAGTCTCG CTCTGTCACC 4680

CAGGCTGGAG TGTAGTGGCG CGATCTCGGC TCACTGCAAC TTCCGTCCCC TGGGTTCAAG 4740

TGATTCTCCT GCCTCATCCT CCCGAATAGT TGGAATTACA GGTGCCCACC ACCGTGGCTG 4800

GCTAATTTTT GTATTTTTTT AGTGGAAACA GGGTTTCACC ATGTTGGCCA GGCTGGTCTT 4860

GAATTCCTGA CCTCAGGTGA TCCACCTGCC TCAGCCTCCC CAATTGCTGA AATTACTGGC 4920

ATGAGGCACT GCACCCAGCT CTGCTGACAT TTTTTATCTT TTGCTGCATT TTGTCTACCT 4980 TTTCCATGAA ATCCTTTAAC ATAGTAGTCA TAATTACTTT CAATTCCTTG TCTGACAGTT 5040

CTGACATTCA AGTCTAGGTC TGTTAATACT TTGTGAATCT GTTAACAGCT TTTTTTCATT 5100

CTTGTCTGTG TGTTTTGTAT TTCTTGATTG TATGCCAAAT ATTGCCTGTA AAATAAACTT 5160

AGATAAGTCA TACTTCTATC CAGAAATAGC ACATTTTTTG TGTCCAGTCA TTATGTGGAG 5220

GAGTTGGGGC AGTCTATCAG TGGCTGAACT AGTTTGGATT TGTTGATGCT ATACTTAGAA 5280

TGCACCAGAC TTCCATTCAC TGCAAGAGTG GGCTGCTGCG CTTTGTGATT CATGTGAGGC 5340

CTGAATTGTG GAAGGGTTTT TCCTTAGTGT GTCCCTCCAT GCTCAGATTT CAGCAAGTCT 5400

TCATATCTGT GCCACAGAAG GAATCTGACC CATGCTCTTT TTGACCTCCC CAAGTGATCA 5460

ACTGTTGCTT GTTATAGCTT GTCATGGAGT AAGAGGGTGT TTTTTTAGTT TTCATCCTCC 5520

AGCCTTGGTC TTGGGCCCTG AGCTCCTAGA CTCCAGGAGT GGATGGAATC CAGTGATTTC 5580

TCAGTAATTC AGCCCCTTCT CCAGTAGTGG CAGATCTCTG CTTTGTATCA GTGCAAGATC 5640

CTGGGCTGAG CTCATTTTCT GCCCTTCCTC GAGTGGCAGA CAGCTCTTGC TTTCACCCTT 5700

CTACCAAAGG CAGTGCATCT TTTCTTGGGC CTCTCCCCAT TGAACTTATG ACTTTCACAT 5760

AAGAGAAGGG CTCATGTATC AGAGAATTCT GTGACTTTGT GCCACATACA GAGTCTCTCA 5820

GTTCTCTTGC CCTGCCCCAG TCTTTTTTGT GAGCACCTAG TAGAGACCCT TGGAGAAGAG 5880

CAAGGAAGCG AGTATGGACT TCTTTTGTGT CTGTCGATTG CTTTGTTTCT CAACTGCTAC 5940

TCTTGGACTT TAAGAATTCA TTAAAATTTC AGCTGTTTTC TTTTTTTCTT TCGTTTTTCT 6000 ττττττττττ ττττττττττ AGATGGAGTC TTGCTCTGTT GCCCAGGCTG GAGTGCAGTG 6060

GTGTGATCTT GGCTTGCTGC AACCTCCGCC TCCCGGGTTC AAGCGATTCT CCTGCCTCAG 6120

CCTCCCAAGT AGTTGGGATT ACAGGTGCCC ACCACCACAC CTGGCTAATT TTTGTATTTT 6180

TAGTAGACAC AGGGTTTCAC CATTTTGGTC AGGCTTGTCT CAAACTCCTG ACCTCATGAT 6240

CTGCCCGCCT CAGCCTCCCA AAGTGCTGGG ATTACAGGCA TGAGCCACCG CGCCAGGCCT 6300

CAGCTGTTCT CTTTTTACCT GCTGGGATGG CTAGTTTTCT GTGTCAACTT GACTGGGCCA 6360 TGGGATGTCC AGATATGTAA TTAAACAGTA TTTCTGGGTG TTTCTGTGAG GGTGTCTTCA 6420

GAAGAGATTT GCATTTGAAT TGGTGAACTA AGTAAAGCAG AGGGCCCTGT CTAGTAGGGG 6480

TAGGCATCAT CCAGTCTGTT GAGGACTTGA ATAGAACAAA AGGCAGGGGA AGGTTGGAAT 6540

TGCCCCCTCT CTGCTTGAGC TGAGACATCT ATCCTGCCCT TGGCACTCCT GGTTCTCAGG 6600

GGTTCAGACC TGGATTCCTG GTCTCCACCT TGCCCATGGC AGACTGTGGG ACTTCTCAGC 6660

CTCCTATCTA ATTAATAAAT TTTTTTTTAC ACACACACAC ACACACACAC ACACACACAC 6720

ACACACACAC ACACACCCTA TGTATCCTTC TGTTTTTCTG CAGAACCATA TTTAATACAC 6780

CTGCTTTTAT GACGATTACC TATCGATTCT GTATTCTGCC AAAACTGAAA ACAGTTCATT 6840

TTTCCATCTC TTCTCAGAGA GGCTTGTCAG CCATTAGTTC TCTGATGGGC TCAAGAAGTT 6900

ATGCAGTTTT TTTTTTCTCA CTGTTAGGAT GGAATTGATA TTCTGTTGAA ACTTTCTATA 6960

CCTAAGTGGA AACTTGTTTT GAGGTTATTT TCTCTACTTA CTTTTGCTGG AAATGGAACA 7020

CTCTGTATCT AGTTAAGACA CATAAACTGA CTTGTGATAC CATAATGTTG TGTTGAATTT 7080

TATATTCTTA GAAAATCATC TGTCAAGGTG TTAACTAATG GCAAAGCATT TAATAAATCA 7140

GCATTCATGT ATTCAGGTGC TCTGAATTAT CTGACTTTTA AATTCTTACT TTATAAATGA 7200

GAAAATTGGG GCATGGAAAA GTTAACTCTC CTAACCCCGA ATTATTACAT TATTAAGGAC 7260

AGGACTTAGA GGCCAGATAT CTTAAGTCAT TAATATTCTT TGGCTCACAG AATTGGCAGT 7320

ATAACCTAAA GGTAATAACT AGGTGATTTT CTTTTATATC AATTAAATAT GTCAGTTTTC 7380

AAATATTCAT AAGTACCTAC TGTGCAGGGA AAGAACATGC CATACAAAAG ATGTAGTCCA 7440

GGCCTTTAAG AAACTTTCAT TTAATGGGAA CTCAAGAAGT GTACATATAA GGAGGGAAGT 7500

AGCAGTATGG TACAAGATAA TACATACATA TCAGTGAATG ATATTGCCAA AAAGTGCTAT 7560

TGATAGAGCA ATAATTCATT TCTGCAAACA GCTGCTGATC TCCTACTGAA AACAGAGGAG 7620

GGAGAACAGG ACGCCTCGTG GTCAGGATAG AAGAGAAAGA CCTTGAGTTG AGCCTTGAAC 7680

AGTATTTAAT ATTCAAAAGG TTAAGAGAGG AGAGCAATTG AGGAGGGGAG AATAGTTCCA 7740 GCACAAATGA TGGTGTACAA GATGAACACA GTCAGTAAAG AGCAGACTGG TCTGGATGGA 7800

GAGGAGGATT TGCATCATTT GGGATTACGT CATTTAGACC CTTGAAAGCC AGGATTGAGT 7860

AAAGCCACAG TGAAGCGACT GGCTCGTATG GAAGCTTTAT TTTAAGAAGA TTAATCTGGT 7920

AGTGACATGT GCCAAAAACT GAATAGGTAG AAATGAGATG CAGAGAGCCC AGTTAGAACT 7980

AAGTCTGGTG CAGTAATGCA GGATTGAGGC AATAAACACC AAACTACAGT ATCACCAGAT 8040

AATGGATGTT TGAACGGACG GTTTAAAGGA AAATTGATGG TATTTGGTAA TTTATTAGAT 8100

AATCCAGGGC CATGGAATGA GAGGGGAAAA TGACTAACCA TAGTCATCAA ATGGTTTTTC 8160

TTAATGAATC TGAATTTTGG TGTAAGAGCA ACATTTTCTT AGGCCTTGCC TAGTTGGTAC 8220

AGCTGACTAT GATAATGACT GCTACCATGC TTGTTCCTCT TTTAGCAGCT GTGAGTCCCC 8280

CACCAGCCAA ACAATGAGCC TCTTGAAAAG GACGATGCCT TTTCACTTCT CTCCAAGTGC 8340

TTGGCAAATA GGAGGCCTTT TGAAGTTACT TTATAGTTAG GGGTTCCCAG TGAGTATTTG 8400

AAATATTAAG TCATGCCCGT GGTTGACAGC ATGGCCCTAC TGCTCATCAT CAGCTATTAA 8460

CCTTAGGCAA GTTAATGAAC TTTTCTAAGC CCCAGTCTAC TCATTTATAA AGTGGGATTA 8520

TTAATAATGT CTACTTCATA AAATTATGAA GCCTGAGTTA GGTCATTCAG ATAGTGTTTA 8580

GTCTGATTCT TCGAACCTAG TAAACAGTCA GTAAACAGAA GCAAATGCCA CATGCCTGAT 8640

TTATATCCAA GGGGAGAAAG GTAAAAGTGA AATTTTCATG ATTTATGGAT TCAAATT TA 8700

CATTTCAAAG ATGCTTTATA AGCTATTGTT TTGGTAAGAA GAATTGAGCT GAAACAGAAT 8760

TTTCTGACAG CAGTGATTAT TAAATGGTGA AATAGGCTAT TGATGTCTTT AGAGGATATA 8820

GATGTTCACC TTTTGCATAT AAGTGCACAA AAATTCACTA AGTAGATATG TCTGTCTACA 8880

CAGAGAGAGA GAGCGTGAGA GCATTAAAGT TAGTAAACAT CCCCCTCGCT TTTTTTTTTT 8940

TGAGACAGGG TCTTACTCTG TTGCCTAGGC TGGAGTGCAG TGGTGCAATC GTGGCTCACT 9000

GCAGTCTCAA CATCCTGGGC TCAAGCGATC CTCTCGCTCA GCCTCCTGAG TAGCTGAGGT 9060

GTGCACCACC ACACCCGGCT AATTTTTAAA TTTTTTTATT GTAAAGGTGA GGTTTCACCA 9120 TGTTGCCCAG GTCTCAAACT CCTGAGCTCA AGCAATCTGC TCACTTCAGC CTCCAAAAAT 9180

GCTGGGATTA CAGGCGTGAG CCACCACGCC TGGCCAGTAA ACCCCATTCA TTTACATCAT 9240

CTTACTTGTC CCTCCAAAAT CCTGCAAAGT AGGTAGGTTC TGTCTTTATT TGTTATTTAG 9300

GTGAAGAACT TGAAGTGGTG TTGAGGAATA GGTGTTTTGC CAAGAGTCAC GCAGCTGGAG 9360

TGGCAGAGCT GTATACTCTT CTGATTCCAC CAACGCTGTT TACATCACAT CTGGAGAAAA 9420

GTGCTCTGAG GCACAGATGT TTAGTGGGAG GGATGAGACA CAGGCTGCAA TGCCTAAAGA 9480

TAATCGGGAA TAAAAGCAGA AAACAAGACG TTTGTTTCTG TTAAAATGAG ACAGAAAATA 9540

AGGCGTTTGT TGTTTGGGAT TGAGCACTTG GAGAAGTGGG GAGCGATTTG ATTTGGGTGA 9600

GACTGCTCCT GGAATGCTGC ATCTGGTTCT GGACTACTCA TTACTAGGCT TATAGAAACT 9660

AGCTGGAGGA GGTTCAAAGA AAAGCTCCAA AATGATTAGC GGGCTGACGG GATTGATTTA 9720

TAAGAAATAT TAAAAGAATT AAATGTGTAT AGCTCAGCTA AGCAAAGATG AAAGAGACCA 9780

GCTAAATGTA TACAAATATC TGAAACGTGC AAACTTTAAA AAGAGAGATT AATTATTTAA 9840

CATGATACAC GGGGGCACAA TATGCAGTCA CAGGATGAAA ATTTCAGCTG AGTATCTAGA 9900

AGAATTCCCC GATAGTGAAT CTGTTAAGGC TGTCTGTAGT GTGGCCTTTC CCTGGAGAGG 9960

CAATAGAAAT TTCAAGTCTT ACGATTTTAA AAGTTTCTTG GGAACTAGGT ATTAGATGAT 10020

GTTAGAGAAT TATTATTAAT TTGGTCAGGT ATGATAATGG TATTGTAGTT CTATAAGAAA 10080

AATTGTATTT TTTAGAGTTA CATACCCTGA AATATAAGCA TAGAATATGA TGTAGGAGAT 10140

TTGCTTTAAA ATACCACAGT AAGGAAAGAA AGGAAGGAGG AAGAAAAGAA AGGAAGGGGA 10200

AGAAAGGGAA AAAGAGGCAA AGAAGGAAGA GAAGGTAAGA GAAAGAAAAA GAATGAAGGA 10260

AGAAGGCTGG GCACTGTGGC TCATGCCTAT AATCCCAGCA TTTAGGAGGC CAAGTTGGGA 10320

GGATCACTTA ATTAAGCCCA GGAGTTCAAG GCTGCAGTGA GCTGTGATTG CGCCACTGCA 10380

CTCCAGCCTG GGTGGCAGAG TGAAGCCCTG TCTCTAAAAA AAAAAAATAA GTTAAAAAGA 10440

AAGAAAAGGA TAGATGAAGT ATGGCAAGAT GTTGGTAATG TTGAACCTGA AGGAAGTTAA 10500 TATGTGAGTT CACTTTCCTC TTCAGTCTTC TTTATGTATG TTTGCCAACT TTCATAATAA 10560

ACAATTTAAA TTATATTTTC CTGATCAAAA CTTAGTAGCA GTATTAATCC CTGGGCTTCC 10620

TGACTAGAAC AGCCTCATTA CCACATGGGC AGAGTTCTGG CCGACCAGGG ACCACGTAGT 10680

GGTTCACCAT CTTGCTCTGG TAATGTGGTC TGGGCTGAAG GGCCCTTTCT AAGGTTGTAG 10740

ATAGAAATCC AGGAAACTTG TTAGAACTGC AGACCTATCA GGGTACCTGC AGGAGGTGAG 10800

TCTACTAAGG TGAAAAAGCA GAGGGCAGAG GTCGTGATTA GCAGCTGACC GCCCCCTGCT 10860

TTTCTGTCCC TCATTCGTGG AAAATTGAGT GGAGCTCAAT TTTGAGTGGA GCTCTAAGTA 10920

GCTCCACTTG TAGACATTGA GTGGAGCTCT AAGTGTCTTC AGAATAGCAA AACACTAGTT 10980

TTCTTTTTCT TTTCTTTTTT TTTTTTTGGG AGACAGAGTC TTGGTCTGTC CCCCAGGCTG 11040

GAGTGCAATG GCACGATCTC CGCTCACTGA ACTCTGCCTC CCGGGTTCAA GCGACTCTCC 11100

TGCCTCAGCC TCCCGAGTAG CTGGGATTAC AGGTGCCCAC CACCACGCCC AGCTAATTTT 11160

CCTATTTTTA GTAGAGATGA GGTTTCACCG TGTTGGCCAG GCTGGTCTCA AACTCCTGGC 11220

CTCAAGTGAT CCGCCTGCCT TGGCCTCCCA AAGTCCTGGG ATTACAGGTG TGAGCCACCA 11280

CACCCAGCTG CAAAACCCTA TTTTTCTTGA ATGGAGAAAC ACTTTCCCCT TATTTATTGA 11340

GTTTGGGAAG CAAGAAGAGG GGTAATTCAT TAAGTGAAAA TTTCCAAAAT CCAGAAAACA 11400

TCGATAAAGC AGCAGCTTAA TTTTTTTAAG GAAGAATTTT TTAAACTATC TTCTTTTGAG 11460

CCTCTTTAGG AAGACCTCAC GTCCTTGCCT TGAATGTTGA GAGTGGGAAA TCCAGGGAGG 11520

TTTTGGAATG CATGCCTTAT GTCTGCTTTT TTGTTTGTTA GAGAAATATA AATATTTTAT 11580

CTAGGTTTTG CTGATGGCAG TCAAGCATGA ACACAACCCA CTGTTTGAGA AGCTGTAATT 11640

TCTGAATTTC TGCAGAGTGC ACATCTAGGC CAGCAAATGG CAGTAAGAGT GAGGTGGATT 11700

TAGCTCAGTG TAAGGATGAA CTCCAGAACC ATCGGCTCTG ACTGAAAGTG AAGCGGCAGC 11760

CGCGTTGTGG GAAAGCTGGC TGGAGTCTCT CTCATAAGCA GGCATTCTTT TTCTCCAGCC 11820

CGTCACTGTG TTGGTTTGGG CCCACGGTAA GCCTCCTGGC CTCTAGGCTG TAACCCCCAC 11880 CATCCTCCTC TGCCTCGCCT CCAGAGTGAT TGTTCTGAAG CACAACTGGA TGTCATTCCC 11940

CTTCCTGAAC TCCTAGCACC TACAGGGACT CCATCCCTTG TGCCCCACAT ACCTCACACG 12000

TAGACATTCC TAATGAAGAT TTGATTGAAT TATTGTAAAC TCAGTGCCTC CCACTCTTCT 12060

AGTTGCCTCT CTGCCTGCCT TTGTACATTT ATTTATTTAT TTATTTATTT ATTTATTTAT 12120

GAGACAGAGT CTTACTGTAT CACCCAGGCT GGAGTTTAGT GGCACCATCT CAGCTCACTG 12180

CAACCTCTAC CTCCCAGACT CAAGCAATCC TCCCACCTCA GCCTCCCGAG GAGCTGGGAC 12240

CATAGGCACG TGCCACTATG CCCGGTTAAT TTATTGTAAT TTTTGTAGAG ATGGGGTTTC 12300

ATCGTGTTGC CCAGGCTAGT CTTGAACTCC TGGACTCAGG CGATTCGCCC GTCTCAGTCT 12360

CCCAAAGTGC TGGGATTATA GGCGTGAGCC ACCATGCCCA GCCGCTAGCA CTCATCTTAA 12420

TCGTATATTT ACTTATCTGG CTTTCCCACC AGACTGCGGG CTCTTCAAGA GTAAATGCCA 12480

TGTTTTCACC TTTATTTCCC CAGTTTGTGG CACATTCTAG GCACTCGCCA TCATGAAATA 12540

AACCTCTGGA GCTGTGATAT TACAAACGTG AAAAGATGAC GAGCACTCAG CAACTTTCAG 12600

TGAGTAAACA AAGGCTTTCA TTCAGCATGT ATTTATTGAC TGCCCTGATC TGGGCTGCTT 12660

CCTGTCTGTG GTTCAAGGAG AGCATAGTCT ACAGAACCAG AGACCTGGCT ACTCTGGAAG 12720

TTAGACTTAA GCCCACCCCG GTCCTTGAAT GGGGAAATAT TTCCCTTCAT TCCTGTGTTT 12780

TAGGGACAGA AAGATGAGTA ATGCAGTGAT ACATGCTGGA AATGTTTATT CCACTACCCG 12840

AAGCTGCCTC TCAACTTAAC AATCCATGAA AGAAACAAGA TGGTATATAA CTTTTTCTAA 12900

TTTGTGATGC CTTTGTTTAT TTGTTTCCGG TTAAAAGAGG AGGTGGCATT GAATTGTTTG 12960

TTTGGTTTGG TTTCTTCTTC AATAAGAAGC ATCTTAATAT AACTAGACTG GACATCTGTC 13020

CCATTTTCAA AAATTACAAG TTTCGATCAT TGCTAAATTG TACAGATCCC AATCTGTCTG 13080

CTCTGCATAC ATTTGCATTT ATAAAAGCAG AAGCAGACTA GCAGTCTTTC TAATGCAATC 13140

CCCCAAATGC ATGAAGTATT AGATTGCTTC TCCCTATTGG TTCATGCATT GCTAAAGGCT 13200

TAAAAGGATC ATTGATTTTA ATTATTTAAT GTGTACAGCA GGCTGAGCTT CCTTTCTTTT 13260 TTAAGGGAAG AACCTTCAGG GGCATTGCTT TAGTTTTTTA ATGTTAAATC TCATTTTTCT 13320

TTGAAAATAA GAAGTTAAAG CTGTATTCAC ACAAGCTCTC AAAGTGCCAG ATTTTCATTG 13380

TGTTTTTAAA CCATCTAGGA AATGTTTGAT TCTAATGAAA CATTACTGCT GAAAATTGGG 13440

CTGAAATTGC TGGGCTGAAA ATATTGTTAT AACTTCACAT GATTCCAGTG TTGTATTATT 13500

ATTTTTTCTT TTCCTTTTTT TGACCCGATA TAGATGAAGC GAAGAGACAA GGGAGCAATC 13560

CCATGTGTAA TAAAAAAAGG CAGCCTGAAT TGTTGTTGCT GTTTTTGAAA TTTAAGCTGG 13620

TTTTCAATTA AATTCAGTAA ATGGTCCAGG ACTATAAATG TTGAACATTT TTTACCGTGT 13680

GATTTAAAAT TTAGTTTTAA TGTTTTTTTT TTGGGTTTTT TTTTTTTTGA TGGTTTACAT 13740

TTTCCCCATG GAAAGCAGCT ATGTCATGTC GGCATGATTC ATCATGGTAA CATCTCGGGT 13800

TATTTTGGTT TGTGTTATGT TCAGAAAGCG GAATGCCAAA AATAAAGAGT GGTTTGTGAT 13860

GTCTAGTGTG TCTTCCTTTA ACAAATCAAA GGCTTTTATT TAATCCACTT AATGGGACAC 13920

TGCAGAAATT TAAAAAATGG AAGTCCCATC CACAGAAGGC AGGTACTATG ATGTAAAAAG 13980

TTTAGGTGGG GGATTAATAG AGTGATCATA TAATTTATGA GCTAAACCGG AGGCACTTTT 14040

TTTTTTGAGA TCGAGTCTCA CTGTTGCCTA GGCTGGAGTG CAGTGACGTG ATCACAGCTC 14100

ACTGCAACCT CCGCCTCCCG GGTTCAAGCG ATTCTCATGC CTCAGCCTCC TGAGTAGCTG 14160

GGACTATAGG CGCCCACCAC CATGCCCAGC TAATTTTTGT GTTTTTTGTA GAGATGGGGT 14220

TTCACCATGT TGGCCAGGCT TGTCTCAAAC TCCTGACCTC AGGTGATCCG CCCACCTCGA 14280

CCTCCTAAAC TGCTGGGATT ACAGGCGTAA GCCACCATGC CTGGCCCAGA GACACTTTTG 14340

AGAGTGAAGA GGAAGCTGAG AATAATTCAC TGATCTACAA CTGGGACCAT CCAGGGCAAG 14400

CCAGATGCCA TTACCACTAG CTAGAAAGCT TGCCAAGGTC TCATTTACCT TGGTATATAG 14460

CAAATTCTTC TTTGAATTCT GGAAATTCTG GTAAGTCATT GAGGTAGCTC TGTGCCAAGG 14520

AGCAATATGG TAGAATTCTA ATATTTCAGG CAGTACAACA CTTTCCTGCA TTTGTAGCAG 14580

GTAAAGGGAG GTCAGGGCAG AAGACAAAAC CACTGGGACT CGACAAAGGG CATAAACGTC 14640 TAATGCACCT GATGTAGCTG ATGGTAAATT GTTATCAGCT AAAGATCTTT CATAATAAAT 14700

AAACTTATCA TTTGTAGGAG GGCACAGAAA TCGTGGAAAG CTGGGATTCA GGTTGCCTGT 14760

GGCTTTAATT CTGGAATCAG AAATATTAGT CAAGGATATC AGTCTATGAA GTAAGTTTTC 14820

AATGTTATAT GCCACAAGAT GCAGCTGTCC TATTTTCACT TCCAGTAATT CCTTCTGAAT 14880

TAATACACCT TAAAAATAGC TGCAGCTTCT CAAATCTGTG AGAATCGTAT GTGCTGCTTG 14940

CTACACTTTC CTTTTTCCTG AAGGCCTCTT TGAGGTCTTT CAAGAACTCA ATTCAATTCA 15000

GCAACAATTA GGGGGTCTAA GGTATACAGA CGCTGTGCAA GATGCTCCTG AGACACAAAG 15060

AGGAGGTCAA GCCCCTGCCT TCAGGCACCT CTCTATAATA TAGGAGGAGA AAGAGAAGAA 15120

ACACTAATAC ACATAGGTAG GTGCCATTAA AAGGGTGCAT ACATTAAAGC CAGGTGGTAG 15180

GTGCAAGAAG ATTTGTAACG TGAGAATTTT CTGCATGTTT GAAATATCTT ATAATTTTTA 15240

AAAATTAAAA TGGGAGATAC ATATA ATGT ATTTATGTAT GTATATATGT ATGTACATAT 15300

ACACACATAT A ACATAAAT ATATACATAA ATATGTATAT ATGTGTATAT AGACATAAAT 15360

ATGTATATAT GTGTATATAT ACATAAATAT GTATATATGT GTATATAGAC ATAAATATGT 15420

ATATATGTGT A ATAGACAT AAATATGTAT ATATGTGTAT ATAGACATAA ATATGTATAT 15480

GTGTATATAG ACATAAATAT GTATATATGT GTATATAGAC ATAAATATGT ATATATGTGT 15540

ATATAGACAT AAATATGTAT ATATGTGTAT ATAGACATAA ATATGTATAT ATGTGTATAT 15600

AGACATAAAT ATGTATATAT GTGTATATAG ACATAAATAT GTATATATGT GTATATAGAC 15660

ATAAATATGT ATATATGTGT ATATAGACAT AAATATGTAT ATATGTGTAT ATAGACATAA 15720

ATATGTATAT GTGTGTATAT AGACATAAAT ATGTATATAT GTGTGTATAT AGACATAAAT 15780

ATGTATATAT GTGTGTATAT AATAATGTGT GTACATATAC ACACATATAT ACATACATAA 15840

ACATTCTGCA TTATACCATT CACTTTGTAA CCCATCTTCC CTAAAAACTG TCTCATAAAG 15900

AGTCTTCTTT TCCCTGTACC TATGCAATGG TAAGTAGCAA AACACACATT CTTTTGGGTC 15960

CCCATAACAT TCCCTGTAGT TTGCCCTTAA CAGTCTTTGA TGTGAAATTT ACTGTTTCTG 16020 TCTTAACCTT GCCTGTCTCG CGTACATGGA GTTTTGGCTC CTGGCTCCTA GTCTGCATCT 16080

TCACCCCATC CCTTGCCCAA AGAATCTGGT TATGTGACCA CTGCTCATCT TTTCTGCTGT 16140

CACAACTCCA GTCCAAGCCA CAAACCTCTC TCTCCTGGAC TCCTGCGGGG AGTTCCTTTC 16200

TCTCCCTGCA TGAGTCTATT CTCCGCACAA CTGGCAGAGG TAAGTGAGAC TGCGGAAGAG 16260

GCAAGTTTGC AAGTCCAGAG GAAATGAAGA CTCTGCTTGT GCACATGCTG GGTTTGACGG 16320

GTGCTGGATA TCCGATGGAT GGCCCTTAAG GTGAGCTCAA GGCTTAAGGG AGAGATAGGG 16380

GCTGATGATC TGAGATTCAT CAGTGTGTGG CTGATGTTTA AACCCAGGGG ACAGGATAAG 16440

AAGGTTATTC CAGGGAGAGC GTAGATAAAG AAGCTAAATG GCTTCTGGGT CCTTAGTCAT 16500

TCAAAATCGG ACCTCTGAGG CAGGAGGAAA GCCCAGAAAG AGTAGATTCC TGGGACTCAC 16560

GGGATAAAGA CTTTCAAAAA GTGGGGGCTG GCCAGTGCTG CTGAAGGAAG TAGCAGGACC 16620

GGAACAGAAG GG AATCGTT GGACCTGGAG AACTTGAATT TGAATTTTAA GGTTGG AAC 16680

CTTAAAAAAG AGCAATTTTA GATACCTTTT GAAATTATTT GCAAGATTTG TTTGGTATAT 16740

GTGTTATTCC AGGCAAAGGG ACCAGAAAAG TAAAAAATAC TTACTGAACA GTTACTGCAT 16800

GCCTGGCACT GTAACACCCT GTTTAATTCT CACGGCAACC CTATAGAGTA GGTGTCATCA 16860

TCCCCATCTT ACAGATGAGG ATATGAGGTG CAGCTAGATT AAGCAGTTTG CCTCAGGTTA 16920

CACCAACTGG TTAACGTAGA GCTAGGATTT GAACCCGGAT GGGCTGATCC CAGAGCTCAT 16980

GCTTTAAATC GCTAGACTGG TGCTCACAGA AGACTGGGAC CGAAAAAAAT TAATAAAAAA 17040

AATAAGGAGC CCCCTGGGCT AGCAAATTAG GAGTTGTTCA GACAGATGTG AAAAGGAAAG 17100

CAAGGCAGAG GGAAAGTCAC TGTACAGAAG AGAGAGACCC ATGACAGCAG AGACAGTGAG 17160

CTGGTAAAGT GGCTGGCGAT CTAGCCCCTG AAAATACCTC CAGAGAGGCA GGCTCACGCC 17220

TGTAATCCCA GCACTTTGGG AGGCCGAGGT GGGCAGATCA CCTGAGGTCA GGAGTTTGAG 17280

ACCAGCCTGG CCAATGGCGA AATCCCGTCT CTACTAAAAA TACAAAAATT AGCCGAGCAT 17340

GGTGACAGGC ACCTGTAATC CCAGCTGTTC AGTTGGCTGA GTCAGGAGAA TAGCCTGGAT 17400 CCGGGAAGTG GAGGTTGTAG TAAGCCAAGA TTGCGCCACT GCATGCCAGC CTGGGCGACA 17460

GAGCAAGACT TTTCTTAAAA CAAACAAACA AAAAAGAAAA AAGAAAAGGA AAGAAGAAAG 17520

AGACAAAGAA AGAAAGAGAG AAGGAAAGAA AGGAAGGAAG GAAGAGAAGG AAGGAAGGAA 17580

AGAAAGAAAA GGAAAGAAAG AAAAAGAAAG AAGAAAGAAA GGAAAGAAAA GAAAGAAAAA 17640

GAAAGAAAGA AAATACCTCC AGAGAGCCAG GTCTCTTAGG CCTTCTGAGA AACTCACATC 17700

CCTTTTGATG AACACAAATG CTTCACACTC TCAATGTTAT TGGTAATCCA AGTTATCAAT 17760

ATACCTAAAT CACTTAGTAC TGAATCTGGC ATATAGTAAT CACCTAATGA AGAGATAAGA 17820

GTCATGGAGT ATTCTGAAGC AATTAGAATC AATAGACTCA ATATACACAT GGCAACAAAG 17880

TTGGATCTTA AAAACCGACC TGAGTGAAAA AGGAAAGGGA AAGATACATA ACACGGTACC 17940

ATTATGTAAA TTGATAATAT ATGCTTACAC AATTTGTAAG AACACA ACA AATAGATACA 18000

TGTATATTAA ACATACTCGA ACGGTTACCC TATGGGGTGG TGGCTGGAGT GGGGGTAAGT 18060

CCGTAAGCTG TAATGGAACC TAAACAAATA CATGAAACGA GTAGGAATCA GAAGGAGTAA 18120

CAATAAAAAT GTGCCATGAA CTGAGGAGTG TAAATTAATC AACTCACTGC ATCTGAGGTT 18180

AAAAATAGAA AGATGATAAT TGTTATTCTT AT ACTCCTA GGTCTTCCAC TTGCACTCAG 18240

CTTTACAATG TTGGACTATC CTTCAGATGG CACCCTCCTT GCACTTGCTC AGGCAGGAGA 18300

GCTTTTTCCT CCAGCTTTCT AGGTGATTTA ATATATCAGG GAATAAGTAT AAAAAAAGGC 18360

ACGGTGCTCC CTGGGTAGCC TTTCTGGACT TCAGAGCTAA ATTGCAAAGT CAGTTTTACA 18420

CATGTGATTT CATCTATGAA ATTAGGGCAA GGTATAAAAC TGGCACAGAA AAAATGTGAT 18480

TTATTATGGT GTTACTATCC CTTACAAGCG GAGTGTCAGC TGCCTCTTTT TGTCCACTGA 18540

TTTAAGGCAA GATGAACTGA AAGTGGCTAT GATCACGTCT TCAAAAGCAC ACTCTGGCCC 18600

CTCGGCTGCA GGCGCCCTGC ACATTCCCCA GCTGCGTGTC CGGTGGTGAC ACAGTGCATA 18660

ATTGTGGCGC CTTCCTGGTG CAAACTGTCT CACTTAGCTC CGTCTTGCTG GCACAGCAGA 18720

AAGGAAGAAA TCGAAAATGT TTGGATTTCA AAGGTAACAA GAAGCTGGAA AACAACTACT 18780 GGCCGAGTCT GAGAGTTTCA GCGGAGACTG GTGCAGCCTT GTGTTTTTCC ACTGACAGCT 18840

GAAAATGAGC CCAGCTTCAG TGAAGCTTGT TTCCTTCCCT CCTCAAGGTT ACCCACAATT 18900

CTCAGTTCTC TCAGGAAAGC CAAAAAATGA ATTTGAGGGT TTAGGATTGT GGTTCTTTTA 18960

TCTATTACAG GATTGATAAT ATGTTCCTCC ACCAGATGTT CTGCTTGTAA CAATACTCAC 19020

TTCCTGACAC TACTGCATAT GCAGGAGTGT CACTACCAAG GTAAACACAG AATTGGCTGC 19080

CCAATTCCAA ATCCCTGAAC TGAGTGAGAG AAATCAGAAT TATAATAGGG GATTCAACAG 19140

AGCTGGCTAC GGATGTGCCA GTGGTCAGAT ACTTTGCTCA TCATACGCAG GTGCTGCTGC 19200

TCTAGCAACT GCTCACTGCT TCATTTCCTG CCTTGGTCTT TAAATACTGC TTTTCTCAGC 19260

TCAATTGGCT TTCTTCCCTC TGGCAGTCAC GTTTCTTTGG GTCAAACAGC AAATGATTCT 19320

TTAGAATCAC CTGGTACTCA AAGGAGCTAC AAGACATTGG GCATCCACTT CCACTCTCTT 19380

GGAAAAACAA TTTTATGGAA GCCAAGGTTG CCATAGTGCC TCTTGAGGTT GTTTGCTCAG 19440

CCAAGGCCCA AGCTTTGTGC TTCAAACATG AAATTAGAGA GCTTCAGAAC AAGATCCACA 19500

TTTTCAATGG CCTCACCCAA CTGGATAAAA GAACAATTGC CATATCTCAA TGACCACCTT 19560

TTCTCAGGTG GGATGGTAGA TGCTGGAATG GGTCACAGCA TTGCCCAACC AAACTTTGCA 19620

AAAAAGGCTG GAAGCTCTGA CTGGGGACCC TAAATATGCA AAAGTTAATA GGCTCTTCAT 19680

GCAGAATATG AACCCCGTGT ATGGATATAG CTAAAGGGTT GGCCTTTATG TTTCTATTCC 19740

TTCACAAACC TGGTAGAATA GATATGCTTG TTTCCCTTTA AAAAATGTCA ACAATTGCAT 19800

TTATGATGCT GTGTATAGTA ACTCACAGAT CATGCTCCAT GAAAATGCTT CAGAACCCAA 19860

TATAAGGAGA TTTTTTAGCC ATGTGTGACA AAAGAGAGGC CATTTCAGTG TTGAAATTGT 19920

TCAGAGAAGT ATTTGATTAT GTTTTCTCAG ATCTTTTTAT TTTTATTTTT TTTGAAACAG 19980

AGTCTCACTT TGTCACCCAG GCTGGAGTAC AGTGGCTGTG GTCTCGGCTC ACTGCAACCT 20040

CTGCCTCCCA GGTTCAAGCG ATTCTCCTGT CAGCTTCCCG AATAGCTGGG ATTACAGGCG 20100

CATGCACCAC CATGCCTAAT TTTTGTATTT TTAGTAGAGA CAGAGTTTCG CCATGTTGAC 20160 CAGGCTTGCC TTGAACTCCT GACTTCAGGT GATCCACCCA CCTCAGCCTC CCAAAGCACT 20220

GGGATTACAG GCATGAGCCA CCGTGCCCAG CCTGTTTTCT CAGATCCTGT ATTTTGTTTC 20280

TGAAGCCTTC ATTTCTATCT TCTTATTCAT TTTGGAAGTA GTACACCTAA GTAAGGTTTT 20340

TAACAATCAA ATATCTTTGG AAAATTCCCT GGTTCCTTTC TTATTCCTAC AAAAATATGT 20400

TCAGTATAGC TGATGTTATG TTTCTTTCAA ATTATTCATT TCTCTATCTC AGAATTTATC 20460

TCATGCCTAA TTGTTATTGA ATAGTCTTCA CTTCTTGTCA TCCAGTTTCT GGTCTCTTAT 20520

TTCACTCTAA GTCTAATTGG CTATTAGAAT AAAGAGCTTG TAACAGATTC TTTCTCCAAT 20580

ATGTCTTATC TTTTGACTGC ATGCCAGTGA CAAACTGTTA ACTGTTTTGA TTCTTCATAA 20640

CATTCCACAG AACATGCTGA CTCCTCTCTT CCTGAAAGCA ATGCCCAAGC ACAGCATTGT 20700

TAGATAGTAT GTACGCAACA GGGACATGGG TGCATAGCAA AAACTAGAAG GAAGGAGGAC 20760

CTTCCTTAGC AATGGGTGAT ATGGTCCCTG GACTTAGACT CCAAAGGGTC GTGAGGTGAA 20820

ACACACATCG TCCATACCCA GGAAGCACAC AGGTGGGATG GAAGAGCTGT GCCTAATGAA 20880

ACTTCATCCA CGTGGAGGTG GAGGAGGCTG CAGCTGCAAG AACTCAGAGC TGCCTTACCC 20940

AGACCAGGGA CCAGGGAGGG CTTTCTGGAG GAAACAGCCT CTGAACTGCC AGCTGATAGA 21000

GGAGCTCTAC CTCAACTCTT CTGGTTCCCC AGGGCTGCTT TTCCACGTCC ATTTATTGGC 21060

ACTGAAGTTT GAATACCTTC AGGGGCCCGA AAGCCTGCCA GGTCCTCTTC TCTGCAGAGC 21120

AATCACACCA ACCTGCAAAG GGCTAGGAAA GGGCTGTCAT CATCTCCTAC TCAGAAACTG 21180

GTTCACTGGA AGGACTCAGG GGCCACTGAA TACATCCTGG CAGCTTTCAC AAGAAGGGCT 21240

TCTGACTCAA GGATGTTTCC ATCTTTGCCA GGTCGCCTTT TCTCCTTCTC TTAGAGTTTG 21300

GAGGACGCAA ATGTGCTGAG AAGTCAACCT TTCCTGCAAG GTGAGACACA AGGGCCTTTC 21360

CCAGCAGAAA GAAGAGAGCA AATGGAAGGT CCTTCTTCCT CCAGTAGAGG ATGGACTCTG 21420

TCTGGCAGCC ACCCAACAGG AAAAGCACAA TGCATGCCTG CCTGCTTCCC TCCCTCCCTC 21480

CGTTTCTCCC TCCCTCCCTC CTTCCTCCCT TCCATTCTCT TCCCTTCCCC TCCCTTCCCT 21540 TCCCCTCCCT TCCCTTCCCC TCCCCTCCCC TTCCCTTCTC CCTCTCCTTC CCTTCCTCTT 21600

CCCTTCCTTC CTCTTCCCTT CCTTTCCCCT CCCCTTCCTT TCCCTTCCTC CCTCCCTTCC 21660

TCCCTTCTTT CCTTCCCTTC TTTCCTTCCT CATTTCCTCC CTTCCTTCCT TCCTTCCTTC 21720

CTTTCTTCCT ACTTTCCTAC CTTTAGGGCT CTGTGTCTTT GGAGTCCATT CTGATTATGC 21780

TGTAATGTCT GCCCCTTCCT CTTCTCTGTC AAAAAATGAA AGACATGGAA GCCACTTGCC 21840

TTTTACTGAA TTAAAAATTA GTAAAAGAGC TAAAAATTAA TGGTTAAAAA TGTACGCATA 21900

AATTATGCAG TATACTAACC AATGAAAAGA TACACTTCTC TTAATTAAAA GCTGACAGGG 21960

AGGGAAACAA GAAAAGAGAA ACACAAAACA ATAATCTAAA TGACCTATTA GTTGGAAGAA 22020

CAACATCAGA GAAAATAGAT ACTGTGTATA GTCATGTGTA TGTCTATGGA ATAACATTTG 22080

TAGAGAAATC TGGACTGATC CTTTCTGAGT AAAGAGAGCT GTGGGTACAA TTAAGGGGAG 22140

ATTGAAAGGA ATCCAAAAGC ATAGCAGATG CTGTGCCTCA CTGGAATGGT TGCCGATCTC 22200

CTCCAAACTA TGAAGTGTTT GAGGCTCAAC TTTAATATAA TTAAGATACA AAGACAGAAT 22260

GAGAGAAAGA GAGAAGGGAG CTCACTGGAA GAACACTCAA GATTCCTTAC TACTCATTCT 22320

CTAAAATTAC AATTGTTCTA GATGGAAAAG AAAAAAAGCT TCTCTGTTAA AAAAGGAGCT 22380

TGTGCTATAG GAGGTTTAAA ATATACTTCT GACCCATCTC CAACATTCTA AATCCTTCCC 22440

AGAAAAGTAT GCCAATCCCA AGAAATATTC AATCAAATTG CTGGAAAGAA AAATACAAAA 22500

TATTAAAATG TATTAGGAAG CGACAGTAAT TAAATCAGAA CTGGAGCAGG AATAGACCAG 22560

CAGATCAATG AGACAGACAT CAAGTCCCGG AATGTGGACT TGCAAATGCA TTAAGTAATA 22620

TGATATGCAA TAAAGGTGGC ACAGTGAACC AATGGGAAAA AAATTAATCT TATAATAATT 22680

GATATTGCAA TAATTGTCTA GTAATTGGGG GAAGAAATAA GCTTATTCCT TATCTCATTT 22740

CTTTTTTTCT TTTTGAGACA GAGTCTCACT CTGGTAGCCC AGGCTGGAGT GCAGCGATGC 22800

GATCTCTGCC CACTGCAACC TTGCTCTCCC GGGCTCAGGC GATTCTCCCA CCTCAGCCTC 22860

CCGAGCAGCT GAACTACAGG CGTGTGCCAC CACTCCCGGC AATTTTTTTT TCCATTTTTA 22920 GTAGAAATGG GGTTTCACCA TGTTGCCTGG GCTGGTCTTG AACTCCTGGG CTCAGGCAAT 22980

CCACCCGCCT TGGCCTCCCA AAGTGCTAGC ATTACAGGCA TGAGCCACCG CGCCTGGCAG 23040

CTCATTTTTT AGACTAAATA AATTGGAGAT GGCTAAAAGA TTTTTATGTA GGCCAACTAT 23100

GTTTTTAAAA AGTTTTTTTT TTTAAGGATA TCTGCTGGAA CCAATCATGC CACCAACCAA 23160

AGATGCAAGA CTATAAAACA TACCCAGTTT TTCAAAGCAT TTAAAAATTA TTCTAAAAAT 23220

ATTTTTTCTC CAGAAATTTT GCATTGATTC CCTGAAGAAG CATTAATATG GGACCTGACT 23280

TATAAAATGA TGAACTCAAT CTCCCCACTC AAGGTAGGAG TCTCTCAGAT TTAAAAAATA 23340

AGCATCCTAG TCCTCTTGTC CCTGTAAAAG TTAACCCTTA CACCTGAAAC ACCAGGAGAC 23400

TGGCGGTTGT TTGCATAGGG GTTACAATTA AAGTTGAGCT ACCTCTGACA TCTATTAACA 23460

CCAAAATTAG TAAACTATGC ATGTATGGAG ACTTTTATGA TTGAACTTGT TTATTGAGTC 23520

AAGAGATATA GTTTACAATG AAAATTTGGG GCATATCAAA ATGACCTTGG CTTAGCTTAG 23580

CATTTGCTGA TGTTAACTAT TTTCTTCATT GGGCTGATTT TAGTTGCTTA GGAAAAATAC 23640

AAACACACAC ACTTTAAAAT TATATTAAAA TCCCGTCCTA AACCTCAGAG TCCAGAACCG 23700

CATCCTAACA CTGGTCATGC ATAATATGTT TAAATTTTTG TGCTTTAAAA ACTACAAATA 23760

AGGAATGTAT TAATAGTTCC ACAATCAATG GTCAGTTAGC CGAGGGAAGA TTAGCATAGT 23820

TAAAGACTTA AAATGGCTTA ACAACATATA TCAAAAGGAC AAAATAAGGG GAACAGAGTC 23880

TAGAAATGAG GAAACTGGGA CACAGGCAAA AAAAAAAAAT GAGAACTGGG ACATGAATAA 23940

CGCAAGGGAT AAGACTAATA CACAAAACAC CCCAAATAAA TAGCCAGCAT TTGCTGAGCT 24000

CTTACTGTGA GCCTGTTCTA AGCACTTTAC ATATATTAAC TCATTTCATC CTCAAGGAAC 24060

CATCTGAGGC AGGCACTGTT ATCATCTCCA TTTTACAGAT AAGGAATAGA CCCAGAGAGG 24120

CTGAGCAACT GGGCCTATTC CACAGCTACT ATGGTGGAGA TGAGATTTAA ATCTAATCAT 24180

TGGCTCCAGA GCCCATGCAC CCAATGGCTG CACTAAGTGA ATGCATGCGC TATCAACGTT 24240

GCCAAAAGTG GGCCACAGCT CGGATCTGCG TTTTCCAGTA GCCAAAGCAG AGAGTGTGAT 24300 CAGACCTCAC TTTAATAAGC AAGTCTCAAG CCAGAGAGAG GTGGTATCAG GCAGCAAACA 24360

GGCTGCTAGT CGAAATCCCA CTTCTTCTCT GAGTGGTCCA TACAGTTTTA CTCTACTTGC 24420

TTACAGAATG AAAATAGCTG GAGTTCAGGT GCGCTTTCAA TGCCCTGTTG TCAGGATTGG 24480

GCTTTTCAAG TTTATTTTTT GTTGTTGTTT TTAATAGACT GTACTTTTTA GAAAATTTTT 24540

AGATTTACAG AAAGATTGAG AGGATAGTAC AGAGAGTTCC CGTATACCTC ACACCCAGTT 24600

TCTGCAATTA TTAACCTCTT ACATTCATGC GGTACATTTG TTACAATTAA TGAGCCAGGG 24660

CCGGCCGGGC ACAGTGGTTC AGGCCCCTAA TCCCAGCACT TTGGGAGGCA GAGGCAAGCG 24720

AATCACTTGA GGTCAGGAGT TCGAGACTAG CCTGACCAAC ATGGTAAACC CTTTCTGTAC 24780

TAAAAATACA AAAAATTAGC CAGGCATGGT GCTGGTTGCC TGTATTCCCA GATACTCAGG 24840

AGGCTGAGGC ACAAGAATTG CTTGAACCAG GGAGGCGGAG GTTGCAGTAA GCCGAGATCG 24900

TGCCACTGCA CTCCAGCCTG GGCAACAGAG CGAGACTCCA TCTCAAAAAA AAAAAAAAAA 24960

AAAAGAAGGA AGGAAGGAAG GAAAATTAAT GAGCCAATAT TGAGACATTA TTATTACTAA 25020

AGTCCATGCT TTATGCAGAT TTTCTTAGTT TTTACCTGCT GTCATTTTTC AGTTCCAGGA 25080

ATGCATTCAG GATGCCATAC CACATTTAGT TCTCATATCT GCTTAGGCTC CTCTTGGCTA 25140

GACTGAGTTT TAATCTACTT TCTGCAGAGC CTGAGAACTT TAGCATAATT TCCTTGAAAT 25200

TACAGCTCAA TATTTTCAAG CACTTATACA AACAGCCTAA TGTTACGTTG GCCCATAACA 25260

GTGTTTCAAG GTAATAAACT TCTTTGTTTT CTGTGCCGAT TGAAAGAACT GCTGCTTAGC 25320

CTCCTGCCAG ATGATGAACT GGGTACACAC GAGCATTTTT CCAGGTAAAG CATATTTCGT 25380

GCGACTTCTT AAGCTGCAGC CTTATATGCA ATAATTGTCC ATTTACAAGA CTTATGTTCG 25440

AATTTCAGGC ACTCTGTTTT CACTAACCAT ATCTTCAACT TTGATAAGTA CTGCTTTAAT 25500

CACTCAGAAA ATTTAACTTG ACTAATTTTT TTTCACCATC AGTTTTTTTT CTGTTGACTC 25560

TTTCTCCTTT TTCTGTTTGC CCAGAAACAT GCTCAGGATT CTCTCAGGCT TTAAAAAATG 25620

AAAAAATGTT TCCTGCAATC TAGTTACTCC TTGATTCTCT TGTTCTGTTT ATCGCTGGAA 25680 TTCTTGAAAG CTTGGTGTAT TAGTCTTTTT TCATGCTGCT GA AAAGATA TACCTGAGAC 25740

TGGATAATTT ATAAAGAAAA AGAGGTTTAA TGGACTCACA GTTCCACGTG GCTGAGGAAG 25800

CCTCACAATC ATGGTGGAAG GCAAAAGGCA TGTCTTACAT GGCAGCAGAC AAGAGAGAAT 25860

GAGAACCAAG GGATTTCCCC TTATAAAACC ATCAGATCTT GTGAGACTTA TTCACTACCA 25920

CAAGAACAAT ATGGGGTAAA CCGCCCCCAT GATTCAATTA TCTCCCACCG GGGCCCTCCC 25980

ACAACACGTG GGAATTATGG GAGCTACAAT TCAAGATGAC ATTTGGGTGG GGACATGGCC 26040

AAACCATATC ACCTGGCCTA TAGCATTATT TCCATTTCTT CCCCATCCTT TTATTCCTCA 26100

AACCGGTACA ACCAGACCTC TTTTTTTTTT TTTCTACCTG AAACTGCTCT TTTGAGGGTA 26160

GCTGATAAGT CCAAAATACT GTCACCTTTT CTCAATTCCG TTCCTTCTTA TGCCTTTGGA 26220

GCAATTGACT GTGTTGGTTG CCCCCTCCTT TAAAGTGTCT CTCACTTGGT TTTTATGACT 26280

AATGATCATG ATTTTCTTTT TCCTCTCTAA ACATTCCGCT ATCTTTTTAG CTTCCCTTCC 26340

CCCTCCCATC CCCTAAATGT CCTTGTTTCC CAGAATCTGC CTCACCTCTT TGACTTCTCT 26400

ATGCCCTGTC ATTCACTCAT GGGTCTTTAT TACATTATTG CATCTGTGTC AATAACTCTG 26460

GTCTTTCTCT TAAGTTCCAG TCTCCCATTT TCAAATGTCC CCAGACATTT CCAATTGAGT 26520

ATCTCTCCAA TGTATTTAAC CTGCTAAATA TCTAACACAT AATCTTTCCC ATCAAATCGT 26580

TTCCTCTTAA GCTTTTCTTA TTTCCTATTA GTACTCCTGC ACTTCTCCCA GGAGCCCAGA 26640

CTTAAAACCT TGAATTTCTC ACCATAACCT CTCTTTTGTC TCCCATAATC AATTAGTAGC 26700

AAGTGTTATC AATGATTACT TGACAATATC TTTTTCTATT TCCCTCCCTG CTATGATCAT 26760

TCATCTAGCA AGAAGAGTTG GCCCTTTGTA TCTGTGGTTT CTGCATCCCT GGATTCAACC 26820

AACTGTAGAT GGAAAATATT TGAAGAAAAA AGCGTCTATA CTGAGTATGA AAAAATTTTA 26880

TTTCTTGTCA TTATTCCCTA AACAATACAG TATAACAACT ACAGCATTTA CACTGTAGCG 26940

TATAGATCTT ATAATCTAGA AATGATTTCA AGTACACCAT TATATATAAG GGACTTGAGC 27000

ATCTGTGAAG TTTGGTATTT GTGGGGCATA CTGGGACCAA TTCCCCCATG GATACAGAGG 27060 GACAACTATA TTTACTCAGT GCTTACTAAA TACCAGTTGG CCAATGTGTT TTTCTTTTTC 27120

TGTTTTCCTG TCTTTAGTTT GCCCCTTGCC AATTAATTCA ATAGTGCTGC CAATGCCAGG 27180

TGTACCTTCA GAATATTCTA TTCTAATTTT GTCATCTCCA AGCTTAAAAA TATTTAATGG 27240

GCCAGGCGCA GTGGCTCACA CTTGTAATCC CAGCATTTTG GGAGGCCAAG GGGGGGTGTA 27300

TCACTTGAGG TCAGGAGTTC CAGACCAGCC TGGCCAACAT GGCGAAACCC TGTCTCTACA 27360

AAAAAGTATA AAAGTTAACC AGGTGCTGGA GCATTTGCCT GTGGTCCCAG CTACTCACGA 27420

GGCTGAGGCA AGAGAATCGC TTTAATCTGG GAGGTGGAGT TTGCAGTGAG CCAAGATCTC 27480

TCCACTGCAC TCCAGCCTGG GTGACACAGC AAGACTCTAT CTCAAAACAA CAATAACAAC 27540

AACAACGAAA AACATTTAAT GGCTGCACCT TGCCTGTGAA AAATGCATTT CTTGGCCAGA 27600

TGTGGTGGCT CAAACCTGTA ATCCCAACAC TTTGGGAAGC TAAGGCCAGG AGTTCGAGAC 27660

GAGCTGGGAT ATATAGGAAG ACACAATCTC TACAAAAAAA AATCCACAAA ATTAGTCAGG 27720

CTTAGTGTTC ATGCCTGTAG TCCCAGGTAC TCAGGAGGCT GAGGCAGGAT TCCTCAAGCC 27780

CAGGAGTTCA AGGCTTCCGT GAGCTATGAT GGCACAACTG CACTCCATCT TGGGTGACAG 27840

AGCAAGGTCC TATCTCTGGA GAAAAAAAAA AAAGAAGGCA TTTCTTAGGA GAGTTCTTCT 27900

CTGTAGAGTC CTAAGGGTTC CATGGAACTC CTTAAAAGCA TCAGAGTATG TGAGTGCAAT 27960

GGGAGGAAGC ATTTAGCCAG AGCAGTTGTG CTCCCATTGC ATATTAATTT TTAAAAAACA 28020

AAGCTATAAA AAAAAGTTGA AAACTACTAC GTTAGCATCA GCCTGACATT TAATGGCCTC 28080

GTAAATCAAA CCTTAATTGA CTTTTTAGCC AGTTATGCTA CTAGCCAACT ACAGACAACA 28140

CACTTTTTAA CCAAATTAGA CTAATAGTTG TCATCAGTGG AAATCAAGTT TGCCATTCTT 28200

CCATGCCTTT GCTCACACCA TTACCTTTTC TGGAATGTCC TGTACTCATC TTCCTGTGTT 28260

GAACTCTATA CCCAACTTTA AAAACCTAGC TCAAAGTTCA ACACTTCCAT TCCATTTCAA 28320

AAAGAGCTTT CCTCTTCCTT AAAGTTTAAG AACTCATTTT CATGAATCTT TTTGGCATTT 28380

ATTGCACACA TGCTTGCTTT GTGTTATTTG TGTTCATGCC TCATATGCCC CCAAGGTGTT 28440 TTAGACTCCT TAACGGCAAA AATGATGCTC TAAACACCTT TCTATCTTTC ATAGTGTCTT 28500

AGTCTGTTTG TGTTGCTATA AAGGAATACC TGAGGCTGGG GAATTTATTT AAAAAAGAGG 28560

TTTATTTGGC TCACAGTTCT GCAGCTATAT AAGAAGCATA GTGTCAGCAT CTGCTTCAGG 28620

TGAGGGCTTC AGGAAGTTTC CACCCATGGT AGAAGGCAAA GGGGAGCAGG CATCACATAT 28680

CAAGAGAGGA GGAAAAAAAG GAAGGAAGAA AGGAGGGTGC CATTCTCTTT CAACAATCAG 28740

TTCTTGTGGG AACTAATGGG ACAAGAGGCT GGGCACGGTG GCTCATGCCT GTAATCCCAG 28800

CCCTTTGGGA GACCAAGGTG GGTGGATCAC CAGAAGTCAG AAGCCTGAGA CCAGCCTGGC 28860

CAATGTGGTG AAACTCCGTC TCTACTAAAG ATACATAAAT TAGATCTAGC TGGGCCTGGT 28920

GGCGTGTACC TGTAGTCCCA GATACTCAGG AGGCTGAGGT AGGATAATCA CTTGAACCCG 28980

GAAGACAGAG GTTGCAGTGA GCTTGTGCCA CTGCACTCCA GCCGGGGCAA CAGAGTGAGA 29040

CGGTCTCAAA AAATTTTAAA AACTTTAAAA ATAATAGAGC AAGAAAGCAC CAAGTTATTC 29100

AGGAGGGATC CACCCCCAAT GACTCAAATA CCTCCCACCA GGCCTCACTT CCAACACTGG 29160

GGATCAATTT CCGTATGAGA TTTGGAGGAG ACAAATATCC AAACTATATC ACATAGTAAT 29220

GAACATAGTA CCTTATCTAT AGAAAGCAAT GGCTAGACAA CTGTTGAATG GCTAACCAAA 29280

TCTGCTTTCC TATGGTCTCG CTCTAGAGGG GGTCAGTATG AGTTTCTGTC AAAAGGAGAA 29340

AAAAAAATGT ATAGTCAGTT TTGTGTGTGT GTGTGTTCAT GTAAAAGAGA TCAAGAGAAA 29400

AGAACAAGAG AAATCATGAA AAGGAGGGGG AATATAAGAA TAATACATAG AAAAAAGCAA 29460

ATTATCTTGT TTATCAGTAA TACCCAAGGG GGTAGAAATG GTAAGTAATA ATCCTTCTTC 29520

ACTTTGTCTG TAGTTCACTT TTTTGCACCT TTATTTTGAT GAATTCACAT CGAAGACATT 29580

AACTCATTAA GGCTTCCAAT ATTTTTGGAG ATAAGAAGGG CTGCTATGCT CTTTATAGAT 29640

GGAAAACTTG GGTCATTAAT AACTCAAACA AGGACATAAC AAAGAAATGG AGCATAAACT 29700

GCCAGGTCCT GACTGTAGAT TTGGATTCCC AGTTGGTGTC TTGTCACCCT TTGTTACTCT 29760

TCCTAAAGTT ATGATCTTTT CTTGTGCATA GGAAATTCAT AGTGATTTCC CATCACCCTT 29820 GGGATTATCA TAGCTCCTTT AAGGTCCCCT CTATGCACTC AATAACATCA ACAGTAAGTG 29880

TTCTTCGAGC ACTTACTGAG TGTATATCAT TGTGTTCTCA CGCAGCACCC ACAGATCTCA 29940

CCAAGAACCT AGCTGAAGCC TGTAGAATGA ATAGGTAAGT ACTGCCATGC CAATCTGGAG 30000

TACTCAAGCG ATGCAAATGA TTCCTTTAAT TGTACTTTTG CAGGCTTGTC AGTTTTGCTC 30060

ATGGAGAAGT GGCTACTGCA TCCATGTTAT ATCTATGTAA TGTTGGACTG CGAAGCATCA 30120

CTTGACTTTT TCCAAGCAGA AATTACAGCT GATGACAAGC TGCTGCTGAG AAAATGGATA 30180

TTTTTCTGAA TTCAGTTCTA CGTGGAAACA GCTGACTAGT TTCCATTGCT GTAAGATGGC 30240

TCTTTTGCTC TTGGTTGATT TTGAGTAATG GCTTTACTTC TGTAGAAAGG AGATTTCATT 30300

TGAAGTCCAC TCAGGGATTT GGTTCAACAA ACTGGAGTAC AGGTTTCAGA AAATATCTCT 30360

TTAATCCTCC AATAATAAAT TTTCTCATCT ATAATTCCTG GAACACTTCA TCCTTTGCAG 30420

CCGAGCATAT AGATAGATTT GTTGCTCACT GTGTTCTGAT TGCCACTTTG ACCTGCTTTT 30480

TCAACTTAGG TTACAAATAG AACAGAATCT CTCTGATTTT TCTCATTAAT TGTTTGAATT 30540

CCCACTTTTC CTCATTAGCA AGAAGTCCAG TATCTTCCTG AGAACTTCCT TTTCTCAATC 30600

TAGGAACTTA CTTGGTCCAT AAGGTAACAG TCTTATTTCT GACTATCAAG GAGAGAAATA 30660

ACAGGAGCCA TTATCATCTT CATGGTGTCA CTTTTGAAAA CTGGTCCTCT GTAGATCTTC 30720

AGATTCTTGC GTTAGTCCAT TCAGCTGCTA TAACAAAATT GCATAGACAG CATGGCTTAT 30780

AAATAACAGA AATGTATTTC TGACAGTTCT GAAGGCTAGA AAGTCAAAGA TTAAGACACT 30840

GGCTGATTTG GTGTCTGGCG AAGGCCCATT TGCTCATAGA TGGACGATGA CCTTTCACTC 30900

TGTCTGCACA TGGCAGAAGG GCAAGAGAGC TCTCTGGGTC TTTTTTATAA GGGCACTAAT 30960

CTCATTTTTG AGGACCCTGC CCCCATGACT TAATCACCTC CCAAAGGCAC TGTCTCCCAA 31020

TACCATCACC TTGAGGGTTA GGATTTCAAC ATATGATTTT GGGGGGACAG AAACACGCAG 31080

TCCATCTCGC TTGTCCACTC CATGGTGGTA TTCTTGCTGG ATCAGTTTCC TCCTTGGGGT 31140

GCATTTGTGT TCCATGTCTA ACTTGCAAGT TATAGCAGGC CCGATAGCAA AGTATTCCAA 31200 TGTTGGTATG CAGAGGCATT GAATAATCAG AATGAACCCA CGCCATAAAC AACTGGTAGA 31260

GCTGCAGAGA GTACCAGCTG ATTATGAGCC CTGGGTAACA GTGGTTTTTA GTTCCTATGT 31320

CCGTCAGCCC TTTTCTCCCA TAGTAGCCCC ACTGTGTTGA AGTGGCTGAA TCGACAGAAG 31380

CTTCCAGCTT GGGCCACATG CTCATGGAAC CAATTCTCCT TATGAGCCGT ACAAGAGCTG 31440

GGTTGCCATT CTGGATACCC TCTTTCTTCA AGAGATTTTA TTTCAAGGAT ATTTTTTCTT 31500

TTATCAACTA CAGGGATTAT TTAGAATCTT AGGGCAGTGG TGCCCAACCT TTTTGGCCCC 31560

AGGGACAGGT TTTGTGGGAG ACAATTTTTC CATGGACCAG TGTCAGGGGG CTGGGAGGCA 31620

TGGTTTTGGG ATGAGTCAAG TACATTACGT TTGTTGTATA CTTTATTTCT ATTATTATTA 31680

TATTGTAATA TATAATGAAA TAATTACACA ACTCACCATA ATGTAGGAAT CAGTGGGGAG 31740

CCCTAAGTTT GTTTTCCTGC AACTAGACAG TCCCATCTGG GGGCAATGGG AGATAGTGAC 31800

AGATCATCAA GCATTAGATT CTCATAAGGA GTGCTCAGCC TAGATCCCCG GCATGTGCAG 31860

TTCACAATAG GATTTGCTCA CCTATGAGAA TCTAATGCCA CTGCTGATCT GACAGGAGGT 31920

GGAGCTCGGG CAGTAATGCG AGGGTTGGGG AGCAGCTGTC AATATAGATG AAGCTTTGCT 31980

CGCTCGCCTG CCACTCACCT CCTGCTGTGT GGTCCACTTC CTAACAGGTC ACAGACTGGT 32040

ACTGGTCCAT GGCCAGGGAG TTGGGACCCT GTCTTAGGGA GTAGGGGTGG AGTTCCCTTC 32100

ACTTCTAGAA GGCCCTGGAT TAGTATCCCA GAGCTGTCAT TACAGAGTAT CACAAACCAG 32160

GTGGCTAAAA ACAGACATGA ATTCTCTCTT ATTTTTGATG GCTTGGAAGT CCAAAGTCAA 32220

GGTGCTGCCA GGGCCATGCT CCCTCTGAAA TGTGTAGGGG AGAATCCTTC CTTCCTCTTT 32280

CTAGCTTCTG GTGGTTTGCT GGCAATCACT GGCATCGCTT GGCTTGCAGC ACTTCAACAT 32340

CTGCCTTTAC TGTCTCATAG TGTTCTCCCC TCATGTCTCC AGGTCTCTCT GTCTCTCTTC 32400

TTTGTATAAG GAAACTAGTC ATATTGGATT AAGGGCCAAC CCTACTCTAG TATGACCTCA 32460

TCTTAAGGTC ACATGCAATG ACTATTCCAG ATAAGGTCAC ATTCTGAAGA ACTGGGAGTT 32520

AGGACTTCAT ATCTTTTGAA GGAACACAGT TCAACCAATA ACAGCCCCTG TACTGTTTTA 32580 CAAATAGGTA TTCCTCTCCT TCCCAAAGTT CTTCATAGCA GAGACAACTT GTACCAAAAG 32640

GCAAAATACC TTATTATGTA ACCTTAACCT AGGATCATAG ATCCCTACTT GTCTGGTGCT 32700

TTTATAAGCC ACAGAACCAC CCGGGAAATC ATTATTAAGA CAAGGAAAGG CCAAGTGCAG 32760

TGGTTCATGC CTGTAATCCC AGCACTTTGG GAAATTGAGG CGAGTGGATC ACCTGAAGTC 32820

AAGAGTTTGA GACCAAACTG ACCAGCATGA CAGAACCCCA TCTTTACTAA AAATACAAAA 32880

ATTAGTTGGG CATGGTGGCA TGTGCCTGTA ATCCCAGCTA CTCAAAAGAC TGAGGCAGGA 32940

AAATCACTTG AACCGAGGAT GCCAAGATAG CAGTGAGCCA ATATCGTGCC ACTGCACTCC 33000

AGTCTGGATG ATAGAGCAAG ATCCTGTCTC AAAAAATTAA TAAATAAATA AAAAGACAAG 33060

GAAAGCCTTT TCCAAGGAGA CCCTTCTGCT TTGCTAGTTC AGAGAACTTC TCTTTGGAGA 33120

AAACAAACAC CCAGTCCATT AGCAGCAACG TCAGGGATTG AATTCTTAGG GCAGCAGGCT 33180

GGGCACAGTG GCTCATGCCT GTAATCCCAG TACTTTGGGA GGCTGAGATG GGTGGATCAC 33240

TTGACATCAG GTGTTCGAGA CCAGCCTGGC CAACATGGTG AAAACTCATC TCTACAAAAA 33300

ATATGAAAAA AAAAAAAAAG CTGGGTGTGT TGGCTTATGC CTGTAGTCTC AGCTACCTGG 33360

GAGGCTGAAG CAGGAGAATC ACTTGAACCC GGGAGTTGGA GGTTGCAGTG AGCTGAGATT 33420

GCCCTACTGT ACTCCAACCT GGGTGACAGA GAGAGACTCC ATCTCAAAAA AATAAAGAAT 33480

TCTTCGGGCA GCAGTCTTTC CTCCACCTCA TAGACCATGG AGGTGAGCCA GCTCTGACAA 33540

ACCATGAGAA CAATGGCAGA GACATACCTG TAACGTAACT GACTGGGGCA AAGACAAAGG 33600

TGAGGAAAAT GACAAGTTTG AGGAACTATG AGACCAGGCA GTGGGGAACA CCACTAGCAG 33660

AAATGATGGA AGTTCTCAAG AATAACAACA GAGAAATAGA CCATGGCCAG AGTCTAGAAC 33720

CCTCCAGGGA AAGGAGATGG GCTCCAGAGG CAGAAGAGGA CGTTGAAGGG AATGGGGAGT 33780

GGGTGAAATA TATAGACGAT GGGGACCACC CAAGAGCAGT CGCTATTGCA AAACTGAGGA 33840

GAAGGAGAGT CTGGAGGGGG TGGTGGGAAG CTGGGTCTCC TAAGGAGGTT TTGACAAAAG 33900

CAGTCATGGA GCGGGCTTAG AAATCACAGT TGGGGACAGG GTAAAGTTCC TCGGGATATA 33960 GAGGATGAGA TTAGAAGAGG TTCCAACTAG GGTAGTGTGG AGAAAAGCAC TATTGACCCA 34020

AAAAGGAAGG AGAATGTGGG TGGAAGTGGC AGAGAAAGAG GGGTTTGAGC AGAGAGTGGT 34080

GATTTTTCTA ATGCAGAGTT GTGGGAGGTG GAGTGCAGGG AGCCAGGCTG GGTGGCTGTG 34140

CTGATGTGAT TAAGCACTTA CTGACTGCCA GGCAATGGGC TAAGTACCTG AGATGCTTTG 34200

TCTGTTATCC CTCCCGAAAC CCCTCTGAGC AGGTGCAGTT ATTATTCTCA CTTCACAGAT 34260

AAGGAAATTG AGGCACAGAG AATTGAGTAA CTTACCCAAG GTGACATAGC TCATATATGG 34320

TAAAGCAGGC TTTGAACTCA GTCTAGCTCC CGAACCTAAG CTTGTAACTA CTATGCTTTT 34380

CCCAAAAAAA GGGGGCTGGC ACAAAAAGAG CTGAGGGGGG CTGGGCATGG TGGCTCATGC 34440

CTGTAATCCC AGCACTTCGG GAGACTGAGG CAGGTGGTTC ACCAGAGGTC AGGAGTTCGA 34500

GACCAGCCTG GTCAACATGG TGAAGCCCTG TCTCTACTAA AAATACAAAA ATTAGCTGGG 34560

TGTGGTGGTG TGCACCTGTA GTCCCAGCTA CTTTGGGAGG CTGAGGCAGG AGAATCGCTT 34620

GAACCCCAGA GGCGGATGTT GTAGTGAGCC AAGATCATGC CACTGGACTC CAGCCTGGGT 34680

GACAGAGTGA GACTCCATCC AAAAAAAAGA AGAGCTGAGG TGATGGCCAC CATCAGCATC 34740

AGCCTGGAAG TTATAGCAGG ATGCTAAGTT TCTCTAAAGC TGTCTTTCTT AGGACTTGAA 34800

AAAGATAACT TGGGTTTGTA TCCCATCTCT GCCATTAGTA GTTTACTGGC TTTGGATAAA 34860

TTACTTAGCC TTACTGAACC AACTTTGGAT TTTTATAGAG ATACTGTAAT GAAAGGAATA 34920

AGGTATCAGT CTTAGCAGAG CATCCAGAGT GTTCCTATTA AAACCTAAAT CATATCCTGT 34980

CATTGCTCTG CCCCAAACCA TTCAATGGCT TCCCAACTCA AAGTTAAAAA CTCATCTTTC 35040

CAGTGGCCTG CAAGAGCCTA TGCTATCCGG TGTCTGACCT CATCTGTTGT TCCTTTCTCC 35100

CTCCCTTTCT TGGCTCCAGA CGCACTCTGG TCTCCTTGCT GTTCCTTGAA TACACCAGGC 35160

ACACTCTTTT CACCTGAAAC ACTTTACCCC AGATATCTTA GCTTACTCTC TGCCTCCCTC 35220

AATTCATTGA TGAAATGTCT CAGTGAAGTC TTCTCTCTCT CCTCTGTAAA AGTATACTCT 35280

CTGTTCCCCT TCTTTACTGT TCTAGCTACT ATTGCTGTGT AACAAATCAC TCCCCAAATT 35340 TAATGAGTGA AAACATCAGC CATCATCTTA TTTCTCACGG TTTCTGAGGG TCAGGAATTC 35400

TGGAAGGGCT CAGCTGGGAG GTTCTGGCTC TATAATCTCT TATGCAGTGA GAGTCAGATG 35460

CTGGCTAAAA CTGAAACAAA GCAGGGTTCT AGTAGCTGAG GGCTGGCTGG GTCTCTCAGA 35520

TATAGTTCAG ATCTCCTCCA GGGGGTCTCT CCACGTGGGC TAGTCTGAAC TTCCTCACAG 35580

CATGGTGGCC TCAGGGCAGT GGACTCTGCA TAGTGGCTGA AGGCTTCGCA GCTGAGTATT 35640

CCAGCAAGCA AAGTGGGAGC TGTATTGCCT CATATGACCC AACCTTGGAA TCCACACAGC 35700

ATCACTTCCG TGTATTCTAC GGGTTGAAAA GTCACAAAAA CCAACCAGTT TCAAGGAGAA 35760

GGAACAGAGA TCACATTTCT CAATTGGAGA AGGGTCAAAG TCACATTGTA ATCAGAGCCT 35820

ATGGGATACG AAGTATTGCG GTCAGGTATG AAAAATTTGA TTTGCTGCAT CTGCTTTACT 35880

TTCTCCACAG CGTTCATGAT CTGCTTCTCA CATGATATTG ACTTACGTCA TTTCTGCATT 35940

TCCTGTCTTC CACACTAAAA TGTCAGCCTG TTTTGTTCAC TGCTGTATCC CCAGAGCCTA 36000

GCACGGAGCC CAGCATGTAG TGGTATCCAA TAAATACTTG TTGCATGAAT GAATTCTGTC 36060

TTTTAATCCT AGCTATAGGT TTCTAAGTTA AATATTACTA TAATCATCTT ACAGACGAGG 36120

GAAATGAGGC TCAAGAAGAT TTGGTAACTT ATGCGGGATC ACTCAGCCAC ATAATGGAAG 36180

AGACAGCATT GAAGTACACA TGCTTGCTCT GTCTGCTCTT CCAAGCTGCT CATCACACAG 36240

CTGCACCTCT GAGGACTTCC CTCCCCAGTC CACCTCCACC CTTACCCAGA GACACACATG 36300

GCCACAATCC ACTAGCAGAC CAAAATTCAA TTTTTCCCCA GTTGGTTGCA CTCAAGCTGA 36360

GAGCAAAGCA ATTGCACTTT AAATCCCCTT ACAGCAGATA TTTCAGAGCA TGTTCGGAAG 36420

AACCCATCAC ACTTGGCTTT TAGATCTTAT TTCTGGTTTG TTACAAAAAC ACAATTAAAT 36480

GAAAGGTTAG GTAGCTTTTG AATGGCCAGC TCAAAGTTTT GGCTTATTTT TGCCTTGCTG 36540

TCTTTATAGG CATTTTACCA ATATTTATCA CTATTTCCCT TAGGGAACCC TTAGATCTGT 36600

GATATTTGAA ATAATAAAGC CTCTCCATTG GCCCTTTAAA AGGTTTGTGG TAAAACCACA 36660

CCATTAACAT TCACAGTTCC TTATTTATGA GGCCTGATTG CACTTATTTC CATATTTCTC 36720 ACTGTTTCTC CGATGAGGAT TTCACATAAT AGTGTTTGAA GGCTAAAGAC TTCAAAGCAG 36780

ATTCTTTACT ATTTTTATCT TGAAAAATAT TCAATATTTG TGTAATTAAA GTGAAGTCTT 36840

CCTAGAGAAA ATGACAACTC AAATAATCTT AAATGTACCT CCAAGAAAAA AGCTGTCAAA 36900

GTGACATTTA GTAGTAGAGT CACATTCTCT AAGGCCTTTG CTTCTCCTTC TGAGTTCTTA 36960

TCATCTTTGA AGGTTATGTC ATGGCTGACT TCAAATCACT TTTAAAATTA TTATGGCCTT 37020

CTTTAAATGT GAGTTCTGAA GGTGAGGGGC TTTATCTTTC TTTTGCTCCA GATTTTTTCT 37080

ACCGCGTCAT TACCAAGCAT CTTAAAACAA AACCTAAAAA CAAAAATCTT CCTTGACCTG 37140

GTTTTTCCCA CTAGCTAACA TCCTATTTTT ATCTTTCCCT TTGCACTAAA GGTTTTTAAA 37200

CGGATCTTTA TACCCTCTGT CTCCATTTTC TCATCTGCTA ACTTATATGG CAAAGATTAC 37260

CACTGCCTTT CAACATAATT GGCCAATCTA CAGAAAGTTT TCAAGTTCTC TTTTTAATTG 37320

ACCACCTCCT GCCTACCTCC CCACCTTTGA CATCTTGCTT CTCACTTGGC ACCTTACCCA 37380

GTGTTCAAGA TTCCCTCCTT TAGGATGTCT TCAGAGCAGC TACACAGTTG GTACTATAAT 37440

TTATACATCC TTGTACACAG GGCTTGCTGG GATATTGATG GAGAGAAGGA GGAAACTGGA 37500

AGTAGTTCAG GCCAGAGCTA GGGAAATTGA CCCATCTCCA GGTCTCAGGT CTGCAAGGGG 37560

AGCTCACAGC TTAACACATG GAGTCTAGAA ACTTGTGCTG GACCTTGACC AACACCAGCC 37620

CATGGAGTCC AATACAGTGC TCAATAGGGA TTTCCAGGAA ATTGCTATAT TTATTCAAAG 37680

AGAACTTACC AAGTGTCAGC TACGTGTTGG GCATTGTGTT AGGCACAGGG ACCACAAAGA 37740

TAAGACATTG TAGCTTTCCT TAAGTTGCTC ACTGAGTAAA TAGAGAGACA GAAAGGTAAA 37800

CAGGTAAGTG CAAAAATACA TACAATTCTG CAATAGTGTT CATAGTGGCT ATGGAGAGAA 37860

CGCTCACTAA CTTTGTTTAA ACAGTTGTTC TTTCAAGGAT TTGACATGGA TTTGATTGGA 37920

AAAGCATGAT ACCATTTTTT GCAATTAAAC ACAGGAATAC ATAAATAAAA TGCATCAGTA 37980

TTTTTTACAA ATAGCTACTA AGAGCTACTA GAAAACCTGG GAATTCTTAA AACCTTACCA 38040

TGCTACTTGC TCTAAAATAT TTTATTTTAT GTTATTTTGT ACATTTCTTT ACCTACACAA 38100 ACACCACTGT TTTCTTCATT TCTTAGTCTA TTTAAACCTC ACACCCTTTC AGCATCTCTT 38160

AATTATTTAC TACCATCTGT TAGTTCTCCT GTCCTGAATG AAACAAAAAT GGCAGAATGT 38220

AAAACGAGGG CGAACAGATT TTTGACAGGA AGTATTCAGA GGTAGAAGGA AATAGTCAAG 38280

ACACATATGA TAAACGAAAA CAATAATAAC TTTATACATA ACAACTTATA GACACATTTA 38340

AAAAGTTTAA GATCTCAAGA GCTATGTCTG AATAGATAGA AGTAAAAACT CTATTAAGTA 38400

ATTAGGAAAA TAACAAGAAC AGTGAATTTC TTAATGAATG GCATGTAATC AAAACTGTAC 38460

TTATCGTCTA ATTCATAATC TTGAATGTTT TTATTTTATT TATTTATTTT TTTATTTTTT 38520

GAGACAGAGT CTTGCTCTGT CACCCAGGCT AGAGTACAGT GGCGTGATCT CAGCTCACTG 38580

CAACCTCCAC CTCCCAGGTT CAAGCGATTC TGCTGCCTCA GCCTCCTGAG TAGCTGGGAT 38640

TACAGAGGCC TGCCACTGCA CCCGGCTAAT TTCTGTATTT TTAGTAGAGA TGGGGTTTCA 38700

CCATCTTGGC CAGGCTGGTC TTGAACTCCT GACCTCATGA TCCACCAGCC TTGGCCTCCC 38760

AAAGTGCTGG GATTACAGGC GTGAGCCACC ACGCCTGGTC GAATGTTTTT ATTATTTGAA 38820

GAGACAACAT GGGCCTTAAA TCTGTCTTCT ATTTGACAGA CTTTGATGGA GTCAAATCCC 38880

AATGCTGCCA CTTACTGAAC GGCCTTAAAT GACTTAGTCT CTCTCAGCTG TCTTTCTGCA 38940

TATGTAAGGT GGAATAATGA TGGCTTCAAG GAGGAATAAA CCTATGAAAA GTGTTGAGGA 39000

TAGTGTCTGA TATGAAATAA GGATTCAACA AGTAGTAGCT GCTATTGAAG ATTTAAGAGT 39060

TATTTATTAC AACTATTTAA TAAAATTTTA AAAACTAATA CACTTAAATT ATTAAAGAGC 39120

TTTGAAATGG GCCAGGCGCA GTAGCTCCTG CCTGTAATCC CAACACTTTG GGAGGCCAAG 39180

GTGGGCGGAT CACCTGAGGT CAGGAGTTTA AGACCAGCCT GGCCAACATG GTGAAACCCT 39240

GTCTCTACTA AAAACGCAAA AATTAGCCAG GTGTGGTGGC ATGCACCTGT AGTCCCAACT 39300

ACTCAGGAGG TTGAGGGAGG AGAATTGCTT GAACCTAGGA GCTGGAGGTT GCAGTGACCC 39360

GAGATGTCAC TGCACTCCAG CCTGGCAACA GAGCAAGACT CCATAAAGAC AACAAAAGCT 39420

TTGAAATTGT GTAAATGAGT TGTACCTATC TTCATTTAAG AAATTCATCT TTGTTCATCT 39480 ATTTTTACTT GACATGAGAG CTTCCAGCAA TTTTTAATTA AGCCCTCACA GATTTTATGT 39540

CACTGGCTAT GTGATAAACA AATTATTTGC TAAAATAATA TTCTTGCTTC TTTTTTAAGG 39600

AATTGTCTCC CTAGAAACGG TTTGTACCAA ACAATACACT GACTTTACAC AAAATCAGAT 39660

CTGATTGGCA ACAGTTGCAG ATGTTTTCAA AGGATTTTCA TTTGAGAAGG GGCCCATTTG 39720

GGTTATTTAG ATTCTAAGAA CTGAAACTGC TTTGTTCTGT TTTTCTGGCT TCTGGGAGAG 39780

GAGGAGACAT GAATTCAGTT AGCACCTTGG TATTTTCTTT ATCCTTCATT TCAATACAGA 39840

AGATGCTTCA TATGCACAGT GGTGTCAGGT CACATCAAAA GAAAGAGAAA CAGTTTCTTG 39900

GTTTTTAATT TTCAACCGGA AAGGAAAGGC ACCCATTTTG TTCCGCTCTA ATTAGCCAGT 39960

GCATGACTTA GAGAGCAGGC AGATGCTTTG AAGGCGTGGT AACACAGGTC TTCATTAATC 40020

TCCACGCAGG ACTTGCACTT CTACTATGCC TAGGCTGAAG AAAATGGCTC AGGAAGATGA 40080

ACAATCTCAC AGAGCCCTAA CTAACTGAAG CCAGGTGTTA TAAAGCACAA GTCAAGAGGG 40140

TGAGAAACTA ACGTTCTTGA AATCTCCCAC TTCTTTCTAC GTCAGAAGAG CCAAGCTGAT 40200

TATTTTAGTT GGAATTTAGA AATTTTTAAA AATTATTCTA AAGTCATGAA CAAGCCTAAT 40260

TATAAAGATA GTTGCTGTGA AGGTGCTGAA ATAACTCGAT TTTACCAACC CCCTCTTCTG 40320

GAGGAAGCCA GAATGGAATC CTGTAGAATG TTCACTCTAC CAACGAACTC TTGTTTTTCT 40380

AATGAGGAAA CAGAGGCCCA CAGTAGTAAA CTATCTTAAC CAAGACAAAA TGACTAGTGC 40440

TCTGGTCCTT TTATTAAGCA CTAAAATTTT GATCCAATAA TAAATCTGTC CAGTAGAAGG 40500

AGTTTCCCTA ATGTACTGGT TCTAACTTGT TCCCTTCAAG GGGCCAGTGT CCCGTACACA 40560

TAGCTAAATG GGACTTCTCT TCAACTACCA TTACCCAGAG GGCAGAACCT AAAATGCTGT 40620

GAATGACATT CTGCTGTTCA CATCTCAGCA GCAGTGTTGC ATTTGAGCTT CTGCAGGGCC 40680

ACCCAGGACC TATATCTGCT CAGATGTTTA ACTCATCTAA TTCAGTGAAC ACTTCATTCT 40740

AGTTAACTGA ACATCTACTT TGTACAAGGC ACTACAGCGG TTCAGAGATG AATAAAATCA 40800

TGAGATTCCA CTGTCTCCTA TAAACCATCA CTTTGGGAAA TTTTAGAAAT GTGGGTAAGC 40860 TCCAGGGCTT CCTGCAGCGT AGAAGTCACA AACTCAAATG CCTGCAGAGG CCCAGCTGAC 40920

AACATAAGTA AATGATTCTG GCTGGGCGGA AAACAATTAC GGGTGGGTGG GTTTCCAGCT 40980

GGGGAGTGCA CGCCTGTGTT AAAGGACA 41008

(2) INFORMATION FOR SEQ ID NO : 4b :

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39238 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :4b:

GCTGCTACTC ATTTCCAGCC AACTGTGTTC CCATGTAGAA CTGCGGCCCA GTGTAGCCAG 60

TACCGAAGAT TTCTCAGAAA AAGCCGGAGA TCTCAATGTT AGTGTAAAAT CTCTCAAATT 120

TCCAAGAGGA TTATATGGGG CAAAGGTTCT CAGATCAGTT TGCAGTCTCT TACTTAGCCC 180

ATGTGCAGAG CAGTCGTAGA GGGTAGCATG CAGTGTCCTA CATAATAATT CTTTTTTATT 240

TTATTTTATG CCTTCCTCCT TCCTGTCTCT CTTTAACCTT TCTTCTTCCC TCAGGCTGGC 300

TTCTTCCCTC AGCCTCGTCC GACCCCAGCC TGGGTTCAAT GAACATTCGG TAAAGGAACA 360

CGGAATGTCA AGCGCATTAG AGACAACCTT GAGACACATT CCTCTTGCGG TAAGCACTTC 420

ACTGTAGATT TTTAATTTTA AACAAGACAA TGTTTACGAC TTGCTTCTTT CAGGGAAGAG 480

CGATATCAAT TTTAGTGAAC ACTTCAAGGC TGAGATACGC TAGGAGAGTC GTGTGGTGTT 540

GCACAGCAAA GAATTCCACT TTGAAGCGAG TGGGAAAAAA AGCATCAAAT GCCACATGTA 600

ACTCACCGCC TGAAGGGTTA CATTGGTATG AAACCTGGGT TTAAAAAGGG ACCGAATAGA 660

CTAGCCATTA AAAGACCTGC GTACAACCTC TCTCTCTCTC TTTGAGAGAT AATGTATCTG 720

GACAATAAAC ATGAACAGAG TGGAGTCTAT CCTGTTTAAA ACATTGCCTA CTGTACAGGC 780 ACCAGGAGCT GAAGGGTCAG AATATTAGCA GTGGGAGCTT GATTAGAAGT TGATGAGAGA 840

TGGGTAGTAG GAGGAAAGAG TGAGATAGAG GAAGAGGACA TGGGGGTTAC CCGTAAGTGG 900

AGAGTAGAAA AGTAGAATCA GCTGGCCATC AAAGGGCGTG GGACTGAGGA ACAGTATGGC 960

ATGTATTAAA TATACTAAGC GCTGACATTG GAGGAGAACT AGGAAGGTAA ATGAAATCAA 1020

TAGGGGATGA TGGAGAATAG TTAGGTGTGC AGGGATTAGG GTTATGATAG AAATACATGT 1080

GAATACATGC AGTATTGTCC TGGAAAATGG TTAACAGTTG GTTCTCCTGG GGGGTGAGGG 1140

GAAGCCCTGA TTTGTAATAT TTGCCTATTT CTGTGGTGCA AATACTCCCA CCATGACCAG 1200

TTTCAAGCTA TGAATGTGAA TCACAAAAGC AGGTTGGGAG GAGATGCGCA CATTTGTTCC 1260

CCGGCAAGGT GGAAGGTAAG GAAGGTGAAA TCAACAAGGT CAAAGAAAAC TCAAGATTTC 1320

GAGGTGCCTC AGGTCTGAGG GGCAATGAAG TCTAGGAATG GCTGTGCTGA GGTAGCTGAA 1380

ATAGAAGTGA CTGCAGAGGT CATGAAGCTG AAGAGGTGAA AACAGAAATT AGAAAGGCAA 1440

ACCCCCACCG CCCAACCCCC ACCCCTGCAG CCAGTTTCTG AGGGTGACAA TAGAGGAAAG 1500

GGTGGAGATG GAGTTCAGGT CCAGAAGCCA TAGAAGCGAG TGTGACATTG TGCTCAAGGT 1560

CAGCACATGT CAGTGTGGGG TGTCACATGC TGTTGTGAAC CATCATTTAT CACCAATTAT 1620

GGAAGACCTC CTATGGGCAT CTTGCCATAT GCATTATAAA GATGTGTAAG AAGACATTTC 1680

CCTCCACTTG GTGAGGAGAA TTAGGGCTGT ACACAGATAC TGTAGAGTGC CATGTGCCTG 1740

GTACAGATAA GGTGTGTTAG AGGTTAAAAG ATGAGGCTCT TAATATTAAT GATAGATCCC 1800

ACTTACCTGA GTCTGACTTA CAATGTGCCT AGCATTAAGT GTTTTACCTG CATTCCCTTT 1860

GACCTTCAGA ACAACCCATT TTACAGATAG GGAAATTGGG TCAGAAAGTT TCAGTAACTT 1920

ATCCAAGGTC ACACAATTGG CAAGTGCCAG AGCTGAGCCA GGAACTGAGG TCCTTCTAAC 1980

ACCAAACAGC TTGTCTCCCC AATCACTGTG CTATTTTCCC TCCCCCAGAA GATAATACTC 2040

TGATGGAAAT GAAGGATAGT GTAATAGGAG ATTCGGTGTT CCTTTTTTTA AAAAAAATTC 2100

AGCTTGCATA TTCCTAAAGA GTCAATTCAT GTTTAAAAAA AATTTCCCTT GTGCTTGCAT 2160 GTGACATGTA TTTTTAGGAT CTGCTGTTAG CAAGTGTATT TTTGTGTGAT TGAGTGGGAG 2220

AGTGGGAAAA GTTTTGCAGA GCTGTTGAAG CCAGAATGCA GGGGGGCTGC GCAGCAGAGA 2280

CTGTAAAATC TCTGCCATCT CAGGTCTTGG AACAAGCACA AAGAGATGTG TTCTCGATTT 2340

ATTATTCTAT GTACATCCCC AGATGAATGA CTAGTTAAAG GTATTGTTAA AGCATTTTAA 2400

ATGACCCACT TCCAGCAGCG AACAAAATCA CTTGCTGTGC CAAGCCAACT GGCATTTCTG 2460

AGATGATAAA ACCACAAAGT GAGGAAAACG TTAAAACTGC TAAAGCAAAA ATGATACACA 2520

ATAATGGAGA AGGAGAAAAA TTGAGCTTTA TTGTCTGCCT AGGCAGATGG CTGACCACTA 2580

GGTGGGCTCG GCGTCACGTC CAGGGTAATT GGTTGCTGGG GTGTTTCTGG CGAGGAAGAT 2640

TCACGCTTCA GCTCGGTCCA CAAGATCCTG GCTCATTCTT TCCTAGATTC CATTTTCTGC 2700

CTCCTCTCCA TGACTGGGTC TGATGGTTGA TCCAAACGGG CAATTGAAAT CAGAAGGTTA 2760

CCTTTACCTT AAAATGCTTT TCTGGAAATA AAAGGACATG AAAAGTAACT AAGGACCGGA 2820

TTTCCTAGCC GTCTTTCTCT CCTGCATGCG CAATTTATCC CCAGATATAA AATTGCCTGC 2880

TTTGATAATT ATACCCTCTA AATGAGGGGC AAGTGGCTAA TTATGCCCAC ATGTGGCCGA 2940

TTGCACTCCC CATTAGCCAA TTATGTGCTC AATTATTTGT GCACATGAAT AATTGCACTC 3000

ATGGAAAATA GCGGCCCTCC TTTCAAATCC TCGTGCTTGG AGTGGCTGAT GGAGTAATTG 3060

TCACACTGGA AATGCACTTG GTGGGGAGGG AAAGAGTATC AGATACCAGG AAACGCATAA 3120

GTGACCAGAG CTCGCAGATG TTCACTGCCA CAAATGGCCT TAGGAGCCAG AGAGAGCGGG 3180

AAGGACCACA GGATGGAACG GGCCAGCCTG TGAGTTAGGA AGCCTGCTTC TGAAGTTGCC 3240

TGGGCAGCTC ATGTGCGGTG ACCTTGGGCA AGTCATTAAC TTTCCTTCAG GTCTAACTGG 3300

TTCTGCATAC ACAATGAGGA TGGTAATAAC GCCCAATTCC CATCACTATC GTGGGATGGA 3360

TCAGACTATT TAAAAGGATT TACAATCTGC TTGGGTAAAA GCTTTACATA AATATGAGGC 3420

ATTATCATGT CGCTTGGTAC ATCTCCAATT ATGAAGGAAG GGTAATGACC CTCCACAGCA 3480

ATGCAGGACT CCTGGTTTGG AGGGAGGGAA AGTTTGAGAA GGACAGGAAG CTTGTTGCCC 3540 CAGCACTGAT GTTTCTACTG AGGTACCAGA AAATGTCATG TGGTCATACA GAATTCATTT 3600

ATTCATTCAA CAAACATCTG TCAATTGTTA CACTGTCCTG AGAATTTGGA AAAATGATGA 3660

AAGACTCAGT CCTGCCTTAG GAGGTCACTG GCACATTGGC CCGGGCCCCT GTTTTGGGCC 3720

TTTTACTCTG ACCTGTGCTG ATTTGCAAAT AGTGGGAAAT TTTATCTCAA GTCTAGGAAA 3780

TCTGGCATGC ATTTTCACGG TTTGATTGCC AGGTACATTC GATGGCAATG AGTCTTATAA 3840

TGTTTGGTTA CCTTCATTTA CCTAAAAACT GTGGTTGTTG CTGTGGTTGT TGTTTTTGTT 3900

GTTTTTGAGA CGGAGTCTTG CTCTGTCATC CAGGCTGGAG TGCAGTGGCA TGATCTCCGG 3960

TCACTGCAAA CTCCACCTCC CAGGTTCAAG CGATTCTCAT GCCTCAGCCC CCTCAGTAGC 4020

TGGATTACAG GCGCGCACCA CCATGCCCGG CTAATTTTTG TATTTTTAGT GGAGACAGAG 4080

TTTCACCATG TTTGGCCAGG CTGGTCTCGA ACTCCTGATC TCTGGTGATC CGCCTGCCTC 4140

GGCCTCCCAA AGTGCTGTGA TTACAGGCGT GAGCCACTGT GCCCAGCCAG AACTGTGGTT 4200

TTAATGACAA TGCTAAAAAG TGGTATATGT CACAGTGTCG GGTGGGGCTA AGAGGCACAT 4260

TGCTGCAGTG ATCCATCATT CATTTCCCAC CATTCTCGCC TGGATTAGCG CAGCAGCTCC 4320

CAGAGAGGCA CCTCACTTTG ACCTTCTTCC TCAAAGACAT TCTCTGTGAC CTGCCTGGCC 4380

CTTATTACCT CTCTAGCTTT GCCACTTCCC TATGTCTCCA TCTCCCCTCT CACACGTAGT 4440

AAGAAAGAGA CTCTACCTCC ATGGAAGTTA AGGAGAGGTT TCACAGAGGC AGGATTGCTT 4500

ATTAGTCTTC AAAGATGAGG TATTTGCTAA ATGAATGAGA CAAAGGGATT GGGGCCACAT 4560

TACAGGAAAT TGAGGTATGT AATAGCCTGG TGCAGGTTAA GAGTGTGGAC TCTGAAACCA 4620

GACTCAGCCT GGAATTGAAT CCTGGCTGTG TGATGTTGGG CCAGTGACTT AACCTCTCTG 4680

TGCTTTTATT CACTCTTCTA TAAAATGGGG ATTATAATAA ACCTACCTTA TAAGGTTATT 4740

ATAAGAGTCA GTAAATATAA AAATAGAAGT TTTTGGATGA TGACTAGCAC AGAGTAAACA 4800

CTTGTTTGCC ATTATTTTTA TTACTTGACT AAAAATATAC CAAAAAGACC ATCCAAGAAA 4860

AGCCTTTAAG CTGCTAGTGC AGAAAGATTC CCCTTGTGTT TGTGTGCTGG GGGGTCAGTG 4920 GTGCCTGTGG CCCACTGGAG AGGAGACAGC TATGGCTGGA GTGATTCTCA AACTTCAGAA 4980

TGTCTAAAAT CATCACATGG ACAACTTATT AAGGAAAGCA AATGCCTGGG CTCCATCCTC 5040

AGAGAGTCTC ATTCACTGGG TCAGGATAGA GCCCAGGAAT CTTTACCTTA AAGAACCATC 5100

CCACCTCCCA CCTCATATGA TCCTTATGCA GGTGATCTGG GGGCCCACAC TTTGAGAAAT 5160

AGACTCAGGT CAAAGTGGGC TCTAACTGCA TCTCATTTCT TACCTGGCAT ATCTAATAGT 5220

AGAGAAGAAG ACAATGCTAA GATTTTTGTT GGAGATCTTT TGCTGGGATT GCTGCTTCAT 5280

TCATTCACTC ATTTATTTAT TTATTTATTT ATTTTGAAAC AGAGTCTCAC TTTGTCACCC 5340

AGGCTGGAGG GCAGTGGCAC AATCTGAGCT CACTGCAGCC TCAGGCTCCT GGGTTCAATC 5400

GATTCTCTTG CCTCAGCCTC CCGAGTAGCT GGGATTACAG TCATGCACCA CCACGCCCAA 5460

CTAATTCTTG TATTTTTAGT AGTGACAGCG TTTCACCATG TTAGCTAGAC TGGTCTCGAA 5520

CTCCTGACAT CAGGTAATCT GCCTGCCTCG GCCTCTCAAA ATTAGTAGCT GCAATTACAC 5580

GTGTGAGCTG CCGTGCCTGG CCTGCTGTTT CTTTTAGTTG GGCCTCTTCT GTAATAGAGT 5640

GTGAGAATTC TGACTTGCTG CAACAGTCTG CTTTGAAGCA GGGCTGTGTT TACACTGGTC 5700

AGATGTGGAA TTGTGGGGCA CACTTAGCAG CTTCCTTCTC TAATTTTTCT GTATTTTCAG 5760

GAGAACAATT TTAAAAAATT TAATAAAAAT GCCTTAAAAA TTAACATTAT TATAAGATGA 5820

ATCCCATTTT TCTAATCTTG TAAATTAAAA ACAATCATAA GCATATGAGC ACCTGCACTT 5880

AGGGAATCAA GGTGGCAAAG CTAAACACTT CCAGCTCTAG GTGATTCGCG GCAATACAAA 5940

TGGAGCTGGA CTTTGGCCAC AGTGCAAAAA TATTGATCTG TTGTTAGATG CTCTGAAGTT 6000

TCCACAAAGA ATTGGTTCTG CCTGCTGTGC TTCAGTGCTT AAGGGAAGTG GTTCCTCAAA 6060

ATGTTAGTTT TTAAGCCCAG CTTTCTTAAA TAGGAAGATT CTAATAGTAG CAAAAATATA 6120

AACTGCTTCT AGGTTTAAAA AGGACCAGCA CACAATGGTT ATCACACACC TTTCTCCTCA 6180

GGTGATGAGT GGATGAGTGG CCTGGTGTAT TTCATAACAT CTCCCAGGGT CCAAATGCTA 6240

AAGCAATTGC TGAAAAGATA CCATGTGTAC CGGAACCTTG CAGAGGTATT TTGTTGGCAT 6300 AAAAAGAAAT ATTGATCATC TATAGTAAAA ATGGTTCTAC TTTAATACTA CTGAGAAAAG 6360

ATTTTCTTTT CCCAGATCTA CATCCTGAAT CTTCATGAAG ACAAGATCCC CTAAACTTCC 6420

ACTAACACCA TAATGTGTGC TGTCCTTTGT AATGTAGTCC ACAGATCTCA TAAACTGTCA 6480

GAAATAGCAG AGATTGTAAG GTCATCCACT TCCCCTGTAA GGCCTGCGTC CCTCACTTAC 6540

ATCCCTAATA ACGTCCTCTA ACCTCTGCTG GAGGGCAGAT TTAGCTGCCA GCTGGGAAGA 6600

GCTCTGCCCT AGTCAACATT TTTATCTGTG GCTTTCAGAT GAGAACACTG GATGCTTATC 6660

TGAAAAAAGC TCCTCAGGCT GGAGGGAGGG ATTGGCTCTA ACAAGATGCA ATGTGATAAG 6720

AATAAAAGCG AAGCCAAACT CTAGGCCCAA AGGCTCTAGC AACACACTTT TGAGAACCTT 6780

GGAGACGAGT TTTGGCTGAT GCGAGCTTCT CCGCCTGCTA AAGTAGCCCA TTCCATTTGG 6840

ACGGCTCTAG AGGCTGGCAT GTTCTTCTCC ACGTTGTGTT AATGTACTCC AGTTTCTTCC 6900

TGCCATGAAC TGGCATGCCC TGGCTCCTCC TACCTTCCCC ACTTTAAGTC TTCCCTCCCT 6960

CCTTCTGACC TTCCCATTCC AGCCACACTG GCCTTTTGTC TGGTCCTAAC AAACCATGCC 7020

TTTCCTGCCT CCAAGCCCTA CACCTGCTAT CCATCCCTCT GTCTGAGAGA CACTCCCACC 7080

CCTTCACAAA GCCTGTTTCT CATCCTTCCA GTTCAGATGT CTTCTCAGCT TGCCTCAACT 7140

GACCTCTTTC AGCTATTCTC ACTCTTTGTA CTCTGTTCAT TTCCTTCCTG GCAGTCACCA 7200

TAATTTATCT TTATTTGAAT CAATTTCTTA GTTG AT AT TTAGTTATTT GCACACTCTG 7260

TCTCTCTGTG CCTTTCTTAT TCACTGCAGG CTTTCTTATG TAAGTAATTT ATTTACTTAA 7320

ATTTTTAAAA ATAATTTCAA CTTTTGGCCG GGCACAGTGG CTCACGCCTG TAATCCCAGC 7380

ACTTTGGGAG GCCGAGGTGG GTAGATCAGC TGAGGTCAGG AGTTCGAGAC CAGCCTGGCC 7440

AACATGGTGA AATCCCATCT CTATTTAAAA TACAAAAACT AGCCGGGCGT GGTGGTATGC 7500

ACCTGTAATC CCAGCTACTC GGGAGGTTGA GGGAGGAGAA TCACTTGAAC CGGGGAGGTG 7560

GAGGTTGCAG TGAGCTGAGA TCACGCCATT GCACTCCAGC CTGGGGCACG AGAGTGAGAC 7620

TTCATCTCAA AAAAACAAAA AACAAAAAAC CCCTGCTTTT CAGAGGGGCT GAACTAATTT 7680 ACATTCTCAC CAATAGTGTA TAAGCATTCC CCTTTCTCTA CAGCCTCACT AGCATTTACT 7740

TTTTTAAAAA ACTTTTTAAT AATAGCCATT CTGACTGGTA TGAGATGGTA TCTCCTTGTG 7800

GTTTTCACTT GCAATTCTCT GATGATTAGT GATATTGAGC ATTGTTTTAT GTTTGTTGGC 7860

TGTTCGTATG TCTTCTTTTG AGAAGTGTCT TTTCATATAT TCTGCCCATT TTTTGAATGG 7920

AGTTGTTTTG TGCTTGTTGA ATTAAGTTCC TTATAGATTC TAGATATTAG ACTTTTGTTG 7980

GATGCATAGT TTGTGAATAT TTTCTCCCAT CCTATAGTTC TGTTTACTCT GTTGATAGTT 8040

CCTGTTTTGT TATGTTTTGT TTTTTTGCTG TACAGAAGCT GTTTAATCTA ATTGGTCCCA 8100

CTTGTCAATT TTTGTTTTTG TTGCAATGGC TTTTGAATTT TAATAATAAA TTCTTTCCTA 8160

AGGCTGATGC CCAGAACAGC ATTTTCTAGG TTTTCTTCTA GGATTCTTAT AGTTCAAAGT 8220

CTTATATTTA AGCTTTTAAT CCACCTCAAG TTAATTTTTA TATATAGTGA AATGCAGGGG 8280

TCCTGTTTCA TTCTTTTGCA TGTGGCCAGC CAGCAATCCC AGAACCATTT ATTGAATAAG 8340

GAATCTTTTC CTCATTGCTT ATTTTGTCAA CTTTGTCAAA GATCGGATGA CTGTAGGAGT 8400

GTGGCTTTTT CTGGGTTATC TACTCTGTTA CATTGGTCTA TGTGTCTGTT TTTGTATCAG 8460

TATCATGCTG TTTTTGTTAC TATGGTCTCA TAACATAGTT TAAAGTTGGA TAATGTTATG 8520

CCTCTGCTTT GCTGTTTTTG CTTAAGATTG CTTTGGCTAT TGAGGCTCTT TTTTCACTTC 8580

ATATGAATTT TAGAATAGTT TTTTCTAATT CTTTGAAAAA TGACCTTGGC AGTTTGATAG 8640

GAATAGCATT GAATCTATAG ATTGCTTTGG GCAGTATGCT ATTTTAATGA TATTGATTCT 8700

TCCTATCCAT GAGCATGGAA TATTTTTCCA TTTGTTTGTG TCATCTACTA TTTCCTTTAG 8760

CAATGTTTTT TAGTTTTCCT TGTAGAGATC CTCCTAGGTA TTTCATTTTT TATGTGACTA 8820

TTTTAAATGG GATTGCATTC TTCATGTGGC TCTCAGCTTG AATGTTATTG GTGTATAGAA 8880

ATGCTACAGA GTTTTGTACA CTGATTCTGT ATCCTGAAAC CTTACTGAAG TCATTTATCA 8940

GTTCTAGGAG CCTTTGGCAA AGTCTGTAGT GTTTTCTAGG TATAGAATCA TATCATTAGC 9000

AAAGAAAGAT AGTTTGACTT CTTCTTTTCC TATTTGAATG CCTTTTATTT CTTTCCCTTG 9060 TCTGATTGCT CTTCCAGTAC TACGTTGAAT AGGAGTGCTG AGAGTGAGCA TCCTTGTCTT 9120

GTTCCACCTC TCAAGGGAAA TGGTTCCAGC TTTTGCCCAT TCAATATGAT GTTGGCCATG 9180

GGTTTGTCAC AGATGGCTCT TATTATTTTG AGGTGTATTC CTTTGATGCC TAGTTTGTCA 9240

AAGGCCTTTA TCATGAAGGG ATGTTGGATT TTATTGAAAG CTTTTTCTGG GTCTTATTTG 9300

GTGAATTGCA TTTATTGAAT TGTGCATGTT GAGCCAAACT TCCATCCCAG GGATTAAACC 9360

TACTTAATCA TGGTGTTAAC TTTTTGATGT GCTGCTGGAT TTGGTTTGCT AATTTTTTTT 9420

TTTTTTTTAA AATGGATTCT CCCTCTGTCC CCCAGGCTGG ATTGCAGTGG TGTGATCTTG 9480

GCTCACTGCA AGCTCCACCT CCCGATTTCA TGCCATTCTC CTGCCTCAGC CTCCCGATTA 9540

GCTGGGACTA CAGGCACCCG CTACCATACC CAGCTAATTT TTGTATTTTT TAGTAAAAAC 9600

AGGATTTCAC CATGTTAGCC AGGATGGTCT TGATCTCCTG ACCTCGTGAT CTGCCTGCCT 9660

CAGCCTCCCA AAGTGGCTAG TATTTTTTTA ATTACTATTT TTTCTCACCC TTGCTGCCAT 9720

CTTATGATTT TCTAGTATTT TGTTGAAGAT TTTTGCATCT ATTTTCATCA GGGATATTGG 9780

CCTGTAATTT TCTTTTTTCA TTTCATCTTT ACCACATTTT TGTATCAGGT TCATACTGGC 9840

TTCATAGAAT GAGTTCAGGA ATGGTCCCTC CTCCTCGAAT TTTCTCTGTA GAATTAGTAC 9900

CAGCTCTTTG TGTGTCTGGG AGAAGTTGTA TGCCAATAAT TTAAATGCAG TTAATATTTA 9960

CTGGACAATT TCCTCCAGAT AATTGTATAT GATTTTTGGT CCACCCTGAG TTGATACATG 10020

TATTTTAATT GTATCATGGT ATGAAAAGAG CAAGAGTATT TGGTCACCTA GTCTTGCCTA 10080

TAGATGTGCC TAATGATTCA AAGTAGATAT TTTGGGAGCC TAACAGGTGC CGTGACTAGG 10140

CAGTTTTGTT TTTTTTTTTT TTTGAGACAG AGTCTCGTTA TGCTGCCCAG GCTGGAGTGC 10200

AGTGGCATGA TCTCGGCTCA CTGCAACATC CGCCTCCTGG GTTCAAGCAA TTATACTGCC 10260

TCAGCCTCCC CAGTAGCTGG GACTACAGGC TCACGCCACC ACGCCTGGCT AATTTTTGTA 10320

TTTTTAGTAG AGATGGGGTT TCACCATATT GGCCAGGCTG GTGTTGAACT CCTGGCCTCA 10380

TGATCCACCC GCCTCGGCCT CCCAATGTGC TGGGCTTACA GGCGTGAGCC ACCGCACCCG 10440 GAGATTAGGC AATTTTATAT TCCCAAATAT CCAACTCTTC TGACCCGCTT TCTCAGCCTG 10500

GGTGTATCAG GCACAAGGCC TGTTCAGATT ATGTGGTCTC TGAAGATATG GCTCTCCAGG 10560

GTTGACAATG TGGATAAGGA TTCACCTGGT TTAGGATTTA CACATTCGCC TTGAATGTCT 10620

GTTGCATCAA GTAGACAGTC CATCCCAACT TGGCCATTTG GTCAGAGCTG TAAGGAGACA 10680

AGGAGGTGGG CAGCCGCTGC TGTGAACTGC TTGGACAAAG ACTGCCAAAT AGCTATCAGA 10740

CAGTGTTAAC AACAGCTGAT TTAGGTTTGA AGGGGGCAGT CTCTTGGGCC ACTTACTATG 10800

CTGCATCATC CTCTTTGGAA AATGCTCTTC AGGTAACTGC CTAACAGACT GAGAAAATAA 10860

AATGCTCACA GAGAAAAAAG ACCCGGAAAG TCTGACTTCT CAGAGCTCAG TGTTTAGGTG 10920

CAGAACTGGA TTGTGAAAGG ATTTTTAAAT TTTTTATATT CATTGCAGGG AACATTCATT 10980

TATTCCATCC TTCTCCACTC CCACCTGTCT GTCGTTGTCT TTGTCTCTGT CTCCCCACCT 11040

CTCTCTCTAG ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAC 11100

ACACACACAC ACACACACAC ACACACACCC CTATTCATTG CCAACAGTAA TAGAGTTGCT 11160

TCTTTACTTC TTGGAGAGAA AAGCCTCAAT CTGAGGAAGC TGTGCTGACT AGCCTTGCTC 11220

TTAATCATGG AGACAATGCT TTATGCCTTT ATCTTTGCAC AGCTGAAAGC CATGGCAGAA 11280

GCAGTCCTCT AAACGAAATA AAATAGAAAG GTTCCTGCTA AGCCCTGGCA AATGCAGCCT 11340

TCTATCCCTC CCCCAACACT CACAGCTTCT GAGCAAGATG TAGCTGCCTT CCAGGAGGCT 11400

GGGTGATGGG CAATAATGAG CAGAGCCACG TGAAGGAAAG ATGGGTGAAG AAATGTGTGT 11460

GGAGGTCATG CTGGCTGCAC TGACCATGAA ACAAAGGATC TACCCCTCTA GTAACTGCCC 11520

TACTCCTTTG GTAACTGTTC TGAAATTATA ACTTGCCAGA AGTTCAGAAG GACCTAGTGC 11580

AGGTATTAGA GGAAATTCGT AAGATTGAGC CATTTATTCC TGCACAGATA CA AATAATG 11640

GACACGGGCC ATGGTGGCCA GCATTCTTGC TCTTGACAAT GGTGAAGGGA AGGGTTGTAG 11700

GTCATGGCTA TGCTCTCAGA ATTATAATGG AAAGAAACAG CTCCTGAGTG TTTACTATGA 11760

GCCAAGGGCT GTGCTAAACA CTTTACCATA TGATGACATC TTTTTCTCAC AGGTATCAAA 11820 AAACAATAGG ACATACCGGA TAGCTACAAT CTTTGGGCCC CTGCAAACAC AATAATGTGT 11880

ATTCTCTTCT TCAAATCCTA CATATTGCTA CAAACTGTAT CCCTGAGGCA TATTCATTGT 11940

AAAATAAAAA CATATAAAGT ACTACTTTTG TTTTTTGAGA TGGAGTCTCG CTCTGTCACC 12000

CAGACTGGAG TGCAATAGCA TGATCGTGGC TCACTGCAAC CCCCTGCTCC TGGGCTCAAG 12060

TGATTCTCCT GACTCAGCCT CTCAAGTAGC TGGGATTACA GGCGCACGCC CCCATGCCTG 12120

GCTAATTTTT GTACTTTTAA TAGAGACCAG GTTTCACCAT GTTGGCCAGG CTGGTCTCAA 12180

ACTCCTGACC TCAAGTGATC CACCTGCCTC GGCCTTCCAA AGTGCTGGCA TTACAGATGT 12240

GAGCCACTGC ACCCGGCCCA TATAAAGTAC TACTAATGTA ACAGGGTGCT AGTCCAGACA 12300

GTGACCACAC GTGGTGTTCA TTGAAGGCTG GACTAACAAC TCCAGCCTCT CCGCCATCAC 12360

AGAGTGATGA CTGCCTTCCC TGAAGCAAAG CTTCTGGTTC AAGGAAAGGC CAGTAAGTGA 12420

CTGCTCTTTG TTGTATACAT GTTAGATGAT CAGGCCTCAA GAAAAGTATA AAGAGATCTT 12480

TGTGCTCTCT GGGACTCAAA AAGCTGCACT CTTTGGGGGA AGGATAGCCA GGTAAAAGTG 12540

GCCCAGGTAA AGAGGGCCTG GTACACCTGG TTCTGCAAGA TGGTAGACAC AAAAATGAGA 12600

GCCACATTTG GAGCTTATGT GCCCCTAACT CTGTACATAA CCTGCAAGAT CTAATTACTA 12660

ACAACTGGAA TCTTGGAAAC ACCTGTAGTA CATCCTTGGC TAAGGTTAGC CCCAACAGAG 12720

AGGGCTCTCC TCTTACAGAG AACCATTACA TTTGTGCCTT CATCCTAGAG TAGAAAAGGC 12780

ATGATCAGAC TACTAAAAAG ACATCAGGAA AGGGCCTGTG ACATCTGAGG GAAGTGGTTG 12840

CCCTCTCTGG GATGTTGGTT CGGGAAGAGG GGCATGGAGG AGTGCCTGCT TTAGATGGTC 12900

ATTCAGGAAC CCAGGCTGAT AGTGAGAGGT GAAGCCAGCT GGGCTTCTGG GCTAGGGGGG 12960

ACTTGGAGAA CTTTTGTGTC TAGCTAAAGG ATTGTAAATG CACCAATCAG CACTCTGTAA 13020

AATGGACCAA TCAGCACTCT GTAAAATGGA CCAATCAGCA GGATGTGGGC AGGGCCAAAT 13080

AAGGGAATAA AAGCTGGCCA CCAGAGCCAG CAGTGGCAAA CTGCTCAGGT CCCCTTCCAC 13140

GCTGTGGAAG CTTTGTTCTT TTGCTCTTCA CAATAAATCT TGCTGCTGCT CACTCTTTGG 13200 GTCTGCACTA TCTTTATGAG CTGTAACACT CACCGTGAGG GTCTGTGGCT TCATTCCTGA 13260

AGTCAGTGAG ACCACAAACC CACTGGGAGG AACAAACAAC TCTGGACACG CCAACTTTAA 13320

GAGCTGTAAC ATTCACTGCG AAGGTCTGCG GCTTCACCTC TGAAGTCAGC GAGACTATGA 13380

ACCCACTGGA AGGAAGAAAC TCCAGACACA TCTGAACATC TGAAGGAAGA AACTCCAGAC 13440

ACACCATCTT TAAGAGCTGT AACACTCACT GCAAGGGTCT GCGGCTTCAT TCTTGAAGTC 13500

AGCAAGACCA AGAACCCACT GGAAGGAAAC AATTCCGGAC ACATTTTGGT GACCCAGATG 13560

GGACTATCAC CAAGTGGTGA GTACCATCAA CCCCTTTCAC TTGTTATTCT GTCCTATTTT 13620

TCCTTAGAAT TCGGGGGCTA AATATTGGGC ACCTGTCAGC CAGTTAAAAG CGACTAGCAT 13680

GGCTGCCAGA CTTAAGAGAC TAAAGACACG GGTGTCAGAC TTTCTGGGAA AGGGCTCTCT 13740

AATAACCCCC AACTCTTTGG AGTTGGGAGC GTTGGTTTGC CTGGAACCAG CTTCCACATT 13800

TCCTGTACTT CTGGGCTGAG ACGAGGGTCA ACAGAGAGGA AAGCCATTCA GCTCTGGGGT 13860

CCCGACAGCA AGTTGGTTGA CCCTGTGGCC ATGAACAGAA CTCTCGAAGT CATGTTGCCC 13920

AAGCGAGACT CACCCATCTA TCCTATCTAT CCTGACTCTT GCTTCCTGGG TCCTAATGCC 13980

TGGAAGACAA AACTTCCTCT TGTCTCTGTT CTCCAAGGCT AGTCCCACTT CTAAAAACCA 14040

CTCCCTGTCT CTGGTGCTTT TCTAGTTTCT CCTATAAGAA TGATTTCTAG TATAAACTCC 14100

AGGACTCTAT TCTCTTCTTT AGGCACCCGG GCTCACCAAT CAGAAAGCCA TAATTTTTGC 14160

CCAAAGCCCC ATCTTAGGGG GGACTATCTG GAATTTTAGG ATCCCTCCTC AGACAAGCAG 14220

GCCTAACAAA AGCTATTCCT GAAGCTAGGA TATGGGGAGC CTCAGAAATG ATATCCTTCC 14280

TATTCAAGTG AGGACAAAAG GCATCACTCT TCCAATTCTG GAAATCCCTT CCCTCCCTCA 14340

GGGTATGGCC CTCCACTTCA CTTTTGGGGC ATAACGTCTT TATAGGACAC GGGTAAAGTC 14400

CCAATGCTAA CAGGAGAATG TTTAGGACTC TAACAGGTTT TCAAGAATGT GTCGGTAAGG 14460

GCCACTAAAT CCGATTTTTC TCAGTCCTCT TTGTGGTCTA GGAGGACAGG TAAGGGTGCA 14520

GGTTTTCAAA AATGTGTTGG TAAGGGCCAC TAAATCTGAC ATTCCTTGGT CCTCCTTGTG 14580 GTCTAGGAGG AAAACTAGTG TTTCTGCTGC TGCATCAGTG AGCGCAACTA TTCCAATCAA 14640

CAGGGTCCAG GGACCATTGT GGGTTCTTGG GCAAGAGGTG TTTCTGCTGC TGCATTGGTG 14700

GGCTCAACTA TTCCAATCAG CAGGGTCCAG TGACCTTTGC GGGTTCTTGG GTCGGGGGGT 14760

GGGGGGAACA AACAGACCAA AACTGGGGGC AGTTTTGTCT TTCAGATGGG AAACACTCAG 14820

GCACCAACAG GCTCACCCTT GAAATGTATC CTAAGCCATT GGGACTAATT TGACCCGCAA 14880

ACCCTGAAAA AGAGTGGCTC ATTTTATTCT GCACTATGGC CTGGTCCCAA TATTCTCTCT 14940

CTGATGGGGA AAAATGGCCA CCTGAAGGAA GTATAAATTA CAATACTATC CTGCAGCTTG 15000

ACCTTTTCTG TAAGAAGGAA AGCAAATGGA GTGAAATACC TTATGTCCAA ACTTTCTTTT 15060

CATTAAAGGA AAATCCACAA CTATGCAAAA CTTACAATTC ACATCCCACA AGAGGACCTC 15120

TCAGCTTACC CCCATATCAT AGCTTCCCTA TAGCTCCCCT TCCTATTAAT GATAAGCCTC 15180

CTTAATCTCC CCCACCCAGA AGGAAACAAG CAAAGAAATC TCCAAAGGAC CACAAAAACC 15240

CCTGGGCTAT CGGTTATGTC CCCTTCAAGC TGTAGCGGGG GAGGGGAATT TGGCCCAACC 15300

CAGGTACATG TCCCCTTCTC CCTCTCTGAT TTAAAGCAGA TCAAGGCAGA CCAGGGGAAG 15360

CTTTCAGATG ATCCTGATAG GTATACAGAT GTCCTACAGG GTCTAGGGCA AACCTTCAAT 15420

CTCACTTGGA GAGATGTCAT GCTATTGTTA GATCAAACCC TGGCCTTTAA TTTAAAGAAT 15480

GTGGCTTTAG CCACAGCCCG AGAGTTTGGA GATACCTGGT ATCTTAGTCA AGTAAATGAT 15540

AGAATGACAG CTGGGGAAAG GGACAAAGTC TCTCCCGGTC AGCAAGCCAT CCCTAGTGTG 15600

GATCCCCACT GGGACCTAGA CTCAGATCAT TGGGACTGGA GTCGCAAACA TCTGTTGACC 15660

TGTGTTCTAG AAAGACTAAG GAGAATTAGG AAAGAGCCTA TGAATTATTC AATGATGTCC 15720

ACCA AACTC AGGAAAAGGA AGAAAGTCTT GCCTTCCCTT GAGTGGCTAC AGGGAGGCCT 15780

TAAGGAAAAT ATAACTCCCC TGTCACCCAA CTCACTTCAA GGGTTAATTG ATTCTAAAAG 15840

ATATGTTTAT TACTCAATCA GCTGCAGATA TCAGGAGAAA GCTCCAAAAG CAAGCCCTTG 15900

GCCCTGAACA AAATCTGGAG GCATTATTAA ACCTGGCAAC CTTGGTGTTC TATAATAGGG 15960 GCCAAGAGGA GCAGGCCAAA ATGGAAAAGC GAGATAAGAG AAAGGCCACA GCCTTAGTCA 16020

TGGCCCTCAG ACAAACAAAC CTTGGTGGTT CAGAGAGGAC AGAAAATGGA GCAGGCCAAT 16080

CACCCAGTAG GGCTTGTTGT CAGTGTGGTT TGCAAGGACA GTTTAAAAAA GATTGTCCTA 16140

TGAGAAACAA GCTGCCCCCT CACCCATGTC CACTATGCTG AAGCAATCAC TGGAAGCCAC 16200

ACTGCCCCAA AGGACAAAGA TTATCTGGGC CAGAAGCCCC CAAGCAGATG ATCCAACCAC 16260

AGGACTGAGG GTGCTCAGGG TTAGCGCCAG CTCATGTCAT CACCCTCACT GAGCCCTGGG 16320

TACATTTAAC CATTGAGGGC CAGGAAATTG ACTTCCTACT GGACACTGGT GCGGCTTTCT 16380

CAGTGTTAAC CTCCTGTCCT GGACAGCTGT CCTCAAGGTC TGTTACCATC CGAGGAATCC 16440

TGGGACAGCC TATATCCAGG TATTTCTCCC ACCTCCTCAG TTGTAACTGG GAGACTTTGC 16500

TACAGATAGT AAGTATGCTT ACCTAATCCT ACATGCCCAT GCTGCGATAT GGAAAGAAAG 16560

GGAATTCCTA ACTTCTGGGT GAACCCCCAT TAAATATCAC AAGGAAACTA TGGAGTTATT 16620

GCACACAGTG CAAAAACCCA AGGAGGTGGC GGTCTTACAT TGCCGAAGCC ATCAAAAGGG 16680

GAAGGAGAGG GGAGAACTGC AGCATAAGTG GCTGGCAGAG GCAGGGAAAG ACAAGCAGAA 16740

AGGAAAGAGA GAAAGAGCAG AAAGTGAGAG AGAAAGAGAG ATAGGAAGTG ATAGCAAAGA 16800

GGGAGTCCGA AAGAAAAGAG AGAGGAGAGA GAGAGGGGGA AAGACAGAGA GAGACAGAGG 16860

AAGAGACAGA GAGACAGAAA GAGAGAAGCA AAGAGAGGAA GAGACAAAGA AGGAGTCAAA 16920

GAGAGGGAAA GAGAAGTAGT AAAGAAAAAA CAGTGTACCC TATTCCTTTA AAAGCCAGGT 16980

TAAATTTAAA ACCTATAATT GATAATTGAA GGCCTTTTCT GTAACCCTAT AATACTCCAA 17040

TACCACCTTG TTGTCAGTGT AAACAAGGGT ATAGCCCAAA AGCACTGAGG CCACTGACAA 17100

CCCGTAGCCT TCTTATCAAA AATCCTTAAC ACAGCAGGTT TCCTAACAGG GAATCTAAAT 17160

CTTAAGGTCG GACCAGACAT AGGAGGAACT GCCTTCAGGA CAGGATGATA GATGGTTCCT 17220

CCCAGGTGAT TAAGGAAAAA GACACAATGG GTATTCAGTA AGTGATAAGG AAACTCTTAT 17280

AGAAGCAGAG TTAGGAAAAT TGCCTAATAA GTGGTCTGCT CAAACGTTGA AGCTGTTTGC 17340 TGTTTGCACT CAGCTAAACC TTAAAGTACT TACAGAATCA GGAAGGAGCC ATCTATACCA 17400

ATTCTAAGTT AATATGGACT GAACGAGGTT TTATTAATAG CAAAGAAAAT TAAAATCTCA 17460

AACTTACAAG GTTTTCAACT AAAGTAAAGT TTGCTAAAAG TTAACAGCGT AACATGTATT 17520

ATCCTACTAC CTCACACTCT CTCAAAGGAT TTCTCAGACA GTTTGCAAAA AAGAACGAAA 17580

TCTGTCCTTA CTCTACAATC CCAAATAGAC TCTTTGGCAG CAGTGACTCT CCAAAACCGC 17640

TGAGGCCTAG ACTCTCTTAC TGCTGAGAAA GGAAGATTCT GCACTTCTTA GGGGTAGAGT 17700

GTTGTTTTTA TACTAACCAG TCAGGGATAA TATGAGATAC CACCCAGTGT TTACAGGAAA 17760

AGGCTTCTGA AATCAGACAA TGCCTTTCAA ACTCTTATAC CAACCTCTGG AGTTGGGCGA 17820

CATGGCTTCT CCCCTTTCTA GGTCCTGTGA CAGCCATCTT GCTAATAGTC GCATTTGGGC 17880

CCTGTATTTT TAACCTCTTG GTCAAATTTG TTTCCTCTAG GATCGAGGCC ATCAAGCTAC 17940

AGATGATCTT ACAAATGTAA CCCCAAATGA GCTCAACTAA CAACTTCTGC TGAGGACCCC 18000

TGGACCGACC CGCTGGCCCT TTCAATGGCC TAAAGAGCTC CCCTCTGGAG GACACTACCA 18060

CTGCAGGGCC CCTTCTTCAC CCCTATCCAG CAGGAAGTAG CTACAGCGGT CATCGCCAAA 18120

TCCCAACAGC AGCTGGGGTG TCCTGTTTGG AGGGGGGATT GAGAGGTGAA GCCAGCTGGG 18180

CTTCTGGGTC AGGTGGGGAC TTGGAGAACT TTTGTGTCTA GCTAAAGGAT TGTAAATGCA 18240

CCAATCAGCA CTCTGTGTCT AGCTAAAGGA TTGTAAATGC ACCAATCAGC ACTCTGTAAA 18300

ATGGACCAAT CAGCAGGATG TGGGCGGGGT CAAATAAGGG AGTAAAAACT GGCCACCCGA 18360

GCCAGCAGTG GCAACCCACT CGGGTCCCCT TCCACACTGT GGAAGCTTTG TTCTTTTGCT 18420

CTTCACAATA AATCTTGCTG CTGCTCATTC TTTGTGTCCA CACTACCTTT ATGAGCTGTA 18480

ACACTCACTG CGAGGGTCTG TGGCTTCATT CCTGAAGTCA ACAGACCACG AACCCACTGG 18540

AAGGAACAAA GAACTCCCGA TGTGCTGCCT TTAAGAGCTG TAACACTCAC TGCGAAGCTC 18600

TGCAGCTTCA CTCCTGAAGT CAGTGAGACC ACAAACCCAC CAGAAGGAAG AAACTCTGGA 18660

CACACCTGAA TATCTGAAGG AACAAACTCC AGACACACCA TCTTTCAGAG CTGTAACACT 18720 CACCGCAAGG GTCTGTGGCT TCATTCTTGA AGTCAGCAAG ACCAAGAACC CACCGGAAGG 18780

AACAAATTCC AGACACAGTA GGAAATCTGT ATTTTTGATC TGTGGCTTCC AGGGTTACTC 18840

CAGTCATTGA AGTCTCCATT GCAGCCTTAA GGAAACAGAG AATGGTTTGG AGGAGCACAT 18900

GTGGGAATTG TTATGGACCA GGCTTGAGAT GCACATAGGG CATTTCTGAT CAAACCTAGC 18960

TGGAAGCAGG GCCAGGAAAT ATAATCTAAG GAAGACAGTT TTTGTAGACA GTAGTAGTCT 19020

TTGCATCTGA GACATGTAGA TTATCAAGCA ATTAATTAGA AAAAATATAG CCAGGTGCGA 19080

TGGCTCATGC CTGTAATCCC AGCACTTTGG GAGGCCAAGG GGTGTGGATC ACAAGGTCAG 19140

GCGTTCGAGA CCAGCCTGGC CAACATGGTG AAACCCCGTC TCTACTAAAA ATACAAAAAT 19200

TAGCCTGGTG TGGTGGCACG CATCTGTAAT CCCAGTACTC AGGAGGCTGA GGCAGGGGAA 19260

TCTCTTGAAC TTGGGAGGCA GAGGTTGCAG TGAGCCAAGA TCACACCACA GCACTCCATC 19320

CTGGGTGACA GAGCGAGACT CTGTCTCAAA AAAAAAAAAA AAAAAAGGAA AGGAAAATAT 19380

AATCAAGAAT ATTGACAGGT AACATTTATT CAACACTTAC TATGCACCAG GCAATACACT 19440

AAGTGTTTTA CATGGATTAA CTCATTTAAT CTTAACAATA GCCCTATGAA GTCAGTGCTG 19500

TTATTATCTC CACTTTATAG ATAAGGAAAC TGAAGTACAG AAAGGTCAAG TAGAGAAATG 19560

GCCATGCTTG CATTCTCAGT TTTTGAAGCA ACTGTTACAG GAATCTGGTG TGAGAAATGC 19620

TCTAACAAGA TGTGAGTCAG GGGTTGGGAG GTACTGAGTC TGAGTTGGGC AGTTGGGGAT 19680

GGAAGGATGG ATGAAGAACA GCTTGACAGA GAAGCTGACA CTTGGCAACT CTGTGGGACC 19740

TTGAAGGGTT AGAGGGACTT CACCAAAGAA ACTGGTGGTC AGGGAAACGG GAGGGTCACG 19800

GCAAGGAGGG AAAGGAAACT GTACCACAGC AGAGAGTCTG AAGCTACTAC AGTGTAGTTC 19860

AGCGTATAAA GAATAATTAT TTTAAGGTAA ACTTATAACC TCATGCAAAT ATAAAATGAA 19920

CACGTGTCAA AGATCTTATT TAATTTATTA ATTAATGAGG GAACCTGTAA GATGTTACAG 19980

CCAGTTCAAA GGATAATTCA AATAAATCCA TGCACATATG TAGGCAATAA GGAATGCTGA 20040

AATGAATTTA AAAGTAGATG TAAACTGATT TATCCACAGA GAAATAATCA GTTGCATTTC 20100 ACATAACAAA ATTCAGTTGC TTTTCTACAG AAGGAATTGT TTGCATCATT ACCAATTTTT 20160

CTACAACTAA CAGAATTATA AAATAACTCA AACACAATGA AAGGCAGATA TAACCCACAA 20220

TGGTATGATA GATACAATAT CCACATCCAG GATGTTTTTT TCTCATTTCA AAGTCTTTCA 20280

CAAGTTTTCC TGATAAGGGA GTGTCAATAA TACTGTATGG CAGGCAATAA GACTGGATGG 20340

ATGGTTGGGG CCAGGTTTTA AGGGGTAATA AATGCCATGT AAAGGTATGT GCATACTGTG 20400

CAACATGTCG GGGGAATCTC AAATTATTGG TAGAGTATGT AGGAAACACT TGTGGGAGCT 20460

TGTTAATAAA TTCAAATTCC CAGACCCAAC TCCTCAAGGG GTCTAATACA GTAGGTTTGG 20520

AGTAAAGCCT GAAAATCTGC AATTGTGCAA AAAAAAAACC CAGGTGATTC TGATACACTT 20580

TGAGAAGCAC TGGTGGAACT AATAGTCACT GAACGTTTTT GAGCAGGGGA GAAACCTGAG 20640

GACGTCTATG TTGCAGCAGT GGAAACTTGA TTAGAAGTAG GAGAAGATGC ATGGTCTTAA 20700

AAGAATGCAA AATGATGGCT AATATTTGAG TGCTTATGAT GGGCCAGGGG CTGTGCTAGG 20760

CGCGTGGCAC ACATTCAATA CGATGGAAGC CTGTACCAGT CAGTATTAGT GGGGTATCTT 20820

TAAGAGTGAC CAGAATTAAG GGGGGTTTTC ACCAAAGCCT GAGGACTGAG CCTCCTCATC 20880

CTAAATTCAG ACACAATGCT GTACCTATGC ATTTGCCTCC AGGCTGTTCC TGGGCCTCCA 20940

GGGACTGGCC CAGGCTCCTG ATAAATAGGG ACTCCCAACA ACATAAAGCC TGGATTTTGG 21000

AACTTCCTGA ATGTTACTCA GGCTTTCTAG TAACTGTGGA GATCTGAATA ATAACACAAT 21060

TCTAAGTTCC CCTACTCATA AAGCTGCTCA TCATTTAGAT GGGGTAAAGC ACCTGAAATA 21120

CAATGAGCAT CACTATTTTC ATTCATCCAT GAAATGAACA TTCCGGGGAG ATCAGTAAGT 21180

TGATGTATCA CCCTTGAACA GGGCAAAATG AATACTCACC AGGAATATGT GGTATTTTAA 21240

AAAGAAGGCA AAGGGAAGAA TAGTGGGGAT GGGGCAAAAA CTTTAAATAG ATTCCCCCAA 21300

TCATATATGG CAATTGAAGA TAATTAAATT ATCATTTTAA TTGAGTAAGT ACTCATAGAG 21360

CCCTCACTAT TTGAAAATGA ACTGCCTCCT AATTGTTATT GTGCAAATGT GATACATTAA 21420

ACTTAAGCTA TTTTAATAAA ACATCCATTT TCGGAAGCTG TAGTAGGTTC TCCCAGGTCA 21480 GATTTGATAA GCCATAAAGA ACAAATGCCA ACTCCTATTT TTCTATGGTG CTGGGAAATA 21540

AGAGAGAAAT GTGTAATTCA AAGCAATCAT TTAATTTTAT CCAATAGCTT GATTCTCCTC 21600

TCTCTTCTAG CCTTTTAGCT AAGCTGTTAC CAAGTAACCA CACTAGTTGG CTTGAGTCTT 21660

ACCACTGTTT CCCTGACCCC ACAGTGGAGA GACTGCATCT GTTAAAGAGC AGTTATGTAA 21720

CCATGGCTAT GCTGAGCTGG GATTCCCAAG GCTTAGGTTC TTTCTGTGAA TGACCTTCAC 21780

CAAGACACCT GAGGTCTGTG TGGAACCACA GGCTTGTCAT CTCTAAGGCA GAGTTGATAA 21840

TTCCATCTGT TTCTTGAGCC CACACTGAGA AAAAGATTAC ATGACTGCAG TTATTTGAAT 21900

GCCTCATGGA AAGACGTCTT ATAAATATTA TAATTAATGT TATCATTAAG TAATGCTTCA 21960

ATGCAGATCT TCCAAGTATA AATATCAGCT GAGTAAGAAG TCAATCTTCC CTGAAGCAAA 22020

ATTGAAATTT GTAAATGCGA TTTCTGGGAG CTTATTTTGT AATACATGAT TCCAGAGTGT 22080

CCATAACACA CACAATTGTC TTTTTTCCCC TACATGGGCT ATTTACAACA AAATTGGACT 22140

TATAATGTTT ATTTCCAGGG ATGACTAGAA CTTTAATAAC AAACCTTGGG CCAGGCATAG 22200

TGGCTCATGC CTATAATCAC AGCACTTCGG GAGGCTGAGG CTGGTAGATT ACTTGAGGCC 22260

AGGAGTTTGA GAACAGCCTG GCCAACATGG CAAAACCCTG TCTCTACTAA AAATATAAAA 22320

ATTAGCCGGG TGTGGTGGCG CATGCCAGTA ATCCCAGTTA CTAAGTAGGC TGAGGTACGA 22380

CAATCGCTGG AACCTGGGAG GCGGAGGTTG CAGTGAGCTG AGATTGCACT ACTGCACTCC 22440

AGCCTGGGTG ACAGAGAAAG ACTCTGTCTC AAAAAAAAAA AAAAAAAATA ATAATAATAA 22500

TAAACCCTGA TGAAAGGTTT CTAAAATGTT TTCATCTAAT GGTTTTCTTG ACAATTAAAT 22560

TTTCTATATA ATGTCAGTTC ATAAAAAAAC TGAGAACGAC CACATGTCAT ATCGACTGCT 22620

TAAAAGAAAA TACGTATATT TACAAACATA TACACGATAC TGTCTTTTGT CTGGTTAGTT 22680

TAGAGGTTAG ATAAACTGCA GTATGTTGTA GTGGACAGAT CATAGAACTA GGAGTCAGGA 22740

TGTCTGGATT CCTAGGAAGC AATGAATAGG TTGCACGGTG CAGACCAGCA TCATGAGTAT 22800

CCTCAGGGAG CTTGTTAGAA CTGCAGATCC TTTAACTCAT TGAATCAGAA TCCCTAGGTG 22860 TGGGGCCCTG AAATCTGTAT TTTAGCAGGC TCTCTGGGAT TGTGATGTGC CTTAGAGTTT 22920

GACAACCACT GGGTAGCTGA TCCTGACTTA GACTTATCAG GCATGTGATC TTGAACAAGT 22980

CACATAATCT CACTGAGTTC AGTTTTCTTA TGCTTAAAAT AGGCCCAATA ATATCTATTT 23040

CACATGGACT GCTTTGAGGA TTAGGCAAGA GATCTGTAAC AGACACTGTA GAACAGTGTC 23100

TCTGGTCTAC AGCTGACCTT CCATAAATGG TAGTTGCCTT GATCTCTGCT CTGCCACATA 23160

ATAGCTGGTT AACTATGAGC AAGTAATTTA GTTCTTCTCA GTTTAGTTTC TTCACCTGTA 23220

AAAGAAGGAA AATAACTGTT ATACTCAATT TCTGAAGTGG CTATAAAAAT CAGTTTAAAT 23280

TATGGGCATT GAAGCTCTTT GTACACTGTA TAAGGACTGT ACATCTAAGG GATTAATGAG 23340

ACCAGGCTTA TGATTTTAAG CATGGAGTAA ATAGTAACAC TGACTCTGTT CTATGAACCA 23400

CATGGAAACT CTAAAGAATA TGCACATTTG AAACACAGGT ATCATCTGGG GAAGGTGATC 23460

TGCTCACCCA AACCAGTTCA TGAACATCAA TCTCCAGTGG CGTGCTGGAG CTAGCTGTAC 23520

CAGCTCATGA GGGCCAATTG TTTCATTTTT AGGAATTTTG TTTGCTGGTT AAAAATAGTC 23580

ATTATTTAAA ATTAAATTAT GTAAACAATA ATATTAGATA AAATAAGTTA AAATAAAAAC 23640

AAAGGAACTA ATTATCCCCA AACTCTTCCC CACCTAATTA TTTTACTATC TGTGCCTTGG 23700

GATTATTTAC ATTGATTTTA TCCATATGGT GACAATACTA TTCATATATA AATGGTGTGC 23760

TTCTCTTCAT AACTCTACAT AGCCTGATGT CAGGCTAGTA GCTTGAAATT GGCCACAGTG 23820

GGAGTGTGAG CATTTGTACC ATGAGGCTTG GCCAAGGCTA CAAATCCAGA CTTTTGTTTT 23880

TCCCTCCTGG AGAGCTGTCT GTTAAAAATT TACCAACACA CCACTGGTCT TACCTTTGTT 23940

AATTTACCAC AGTCCAGGTT CTGACCTAGA CTTAGAAACC TGGATTTGTC AGCAAGCTGA 24000

GGATAGAGCC ATTATTTTTA AGAAGGACTC ACATTACCCA AGTGCAAAGC CTGATATATA 24060

CCTTCAGAAT ATCAATTTAT TAATTTACAG TGAAGAAAGC CACCCCAGGG CATTCCCCAG 24120

GGGAAGGCAA AAAGAGC AG TTGCACATTT TGAATGTTTG ATGACATTAG GGTAAGGTGA 24180

CACAGAATAT CCATTTCCAC AACTGAGATA CCTGCTGCCT TAAGGAAGGG ACAGGCAAGT 24240 CCTTGGGCAG GACCTTAGAT TGTCACTGTC CATCTTGCTC TAGGACTCTC CTTTCCAGGC 24300

ATGACGATGG CCAACTCTGT CCTCCTACCC TACTGATGGG ATTATCTTTT CTTGACACAT 24360

GGCAATGCCT CCAATCAGAG GCTGGTAGCT ATTTTTAATC TTCAGGGCAG TATTTTTCAA 24420

AGGGAAGTTC ATGGACCATA TGCATCTGTA TCATTTAGAT GTATATTAAA AATGCTTAGT 24480

CTTCCCCAGT TATACTAGAT CAGAATCTCT GTTGGTGGGG CCCACGAATC GGTATTTTCA 24540

ACAAATCACT AGGTAATTTC TGTATATACT ATAGTGTGAA GACCACTGCT TGAAGGTTTC 24600

TTTGCATATC TCCACTAAAT ATAAAAAATA TTGACTTCTA GATTTAACTC CCAAAGCACT 24660

TGCATTTTTA AGTTTCTGGG GGCATTATAT TGTGGTACCC CTATACCACT CACACTCTAG 24720

TCAGGAGGTA TATTATGGAC TGAATGTTTG TGTCCCTCCA AAACTCATAT GTTGAAGTCT 24780

TAGCTTCCAA TGTGATAGTA TTAGGAGATG GTGCCTTCTG GAGGTAAAAT CAAGCCCTCA 24840

TGAATGGGAT TAGTGCCTTT AGAAAGAGAG CTCCGTCACT GTCTTTCCAT CAATTGAAGA 24900

TGCAGTGAGA AGCTGGTAGT CTTGCATCTG GAAGAGGGCC CTCACACAAC CTGATCATGC 24960

TGGCACCTGG TCTCAGACTT TCTGCCTCCA GAACTATGAG ATGATAAATT TCTGTTGTTC 25020

ATACCCCACC CAGGCTACAA TATTAGGTTG CTGCAAAGTA TTTGTGATTT TTGCCTTTAC 25080

TTTTCAGGGC AAAAACTGCA ATTACTTTTG TGCCAACCTA ATATTTTGTT ATAGCAGCCC 25140

GAACTAAGGC AAGGGAGACT ACATCAGACA GTGTAGCTAT GTAAGTACAA ATGTATCCCT 25200

GTTGAGGAAA ACTAAGTTCT AACCCTGACT TCAGGCCAGT AGCCACCTTT TCAATCTCTT 25260

TCATGAAGGG ACCATTATCA TTATCACTGG TGGCAAAAAT AGAGGCACGA GAATGGAATT 25320

TGCTTTTCTG TGAAATCTCA GTGTATACAG ATTGAAGAGC AAGGGTTTGC TTTCATCTCT 25380

AAGAAGCAAA AGTGAGTACG GACTGGCACA TTATCAGAGA AAGAATCATT CTAGCTCGGT 25440

GGGTCTTAAC CAGGAGTGAA TTTGACTCCA GGGAACAGTT GGCAATGTCT GGAGACGTTT 25500

TTATTTGTTA TAGCTGGGGG ATGAGTGGGT GGGTTGCTAC TGGCATCTAG TGGGTGGAGA 25560

CCAGAGATGC TGTTAAACAT CCCGCAAAGC ACAGGACAGT CCCCGACAAC AAAGAATTAT 25620 CTGGCCCCAA ATATCAATAG TGCCAAAGTT GAGAAACCTC ATTCTAGCTT CCTTTTCCCT 25680

TCTACGTTCT AATCAACTGT TGTTCTTTCA GCATTAGGAT TCATCCAGCA GTCTCTTTCC 25740

CCAGCAATTT GTTGAAATTT TTTTAAAAAT GGACTCATTT TAGTGTCACA AGAAAAAAAT 25800

ACATTCACAG GAAAGGATGG GTCATTTTGT TTAATGATGT TTTGCCTTTC ACATAGCAAA 25860

AGCTTAATAA AGTATTTTTA AATAAAATGG TGAATAGATC AAAACATTAA TTTCACATGT 25920

GTTTTAATAA ATAACAGGAA GATGGCTATA TTATATAAAT TGTTCTTGTA TATGTCTTGA 25980

GTGGATCATC AAACACAAAC GTATCTACAT GCCTTTTCTT GTGAATAGAT CTAATAATAA 26040

CGCTCTTCTA AAAACAAATT AAATGGATAT TATTTGCTGA GAATGTAATG CTTGTGTGAA 26100

TAGAAGCCAG CCCTGAATCC AAGCCCCCAG ATCTATTTAA AGAATTTGAA GAATGTCAGA 26160

AAAGCACGTG GCTTCAAGGT TAATGTGTAA GACTCACAGA AACTTGAAAA ATCACTATGA 26220

CTAAAAAGAA AGTATGAGCT CCCTGCATGC CTGTAAATTG GAATGACAGC CAAAACCAGT 26280

TAATTATAAA AACAGCTAAT TTAACAGGTT TTCAAATTTG TTTCTTTCTC CAAGTAGCAT 26340

ATAGTCAATA ATCCTTAAAG AGAAAGCAAA GAAGGGGAAG CACTGAACCA AATTTGCTTT 26400

TTTGTACCTG CTCAGCTCAA ATGCAGAGTT CTCTACCTGG AAATTGACTG CTTCCATAGT 26460

TTGATAGCCA CAGAGAGATG GGAACAGAAG GAGAGGTATA ATCCCAGACT TGATTCAGCT 26520

ATAGAGAATG ACAATAGTGT CAGAGGCCTT CCAACCAGAG CGACTCCATC TTGAATACGG 26580

GCTGGGTAAA ACAGGGCTGA GACCTACTGG GCTGCATTCC CAGGAGGCTA AGCATTCTAA 26640

GTCACAGGAT GAGACAGGAG GTCAGCACAA GACCTTGCTG ATAAAACAGG TTGTAATAAA 26700

GAAGCCAGCC AAAACCCACC AAAACCAAGA TGGCCATGAG AGTTATCTGT GGTTGGTCTC 26760

ACTGCTCATT GTATGCTAAT TATAATGTAT TAGCATGTTA AAAGACACTC CCACCAGTGC 26820

TATGACAGTT TACAGGTACA TTGGCAACTT CCGGAAGTTA CCCTCTATGG TCTAAAAAGG 26880

GGAGGAACCC TCACCTCCCA GAATTGCCCA CCCCTTTCCT GGAAAACTTG TGAATAATTC 26940

ACCCTTGTTC AGCATATAAT CAAGAAGTAA CTGTAAGTAT CCTTAGGCCA GAAGCTCAGG 27000 CCACTGCTCT GAATGTGGAA TAGCCATTCT TTTATCCTTT ACTTTCTTAA TAAACTTGCT 27060

TTCACTTTAC TGTATGGACT CCCTGTGAAT TCTTTCTTGC AAGAGATCCA AGAACTCTCT 27120

CTTGGGGTCT GGATCAGGAC CTCTTTCCAG TAACAATAGT AGTAAGGGGT CAGGGAGACT 27180

GGACAAAGGA GTTTAAGAAG CCTTAGATAA AGGGTCCTCA TCATTGTCAT AACATAAAAT 27240

CATGGACTCC TAGAATTTTA TAGCTGATAG GATTAGAAAT TTCAAAATTC AATTTCATTA 27300

ATTTTCATCT GCGAAAACAG ATGGCCAGAG AGGCCAAACA ATTTGTTAAG GAGCACTGAG 27360

GGCAGACCAC ACTGGAACGC AAACCTCTTA GCAGAGTATA CAAGGCCTTT GATCTCCTCA 27420

GTCAGAATGA ACTAGAGCTT TCCAGGGTAC CCTTTCTGAC TGTTTAGCAT GTTTGCCAGT 27480

CTGACTAATT TTGAAGTTGC TTAAATATCT GTCATTTCCA CTGTATCATA ATCTCCTCAT 27540

TCATCTTCAA TCTCCAATGC CTTGAACTCA GTAAATGTTA GTTGAACAAA AGTAAATTGA 27600

ACCCAGAATT TCTGATCATA ATCTGGAGCA CTTTAAAATT GTCAGCTTAC TGGGAAACGG 27660

GATAACATGT GATTTGTCTT TGATTTTTTT TTTCTCATAT GCTTTTTCCA CCTATAGATG 27720

CTACACGAAT GTTTTTAAAA TCTGATATAA AAATTAAAAT TAAAAAATTA AAAAAAGAAA 27780

ATTTGATACA ATGCTACATT TAGAGTGTTG TGATTAGATT CCTTAAGTGT ATCATGGTGA 27840

TCTCTACATC ACGTGGTGAT CAAATTGCTT TGGGTTTTAA CACATAACTG ACAAAGGCTT 27900

GGGGACATGT AAGATCCCAA ATACATTTTT ATTGATTTTT TTTTCTTGTT TGTCCTCTTT 27960

TAAATAACTT TTTTTTGTTA TAAGAATAAT TCATGTTCAG TGGAGAAACC ATAGAAAATA 28020

GTGACAAGTG AAGGAATAAA TTTAAAATGA CCCATAATTG TACCATACAT TCTGATTTTT 28080

TAAACGCTGA ACAAATTAGC CTTGGGTAAG TACCAGGAAT AGAGTGCAGC ATTGAAAGTT 28140

AAAGTTTGGG GAAGGATAGC TGACTTAAGA AATTATC AG TTAGACATTT TTTGATGGGG 28200

TAATTTTGCA GATGACATTA GTGAGAGAAA GGACTTGCCA CTCTCACACA GCTAGTAGGG 28260

GTGTGGGAGG ATATTGGAAC CAAGTTTCAA GTCTTCAGTG AAGAATCAAG GGAGAAGTTC 28320

TAAAACCTAA CAATATCCCT CTGGATGGAC ATTTATTTTA TTACTACAAT AAGCCACACG 28380 GTGAGTCATA AGGAGCATTT CATTCTTCTA ATATGTCTCT ACTGTATTTA GAATCTGATA 28440

AAGCCCCTAT TAGAATTCAT CTCTTTAAGA ATAAAAGAAG CTGAGGAACT AAAGAGAGGG 28500

TTGGAATAAT CCACTAATTA TATCCGTTAA GCTTCAGTTA CGCTAATAAG GAATATCACA 28560

TGACTGTGGT GTGTGCTTGT TCTGAACAGT AAAGTACATG AGGAAAGATA AGATTCAGGG 28620

CTGAAATGTC CTTCAGCATA TGTAGGTAGT GGTGATGAAA GTCATTAAAA GAAAAATTGA 28680

TTGAGGTATT TTAGTAAACA AAAGAACTCA CCACTTACCC ATCAGGAAGT GTATTGTTAA 28740

TGCAGTGCTG TTCAGCCTTC TGGAAGAAAA GGTTTCTTCA TGCTTCTCTC TTTAGCCTAA 28800

TTCTTATCCT GTCACTTTTC AGGCAAAATT AAAAAAAAAA AAAGATTGAA AACGATGCTC 28860

CTATTTTATT TGCTTCAAAA GAAACAGGCT GTTGCATTGT GCTTGGAACA GTTTACTCTT 28920

GGCCTTGATG TAAGTGTGAA AGGAAGCCCA TGTAATTGAC TAGGCAGTAT CTGAAGAAGC 28980

AGGAAATACA GTGTTAAGAA AATGAACAGG CATGAAAACC ATGGCTATTT GATAAAAGTA 29040

AATAATTTCT GCAGTTCACA TGTTCTCAGC ATATTTTCTT TGATACTGAC TTGCTTAATA 29100

TGACAATAGC AGAACCATGG TAGCTTGTAG GCATTACTTT TCTTTTAATT TCTTTTACAT 29160

TTTGAATTTA CCAGCACTCA CATTTGTATT ACTTTTGGGT TATACTGAGG ATCTATAACT 29220

TATAGATCAA ATACCTGACA TATATATGCA TTCTCTGAAG TCTTAGGGCA GAACTAGAAC 29280

ATTCTTGTGA ACATCAGTAT AAGATATTAA AATGGAAGTT TTGCCTAAGA CTGAAGACAA 29340

TAAAAATATC ATAGTCTGAA ATGAATGCCA GCACACCATA CAGGATTTAA ATATCTATAC 29400

ATATATATGT GTGTGTATTA TA ATATTTA ATATATATCT GTGTGGGATA GGAAGAGGTA 29460

GGGGGAAATC AGTTTTACAA TTATTAAGTA TTTCACCCTT GACAAGAGTA TATATATTGG 29520

AAATCAGTTG GAGAGTATTT TCAAAGATAA ATGTTAGTGT GCTATGAATG AATCCACCCC 29580

TACCACCACT GAGGCAGGGT AGGAGAGGCC TGTGCTCCTC AAGCATAGTT GGAAAAGGAC 29640

CTCAACAAGA CCACTTCAAG AGTCTAATGT GTGGAGACTG TTGCTTAGGG AGACCTTATG 29700

GTCTAGCTTC TGACTCACAG CTAAGTCAGG GAGACAGGTT GGCTGCTCTG ATCGTGGAGT 29760 CCAAAAGATG GCCTGCACTG AAAAGCCTCA TGAGTGTTGA CTTAGGGCTA GTCTAAGAGG 29820

TCCCTGGAAG AAGAAACACT CAGTAGGAGA GAAGCTGGAG GTACCTTCAG TGCTGAATTG 29880

GAACCTAGAT TCATTCCCCC GTGGAGCAAA TTACATAGGA AAGATGCCCA GTGATGGAGA 29940

GTGGGGGTGT CTCTAACAAT TACCCACCCA CCTGCCCCCA CCCCTAAGAA AAAGAAAATC 30000

ACATACAACC AGTCAGCTGT AAACATATGC CGAGCCTAGT AAACTCAGAT ACTAAGTTAC 30060

CAGGGTACCT GGCAAGTAAG AACATTCCTG ATTCCCTTCC CTCCTCTTCC TCTTTGCCCT 30120

CCAACCTTAG TGGCTAGCAA GATGGGGAGA GGAGGAGAAG CTGTAAGTGG GGAAAAAAGA 30180

GCAGCTTTCT CTCCTTTTCA GCTGCTGGAT TCTCCCTCAT CATAGGCCTG AGCTGGGGAA 30240

TCAGGAAGAA GGATTCTTTT TAAAACTGAA GTAACGTTAT CATTTAATTT TAAAACATTT 30300

TAAATTTTGA CAATGTTGAG ATTAGATATA CTAATTATTA AACTAAGATT ATGTTTTGCA 30360

GCTTGAAGTG ATAAGAAAAA CCTCTTATCT AAGAGCATCC AGGAAAGTCG GGGGTTTCCT 30420

GAACATCCTT TTAAATCCTT TGGAAGTCAG CTTTCAGAGA GGATTTAAAG TGTAGACTGG 30480

GCCTTCAGAA ACTTGGTTAA TGTAGGGGTT TCCTATGCAG ACTTGGGGAC TATACCTTGT 30540

GTGGAAGAGA GAAAATAAGA TTATCTTACA TTTTTCCCAT TCCTTTTTCA AAAAGAAAGC 30600

TCAGCTAGCA TGAAAGTTAA ATTCAAAACG TAATGGGTAT TATTTGCATA TTCAAATCTA 30660

GTGCATATCA TGTAAGTACT GAATTATGGT ATTCATTATT TCAAATGACA AGCTGGATTT 30720

TTTTTTCTTT CGAATTTCAC AAATTAATTT TCCTTGGAAC CTTTTGGTTT GGGCTTTAAG 30780

AGTTTAGGCT TTCATCACAA AGAGAGGACA GCCTTGAAGA TTAAAGTGTG TGGCTCTTCT 30840

CAAGATGTTC TTAGTCCAGC AAAGGATTCT ATGCATATTT GGGCTTCCTT CTGTCTCATA 30900

ACCTGTATTT CTTGATATTC TATTTATATT CTGTAAGATT TTTTTTTTAA AGGAAAAATT 30960

CTTCCATGGT TGAAGGACAT GTCAAAAATA GAGGATACAG TTTTATATCA AAGGAAGTTT 31020

CATGATATGA CTGTAGAAGC TCATTTGACT TAAGACACAT CATTTCCTCA TGGAAGTGTT 31080

AAACAGATCT GTACAATAAG GTTGGCAATC TTTGTGTAAA ACAGTTTTTT TTCTCCTGCT 31140 CTAAAGAAAG TGTATATTTC AAAATGTGAA TGTCAGCAGT CAGAAAATAG TATTTTTTTA 31200

ACTTCGTTTT CAAAGTCCTC AAAAACCTGT ACCTAATCAT GAATTTTTTT TCCCACAGAT 31260

TGTTTCTTCT TCTCCCTCCC AGAAACTTTG AAGTTTTTCT ACATGACACC AGGACCTATG 31320

TCTTTTTTTA ATTACACAGA AATGAAAGAA AAAAAGTGTG TTGTATCGTT AACCAAATAT 31380

ATGAAATCTT TAAGCTGTAT TTTTATTTTT AACTTTGTTT TGCAAAGAGG CCATTCCCTT 31440

TGGTTAAATA ATTTGTTATT CACAGTTTCC TTGTCCTCAT ATTATCAAGG GGAAAATTGT 31500

AGAAATTTTA AAGGAAGCTC TAGGCAATGT TTTCATCCCT GAATCTTTGG AGAGTTATAA 31560

AAACAAACAG ATTACTGAAC CTGTAAGAGA ACCAATCGTG AAGTCATTAC ATCTAAGCAT 31620

AAGCAAAATC TCCTCTTGGA TCATTAAGTT ATAGAAGAAA AGAAAGCCTG CACTTTGAAA 31680

TTTAAATAAA GCTTGGTAAC TTGTAAGTCA AACACGTAAA ATTTTACAAT TCAGGAATAT 31740

CGATAGCAGT TGAGTTTAAT AGACTTCTCA CATTCCAAAT TTAAAGCTTC CTTCTCTGTG 31800

CTAATAGAGA TACAATAGCA GTAGGCGTTT AAGAAGAATG AATCAACAAT TTAAAACTAT 31860

AATGTGTTTT TTATTCATCT CCCTTATTCA CATATATTTG TTTTGTTTTG AGAAGGAGTT 31920

CTGCTCTGTC GCCCAGGCAG GAGTGCTGTG GCACGATCTC AGCTCACCGC AACCTCTGCC 31980

TCCCGGGTTC AAGCGATTCT CTTGCCTCAG CCTCCTGAGT AGCTGCGATT ACAGGCGTGC 32040

GCCAGCAACC CCGGCTAATT TTTGTATTTT TAGTAGAGAC AGGGTTTCAC CACGTTGGCC 32100

AGGTTGGTCT CGAACCCCTG ATCTCAAGTG ATCAGCCCGC CTCGGCCTCC CAAAGTGCTG 32160

GGATTACAGG CGTGAGCCAT CACTTCTGGC CCTTATTCGC ATACAATTTA AAAATCATCA 32220

CAGAAGGTTT GAAAGAAGGA AGGGGCAGAA AATTACCTAC TTTTCCTCTC CCCAGCGATC 32280

TCCTTCAAAT CTGTGCCTTT TCCTCAGGCC CAGGCCTCAA TTTACTGAGC AGTCACACCT 32340

CACAGAGGGA GGTCTGGGCA ATCCACTCTT GGTCACAGGA AAGCCATTGA CCCTCCCACT 32400

TCCTCTCCTC CACCTTGTTC TCAACTCTTG ACTTTGGGCT TTGTTTCTGT TCAAGTCCTA 32460

GGAACTGGTT TCTTTTATCA GGTTAAGTGA TTAGTTCTCT TTCCCTCTAG TTGCTCTCAC 32520 TCCCTGACTC TTGCCTTCTG TAACAACTGG AGACAACTCT TTCAAAACCA GCTCCAAGCC 32580

CCAGACTTCT CTCTGGGCTT TAGTTCGTAA GGCAGGTGCC CTACTGAGTG AGCCTAGATC 32640

AGACAGAAAC ATAGCTGTTG GCAATGATTT AGGTGAATTT CCTTCCATTG TTTTTCTAAT 32700

ACCTTCTTTT TTTTGTAAAT ATAACCATGC ACATACACAC ATATTTGAAT ATCCTGCCTT 32760

TTTATTTAAA ATGACAATAG GTCCGGGAGT GGTGGCTCAT GCCTGTAATC CCAGCACTTT 32820

GGGAGGCCGA GGTGGGCAAT CACCTGAGGT CAGGAGTTCG AGACCAGCCT GGCCAACATG 32880

GTGAAACTCC ATCTCTACTA AAAATCAAAA ATTAGCCGGG CATGGTGGCA GGCTCCCAGC 32940

TACTCAGGAG GCTGAGATGT GAAAATCGCT TGAACCCGGG AGGTAGAGGT TGCAGTGAGC 33000

TGAGATCTTG CCATTGCACT CCAACCTGGG CAATAAGAGC GAAACTCCAT CTCATGGAAA 33060

AAAAAAAAAA AAAGACAGGA TAAACATTCT AGATAGTCTC TATAATGGTC ATGATTAAGA 33120

CAATAAAATA GTCTGAAATT GTCAATATAT ATTAATAATA ATTTATTTGG CCATTCTGCC 33180

AAGTAGCAGA CACCTGTCAT TCTGCCCACT CAGCACCTCT CTTTCTTTTA GGGAAATGCT 33240

ACCCACTCTT TGCATGGGTT CTGGATGGAA CTGTTGATCA CAGTGTTTTC ACTCCCCATT 33300

TTGCCTCACC AGAGGTAGAC AGAAGACCCA AGCCAGGCCA GTTACACACA ATCTTCAGAT 33360

AATTACCGTA TTGATCACAG TATCACCCCA CTCAAGGCTT GGTTGGAGAT GAGCAGAAGA 33420

GACTAAAGCT GGGTCATTTT AATTAACACC TGTACCCCAA AGAAAGACTG TCAATGAGGC 33480

TTTTATACCG ACACTCCTGG TTTCCATTCT TCCTGATGCC ATTCATTTGA CGAACTACCC 33540

AATCTTTCCA ACAGTGTCTT TGGAAGAAAG ATAGTCAGAA AAGAAGATAG AGTTGTTTTC 33600

TGTTCTTTGC AACCAAGGAA CTCTAAATGA TAGACTTGTT GCTAGGCACT TTGGTTATTT 33660

TTATTATCTT GAATACTTCT GTGATATACT TCTTTGTGCA TGCCTGTTTG TACGGATGTA 33720

GCTTTTTATA TATTTTATAT AATTTCTCAG AAGTGGAATT ACTTAGTCAA AAGGTATGAA 33780

CATTTTTCTG ATTCTTAATA TAAATTGTGC AAATGCTTTT TAAGAGGATT ATACCAGTTT 33840

ACATTTTGTG TTATATATAA CAGAAAGTAC TACTGAAAAA ATATTACAAA AATTTGTCTC 33900 TCTGTTCAGG AGGACCTTGT AATAGATGAT AAAGTACTTG AAATAGGAAC ATAGAGCATT 33960

TTCAGTTTAA AATAATTTCA TTGGGTTATT TACGGAATCC TTAGAATTAT GGCCAGACAT 34020

TTATAGATGA TCTGTACCAA ACCTAGGTTG GTTACATAAA TTGCTTATTC AACTGGCTTA 34080

AATCTATAAT AGAAAGATGA CACTTACTGA ATGTTTAATA TACACTTTGT CAGGGGCTTT 34140

GTATTATTCT ATGACATCTT CAAAATGACC CTACTTTCCT ATTTTATAAG TAAGGACAGG 34200

AAGGCTTCAA GAACATGACT AATTTTCCCA AGGGCTGTAC CAAAGCCAGA ACCCAAATCT 34260

ATAAGGCTTT TAAACCTGCA TTCTAAAACT GCATCTCGGC CATCTTATTC CTACAGAACT 34320

TAAGGTTAGA AAGCCAGATT GGAGTCCCAA TTTCACCACT TAGTAACCAG ACAAACTTGA 34380

GGAATTCACT CAACGTCTTT GAATCTTCAT TTTCTAATCT TTAAAACTAA AACAATAATA 34440

CTTGCTCTAC CTATGTCCTA AGATTTCGTG AGGCACATAG AGATAGTGTG GAAGAGTGCT 34500

GTACAGATGT CAAGTGTTAG CGTGATTACT TAGATCCCTG AACACCATGG ATGAATGTCT 34560

CTGACTGCTA TTAGAGGTCA TAAAGAATAT TGGGGCCAGG TACATTGGCT TATTCCTATA 34620

ATGCCAGCAC TTTGGGAGCC TGAGACAGGA GGATCACTCG AGGCCACGAG TTCAAGACCG 34680

GCCTGGGCAA CATAGTGAGA CCCCTTCTCT ACAAAAAAAA AAGCAGCCAC GTGTAGTGGC 34740

ACACACCTGT AGTCCCACAT ACTCAGGAGG GTGAGTTGGG AGGATAACTT TAGTCCAGGA 34800

GTTTCAAGGT GCAGTGAGCT GTGATTGCAC CACTGTACTC TAACCTGGAC AGCAGAGTGA 34860

GACCCTGTCT CTAAAAAAAA AGAAAAAAAA ATAATAATAA TAAAGAATAA TGGGGCCTTG 34920

GGATACCCAC TCCTCTCTTT CTGCTCTGAG TTGTGAAGCA GTTGAGTTAC ATATGCATGT 34980

CCAATGGATG AGGTTGAAAA TATCAACTGG ATTGGAATGT GGCTTACTTG CGTGGCCACA 35040

ATGAGCTTCG TAACACTTCC TGACAGGGTG AGAAGACAAA CTTCCTCACC CAGTCACTGG 35100

CAGAGCTGGA CACTCTGTGT CTCTCCCACA GAACAACCTC TTACTGCATG GAGGTGGATG 35160

AAAAAGTCAA CCGAGAACAG GCTACTCCAA AAAGCAGAGC ACCAAAGGCA CCAGCTGGTC 35220

AGGTCCCCCT TCCTAAGTAA ACAATCACGT AATTCATTCG GGACAAAGCC AGAGAGGTGG 35280 TGTGGAGAAA GAGAGGGCAG TTTCCTCCCA AGTTTTTCCT GGAATTCTTT ATGGGAATAT 35340

GAGGTTTAGG GGAATAAGAC TTCCCTTTAA CAGTGAAGAA TCCCCAGCTC TATTGGTAAT 35400

AGGAAATCGC TTACAAGGAT CATGGGGAGT ATTTCCTCAG CTCGTTCTGC CTCCTACTTG 35460

GCTGAGTGGA ATGGAACCAT CTGTGGCTGC TGCATATGAT ATTGTCAACT TTGTCATTCC 35520

ACACCCACTC CTTGACGCCC TACCATGTGG TCATAAGACT CCCTTTAAAG TGTTCCTTTA 35580

AAAAACAAAA TGTGTTTTGT TTCTATAAAA TACAGCTCAA TGTCAGAACC CTTGTCTTGT 35640

TTGCTCTCTG ATGTAACCCT TTCACAATGT TTGGGCAGCT TATTCTCTCT ATTTCCCTGT 35700

AGGGTCCCAT CCAGGCCAAA GTGAGTGCCA GCCTCATTTG GGCAGCAGAT GCCCTGTGGA 35760

AGGGCAGGAG GAGACGAGAG CTAATTGTAA CTTTGTGATT AGCTGTCATG GATGCCTGGT 35820

CCTGTCAATA GCGCTCAATA AAGCCAGAAG GCCAAGCGTT CGCTTCTGCA TACTGATTGC 35880

TGAGTCAGAT TTCTCAGTGC AGAAGGGCTT TCTAGGCAGT CAATTTTAGA ATATTAGTCT 35940

TGGTTCTTAA GTGGTTAAAA TCCCTAGCTG GTCTTTAATC TGAGCCTGGA GAATTTAGTT 36000

ATGGCTGACA TTCTGCTGTG ATATTTTTGC CCTCAATATA TATGTCTTTC CTCCATCTCT 36060

TAGATCCCTG AATCATAGAG ATATATATGT TATATAATCA ACTGTCTCCA GTCTCTAAGA 36120

GTGATAAGTA CACATTGTGT CAGGTTGAGG GGACAGGAGA ACTTTCAAAA GCCTTTCTTG 36180

CCCCTTTTTC CTTCTCACTG CCTCCCACTA AGTCCAGCCA CTTATTATTC AGCTGACACT 36240

ATCATCATGA CCATGAGGTC TTTTGGGGCT ACCCTGGTTC GGATCCTTCT GGAGGTTTGT 36300

TGCTTAACTC TGTCTTCAGT CCTATGAGCT GCTTTTTCAA TAAGTTTCTA TTTTGGCTAA 36360

AGTTGGCCAG AATCTCCTTG TAACCAAAGA ACAAATAAAA TACCAGCTTG CAATGTTCTA 36420

TGTTGCTTCC ACCAAACTTA TGCAGCACTT CCTATCTAAT CCACCTACTA GTCTTTTTTT 36480

TTTTTATTTT TTTTGAGACG GAGTCTCGCT CTGTTGCTCA GGATGGAGTG CAATGGTGCA 36540

ATCTCGGCTC ACTGCAACCT CTGCCTCCCG GGTTCAAGCA ATTCCCCGGC CTCAGCCTCC 36600

TGAGTAGCTG GGACTACAGG TGCATGCCAC CACGTCCGGC TAATTTTTGT ATTTTAGGAG 36660 AGAGAGGGTT TCACCATGTT GCCCAGGCTG GTCACGAACT CCTGAGCTCA GGCAATCCGC 36720

CCTCCTCGGG CTCCCAAAGT GCTGGGATTA CAGGAGTGAG CCACCTCACC TGGCCCCGAC 36780

CTACTAGTCT TTAGTGTTTG CTTCCTTCTA TTGGGTAATT GTCTGTTTAT ATGCATGTCT 36840

TGTTTCCTCA AATAAAATGT GGTCTTCTCA AGGGTATTGG CCCATGTTCT ATCCATCTGT 36900

AGATATCACA GCACCTAGCA GTGTCTTTCA CAGAGGAAGT ACACAACTGG CATTATTGAT 36960

TCATTGCTCC ATTTTTTCCT TCTTTATCCC CAGCATTTCT CAATAATTTC AAACATCTCC 37020

ATTGGAGTAC CGGAGAAAGC AGGTAGCTTT ACTTGCAGCT ATGTTTCTAT CCCCATAGTA 37080

ACTAAAAGAG GACCCAGAGA AACATGTTTA AATGCTGTCC TGTTATCAGG ACCTCAGCCT 37140

TCTGATGCTC CGTGGCTTGG GGGTTATTGC TTGATCATCT CCTCCCCAAC CTACACTGTG 37200

TACCTATGCT AGTCTCTTCA TGAGGACTAA GCCCCATAGT AAAAGGGCTA GATAAATAGA 37260

AAATCATTTT ATGTAATTAT AAGAATGAGA ATACTGAGTA TTCTGGTGTT TGTTTAGGAT 37320

AAGCACATCT TTATTTGTAT GAGAAAAAGA AAAAGAGAGT GAAAAATATA TTAACGTGCA 37380

TATTGTTCAG AACCCTTGGA TTGCAAGTGA CAGAAACTCA ATTCAAACCA ACGTAAGTCA 37440

AAAGGAAAAT ATATTGGCTC ATGTAACCTT CTCACAGAGA GGGCAGGATG GAAGGGGCTT 37500

TGGGAACAAG AGAATTGTTC TCAAATTCTA GGAATACTAG GATTAGTCCA GGATGGGTCA 37560

CCTTCCTGTC CCTGAGGTGG TGGTAGCGAT GGTAGAGTCT TATGGGAGGA AAGAGTGCAT 37620

GTTAGGATGA AGGTAGGGCT AAGCAAACAA GGGCAAGGGC CACTATATCA TGCTAAAAAT 37680

GGTTTTTTTT GATGTCTTCC TTAATTTCAC AAATGCTTCC AACAAAGTAG CACACAGGAA 37740

AAAGAACATA GGGACTCTAC TGGTGGGTGC TTTTATCTTA AGCCTTGTAC TTGCTTTTCA 37800

CAGCTTACTC ACTGCTTGTA CCTGAGGCCA TATGCCCTGT AAAAGCTTCT GCAGGGTTTC 37860

TACTAAGCTG GGTTCCTTAT ATGGCTCTCT CCCATTTCTG TTGCCTCACT CTAGTGATCT 37920

TTCTCTTTTC CTCACCTCTG GGACTGGTGG CTGTTTGTAT GGACTGCCTT AGCTTTGCTT 37980

TGGGTTTTTT CCTGGGGACA ATGTCTTCAG ATTATCCTAG ACCAAATAAA CTACAGCCAC 38040 TGGGCCAGGC TCTTCCTCCT CCAACTGGAC CATGTTCCCA GGGCTCTTCA CCTTAGTTTA 38100

GGTCAAGCAT TCTTGGCAAA AGAAAGGCCT AGTTAACAAT AGACATTCTA GCAATTGATT 38160

CTTTTTGACA TGTTGTAAGA TCTATTCACA TTTTGTAATT AAAGCATTCC CCTATGGAAA 38220

CCAACACGAA CTAAGCTGCT CCTGGAATGC AGGGTGGCCT CCTCAATACA GGATGTTCTA 38280

GAGAGCTGTA TTTTGGGCAC TTAACTATTC TCCACTACTT AGGGCACAGC ACTGAAATTA 38340

ACACCACTAA GTTTGTCATG TCCATGTAGT TAGTCTCAGG CAGTGCAGCC TCAGGAGTGG 38400

AACTGACCTC TTATGTGTGT CCAGCCTTTC TTCCTTCAGA AGTCAGCTGT GTTTTCTGCT 38460

GACTCTCCAT AGGAACATCA GTCCTGAATC CTCAGACCAC CATCTGGAGT AGTAAGTGCT 38520

CCTGACAGTC CTAGAAGTTG TCTACCGCTG GATCTCCAAA GCGTGTGACA CACCGTGAGA 38580

GAGAAATGAG AAAGCTGGGC TCTTCAGGTA AATCTTGCTT TTTCACAAGC CCCCTAATTT 38640

TACTGCATAA TTATTTTGAA TTCACTGATA ATTTCTACAA TTTTCCCATA AGTCATCTAC 38700

ACACAATACC CTCTCATGCA ACACTTGGCT TTGCTAATAC ATATCTATTA TGAGAGCTGT 38760

GCTTCTTAAG CGTAAATGTT TTATATGCAC TAAGGCTCTT GGCTTACATA TAAAAGGGGT 38820

ATTGAGCAAT GTGATACAGA AGTCTTTTCT CCACAGGTCT CATATGTAAA GAATTCATTA 38880

GATTGGCTGA AATAGACTGA TCTGTCCATT TCTCTGCTCA CTTATCATAA GGAAGTCATT 38940

AGCTAAGGAA CAAAAACTAC AATCTATGTA ATTAGAAGAA CAAGCTGGTT TTGCTCAATA 39000

TAAAAATAAG AAAAAGAAAC CATGTGAAAG TCAAAATATT TGTTTAATCA GGTCATTGAG 39060

AATCTATTAA AAAGTATTTG AATTCTTTAT GATGAGAACT ATCTTGACTC AAGTGGACAG 39120

TGGTGAGCTT TTTGGCCTGT GGTCCCTACG TAGAAAGGAG GCTTTGTCAT AAAGTCTTAT 39180

ATGGTACAGG TGCCAAGTTA AGTGCCCAAG CTTGCTCTTA AAAGCATACT GGATTTTG 39238 (2) INFORMATION FOR SEQ ID NO : 5 :

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5596 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 :

TATGGACATA TTGTCGTTAG AACGCGGCTA CAATTAATAC ATAACCTTAT GTATCATACA 60

CATACGATTT AGGTGACACT ATAGAACCAG ATCTGATATC GAATGAATTC TTTCTTGCAA 120

GAGATCCAAG AACTCTCTCT TGGGGTCTGG ATCAGGACCT CTTTCCAGTA ACAATAGTAG 180

TAAGGGGTCA GGGAGACTGG ACAAAGGAGT TTAAGAAGCC TTAGATAAAG GGTCCTCATC 240

ATTGTCATAA CATAAAATCA TGGACTCCTA GAATTTTATA GCTGATAGGA TTAGAAATTT 300

CAAAATTCAA TTTCATTAAT TTTCATCTGC GAAAACAGAT GGCCAGAGAG GCCAAACAAT 360

TTGTTAAGGA GCACTGAGGG CAGACCACAC TGGAACGCAA ACCTCTTAGC AGAGTATACA 420

AGGCCTTTGA TCTCCTCAGT CAGAATGAAC TAGAGCTTTC CAGGGTACCC TTTCTGACTG 480

TTTAGCATGT TTGCCAGTCT GACTAATTTT GAAGTTGCTT AAATATCTGT CATTTCCACT 540

GTATCATAAT CTCCTCATTC ATCTTCAATC TCCAATGCCT TGAACTCAGT AAATGTTART 600

TGAACAAAAG TAAATTGAAC CCAGAATTTC TGATCATAAT CTGGAGCACT TTAAAATTGT 660

CAGCTTACTG GGAAACGGGA TAACATGTGA TTTGTCTTTG ATTTTTTTTT TCTCATATGC 720

TTTTTCCACC TATAGATGCT ACACGAATGT TTTTAAAATC TGATATAAAA ATTAAAATTA 780

AAAAATTAAA AAAAGAAAAT TTGATACAAT GCTACATTTA GAGTGTTGTG ATTAGATTCC 840

TTAAGTGTAT CATGGTGATC TCTACATCAC GTGGTGATCA AATTGCTTTG GGTTTTAACA 900

CATAACTGAC AAAGGCTTGG GGACATGTAA GATCCCAAAT ACATTTTTAT TGATTTTTTT 960

TTCTKGTTTG TCCTCTTTTA AATAACTTTT TTTTGTTATA AGAATAATTC ATGTTCAGTG 1020 GAGAAACCAT AGAAAATAGT GACAAGTGAA GGAATAAATT TAAAATGACC CATAATTGTA 1080

CCATACATTC TGATTTTTTA AACGCTGAAC AAATTAGCCT TGGGTAAGTA CCAGGAATAG 1140

AGTGCAGCAT TGAAAGTTAA AGTTTGGGGA AGGATAGCTG ACTTAAGAAA TTATCTAGTT 1200

AGACATTTTT TGGATGGGGT AATTTTGCAG ATGACATTAG TGAGAGAAAG GACTTGCCAC 1260

TCTCACACAG CTAGTAGGGG TGTGGGAGGA TATTGGAACC AAGTTTCAAG TCTTCAGTGA 1320

AGAATCAAGG GAGAAGTTCT AAAACCTAAC AATATCCCTC TGGATGGACA TTTATTTTAT 1380

TACTACAATA AGCCACACGG TGAGTCATAA GGAGCATTTC ATTCTTCTAA TATGTCTCTA 1440

CTGTATTTAG AATCTGATAA AGCCCCTATT AGAATTCATC TCTTTAAGAA TAAAAGAAGC 1500

TGAGGAACTA AAGAGAGGGT TGGAATAATC CACTAATTAT ATCCGTTAAG CTTCAGTTAC 1560

GCTAATAAGG AATATCACAT GACTGTGGTG TGTGCTTGTT CTGAACAGTA AAGTACATGA 1620

GGAAAGATAA GATTCAGGGC TGAAATGTCC TTCAGCATAT GTAGGTAGTG GTGATGAAAG 1680

TCATTAAAAG AAAAATTGAT TGAGGTATTT TAGTAAACAA AAGAACTCAC CACTTACCCA 1740

TCAGGAAGTG TATTGTTAAT GCAGTGCTGT TCAGCCTTCT GGAAGAAAAG GTTTCTTCAT 1800

GCTTCTCTCT TTAGCCTAAT TCTTATCCTG TCACTTTTCA GGCAAAATTA AAAAAAAAAA 1860

AAGATTGAAA ACGATGCTCC TATTTTATTT GCTTCAAAAG AAACAGGCTG TTGCATTGTG 1920

CTTGGAACAG TTTACTCTTG GCCTTGATGT AAGTGTGAAA GGAAGCCCAT GTAATTGACT 1980

AGGCAGTATC TGAAGAAGCA GGAAATACAG TGTTAAGAAA ATGAACAGGC ATGAAAACCA 2040

TGGCTATTTG ATAAAAGTAA ATAATTTCTG CAGTTCACAT GTTCTCAGCA TATTTTCTTT 2100

GATACTGACT TGCTTAATAT GACAATAGCA GAACCATGGT AGCTTGTAGG CATTACTTTT 2160

CTTTTAATTT CTTTTACATT TTGAATTTAC CAGCACTCAC ATTTGTATTA CTTTTGGGTT 2220

ATACTGAGGA TCTATAACTT ATAGATCAAA TACCTGACAT ATATATGCAT TCTCTGAAGT 2280

CTTAGGGCAG AACTAGAACA TTCTTGTGAA CATCAGTATA AGATATTAAA ATGGAAGTTT 2340

TGCCTAAGAC TGAAGACAAT AAAAATATCA TAGTCTGAAA TGAATGCCAG CACACCATAC 2400 AGGATTTAAA TATCTATACA TATATATGTG TGTGTATTAT ATATATTTAA TATATATCTG 2460

TGTGGGATAG GAAGAGGTAG GGGGAAATCA GTTTTACAAT TATTAAGTAT TTCACCCTTG 2520

ACAAGAGTAT ATATATTGGA AATCAGTTGG AGAGTATTTT CAAAGATAAA TGTTAGTGTG 2580

CTATGAATGA ATCCACCCCT ACCACCACTG AGGCAGGGTA GGAGAGGCCT GTGCTCCTCA 2640

AGCATAGTTG GAAAAGGACC TCAACAAGAC CACTTCAAGA GTCTAATGTG TGGAGACTGT 2700

TGCTTAGGGA GACCTTATGG TCTAGCTTCT GACTCACAGC TAAGTCAGGG AGACAGGTTG 2760

GCTGCTCTGA TCGTGGAGTC CAAAAGATGG CCTGCACTGA AAAGCCTCAT GAGTGTTGAC 2820

TTAGGGCTAG TCTAAGAGGT CCCTGGAAGA AGAAACACTC AGTAGGAGAG AAGCTGGAGG 2880

TACCTTCAGT GCTGAATTGG AACCTAGATT CATTCCCCCG TGGAGCAAAT TACATAGGAA 2940

AGATGCCCAG TGATGGAGAG TGGGGGTGTC TCTAACAATT ACCCACCCAC CTGCCCCCAC 3000

CCCTAAGAAA AAGAAAATCA CATACAACCA GTCAGCTGTA AACATATGCC GAGCCTAGTA 3060

AACTCAGATA CTAAGTTACC AGGGTACCTG GCAAGTAAGA ACATTCCTGA TTCCCTTCCC 3120

TCCTCTTCCT CTTTGCCCTC CAACCTTAGT GGCTAGCAAG ATGGGGAGAG GAGGAGAAGC 3180

TGTAAGTGGG GAAAAAAGAG CAGCTTTCTC TCCTTTTCAG CTGCTGGATT CTCCCTCATC 3240

ATAGGCCTGA GCTGGGGAAT CAGGAAGAAG GATTCTTTTT AAAACTGAAG TAACGTTATC 3300

ATTTAATTTT AAAACATTTT AAATTTTGAC AATGTTGAGA TTAGA ATAC TAATTATTAA 3360

ACTAAGATTA TGTTTTGCAG CTTGAAGTGA TAAGAAAAAC CTCTTATCTA AGAGCATCCA 3420

GGAAAGTCGG GGGTTTCCTG AACATCCTTT TAAATCCTTT GGAAGTCAGC TTTCAGAGAG 3480

GATTTAAAGT GTAGACTGGG CCTTCAGAAA CTTGGTTAAT GTAGGGGTTT CCTATGCAGA 3540

CTTGGGGACT ATACCTTGTG TGGAAGAGAG AAAATAAGAT TATCTTACAT TTTTCCCATT 3600

CCTTTTTCAA AAAGAAAGCT CAGCTAGCAT GAAAGTTAAA TTCAAAACGT AATGGGTATT 3660

ATTTGCATAT TCAAATCTAG TGCATATCAT GTAAGTACTG AATTATGGTA TTCATTATTT 3720

CAAATGACAA GCTGGATTTT TTTTTCTTTC GAATTTCACA AATTAATTTT CCTTGGAACC 3780 TTTTGGTTTG GGCTTTAAGA GTTTAGGCTT TCATCACAAA GAGAGGACAG CCTTGAAGAT 3840

TAAAGTGTGT GGCTCTTCTC AAGATGTTCT TAGTCCAGCA AAGGATTCTA TGCATATTTG 3900

GGCTTCCTTC TGTCTCATAA CCTGTATTTC TTGATATTCT ATTTATATTC TGTAAGATTT 3960

TTTTTTTAAA GGAAAAATTC TTCCATGGTT GAAGGACATG TCAAAAATAG AGGATACAGT 4020

TTTATATCAA AGGAAGTTTC ATGATATGAC TGTAGAAGCT CATTTGACTT AAGACACATC 4080

ATTTCCTCAT GGAAGTGTTA AACAGATCTG TACAATAAGG TTGGCAATCT TTGTGTAAAA 4140

CAGTTTTTTT TCTCCTGCTC TAAAGAAAGT GTATATTTCA AAATGTGAAT GTCAGCAGTC 4200

AGAAAATAGT ATTTTTTTAA CTTCGTTTTC AAAGTCCTCA AAAACCTGTA CCTAATCATG 4260

AATTTTTTTT CCCACAGATT GTTTCTTCTT CTCCCTCCCA GAAACTTTGA AGTTTTTCTA 4320

CATGACACCA GGACCTATGT CTTTTTTTAA TTACACAGAA ATGAAAGAAA AAAAGTGTGT 4380

TGTATCGTTA ACCAAATATA TGAAATCTTT AAGCTGTATT TTTATTTTTA ACTTTGTTTT 4440

GCAAAGAGGC CATTCCCTTT GGTTAAATAA TTTGTTATTC ACAGTTTCCT TGTCCTCATA 4500

TTATCAAGGG GAAAATTGTA GAAATTTTAA AGGAAGCTCT AGGCAATGTT TTCATCCCTG 4560

AATCTTTGGA GAGTTATAAA AACAAACAGA TTACTGAACC TGTAAGAGAA CCAATCGTGA 4620

AGTCATTACA TCTAAGCATA AGCAAAATCT CCTCTTGGAT CATTAAGTTA TAGAAGAAAA 4680

GAAAGCCTGC ACTTTGAAAT TTAAATAAAG CTTGGTAACT TGTAAGTCAA ACACGTAAAA 4740

TTTTACAATT CAGGAATATC GATAGCAGTT GAGTTTAATA GACTTCTCAC ATTCCAAATT 4800

TAAAGCTTCC TTCTCTGTGC TAATAGAGAT ACAATAGCAG TAGGCGTTTA AGAAGAATGA 4860

ATCAACAATT TAAAACTATA ATGTGTTTTT TATTCATCTC CCTTATTCAC ATATATTTGT 4920

TTTGTTTTGA GAAGGAGTTC TGCTCTGTCG CCCAGGCAGG AGTGCTGTGG CACGATCTCA 4980

GCTCACCGCA ACCTCTGCCT CCCGGGTTCA AGCGATTCTC TTGCCTCAGC CTCCTGAGTA 5040

GCTGCGATTA CAGGCGTGCG CCAGCAACCC CGGCTAATTT TTGTATTTTT AGTAGAGACA 5100

GGGTTTCACC ACGTTGGCCA GGTTGGTCTC GAACCCCTGA TCTCAAGTGA TCAGCCCGCC 5160 TCGGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCATC ACTTCTGGCC CTTATTCGCA 5220

TACAATTTAA AAATCATCAC AGAAGGTTTG AAAGAAGGAA GGGGCAGAAA ATTACCTACT 5280

TTTCCTCTCC CCAGCGATCT CCTTCAAATC TGTGCCTTTT CCTCAGGCCC AGGCCTCAAT 5340

TTACTGAGCA GTCACACCTC ACAGAGGGAG GTCTGGGCAA TCCACTCTTG GTCACAGGAA 5400

AGCCATTGAC CCTCCCACTT CCTCTCCTCC ACCTTGTTCT CAACTCTTGA CTTTGGGCTT 5460

TGTTTCTGTT CAAGTCCTAG GAACTGGTTT CTTTTATCAG GTTAAGTGAT TAGTTCTCTT 5520

TCCCTCTAGT TGCTCTCACT CCCTGACTCG GGGGATCCAC TAGTTCTAGA GCGGCCGCCA 5580

CCGCGTGGAC TCACAG 5596

(2) INFORMATION FOR SEQ ID NO : 6 :

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18443 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 :

GAGGGCGGGA ACCCCCTTTC CAAAAAAAAA GAAACAAAGA CAGGATAAAC ATTCTAGATA 60

GTCTCTATAA TGGTCATGAT TAAGACAATA AAATAGTCTG AAATTGTCAA TA ATATTAA 120

TAATAATTTA TTTGGCCATT CTGCCAAGTA GCAGACACCT GTCATTCTGC CCACTCAGCA 180

CCTCTCTTTC TTTTAGGGAA ATGCTACCCA CTCTTTGCAT GGGTTCTGGA TGGAACTGTT 240

GATCACAGTG TTTTCACTCC CCATTTTGCC TCACCAGAGG TAGACAGAAG ACCCAAGCCA 300

GGCCAGTTAC ACACAATCTT CAGATAATTA CCGTATTGAT CACAGTATCA CCCCACTCAA 360

GGCTTGGTTG GAGATGAGCA GAAGAGACTA AAGCTGGGTC ATTTTAATTA ACACCTGTAC 420

CCCAAAGAAA GACTGTCAAT GAGGCTTTTA TACCGACACT CCTGGTTTCC ATTCTTCCTG 480

ATGCCATTCA TTTGACGAAC TACCCAATCT TTCCAACAGT GTCTTTGGAA GAAAGATAGT 540 CAGAAAAGAA GATAGAGTTG TTTTCTGTTC TTTGCAACCA AGGAACTCTA AATGATAGAC 600

TTGTTGCTAG GCACTTTGGT TATTTTTATT ATCTTGAATA CTTCTGTGAT ATACTTCTTT 660

GTGCATGCCT GTTTGTACGG ATGTAGCTTT TTATATATTT TATATAATTT CTCAGAAGTG 720

GAATTACTTA GTCAAAAGGT ATGAACATTT TTCTGATTCT TAATATAAAT TGTGCAAATG 780

CTTTTTAAGA GGATTATACC AGTTTACATT TTGTGTTATA TATAACAGAA AGTACTACTG 840

AAAAAATATT ACAAAAATTT GTCTCTCTGT TCAGGAGGAC CTTGTAATAG ATGATAAAGT 900

ACTTGAAATA GGAACATAGA GCATTTTCAG TTTAAAATAA TTTCATTGGG TTATTTACGG 960

AATCCTTAGA ATTATGGCCA GACATTTATA GATGATCTGT ACCAAACCTA GGTTGGTTAC 1020

ATAAATTGCT TATTCAACTG GCTTAAATCT ATAATAGAAA GATGACACTT ACTGAATGTT 1080

TAATATACAC TTTGTCAGGG GCTTTGTATT ATTCTATGAC ATCTTCAAAA TGACCCTACT 1140

TTCCTATTTT ATAAGTAAGG ACAGGAAGGC TTCAAGAACA TGACTAATTT TCCCAAGGGC 1200

TGTACCAAAG CCAGAACCCA AATCTATAAG GCTTTTAAAC CTGCATTCTA AAACTGCATC 1260

TCGGCCATCT TATTCCTACA GAACTTAAGG TTAGAAAGCC AGATTGGAGT CCCAATTTCA 1320

CCACTTAGTA ACCAGACAAA CTTGAGGAAT TCACTCAACG TCTTTGAATC TTCATTTTCT 1380

AATCTTTAAA ACTAAAACAA TAATACTTGC TCTACCTATG TCCTAAGATT TCGTGAGGCA 1440

CATAGAGATA GTGTGGAAGA GTGCTGTACA GATGTCAAGT GTTAGCGTGA TTACTTAGAT 1500

CCCTGAACAC CATGGATGAA TGTCTCTGAC TGCTATTAGA GGTCATAAAG AATATTGGGG 1560

CCAGGTACAT TGGCTTATTC CTATAATGCC AGCACTTTGG GAGCCTGAGA CAGGAGGATC 1620

ACTCGAGGCC ACGAGTTCAA GACCGGCCTG GGCAACATAG TGAGACCCCT TCTCTACAAA 1680

AAAAAAAGCA GCCACGTGTA GTGGCACACA CCTGTAGTCC CACATACTCA GGAGGGTGAG 1740

TTGGGAGGAT AACTTTAGTC CAGGAGTTTC AAGGTGCAGT GAGCTGTGAT TGCACCACTG 1800

TACTCTAACC TGGACAGCAG AGTGAGACCC TGTCTCTAAA AAAAAAGAAA AAAAAATAAT 1860

AATAATAAAG AATAATGGGG CCTTGGGATA CCCACTCCTC TCTTTCTGCT CTGAGTTGTG 1920 AAGCAGTTGA GTTACATATG CATGTCCAAT GGATGAGGTT GAAAATATCA ACTGGATTGG 1980

AATGTGGCTT ACTTGCGTGG CCACAATGAG CTTCGTAACA CTTCCTGACA GGGTGAGAAG 2040

ACAAACTTCC TCACCCAGTC ACTGGCAGAG CTGGACACTC TGTGTCTCTC CCACAGAACA 2100

ACCTCTTACT GCATGGAGGT GGATGAAAAA GTCAACCGAG AACAGGCTAC TCCAAAAAGC 2160

AGAGCACCAA AGGCACCAGC TGGTCAGGTC CCCCTTCCTA AGTAAACAAT CACGTAATTC 2220

ATTCGGGACA AAGCCAGAGA GGTGGTGTGG AGAAAGAGAG GGCAGTTTCC TCCCAAGTTT 2280

TTCCTGGAAT TCTTTATGGG AATATGAGGT TTAGGGGAAT AAGACTTCCC TTTAACAGTG 2340

AAGAATCCCC AGCTCTATTG GTAATAGGAA ATCGCTTACA AGGATCATGG GGAGTATTTC 2400

CTCAGCTCGT TCTGCCTCCT ACTTGGCTGA GTGGAATGGA ACCATCTGTG GCTGCTGCAT 2460

ATGATATTGT CAACTTTGTC ATTCCACACC CACTCCTTGA CGCCCTACCA TGTGGTCATA 2520

AGACTCCCTT TAAAGTGTTC CTTTAAAAAA CAAAATGTGT TTTGTTTCTA TAAAATACAG 2580

CTCAATGTCA GAACCCTTGT CTTGTTTGCT CTCTGATGTA ACCCTTTCAC AATGTTTGGG 2640

CAGCTTATTC TCTCTATTTC CCTGTAGGGT CCCATCCAGG CCAAAGTGAG TGCCAGCCTC 2700

ATTTGGGCAG CAGATGCCCT GTGGAAGGGC AGGAGGAGAC GAGAGCTAAT TGTAACTTTG 2760

TGATTAGCTG TCATGGATGC CTGGTCCTGT CAATAGCGCT CAATAAAGCC AGAAGGCCAA 2820

GCGTTCGCTT CTGCATACTG ATTGCTGAGT CAGATTTCTC AGTGCAGAAG GGCTTTCTAG 2880

GCAGTCAATT TTAGAATATT AGTCTTGGTT CTTAAGTGGT TAAAATCCCT AGCTGGTCTT 2940

TAATCTGAGC CTGGAGAATT TAGTTATGGC TGACATTCTG CTGTGATATT TTTGCCCTCA 3000

ATATATATGT CTTTCCTCCA TCTCTTAGAT CCCTGAATCA TAGAGATATA TATGTTATAT 3060

AATCAACTGT CTCCAGTCTC TAAGAGTGAT AAGTACACAT TGTGTCAGGT TGAGGGGACA 3120

GGAGAACTTT CAAAAGCCTT TCTTGCCCCT TTTTCCTTCT CACTGCCTCC CACTAAGTCC 3180

AGCCACTTAT TATTCAGCTG ACACTATCAT CATGACCATG AGGTCTTTTG GGGCTACCCT 3240

GGTTCGGATC CTTCTGGAGG TTTGTTGCTT AACTCTGTCT TCAGTCCTAT GAGCTGCTTT 3300 TTCAATAAGT TTCTATTTTG GCTAAAGTTG GCCAGAATCT CCTTGTAACC AAAGAACAAA 3360

TAAAATACCA GCTTGCAATG TTCTATGTTG CTTCCACCAA ACTTATGCAG CACTTCCTAT 3420

CTAATCCACC TACTAGTCTT TTTTTTTTTT ATTTTTTTTG AGACGGAGTC TCGCTCTGTT 3480

GCTCAGGATG GAGTGCAATG GTGCAATCTC GGCTCACTGC AACCTCTGCC TCCCGGGTTC 3540

AAGCAATTCC CCGGCCTCAG CCTCCTGAGT AGCTGGGACT ACAGGTGCAT GCCACCACGT 3600

CCGGCTAATT TTTGTATTTT AGGAGAGAGA GGGTTTCACC ATGTTGCCCA GGCTGGTCAC 3660

GAACTCCTGA GCTCAGGCAA TCCGCCCTCC TCGGGCTCCC AAAGTGCTGG GATTACAGGA 3720

GTGAGCCACC TCACCTGGCC CCGACCTACT AGTCTTTAGT GTTTGCTTCC TTCTATTGGG 3780

TAATTGTCTG TTTATATGCA TGTCTTGTTT CCTCAAATAA AATGTGGTCT TCTCAAGGGT 3840

ATTGGCCCAT GTTCTATCCA TCTGTAGATA TCACAGCACC TAGCAGTGTC TTTCACAGAG 3900

GAAGTACACA ACTGGCATTA TTGATTCATT GCTCCATTTT TTCCTTCTTT ATCCCCAGCA 3960

TTTCTCAATA ATTTCAAACA TCTCCATTGG AGTACCGGAG AAAGCAGGTA GCTTTACTTG 4020

CAGCTATGTT TCTATCCCCA TAGTAACTAA AAGAGGACCC AGAGAAACAT GTTTAAATGC 4080

TGTCCTGTTA TCAGGACCTC AGCCTTCTGA TGCTCCGTGG CTTGGGGGTT ATTGCTTGAT 4140

CATCTCCTCC CCAACCTACA CTGTGTACCT ATGCTAGTCT CTTCATGAGG ACTAAGCCCC 4200

ATAGTAAAAG GGCTAGATAA ATAGAAAATC ATTTTATGTA ATTATAAGAA TGAGAATACT 4260

GAGTATTCTG GTGTTTGTTT AGGATAAGCA CATCTTTATT TGTATGAGAA AAAGAAAAAG 4320

AGAGTGAAAA ATATATTAAC GTGCATATTG TTCAGAACCC TTGGATTGCA AGTGACAGAA 4380

ACTCAATTCA AACCAACGTA AGTCAAAAGG AAAATATATT GGCTCATGTA ACCTTCTCAC 4440

AGAGAGGGCA GGATGGAAGG GGCTTTGGGA ACAAGAGAAT TGTTCTCAAA TTCTAGGAAT 4500

ACTAGGATTA GTCCAGGATG GGTCACCTTC CTGTCCCTGA GGTGGTGGTA GCGATGGTAG 4560

AGTCTTATGG GAGGAAAGAG TGCATGTTAG GATGAAGGTA GGGCTAAGCA AACAAGGGCA 4620

AGGGCCACTA TATCATGCTA AAAATGGTTT TTTTTGATGT CTTCCTTAAT TTCACAAATG 4680 CTTCCAACAA AGTAGCACAC AGGAAAAAGA ACATAGGGAC TCTACTGGTG GGTGCTTTTA 4740

TCTTAAGCCT TGTACTTGCT TTTCACAGCT TACTCACTGC TTGTACCTGA GGCCATATGC 4800

CCTGTAAAAG CTTCTGCAGG GTTTCTACTA AGCTGGGTTC CTTATATGGC TCTCTCCCAT 4860

TTCTGTTGCC TCACTCTAGT GATCTTTCTC TTTTCCTCAC CTCTGGGACT GGTGGCTGTT 4920

TGTATGGACT GCCTTAGCTT TGCTTTGGGT TTTTTCCTGG GGACAATGTC TTCAGATTAT 4980

CCTAGACCAA ATAAACTACA GCCACTGGGC CAGGCTCTTC CTCCTCCAAC TGGACCATGT 5040

TCCCAGGGCT CTTCACCTTA GTTTAGGTCA AGCATTCTTG GCAAAAGAAA GGCCTAGTTA 5100

ACAATAGACA TTCTAGCAAT TGATTCTTTT TGACATGTTG TAAGATCTAT TCACATTTTG 5160

TAATTAAAGC ATTCCCCTAT GGAAACCAAC ACGAACTAAG CTGCTCCTGG AATGCAGGGT 5220

GGCCTCCTCA ATACAGGATG TTCTAGAGAG CTGTATTTTG GGCACTTAAC TATTCTCCAC 5280

TACTTAGGGC ACAGCACTGA AATTAACACC ACTAAGTTTG TCATGTCCAT GTAGTTAGTC 5340

TCAGGCAGTG CAGCCTCAGG AGTGGAACTG ACCTCTTATG TGTGTCCAGC CTTTCTTCCT 5400

TCAGAAGTCA GCTGTGTTTT CTGCTGACTC TCCATAGGAA CATCAGTCCT GAATCCTCAG 5460

ACCACCATCT GGAGTAGTAA GTGCTCCTGA CAGTCCTAGA AGTTGTCTAC CGCTGGATCT 5520

CCAAAGCGTG TGACACACCG TGAGAGAGAA ATGAGAAAGC TGGGCTCTTC AGGTAAATCT 5580

TGCTTTTTCA CAAGCCCCCT AATTTTACTG CATAATTATT TTGAATTCAC TGATAATTTC 5640

TACAATTTTC CCATAAGTCA TCTACACACA ATACCCTCTC ATGCAACACT TGGCTTTGCT 5700

AATACATATC TATTATGAGA GCTGTGCTTC TTAAGCGTAA ATGTTTTATA TGCACTAAGG 5760

CTCTTGGCTT ACATATAAAA GGGGTATTGA GCAATGTGAT ACAGAAGTCT TTTCTCCACA 5820

GGTCTCATAT GTAAAGAATT CATTAGATTG GCTGAAATAG ACTGATCTGT CCATTTCTCT 5880

GCTCACTTAT CATAAGGAAG TCATTAGCTA AGGAACAAAA ACTACAATCT ATGTAATTAG 5940

AAGAACAAGC TGGTTTTGCT CAATATAAAA ATAAGAAAAA GAAACCATGT GAAAGTCAAA 6000

ATATTTGTTT AATCAGGTCA TTGAGAATCT ATTAAAAAGT ATTTGAATTC TTTATGATGA 6060 GAACTATCTT GACTCAAGTG GACAGTGGTG AGCTTTTTGG CCTGTGGTCC CTACGTAGAA 6120

AGGAGGCTTT GTCATAAAGT CTTATATGGT ACAGGTGCCA AGTTAAGTGC CCAAGCTTGM 6180

TCTTAAAAGC ATACTGGATT TTGTTTTAGA CTTTTAGTGA ACTGAAGGGA ATAAACAAAT 6240

CCCTCTGGGA GAACTTCTCC TCCATCCTTG GTGAAGTCAT TCTGCCAGAA TTCTATCTGG 6300

TAGTTACCTT CTCCGATTCA TTAAATGTTG TCCCATGGTC CGACATGGGT AATTTTTCTC 6360

TCATTTGTGA TTAGTTCCAC TACAAGGAAT TAAATATTCA ACTTCTTGCC TTCTGGGATA 6420

TACTCAGCCT TATCACAGAG CTCCTCCAGG GAAGGAACTT AGATTCTTTG AAGAACTTCC 6480

CTGCTCTTAC CCAAACCGAT TCAGTTGTTA ATTCTGTCCA CCTTGCTCCA TTTTCAGTGC 6540

AGGAGAAAAA GCATTTGTGG CAAGTCTGAC CTTACAAAGG CTCGTTAATG CTCAATAACT 6600

GTGAGGACCT GCTATAAGTC ATGCCTTTTA AGAAAAAATA CACACATGCA CACACTCACG 6660

ACAAGACTGC AACACAACTG TGATGGCAGC TTGCATATTG AACCAGCTGT TTCCCTAAAA 6720

CATTTGATTC GGCATCCTTT GTAGACAGTA AATGCAAAAG ACTTAGGTTG GAAAAGTGCA 6780

TTAGGTTTTG ATTAACGATT GGATGAGGGC CAGTTAAATT TTTAAATCTG AATGAGCTTG 6840

CTGACTCAGG AGCCTTAGCA GCATAATGGA CAGACAGTCC TCAAAGCTTT CATTAAAAGG 6900

GTTTCTGGTA ACTGATGTCT ARAGAAATGA GTTGAAATAC AATTCACTGA ACCACTCAGC 6960

TTTCATCTAA AACAGAATAT GTAATCTCAA AGAACTCAAC TGGTCTCTTG AAATATTCAG 7020

GTAAAATTAA ATGTAAAGAA GCTAGAGCTT AAATATTTTG AGGAAAGGAA GCCTCCTGTA 7080

GCTTTGTGAC TATATCACTT TATCCTTTTG AATGCCGTAT TTAATTATGT TAATTGCATT 7140

TTAAGTATAG CTGGAGTCAC CGATCTGCTG AAAACAAACT CTASAATGGT TTGTGGGAGG 7200

TGCTCAGGAT GTATCAGAGA CTGATTTGAT TTGCATTTTA TTTTTAACTT TAGTTCCTCT 7260

CTGAACTCTG CCTTCTCATG TTTGTTTTTT WTGTTGTTGT TGCTTAATAC AGTCATGTGC 7320

CACCTAATGA CAGGGATATG TTCTGAGAAA TGCATTATTA GGTGATTTTG CCATTGTGCA 7380

AACATCACAG TGTACTTACA CAAACCTAGA TGGCATAGCC TACTACACAC GTCTGCTATA 7440 TGGTAGAGCC TATTGCTTCC AGACTACAAA CCTGTATAGC ATGTTACTGT ACTAACTACT 7500

GTAGGCAGTT GTAACACTGG TATTTGTGTA TCTAAACCTA TCTAAACATA GAAAAGGTAC 7560

AATAAAAATA CAGTATTATA ATCTTATGGG ACCACTGCTA TATATGCAGT CCATCATTGA 7620

CTGAAACATT ATGTGGTGCA TGACTATAAT AGGATCAAAC TATGCCTTTG CAGAAATCCC 7680

CCTGGAAAGC CTCTGAAACT ACCCTGATCT TAGAGGCAGT TTTATAAATC ACGGCCAATG 7740

ATTCTCAGCC TTTGGGTTGT GCCAGAGATG TGTCCGCTCT CCTTTTGCAA TGACCCTAGA 7800

GGTAAAGGTG CTCTTTCTTC TTCTGCTTCT CATGAAAAAA TGTAAATGTT GTATTTTAGC 7860

TTCTTTTCCC AGTCTAGTAA TATCTTGTTA AATTTACAAG ATTGTAGCGG TGCCTCCAAA 7920

AGGGGATAGC AATAGTTACT TTGAAAATGG GTGAGTTCTT TGCAACCATC TCTGAGTTGA 7980

ACAGTTCTTG TATAATCTGT CTTCCCAGTT AGGCTGTGAG CCGCCTGAAG GCAGCAAGTG 8040

TATCTTTCAC TCTTCTCTGA TCTCCTCAGC CACTCTTCTG CCCCACAATT CCAAAAATCA 8100

GTTACCAAGC CATTGTAATT CCTTTTCTGA AATGTGTAGT AGACTCCTTT TAGGGTATTT 8160

GCCCAGTTCA CAAAGACCCC TGCCCTCTTT GGAAATCTGT CCTTGCAGCC ATATATGGTT 8220

TTTGTTTGTT TGTTTGTTTG AGACAGAGTT TCACTCTGTC GCCCAGGCTG GAGTGCAGTG 8280

GTGCGATCTC GGCTCACTGC AAGCTCCCCC TCCCGGGTTC ACGCCATTCT CCTGCCTCAG 8340

CCTCCCAAGT AGCTGGGACT ACAGGCGCCT GCCACCATAC CCAGTTAATT TTTTTGTATT 8400

TTTAGTAGAG ACGGGCTTTC ACCATGTTAG CCAGGATGGT CTCGATCTCC TGACCTCGTG 8460

ATCTGCCCGC TTTGGCCTCC CAAAGTGCTG GGATTACAGG CGTGAGCCAC TGCACCCGGC 8520

AGCCATATAT GTTCTATATG ACTCTTTCTG AGACAATAGC TGATTAGAAC AGTGATTAGA 8580

ACTGTGATTT CTGAGACAAT AGCTGATTTC TGAGACAATA GCTGATTAGA ACAGTTGCCA 8640

CGAGCTGGAC CAATCATATT AATATTCTCT ATCTCTCTCT TTTGCTCTCG AAATCTCAAA 8700

TTGAGATTCA GAAACAGCTA TGTAGTCTCT GTTTGTGGCT AGAACTGTAA CATATGAACC 8760

CAGAGCTAGA GAGATGCAAT ATTCTATCAA GCAGAGAGAG AAGCAGAGGA AGCCGGTCGG 8820 CACAGACGGA ATGCAGTAGC ACACAGAGAG AAGCAGACAC TCGGAGATGT CTGACACCTT 8880

TCTGCTTAGA TTCCAGTCAG TTCAGAGGCC CAGACGCATT CCTGTCTGGA AGCATTCTGA 8940

TCCTGTTTTG TAAATCAACA ATAAATCCCT TGCCACCCTC TTTGCGTGTT AGCTTAAGTT 9000

GTCTTGCTCT TAAAAATCTA AAGAGTTCTA AATGATATGA AATGTCTGTT ATACAGAAAG 9060

TAGAATGACA ATTGCCAGGG GCTGAGAGGA GAGGGAAATG GAAAATTGCT CAATGGTTAT 9120

AGTTTTAGCT TTGCAAGAGG AAAAAGTTGT GGATATTGGT GGCACAACAA TGCGAATATA 9180

CTTACCACTA CTGAGCTCTA TGCTTAGATA CGGTTAAGAT GGTAAATTTT ATGTTATGTA 9240

TATTTTATCG CTGTTTTTAA AAAAGTTTAA AATAGCCTGT TGTAGTCAGC TTCCTTGTCT 9300

TCCTTACTAC TGCAGCCATA TTCAGGTCTC CATGGCCCAA GGTATGGACA ACTGTAGTCA 9360

CCAAACTGGT CTCCCCACTT CCACCCCTTG GAATTTGGTC CCCAGCAATC TACCCTACAT 9420

GCATGGAGCA ATCAATATTA CCCATAAAGC ACTAACGCTG TGCTGTACTC CAAAATGCAA 9480

ACCTTCATGG TGTCCCATTG AATTCAGGAT CAAGTTCATA CTCCCCAGCT TGTCATACAG 9540

GACCCAGTGA TCCTTTCCAA CCTTCTGACC TACTGATTCC CAGTAGGAAG CAAACCCTAG 9600

CAAGACTGGT CTGCCTCATC CCAGAACAGT ACTTACTCAT GCTGTTTCCT TGCCATGATT 9660

ACCTTCCTTC TCCTCACCAC ATCTTATCTT TCTTTCACTT GATCTTAGTC CAAATGCCGA 9720

GAAGCAATCT TATCTTACTT TCAAAGCCCA GGTTCAGACC CATCAATTCT ATAAAACATT 9780

TCTGACCACA CTAGTCCTCC ATGGACATTT ATTTGAATTG AACTTCTTAG CATTTAAATA 9840

TACACAGTTT CTTATTCATC TGTCTTGTTC TTCTGCTAGT TTATAAATTG CTTGATTATA 9900

GAACATGAGC TTGATAATCT TTGATTTTTC CTGGATACTG TGTTCTTGCT AGGCTGTTAA 9960

TAATGCTTGT TGAATGAAAT GAGAAATGAA GAACGGCTGC TTTACCAGTT TGTCTCTTCT 10020

GCCAACTTTT TTACATGGAT TTTACACGTC AACTTTTTTA CACAATGATT AAATATACCT 10080

AATTTGATCA TCCCAACAAC ACTAGTAAAT ATATATGATC ATTATCCTCA TACTACAGAT 10140

GAGGAAACAC AGGCACACAT CGTTTGTTTG TTTTTTTTTT TGAGACGGAG TCTTGCTCTG 10200 TTGCCCAGGC TGGAGTACAG TAGCACGATC TTGGCTCACT GCAACCTCTG CTCCTGGGTT 10260

CAGGCCATTY TCCTGCYTCA GCCTCCCGAG TAGCTGGGAC TACAGGCATG TGCCACAATG 10320

CCTGGCTAAT TTTTGTACTT TCAGTAGAGA TGGGGTTTCA CTATGTTGGC CAGGCTGATC 10380

TCGAACTCCT GACCTGATGA TCTGCCTGCT TCGGACTCCC AAAGTGCTGG GATTACAAGC 10440

ATGAACCACT GTGCTGGGCC AAGCACACAT AGTTAAATAA CTTGCAAAAA AAAAAAAATC 10500

GTATCTATTT GTAGGAGGCA GAGTCGTGAT TCTGAGCTGA ATCTATTTGG CTCCTAAGCT 10560

TATGCTTTTT CTACAGTATC ACCACATATC CCATACTCTA TTGTTATTGT TGGCTTTATT 10620

GCCTGTTTTT CCTGTGAATT TTAACCTTCC CAAAAGCAGG AATCTTATCT CAGTATATCA 10680

CAGAGAATCA CTAAGTATCT A AGAGGAAA GGAAGGAGAG AAGGAAAGAA GAAAAGGAAG 10740

AAGGAAAGGA GGGAAGAAAG GAAGAAGGAA AGGAGGGAAG AAAGGAAGGA AGGAAGGAGG 10800

GAAGGCAAGA GGGCAGGAAG ACAGAAAAGA AGGAAGGAAG AAGGAAGGAA GGGAGGGAGG 10860

AAGGAAAGAA GGGAGGGAGG GAGGAACGGA TAGGAGGGCA GAAACTCTGG AAAGGAGCTT 10920

GTCTTACTCC TAAGCTTGGT AAAGATCAGT CTTGCAAGGG GCTTGACTAG AAAACACTGG 10980

CTTATCTCAC TGAACCATAT TCCCAATGTC ATTGACTCCT TTCCCCTGGG GAGTAATTCA 11040

ACCATGTGTT CACTGTATGG ATCAGAGTTG ATGATGAATA TTCTCTTGCC TCAGTCTCTT 11100

TTGGCCAGAG TTCCTTGGCT TCCAGCCTGC TCCTTGCTTG TTTTGAACGA ATAATATATG 11160

ACTTTCCTTC TTAACTGGCA AATGCTGAAC TGTGGCCTCT CTTAACCCTC AAGTCTCCCG 11220

ATAAAAAGCA AAATATTAGA TTCGCTGACC AGCGCTACTC CTTACCCCGG CTGATTTCAC 11280

ATGAAGAGCT ATATATGGGG TGGTAACATA GGTTTAAGGA TGGATGTGCA TATAACTCCT 11340

GGATACCGTT CCTGAAAATA TACTATTGGG GATTATTTCT TTGGTTGAAG AGTCCCTTCA 11400

CTACCACATG TCAGTCCCCT TACCTATAAA ATGGGAACCT TAGGGTTGTT ATAAGGATTA 11460

AATGAGTTAA TGTGTATAAT GTGCTTAGCA CAGTACCTGC CACTCAATGC TATTATTGTT 11520

GTTGTTGTTA TTATTATTGG TAGTAGTAGT AGCAGTAGTT GTTGTATGAA GATGCATGAT 11580 TTCCTGGGAA AGGTAGCACA TTAAGGCAGG ATCAGTCATG AGTTACCTCA AGCAGATTAA 11640

TTTACTAGCC CTTTCATGCT ATTTCCCAAA GGGATGGTTT ATCAAGTTGA GGAAGATGTA 11700

GATGTGATTT ATGATGGATT TGAGGTTAGT ACTGTGTATC CAGGTTGTGT GTGAGAAGAC 11760

AAGAAGGAAC TGAGGGCACA GCTGTACTTA GGAAGAACTC TGGTTTGCAA GGTACATAAG 11820

CTAATTCAGA CGAGTTTAAA CCATAGGAGA TTTTGTTACA AAGGCACTAG GTAACTGCAG 11880

GGACCAGGGA GCAGGGTGTC CACTCTCATT CCAGATTCTT TTGAATTCTG TATATTTTAT 11940

TCTCTTTCCA CAAACAGACT TTCTATCCAC GGTGGTGATG ATAACCAATA ACATTTCCTT 12000

CAGTCTCACC CTTGTAGCTC TGTGACCAAA AATGCAAAGC TGCTGCTTCT CCAGCTTCAA 12060

AATTTAATAA GAATCACAGG GCAGAACATT TATTGGCTAG GCCTGAGTTG CATGTCTAAC 12120

CTTGGAGAAC TCACTTTGAA TAGGGGAATT CAGAACTAGG ATTGGTGGCT CCACAAATCT 12180

CACAAAAATG GAGCAAARTA GGAACTCATC AAACAGAAAT CAATAGATCT CCACTGGCTT 12240

TA AGTACGT GGTTCTGGGA ATCCAGATAT TCAGAGCCTA GGTGAACCTG AACATTTCCC 12300

TTTAGGCAGA TGGAAATCCA CGTTCTTCTA GCTAAAATTT TTCCATTCTC TTTGAGGGGA 12360

GTTTCCATGG AGAGGCTAGC TTTGTGGGAG AGAGTGGGAA RAAACAACTC ATGCTGTTTT 12420

TCATTGGGGA CCATTCTTAT TGCTACTTTA GTCCAGTCCT GCCCACGGAT CACACATTAT 12480

TCCTTACTCT TGTTGCTTCT GGGCTTTTTC TTTTTCCTTT GCATGCTGCT TATATTCCCT 12540

TCCCTAAAAG CTACTCTATT AAGAGGGAGA TTAGGCAAGT AGGCTGGTTT GATTATGTGC 12600

TGGTTTAACC CATAATCACA TACCTCAAAA AGAAAATGTC AGACACACTA TAATAGCTCC 12660

AGATACAAAA CATGAAGTAC GAAGACCTCT TCAGAAAACT GCAGGCTTGC TACTCACCCA 12720

CAGACAAATA GAGCTGATTC TATTAGAACA GTGAGGAAAG AACACAGTAA AGAATGGCAT 12780

TTAAGATCAA TTGTGGCAAT GTCTAATTTT GTCTGGGAAG ACCATGGCAG TGAGGGATGC 12840

AAAGGGATGA CATCAAGTTT TCAGAACAGT GCCTATATGT TTAGGACGAA GAGTTAAATA 12900

ATGAGAGAAA ACAAATGCAA TACAATTTCA TTGGCTACCT GGTTAGACCT AGCATGAACT 12960 GTGTCTGTGA TGGTGCTATT AATTTGTGAT GGAGACATTG GATATTGTCT TTCCCTATTT 13020

GGTAAGAGCT TGATTCAGGT AGAGAGAAAC AATAATTATT TTACAGTGTA CAAAGCACTT 13080

TCTTATACGA TATATTATTT TCATCCTCCC AACTAGTTTG ATAGGCAGTA ATATTATTCC 13140

CATTTCACAG AGGGGGAAAC CTGGGTTAGG GCCCAGGAAC TTGGCTGGTG AGTTTGGAAA 13200

GCTTGAATAG CAATGATTAT AATCTTGGTG CACAGAAGCA GCCAGTGAAA TTCTGAAATG 13260

CATATTTCTG TTCTCTACTT CCAGAGGGTC TGATTGAGTT AGCTTGGGGA AGGGCCTAAG 13320

AAATGGAATC TTTTTTATTC ACACCAGGTG ATTTTGAAGC ATGGGGTCTA CTGAGTATGC 13380

TTATGAAACA TTAACTTTAG GTCCTAGGCA CTGGCTTAGT TGACTGTGAG AAACTGAAGC 13440

ACAAAATTGT GTGACCAAGT TCTTTCTGAG CCTCAGTTTC CTCACCTGAA AAATGAATGA 13500

TGATGATAAA AATAACTAGG CTCCATGCCA AGTGATTTAC ATATTTCCCC TCAAATCATC 13560

TTTCTTACAA ACCTAGGAGT TCGGAGGCAT TGTTGTTCCT ATGCTATGGG ACTCAAACCC 13620

AAATCATTTC TACTCACTCT TCCTTTCATA ATTGTCAGGA AGATTAGACA TAGAAAGTAT 13680

CTAGCACATA TTCCTGATGT TGAAGGAATA GCAGCAGCTG TTATAACTAC TACTAAAACT 13740

GACAATACTG ACCATACAGC CACCACTAAA ATGYTGGGGT TGAATTCAGA TAATCTCTAA 13800

GGTTCTTCCC AGCTCCACCA TACCCTGATT TCAGCATTTC AAATATATGC TGTATTTGTG 13860

GGGGGGGTTC CTAGAAAGAG TGTGGCAGTA ACTGAACTCA ACTATACAAA AGACCGAATT 13920

CTTCCTTTAG TTGGAGATTT ATTGATTTTT GTAAGTGAGT TTATAGACAA AAACGAGGAA 13980

GATACAGAGA AAAAAGAGAA GAATTACTGT GCTTTGATAG TAGGGCTATG GGTGATTATT 14040

TTATTTTTAA AATTTTATTT TTTATACATT AATGTGGTTT CTATAACAAA CACAAATTTA 14100

GAATAAAAGT AAGATATTTC TCTTGTGCTT CCAATTTACC ATATACTTCT TAAATGTATT 14160

TGTATCATAA TCATCAGCTG TAAGTTTACT ATTAAAAAAA ATCAACAAAA GAACAATATC 14220

AGAGCTAAAG GACTTCAGGC CTGATGAACC TAAGTCTAGT TTCTGTGCTC ACTAGCCTTG 14280

GCTTATCCCA AAATATTAAA AGTAAAATAT GATCCAATCT GCATCTCTTG CACATGTCAT 14340 GTTTTGTAAA TAGAAAGTTC TTGGAACAAT CTGTAACATC GTTGAAGTAC TTCATTCAAT 14400

TCTTGGGCAT TAAATTTTAT CTTCTGTTCC TGCCTCATAT CATTAAACAG TACCTTCACC 14460

TACATTGCAG TCAACTATGG AGGACTAATG CTCTATTTTT TTTATGTTGA ACATGAAGCA 14520

TAAACATGTA CAGCTCTGAA CCTGAGTTTT CCTTGCTTTA GAAATAAGAG GTGTTGATGA 14580

AAGAGGAAAT CCCTGAGACT CTGTAAACCT WACCTGCAGG TATGAGAATA CAATCTGTGT 14640

TTWATTTATK GTATTCTTWA GCAAAATTAT AGTAAAATTA GTATTTTTCT TTTCATTTGC 14700

TCTCGAATTA TCCTTTAGTA ACAGAGTGAA CTTGTATGTC CATATTTTGG GTTTAAAGAA 14760

CATGGTTACT GTAGCAAAGA AGGGGCTAGC CCATGTATTA AGGTCCTGGA TTATACTGTT 14820

GCTCACAGGA GAGCATGGGT TTGAAGATGA GGCTGCATAG TAAAGTAGGT AAAAGTTTGG 14880

ACCTTGGGGC CAAACTGCCT AAGCTCAAAT CATGGTCCTG CCAGTACTCT CTGTTCGACC 14940

TTTAGCAAGT TACTTAATCC TTGTAGACCT CTGATTTGGT CTCTTCAAAA TAGGGATAGC 15000

AATAATGCCT GTCTTATAGA GACATTGTGA GGATTCAATG AATTGATATT TGTAGAAGAA 15060

TATTGAGTTG GTTTTGCTAG AAGATATTAA GTGCGCAGTC TTTCTAAAAT AACTAAATGC 15120

TACAAAAAGC AAAATAGCCA TTCTGCAAAG AGCAGTGATT GAAGCAGGAA AAATGCCTGC 15180

CTTCATAAAG CTTACATTAT AAGGAGAGAA AAATAAGCAA AACAAACTAC GTGGTATATA 15240

TGTAAAATAA AAATAAAGAG GGGGAAGCAT GGGGTGGGGC AGATATTGCA GTTATAAATA 15300

GAATGGTCAT TGGAGGCTTT ATTGAAAAGG GGACATTTGA GCAAAGTCTT CAAGGGGGTA 15360

TGGAAGTGAG CCATGTGAGT ATTTTGGTGT AGGGAAGGAA AAACATCCTT CTACCCTCTT 15420

AGGTTTGGTG GCTAACCTAA GAATTAAAAC AACATAGATT AACAAGAGAA AAGCATGCAC 15480

ATTTATTTAA TGTTTTTATG TATACATGGG AGTCCTCAGA GAAAAATGAA GACCCAAAGA 15540

AGACTTTATG CCCCAAAGCT TATATACATT TTTTACACAA AGAATGATAA ACTGTGGAGA 15600

TGTGACAAGA CAAAAGGCCT TGGGCTAGAA GCAGTAAATT GTGGGAGTAA GGGATATACA 15660

GGCGAAACTA GTGGAAAATG AGGATGATTT TAGTTTTTTT TTACAGGTCC ATTTCGATGA 15720 TAACTCCAGT CATCTCTGGT GATACTATTC TTCTCTTCCT GGCACAAGGA GGGCACCTTT 15780

CTCATGGGAA ATTTTATGAC CTGCTTTTTG GTAGAAAGGG GAAGTCTGAG AGCTCTTCCT 15840

GCCCCTAGTG TTTCTCAAGC GCCTTCAGCT CAAAATAATC ATTATGCCAA AGTGGCATAT 15900

TTTGAGGTGG CATGTTCTGA GCCATTTCAT GGGGTAAGGA TATTCCAGGC TGAAGGAACT 15960

GGGAATGCAA AGGCCCTTAG ACAGGAACAT GCCTGGTATA TTCAAGAGAC ATCTGGGAAG 16020

CCAAGGTAAT GAATGACAGC AGAGCATGAG GGTGTGGGTG GCAGGAGATG AGGAGATGGT 16080

ACAGGAGGCA CAAATCAGGC AGCATGTTAT TGATCACCGG CAGAGCTCCA GGTTTCATTC 16140

CATTCTGAGT GACATGAACG GCCATCAAAG GTGTTTGAGT AGAGGAGTGA CTGTGTTTAG 16200

AATGGACTGC AGGGGAATAA GGGTAGAAGC GGGAAGACCA GTTAGAAACT GTTAGAGATG 16260

ATAGTGGCTT AGACCTGAGT GACAGCAGTA GAATAGGTAA GAGATGGATT ATGAGTGTGT 16320

CTGGCTGATT CACTCTTATA TCCCCTATGC TAAGGCATCA TGCTTGGCAC ATAGTAGGGA 16380

CTCAATAAAT ACTTGCAGAG CGAATGAATA AATGGGAGTT CAACTTGGGT AAGGCAACTT 16440

CTCTAAGGCT CTGTTTCCTC ATCTCTAAAA TGAGGGTAAG AAAAATATTA ATAGATCTAC 16500

CTCCAACGGT TATTGTGGAG ATTAAATGAG GTCATTCCCA TGCATTGCTT AGCATAGTAA 16560

CTGAAACATA AGATAGGGCT AAGATGTATA CATACACATA AATATAAAGC ATTTTTGCAA 16620

GAGTTTACCT TTGGAGACAT GGAGGAAGGT AGACTTTTAT TCTTCATTTT ATGAACTAAA 16680

AGCAAAAGAA GAAAACAAGT GTTGAAATTA TGAGTCATTT TCAAGTTCTT TTTGTACTTT 16740

TCACTACCAT TTGGAATTTT CCTATAATGA ATATGCGAGG CAAAGACAGA AATGAAAGGA 16800

TAAGATCACT CAGAATTTCA GGTTTTTATA AAGCATCAGA AATGTAAGAC TTTTTTCTGC 16860

TACTGCATGG CCCATTTCTC TGACTCTTTG AATGTGGGTA TTATTCTCAT CTTTCTCCCT 16920

CCTCTTCTCT TTTTGGTTAA AAGTAAAGAG AGCTTTTGAA GCTATTATGG AACAAGAACA 16980

ACAGCCTAGT TCATCCTCAC ATTTTGGAGC CTCTTATTCC TTCCAAAGAA CAAACACATC 17040

TATTTAGTGG CTAAGAGTCT CTTGAGCTGA AACCATTCAT CACCATAACT ACATTCAAAC 17100 TGTCTGAGGT ATACATTATA ACTAAGAAAA TGGGGTTCCT CATTGGAATT TACAAACTAA 17160

ATATTCAAAG AAGGGTTCTG ATGCTTTTAA AATAGGGGCG CCACCAAAAG GTAAAGTAAG 17220

ACATGTGGTT GAAGACACAG GAAAGGGCAG AGGTCACCAG AAAAGTTGGT TGTCACGCCT 17280

GATCTTAGGG CCTCATAAAG AAATAATTAT GGCAGAATGA GCCCTAAGAA GCAAGCACTT 17340

TAGCATGGCT CTCCCTGGAC AAAGTGGAGA GGCCCTTCCA CCCTAACTTA TCCTATTGTC 17400

CTGGTCTTCA GTCTTTCCTG TCTGTTTGCC TTTCCTGGTG TTAATATACT TGTTCCTAAG 17460

GTTTTCACCC TGCTGACTTT TAGCTCTTCT TGCTAAGATT CCTGGCTGTA CATTAGAAAA 17520

CTCCTGAGCA ACTAAACACA AAAAAATATT TGGCAGGGGG ATAGGGGGTG CTTCTAGGCC 17580

CTAACTAAGA CCTGTTAAAT TAGAGTCTCT TTCGGGTGGC TCCTGGGCAT TGGGGTTTTT 17640

TTGTCCTTTT TTTTTTTTTT TTTAAATCTA AAGCTTCCCA GTTGATTCCA ATATGTAGCC 17700

AGAATTGAGA CCAGAAAGCT GTTAATACCC AAGTAGTATA CTAATATTAA TAATGATCAT 17760

AATAGATTAA TAACTAACAT TGAATGAACT TTAAATGTGT TAGCTGATTT AATTCTCAAT 17820

GACTCTGAGG CAGTTACTAT TATTATTAAT GTACCCCTTC TACAGATGAA GAATTCAAGA 17880

TACCAAAAAT CTACATAATT TGGCAAACAA GTAAATGCTA AAGTTGGAAT TCAAACACAG 17940

GTAGTTTAGT GTCCGAGCCC ACACTCTTCA CCACCACACT GGTGGATTGC CCACCTGCAA 18000

TGTTAAAAAT CGCAGAGGAT AGTGATGATA CTGCAGACAC ACTGCCTGCA TTTTATCTCC 18060

TCCTTGTTAG GCTGAGCCAT TCATACCTCA GTGGTCCACA CCTTAAAGGC AGGATATAAA 18120

GGTAAATATA TGTACCTTCT CTGATATGAA CTAGAGACTC CATCCCTTCT TTTTAAGTAA 18180

TGTAAATGAT TAACCAGCTT TCTGTTATTC CTTTCAGAAT CTCATTCATA GAATAAATTC 18240

CTGGCATAAA TTAGTATCAT AAGTTTTCTA TTATTGCTCA TTAATCAGTA TGTGATGTAA 18300

GATCAAGCAG TAAGAGTTCC CCCCAACCCC AAAGAATGGT CTTTCTGTTT GTGACAAATT 18360

ATTCTTGGCA ATGTAATTAG CCAGTTGGGT TATTGAGGGG GATCCACTAG TTCTAGAGCG 18420

GCCGCCACCG CGGTGGACTA GAT 18443 (2) INFORMATION FOR SEQ ID NO: 7:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11811 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 :

CCTGTTAAAG TTTACCTTGT ATCTTAAAAC TTGCCCTAAC CGGATTAATT TTCTGGCCAA 60

ATAGGGAGGC TGAATGAAAG TTTCACATAA ACCTTAGATA CTCCTAATTA ACTGTTTTTT 120

ATGTCTGTTT TTCTAGGACA CATGTTCAAA GAGCATAATT AACTTTTTAA AAGAAGCTAG 180

TAAGTACTGA AATAGTTTTT TAAGTTTTTT CTACAAGAAT AGAGGAAGAA AGGAAACATG 240

GAATTCTGAA GGGCTACTTA GCAAGCTGCT TATGGCATAA TCTGGGGTGG GGGTGCATAG 300

TAAAGGATTT GCATTTTACT GAGACCGATA CATGTCAAGG GAATGGTATT TAAAATTAGT 360

GATATGTGTT GATTTTTCAA GGACTATAGC CCATCAACTA CAATAGGCTC CAAAAAATTC 420

TGGTGAAATT AGCTTCTTGG AGCCTTCCAG TTTACCTACT ATGTTATTCC CACTATAAAA 480

TATTCTCAAC TTTTGGGGTT TTAGCCACTT AAGTTTTTTA TTTTCTCTAA TGTCTCTAGT 540

ATCTGCTTTA GTTTCCTGTC AATGCTAGAC TCTGTGGTTC AGCAGTTCAT CCATTCTCTT 600

CCCAGTACTC AACCTCGTTG CTTATAGTTT CATTACATTC ATCTAGCAAA ACCTTAATTC 660

TGTATGTTTG CCATACCATT AGTGCTTAGA GCATTTTTTC AGAAAAGAAT CCTGGAAAAA 720

TGGATCTTAT CTCACCTGGG CCCTCAGGAC TGCTGGGCTG CCTGGTGTCA GCACTTCCCG 780

CCATTTTCTA TAGCACCAGT ATTATTCTTA ATACTTTAAA AAACCACCAG GCACGGTGGC 840

TCACGCCTGG AATCCCAGCA CTTTGGGAGG CCAAGGTGGG CGGATCACAA GGTCAGGAGA 900

TCAAGACCAT CCTGGCTAAC ACGGTGAAAC CCTGTCTGTA CTAAAAATAG AAAAAAATTA 960 GCTGGGCGTG GTGGCATGCA CCTGTAGTCC CAGCTGCTGG GGAGGCTGAG GCAGGAGAAT 1020

GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGAT TGCACCACTG CACTCCAGCC 1080

TGGGTGACAG AGCGAGACTC CGTCTCAAAA AAAAAAAGTA AATAAAAATA AAAAACCATA 1140

TCCCACTATC TCCCCCTTCT CTCTTTGCCT GTGATCTTGC TGCATACTTA TGGGGAAATC 1200

TTTAAGATGT CAGATTTCAG TTCTCTCACT TTTCTACAAC TTCTCCCACT TTTGCCTTTC 1260

TTATGTACCT TCCCTTCCTT CCCATCTGAT TCCTTATCAG TATTTACACA TGATTAGTTC 1320

TTGCCTAACC TAATAGACCC TTTCTTGAGT GCAAATCAGT GGCTATTTTT GCTAGGGTAT 1380

AAAAATTACC TATCTAATCA CCTTGACAAA GTTACCCTGT TATTTCCAAT AACTTACTTC 1440

CTATGGATTC TTGTAGATTT TCTTTTTTTT TTTTTTAATT TTTTTATTTT CAGATGTTTT 1500

CTCGCTTTGT CACCATGCCT GGCCTAAATT CTCGTAGGTT TTCTATGTAA ACAATCAGAT 1560

TTTCTGCAAG TATTAGTCTC CTTTCTAATT GTTATAATTT TAATTTCTTT TTCTTTTTAA 1620

AATTTTTCGT AGAGACAAGG TTTTGCTATG TTGTCCAGCC TGGTCTTGAA CTCCTGGGCT 1680

CAAGCAATCC TCCCATCTCA GCCTCCCAAA GTGCCATTAC AGTGGCATGA GCCACTGTGC 1740

CTGGCCAAAT TTCTTTTCTT GTTGCGAAGG CAGACTTTTC ATACAATACT GAATAGAAGT 1800

GATAGTAGAT TACTTTATTT CTGATTTTCA AAGGAATGCT TTCCGTTTCT CTCTGTTGAA 1860

GATAATTGCG TATTGTTTTT TTTTTTAAAT AGTAACTTTT ATCAGGTTAA GGAAGGTTTC 1920

TTCTATTTCT ATTTAAAAGG ATTTTTTAAA ATCTTGAATT CATATGTTTT TATCTAATGC 1980

ATTTTCTACA TCAGTTGAAA TGGTTGTATG AACTCTTTTA ATATGGGTGA ATTATATTTA 2040

TAGATTTTAT GTTAAAATAT CCTTGTATAT CTTGGATAAA CTCAACTGGA TCATGATTTA 2100

TCTTTTTTAT ATGCTAGATT CAATTTGTTG ATACTTTGTT ATGATTTTTG AATATATATT 2160

ATTGTGTAAA AGTGAGCCTG TGATTTTCTT TCTTGTAATG TTTCTGTCCA GTTTTGGTGC 2220

CTGGTTTTGC TCTCTCCTTA GAATGAGCTG GGAACTAGTC ACTCTTGTTT TCTCACCTAT 2280

AATAGCATCT GGGTCCAGTG TTTTTTATGT GGGACAAATT TGAACTTGTG GTCAACCTCT 2340 TTAATTGTAA GAATATTCAG GTCTTTTGTT CTTCCTGGGC TAGTTTTTTA TTCTTTTTCT 2400

AGAGATTCGT TCATTTTTCT TAGTTTTATT TGCCTATAAT TGTGGATAAT CTGTTTTTTA 2460

TCTGCTACTT CTGTAATTAT TTCCACATTT GATTTATAAT ATTAACTTGT GGGCCAGGCG 2520

TCGTGGCTCA CACCTGTAAT CCCAGCACTT TGGGAGGCCG AGGCGGGCGG ATCACGAGGT 2580

CAAGAGATCG AGACCATCCT GGCCCATGGT GAAACCCCGT CTCTACTAAA AATACAAAGA 2640

AAAAAATTAG CCGGGCGTGG TGGCAGGCAC CTGTAGTCCC AGCTACTCAG AAGGCTGAGG 2700

CAGGAGAATG GCGTGAACCC AGGAGGCGGA GGTTGCAGTG AGCCGAGATC GCACCACTGC 2760

ACTCCAGCCT GGGCGACAGA GCGAGACTCC ATCTCAAAAA AAAAAAAAAT TTACTTGTGT 2820

CTTCTCTTTT TACCTGTTTG TTAATTTATC AAATAACTAC TTTTGGCTTT GTTTCATTTT 2880

TATTATACAA TAAAATGAAA TTCTTTTCAT TGTATTTCTT TTCATTGATT ATTCCTATAA 2940

TTCTTAAACA ACTTTATAAT TGATGTAACA ATAACCTGTA CACATTTAAA GTGTAAAATT 3000

TATTACATTT TGATCCATGT ATATAGCAGG GAAATATCAC CACAACAAGA GTGTGAACAT 3060

ATAATCTCTC CCCAAAGTTT TCTTGTGTCT TTTATAATCA CTGCCTCTTG CCCCTGCCCA 3120

CTCCCTCATC CTTAAGCAAC CATTGGTCTG TTTTCTGCCA CTATAGATTA GATTGTATTT 3180

TCTAGAGTTT TATACAAGTG AAATCATGTA GTATAGTATT AACCATGTGT TTGTTTGTTT 3240

GTTTGTTTCT TTCTTTCTTT CTTTTTTTTT TAGACGGAGT CTCGCTTTGT CACCCAGGCT 3300

AAAGTGCAGT GGGGCGATCT CGGCTTACTG CCAGCTCCGA CTCCGGGGTT CACACCATTC 3360

TCCTACCTCT GCCTCCCGAG TAGCTGGGAC TCCAGGCGTG CCCGCCACCA CGCCCAGCTA 3420

GTTTTTGTAT TTTTAGTAGA GACGGGGTTT CACCATGTTA GCCAGGATGG TCTCGATCTC 3480

CTGACCTCGT GATCCGCCCA CCTCAGCCTC CCAAAGCGCT GGGATTACAG GCAGGAGCCA 3540

CTGCGCCCAG CAACTATGTG TTTCTGATCC TTTGTCAGGG CTAGCCAATT CCTAGAGACA 3600

GTGAATAACT CACTCATAAT CTAGCTGCCT CCTTTATGTC GCTCTCATAG GACTTTGACA 3660

CCTCTCTGCT ACAATCCACC TGCCCTGTTC ATTTCAAGAT CAGGTACCAG GAAACTCGGG 3720 ACATCCCTAT GCTGCAGAAC TCACTGAAAT TATTCAAACT AGCCAGTCCT AAACATGCTT 3780

ACCCTGCCTT GCCCATTCCT TCCGCTGAAA CCACATAAAG GCTCTTGCCC ATGTTTTCAT 3840

CCCATTCCAT TGACCTCCTT ACTGACCCTA GCTAGTGCTT CCTCATGTGG CCCCTGCATG 3900

GCATGGTGTG CACCTTCCTC TTCGGAACTG CGAGTAACTG TCTTGTCAGC GGCAATCATC 3960

TTGTGATCTG TTGGCCTCAT CATATTTGAA TAACAATAAA ATCTGTTTTA AGGCTGGGCG 4020

CGGTGGCTCA TGCCTGTAAT CCCAGCACTT TGGGAGGCCA AGGCAGGCGG ATCACGAGGT 4080

CAAGAGATTG AGGTGAAACC CCCTCTCTAC TAAAAGTAGA AAAATTAGCT GGGCATGGTG 4140

GTGCGTGCCT GTAATCCCAG CTACTCAGGA GACTGAGGCA GGGAATCTCT TGAACCCAGG 4200

AGGCAGAGGT TGCGGTGAGC CAAGATTGCA CCACGGCACT CCAGCCTGGT GACAGAGCGA 4260

GACTCCATCT CAAAAAAAGA AAAAAAAAAA ACTGTCAAAT GATACTCCAA AATGGTTGTA 4320

CCATTTTATA TTTGCAACAA CAATGTCTGA GGGTACTGAT TGCTCCATAT CCTTGACAGC 4380

ACTTGGTATA GCTGATCTTT TAATTTTAGT CACTTTAGTG GGCATATACT GGTATTTTAT 4440

GTTTTACTTT TTATTTTCCT AATGATTAAT AGTTTGCAGC ATCTTTCATG TGCTTATTTC 4500

CCTTTCATAT ATCTTCTTTG ATAAAAATAT CTGTTCAAAT ATTTTGCCCA TTATTTTGTT 4560

GGAATACTTA TTTTCTTACT GTTGAGCTTT GAGAGTTCTT TATATATCTG GATACCAATC 4620

CTTTGTCAGA TATATTTTTT GCAAAATTTT TTCCCAGCCT GTGATTTAGT TTGTTATTCT 4680

CATGTC TTT AAAAAAAATT GTAGTTAAAA TATACACATA ATACAAAATT TAACATTTTA 4740

ACTCTTTGTA AGTATACAGT TTTGTGGTAT TAAGCATAGT CACATTGTTG TGCAACCATC 4800

ACCGCCATCC ATCTCTGGAA CTTTTTCATC CTCCCTGACT GAAATTCTGT ACCCATTTAA 4860

ACACTAACTT CTCATTCCCC CTTACTCCAG CCCCTGGCAA CCATCGTTCT GTTTTCCTTC 4920

TCTATGAGTT TGACTGCTCT AAGTACTTCA TATAAGTGGA GTCATACAAT ATTTTCATTT 4980

TGTGACTGGC TTATTAGTAT AATGTCTTCA AGTTTCATCC ATGTGGTAGC ATGTGTCAGA 5040

ATTTCCTTCC TTTTTAAGGC TAACATTCCA TCCTATGTAT ATACCACATT TTATCCATTC 5100 ATCTGTTGAT GGACATTTAA GTTGCTTCCT CCTTTTGGCT ATTGTGAATA ATGCTGCTGT 5160

GAATGTTGTT GTATAAATAT CTGTTCGAGT TCCTGCTTTC AATTCTTTTG AGTATGTTCC 5220

CAAAAGTAGA ATTGCTGGGT CATATGTTAA TACTGTATTT AGTTTTTTGA GGAATTGCCA 5280

TACTGATTTC TATAGTAGTG GTACCATTTA CATTCCAACC AGCAGTGTTC AGGGTTCCAA 5340

TTTGTTAACA TTCTTGCCAA CCCTTGTTGT TTTCTGGATT TTTTTTATTT TGGGGTTTTT 5400

TATTTTATTT ATTTATTTTT TTTTTGAGGC AGAGTCTCAC TCTGTCACCC AGGCTGAAGT 5460

GTAGTGGCGC AATCTCGGCT CACTGCAACC TCTGCCCCCC GGGTTCAAGC GATTCTCCTG 5520

CCTCAGCCTC CGAGTAGCTG GGACTACAGG CGCGCGTTAC CACGCCTGGC TAATTTTTTG 5580

TATTTTTAGT AGAGGTGGGG TTTCACTGTG TTAATCAGGA TGGTCTCGAT CTCCGGACCT 5640

TGTGATTCAC CCGCCTCAGC CTCCCGAAGT GCTGGGATTA CAGGCGTGAG CACTATGCCT 5700

GGCCATTTTT TATTTTTAAA CAATAGCCAT CCTAATGGGT ATGAAATAGG TTTTTTGGTG 5760

TTTTGTTTTT TTTTTTTGAG ACAGAATCTT GCTGTGTTGC CCTGGCTGGA GTTTAGTGAC 5820

GTGATCTCGG CTCACCTCAA CCTCCGTCTC CTGGGTTCAA GCACTTCTCC TGCCTCAGAC 5880

TTCCAAGTGG CTGGGACTAC AGGCGCCCGC CACCACACCC AGCTAGTTTT TGTATTTTTA 5940

GTAGAGATGG GGTTTCACTG TGTTGGCCAG GCTGGTCCAC GATCCATCCA CCTTGGCCTC 6000

CCAAAGTGTT GGGATTACAG GGGTGAGCCA CCATGCACAG CCAGGGTTTT GTTTTGTTTT 6060

GTTTTTACTA TTTTTTTTTT TTTTTAGAGA CAAGCTGTCT CCCAAGCTGT AGTGCAGTGG 6120

CACCATTCGT ATCTCACTGT AACCTCAAAA TCCTGGACCC AAGCAATCCT CCTGCCTCAG 6180

CCTTCCATGT AGCTACCTCT ACAGGGAATT GCCCCCATAC CCCGGGAAAT TTTTTTTTTT 6240

TTTTTTTTTT GAGAGTTTTG CTCTTGTTGC CCAGGCTGGA GTGCAATGGC ATGATCTTGG 6300

CTCACTGCAA CCTCCTCTTC CTGGGTTCAA GTGATTTTCC TGCCTCAGCC TCCTGAGTAG 6360

CTGGGATTAC AGGCGCCCGC CACCACGCCT GGCTAATTTT TTGTATTTTT AGTAGAGATG 6420

GGGTTTCACC ATGTTGGCCA GGCTGGGCTC GAACTCCTGA CCTCAGGTGA TCCACCCACC 6480 TTGACCTCCC AAAGGGCTGG GATTACAGGC GTGCGCCACC ACACCTGGCC CCCAGCTAAC 6540

TTTTAAATGT ATTTTGTAGA GATGAGGTCT CACTGTGTTG GCCAGGCTGG TCTTGAACTT 6600

CTGAGCTCAA GTCATTCTCC CACCTCGGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC 6660

CACCACACCT GGCCCCTTTG CCCATTTTAA AAATTAGGTT GTTTTTGTTG TTGTTGAGTT 6720

GTAGGAGCTC TTTGTATATT CTGCATTTCG GTTCCTTATT GGATATGTGA TTGGCATACA 6780

TTTTTTCCCA TCCATGGATT GCTTTTTCAT TCTGTTATAG TATCCTTGAT TCACAGAAGT 6840

TTTTAATATT GATGAGGTCC TGCTTAGTCT GTGTTTTGTT TTGTTGCTTG TGCTTTTGGT 6900

GTTATATCCA AGAAATTTTT GCCAAATCCA AAGTCATGAA GCTTTGCCCT CTGTTTCCTT 6960

CTGAGTTTTA TAGTTTTAGG ACTTAAATTT AGGTTTTCGA CCCATTTTTA GTTAATTTTT 7020

GCAAGTGGTA TAAGGGAGGG GTCCAGCGTT ATTGTTTCAC GTGTAGATAT ACAGTTTTCT 7080

GAGTACCATT TGATGAAAAG GCTGTCCATT GAATTGCTTT TGCAACTTTT ATTTGGGCAT 7140

ATTTATGTGA GTCTGTTACT GGTTCTATAT TTTACTCCAT TGATCTATGT GTCTATTCCT 7200

CTGCTAATAC TGTCTTAAAT ATGGTAGCTA TATAGTAAGC CTTAACACTG AGTAGATAGA 7260

TTTCTCCCCT TTTTTTGTTC TTTTTCAAAA TTGTCACTGG TTTGTTTTTA TTTTTTACTT 7320

TATGCAGATA ATCTGTACTA TACTTTGGTT TCATGTATCA AGTAGTTTGT TCCAAGTTGT 7380

GCTTTAAGCA GAACAAATAA ATTTTCATAT TGTTCTTTGT GTTAATCTGC AATATAAACC 7440

TATACCAAAT TCTATTTTGT GTATTTGTTT ATTGTAGTAA TCTGACTGAC TCTTTTGCCT 7500

CCAGACTCAT CTCTTTCAAG GTCCCCAACT GAATCTTGTT TTAGGTGGAA CTTAGAAGCA 7560

GTAGAAGTTA AGAATCTATT TCACAGCCTT AGTAGTCTAG TTTCATTCTC TATATAATGT 7620

TGTCTATGCA AGTGAGCTGC TCTCCAGTGC CTTAGTTTCA CTAATGTTGG GGAAGGTCTC 7680

TTCTCTTGTT TTGGACTTCT CTATCACATT GCCTTTCTCA AGAGAAGACA TATAATGAAA 7740

GTTGATATCT GGTGTTCTAG GACTTCTTCA GAAGCTTGCC AGTTTTTCAA GCTGATTTCT 7800

CTCACTGGCA ACTCTTCAGA GTGCTGTTCC TACTCCACCC TCCCCTGGTG GTATGTATCA 7860 GTTTTCTACT CATCAGCACC CACCTACTCC TGCCTACTGT GTTTCTCAGA TGTCTGCTGC 7920

CTGGCTAGCT CATTGCTGCT TTTGTCACTC ATAGAGCTGT CTTCTTCCCT TTTTTTGGCT 7980

TTCTGCCTGA CTTCCAGGGC AGCTGCTCTG TCATTGCCTG TCTGCCATTC TGTCTTTTTT 8040

CCCCCTACCC CCCACAGATA CAACATCTAC TCTAATACCA CACATTCTCC ATGTTCAAAC 8100

TAACCTCATC ACTTTCCCCA CCACATTCCC CAAAACTGGT CATCCTCCAG CTTATAGCAT 8160

TGCAGTTCAC TGAAGTTAGA CATCTGGGCC TTGCTTACCT CCAACATCTC ATTAGCCTTC 8220

GATTCTACCC CTATAAATCC TCTTCTCAGT CTCCTTTAGA TATTCCTGCC CTGCTGTGAG 8280

ATCCATCTGG TTTATTGGCT AGATTACTTC AGAAAGCTTC AGTCAGTGAC CCTCCTTACT 8340

TCAAACCCCA CCAGTTGATC CTTCACTCTG CCATCAGTCA TTGCTTCTAA AATCTAAATT 8400

GTTCCATTTA ACCTTGCTGT GATAAAACCT TTGGTAGTTC TTCAGTGTGT TCAGTGGTAA 8460

GTTAAAACTT TCACTGTAAT GTACAGGCCC CTTCATGATA TGATCGCTGC CTCCTCGAGC 8520

CTCATTGTGT GCATTTCCCC GCCCCACCCT TTCCTCACCC ACCCTAGTCT TTCATGTCTG 8580

CCATTTTTAC ATTCATTTAG CAGATATTTA TTGAAGCCCC CTGTGATGTC CTTACCTAGG 8640

TCTTTCTTGT TGCCAGGACC AGACAGGCTT TTTCAAGCTT CCAAGTCATC TCAGTTTGAA 8700

AGACTATGTC TGACCCTTGT CTTGGCCAAT TACTCTTTAT CCTTCCAAGT TCAATGATTG 8760

TCCCACTGCA CTCCAACCAG AGTGAGAGAG CAAGACCCTG TCTCAGTAAA TAAAAATAAA 8820

TAAATAAATA AATAAATAAA TAAATAAATC AGCCATAATT TATTTAATCA TGTCTCTCTC 8880

CCCCATTGAT AGACGTTAAG GGTATTTCCA GTATTCTTCT CTTGAAAACA ATGCTACATT 8940

GAATAACCTT GTACATGGGT CACTTTGAAA GTATGGATAT GTATCCGTGG AATAAGTTTC 9000

CAGAAGTGGA ATTGTGTCAG AGGGGTTGTG CATTTGTAAT TCTGATGAAT ATTTATAGAT 9060

TATATGAGAG TACCTGTTTA CTCAAACTCT TGCCAATGCA GCATTATCAA AGTTTTTTAT 9120

GTTCGCCAGT GTGATAGATT AAAAAATGGT ATCTCAGCCA GGCGCAGTGG CTCACGCCTG 9180

TAATCCCAGC ACTTTGGGAG GCTGAGGCGG GCAGATCACG GGGTCAGGAG ATCGAGACCA 9240 TCCTGGCCAA CACAGTGAAA CCCTGTCTCT ACTAAAAATA CAAAAAATTA TCCAGGCGTG 9300

GTGGCGGGCA CCTGTAGTCC CAGCTACTCG GAAGGCTGAG GCAGGAGAAT GGCATGAACC 9360

TGGGAGGCGG AGCTTGCACT GAGCCGAGAT CGCGCCACAA CATTCGAGCC TGGGCGACAG 9420

AGCGAGACTC CGTCTCAAAT AATAAAAAAA AAAGATGGTA TCTCAGCATT GATTTCTTTG 9480

ATCATCAGTG AGGTTGAGCA TCTTTTCATA GATTTAAGAG AACTGTATGG TTTTTTGTGA 9540

GTTATGTTTC ATATCGTTTA CCCATTTTAC TTTTAGGCTG GAAGCAGCTG TTTTAGTGGA 9600

ATGGTGGAAC AAGAAGCCAG ATTGCCATGG AGAGACAACT CTTTCTAGAG ATTTGGCTAT 9660

GAAGCAGAGT AGAGACAATG ATAGCTGAAG GATTGATGTA GATGCAAAGA AATTTTTCAT 9720

CTTCTTTGAA AACTTAATTG TGTTAAAAAC TGGTATGAAA GGGAGGGGTT AAAGCTAGAG 9780

ATGGTGGTAG AAAAAAATGC AGGGTTCCTA AAGGACTGAG ATTCCTGGAT GGAATTTCAG 9840

GGAAGGGGAA AATTTCTGGA TATAGTGACT GGGGAGTTAA GGGTGTCTAG TCCAATGGCT 9900

TTTATTTTCT TGGAAGGGTA GGCAAGGCCA ACAGCCACAT GTGTGGGAGG AGATGGTTAG 9960

AGGGGAGAGG AGGTTTGAAG GCACCGCTAT GGAGAATTGG AGAGAGCTAA GGAAAGACAG 10020

AAAGACTGCA GAAAGTGCTT AGGGTTCCAC TGAAGCGGAA ATAGTGATTT GTAGTGATAC 10080

AACCCTTATG AGTTATTTGA TTTTTTTTTT TTTTTAAGCA GCATCTGGCA GTCCAAGTAT 10140

AGGGCTGACA GTTTGGGATT TTTCTTTCCA TGTTGGTGTA AAAGAAGAAC AGTGTAGTGA 10200

AGGAAGTTAG GACAAAAGAA TGATTGAACT GACACCAAGT TTTCTTGATT TGGTAGAAAA 10260

GGAAATAAAG ATAGAGCAGA GATATTGAAA AGAATTAGAG AGGGGTTCAA GAGACTGAAG 10320

GCCTGGGTGA GGTCAGAGAG CAGGTGTGGT AGACATAACA GAGAGAACTA CAAGGATAGA 10380

AAGTGTGGTT GGAGAGTGGG AAGGCAAGAT TTATTCAGTA TGGGGGCTTT TCTGGGTGAT 10440

GACAGCATCT GGAGTACAGC CATTGTCGTG AGTGGCCCAA GTGTAGCAGA GATAAAGCGT 10500

TGTTGGAGTG AAGGAAGTCA AGGAACTGAG AGGCTGGCCT AGATGGGGAT TTTGGTTGTC 10560

ATCCATGAGG ATATTGAAGT CATCCAGGAG AATAGCAGGC CTGGGGGACA GGAAGGAAAC 10620 TGAGCCACTT ACAGTGTCTT CAGTGATAGG AAAGCACAGG GCAAAAAGCT TTCAAGAACA 10680

GGGACTGTTA AGCCGGGTAC AGTGGCTCAC ACCTATAATC CTAGCATTTT GGGAGGCCAA 10740

GGCGGGTGGA TCACTTGAGG TCAGGAGTTC AAGACCAGCC TGGCCAACAT GGTGAAACCC 10800

CATCTCTACT AAAAATACAA AAATTAGCCA GGCATGGTGG CACGCGCCTG TAATCCCAGC 10860

TACTTGGGAG GCTGAGGCAG GAGAATTGCT TGAACCTAGG AGGCGGAGGT GGCAGTGAGC 10920

CTAGATCGCG CCCTTGGCTG CGATCCAGAC TTCACTCCAG CCTGGGTGAC AGAGCAAGAC 10980

TCTGTCTCAA AAAAAAAAAA GAAAATCAGA CTCTTAATAT TTGTAAAGAA GTAGTCCTTG 11040

AGCTACTACT TAAGTCTAGA AAGAGTTGAT ATTCTTGTTT TAAGAGTGTT AGGGCACTTT 11100

GGGAGGCTGA GGCAGGTGGA TCACTTGAGC CCAGGAGTTC CAGACCAGCC TGAGCAATAT 11160

GGGGAAACCT TGTCTCTACT AAAAATACAA AAATTAACCA GGCATGTGGT ACGTACCTGT 11220

AGTCCCAGCC ACTTGGGACG CTGAGGTGGG AGGATCACCT GAGCCCAGGA AATGGAGGTT 11280

GCAGTGAGCC AAGATTGCGT GACTGTACTC TAGCCTGGGC AACAGAGCAA GACTCTGTCT 11340

CAAAAAAAAA AAGGGCGGGG ATTATCATAG TGCCATTATT ATTATGAGTT TATGATGGCT 11400

TTCTCTAAGC ACCTTTTACA TTCGGCATTT ATTCAGTACC TATTAAGCAT CAAGGAGTCC 11460

AGAAAAAATT TTATATATAA ATATATATAA AATATGTAAA TATATATATG CATATGCTTC 11520

CCTATCTCAG GAAGGAAATA TGTGAACATC AGGAACCGAA GTCTACTCAG TTACATGCCA 11580

TTGGATATAT CACACAAAGT GCTGAGGGAA CTCAGAAGGC TCATTATATC TGGGGAGTGG 11640

GAAGGAGGCA CAGAGATGTG CTTTGGGAAG TTTAAATTAA AATAGCAAAT GGGGAAAATG 11700

AAGACACACC AGACAGGGCA CAAGCAAAGA GACATGAAAG AGTAAGTCAT GTGTTTGAGG 11760

ATCTGGGGAT CCACTAGTTC TAGAGCGGCC GCCACCGCGT AGCAGTTACG G 11811 (2) INFORMATION FOR SEQ ID NO : 8 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1241 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 :

TCGTGATGCG GTATTTTCTC CTTACGCATC TGTGCGGTAT TTCACACCGC ATAGATCCGT 60

CGAGTTCAAG AGAAAAAAAA AGAAAAAGCA AAAAGAAAAA AGGAAAGCGC GCCTCGTTCA 120

GAATGACACG TATAGAATGA TGCATTACCT TGTCATCTTC AGTATCATAC TGTTCGTATA 180

CATACTTACT GACATTCATA GGTATACATA TATACACATG TATATATATC GTATGCTGCA 240

GCTTTAAATA ATCGGTGTCA CTACATAAGA ACACCTTTGG TGGAGGGAAC ATCGTTGGTA 300

CCATTGGGCG AGGTGGCTTC TCTTATGGCA ACCGCAAGAG CCTTGAACGC ACTCTCACTA 360

CGGTGATGAT CATTCTTGCC TCGCAGACAA TCAACGTGGA GGGTAATTCT GCTAGCCTCT 420

GCAAAGCTTT CAAGAAAATG CGGGATCATC TCGCAAGAGA GATCTCCTAC TTTCTCCCTT 480

TGCAAACCAA GTTCGACAAC TGCGTACGGC CTGTTCGAAA GATCTACCAC CGCTCTGGAA 540

AGTGCCTCAT CCAAAGGCGC AAATCCTGAT CCAAACCTTT TTACTCCACG CACGGCCCCT 600

AGGGCCTCTT TAAAAGCTTG ACCGAGAGCA ATCCCGCAGT CTTCAGTGGT GTGATGGTCG 660

TCTATGTGTA AGTCACCAAT GCACTCAACG ATTAGCGACC AGCCGGAATG CTTGGCCAGA 720

GCATGTATCA TATGGTCCAG AAACCCTATA CCTGTGTGGA CGTTAATCAC TTGCGATTGT 780

GTGGCCTGTT CTGCTACTGC TTCTGCCTCT TTTTCTGGGA AGATCGAGTG CTCTATCGCT 840

AGGGGACCAC CCTTTAAAGA GATCGCAATC TGAATCTTGG TTTCATTTGT AATACGCTTT 900

ACTAGGGCTT TCTGCTCTGT CATCTTTGCC TTCGTTTATC TTGCCTGCTC ATTTTTTAGT 960

ATATTCTTCG AAGAAATCAC ATTACTTTAT ATAATGTATA ATTCATTATG TGATAATGCC 1020 AATCGCTAAG AAAAAAAAAG AGTCATCCGC TAGGTGGAAA AAAAAAAATG AAAATCATTA 1080

CCGAGGCATA AAAAAATATA GAGTGTACTA GAGGAGGCCA AGAGTAATAG AAAAAGAAAA 1140

TTGCGGGAAA GGACTGTGTT ATGACTTCCC TGACTAATGC CGTGTTCAAA CGATACCTGG 1200

CAGTGACTCC TAGCGCTCAC CAAGCTCTTA AAACGGGAAT T 1241

(2) INFORMATION FOR SEQ ID NO: 9:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1701 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 :

ATAAAAAACA GTTAATTAGG AGTATCTAGG TTATGTGAAG CATTCATCAC CYYCCTAYTG 60

RCAGAAAWTW TCGWTAGGCA AATTTTATAT TWTAAGTAAC TTTAACATGA ACACTTCTTA 120

AACTTTGGCT CATAATTTCA CAAAAATTAG GCTGCAAGTC ACCATATTCA TCAGATACTG 180

GCAGACACTA ACTTCTGCGG CTATGACACC AAGCAATACT GAAATCTCTT ATCTTTCCAG 240

GGGGGTTGTT CATGTATTCA GTGTTTGCAA AGAGTTCCTG CTGAGCTAAA CACAGTCCAC 300

TGTGCACTCT ACGAAAGAGT CCATGAGACA AGCATGGGGG AGGGTAGGAA GTTTAATACT 360

TTCACAATGC CTGTGGAGAC GCTGGCAGTG ATGAAAGCCT AGAAAACTCA TGAAAGGACC 420

TTTTATGAGC AGGGTGAATG TAGAGCACAA AAGCAAAGTC AGATGACCCA CTTAAAGCTT 480

TGCCTTTACT GATGAGAATT CATTCTCATT CCAGATTAGT CTCTCTCTAG AAAAAGCAAA 540

CCTTATATAA GAGTTGGAAA ATTAAGATAC AGGAAGTATA ATTCTACTAA ATTCCAGTTT 600

TTCCTTCTCA AATATCAGCC TAAGTCCTAA GGTCTGTGGC CAAAGACAGA AAATACAAGG 660

CGCTGAGAAA TATGCTATTT ATCTTGGTGT AACAATCTCT GACTGTTGGG GTTTGAGGAA 720

ATTTAAGCTC TACAATCCAT AGATCAGACC AGAAGTTTAG GGTAGTAATA TTATGAGAGG 780 AAATAGTTTC TTTCTGGAAC TTATATAAAG CAAATAACTG GTAAACCTGA TTTGCAAGGT 840

AATGACAGTC CAAGTTCCTT CAAAGCAGAG AACCACTTAT TTGCTCATTC ATTCAACTAA 900

GTTCCTTGTC TTGTGCCAGG CTGGAGAGAG AAAGCAGCTC CTGTCCTCAA GGAGCTCACA 960

TCTCAGGCAT CTTCTCACCC TCCTTTCTCA TGTTAACCAA AACATTTCAG GTTCATCAAT 1020

GAAACTCTTC ATCCAGGAGG CAGATAAAAT GGCTTCTCTT CATTTTGATT CATTTACTCT 1080

TTCTTTTATT TATTTTATTA TTATTATTTT TTTTTTTTCT GAGAAGGAGT CTCGCTCTGT 1140

TGCCCAGGCT GGAGTGCAGT GGCGTGATCT CGGCTCACTG CAACCTCTGC CTCCCGGGTT 1200

CAAGCGATTC TCCTGCCTCA GCCTCCCAAG TAGCTGGGAT TACAGGCATG CGCCACCACG 1260

CCCGGCTAAT TTTTGTAATT TTAGTAGAGA TGGGGTTTCA CCATGTTGGT CAGGCTGGTG 1320

TCAAACTCCT GACCTTGTGA TCCGCCTGCC TCAGCCTCCC AAAGTGCTGG GATTACAGGT 1380

GTGAGCCACC ATGCCCGGCC TACTCTTTCT TTTAAACAGA GAAATAAGAT GGAATATTTT 1440

TATCCCATCT TTTCTTCTGT AATTAAAAAA GGAATACGAA GAAACTTGAC ATAGTCTCTC 1500

TCCTCATGTG CTCTCTTACT TCCCATCCCA ATTCCATGTT TGCTCTCTTT TTCCTCTCTC 1560

CTTCTGTTTT GTTGTGAATG AAGAATTAGG TAACTAGTCC AAAACTACAG AGCTACACCT 1620

GGAGCCTAGA TTCACTGGTA GCAAATCACT AATTTTCTGA AGGTAAATGG GAGAAAATGG 1680

GGGTGGGGGG AAACTCATTA A 1701

(2) INFORMATION FOR SEQ ID NO: 10:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1293 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:

GGAGATAATA AGTATACACT ATGTGTGAAG GGGGTGTCTC TATTGTTGTT GTGGCGATTA 60 GGTGAGTAAT TTTACACCTG GTTGTGAATA AAGTCCGAGA TTGGGGGACT CACGCTTTGT 120

AGAGTCTCCC AGGACAATGG GTTTTGCCCC CGTGCCCAAT TAATAGTTAA AGGTTGGGGG 180

CTTTTCGATT CCCTTATTCC AACTGGATAG GGCTCTTGAA ATGCCCCCAA AAAAGGTTGA 240

CCCTTTCCCC ACACGTCAAA GAGGGAATTC TCCCGCTAGA CTACCCTTGA ACCTGAAGTG 300

CAGTCCCTAC AGGGTATTCT AGCTTGTTAG CATCCCCCAC TGTGAATCAA TCCCTTAAAA 360

TAAACCTATA TAAGATGTAT GTAATAGAGG ACTAATCTTT AATATAATAA GCATATATTT 420

AATATAATTT CGGTACTACC CCCTTATCTG GGGGGGGGGT GGGGGGATAT GTTCCAAGAC 480

TCCCAGTAGA TGCCTGAAAC CACAGATGGT ACTGAACCCT ACGTAAACTG TATTTCATTC 540

CTATACATGC AGGCTATGTG TTGTAATCTG TAGGGTAACC ACTAAAAGAA CAGGGTCTAT 600

AACTTGGCAA GAGGGAAAAA AGCTAGGATA GTAAAAAAGT CTATCAATCC AAAAAGCAAG 660

AAAAAAGAGA AAAAGGAACA TGCTGGCATA TTATTATAAG TATTGTATTT TATTATTAGT 720

TATTGTTAAT TTTTTACTGT GCCTAATTTA TAAATTAAAC TTTATCACAG CTATGTATGT 780

ATAGGAAAAT ATATATCTGT GGTTTTAGGC ATCCACTGGG GGTCTTGGAA TATAATGCTT 840

CCCCCAGATA AGAAGGTACT ACTGTAATTA TATTATATGT CATATTAAGT ATACATTAAT 900

TCTACTAGGT AGTAGCCACA TTATATATTA ATTATATTAA ATATATATCA TATAGAATTA 960

TTTTAAGGAA TTGACTCATA ATAGAAGAGG CTGGCAGGCT GGAGATTCAG GGAGGAGTTG 1020

CATTTCAAGT GCAAAGGCAG ACTGCCAGAG AATTCCCTCT TGCTTGGGGG AGGTCAGCCT 1080

TTTGTTCTAT TCAAATCTTT GAGGAAAATA GAAAGCAAAG AATATATTAA CTATATTAAA 1140

CAAACTAAAT GTTCCAATTA AAATACAAAA ATTATAAAGC CTAATAATAA AAGCCCTCAA 1200

TTATATGCTG TTTAAAAGAG ACATTTTTAA GCTTAAGGAT ATAGAAAAGT TGAACA ACA 1260

AGCATGGAAT AAAATAAGCA TGCAAAA AC TAG 1293 (2) INFORMATION FOR SEQ ID NO: 11:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 529 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:

GCTGAGGTGC ATCGCGGTGG CGGACGCTCT AGAACTAGTG GATCCCCAAA CAAAACCTGT 60

CCCTGCTAAT GATGGTAGAC CCAATCAGAT CCCCGGAGAA GCCGAAATAC GGAAACCATA 120

TCAGCATACG CATGGCATAC ATAGAACCCC ATACATGGAT TGCTTACTCA GCCAGATATA 180

GAAATCTATC TTCACGATAG AGATATATAT ATATAGACAC ACTGCATATA CAGATGTGAG 240

ATGGAGGCTC ACTCTGCCAC CCGTGCTGGA TCTACAGTGG CACAAGCTCA GTCCACAGTC 300

ACGTCGATCT GCCGGGCGTG ACCGACTGAG ATGCAGCGGC CTCGGGCGTA GCTGTGAGTA 360

CACGCACCAG TCATCGCGAC TGGCTGCAAG TGGTATAAGC GGAGGGGACA GGGTTACAGC 420

ATGACGGCTA GGCAGGCCGC AAACTGAGGA CCACAAGAGT GCCACGCTGC CCGAACGCAT 480

GCAGTGGCGA GATTACATGG GGCAGCCACT AGAGCCGCCG TATCAGAAA 529

(2) INFORMATION FOR SEQ ID NO: 12:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18073 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:

AGCGCGTGCG CCGCTCTAGA ACTAGTGGAT CCCCCGAGGA GTGAGGAGGA CCTCAACCCT 60

ACTTCCTGAA ATGGAGCTCT GAGATGTTGG AGTAGAAATT TGGAAACCAG AGAGAGAAGT 120

AAGGGTAGTG TTGTTGCAAC ATGCATTGTA TATGGGGGGT CGGGAAGTCA CAGGAGTTTG 180

CCTCAAAGTC TTTCTCGGAG ACGGATGAGG TTTTCACTGT GATTTTCCTG GTCGTGGTCT 240

ATGGATATAG TACCTGTTAG TGACATGGAT CTTCTTAACT TCTGATGTGT CTTTTCCTCC 300

CTAGTGTACG CATACCAATT CTCTCCACAG CTTCCATCAC CATGCATTTG TTCTTTTCCC 360

TTGTTCTTGT ATTACCTTTC TGGAAAGGAA TTTTTATTGT AGGCTAATTG TTACTCCCAC 420

CAGTATTTAA CCACTGGATA TTTCATATGA TTGATCTCTT CTGATTTGGA AAATAAAAAT 480

GTAATCTCAT TATATTCATT TGATTAGTGG GGACAGTCAA CACTTCTTTG TGTATTTTCT 540

TAGCTGTTCG TTTTTCTCGT CTGTAAATTA TCTGTTTAGG TCCTTCAGAT TTTTCAAAAT 600

TGGACTGTTA TGTTTTCAGT ATTGTTATGA GTTCTTGTTT CAATTATTTA TGACAGTTCA 660

TTTTCTTTTT TAAAATAGAC TTTTTTTTTC TTAGAGAAAT AAGAAAAAAT AAAAATTAAA 720

ATAGACTTTG TGTTTTAGAG AGTTTCAGGT TCACAGCAAA ATTGATCAAA AAGTATGGAG 780

AGTTCCGGCC AGGCGCGGTG GCTCACACCT GTAATCCCAG CACTTTGGAA GGCCAAGGTG 840

GGCAGATCAC AAGGTCAGGA GTTTAAGACC AGCCTGGCCA ATATGATGAA ACCCCATGTC 900

TACTAACAAT ACACAAATTA GCTGGGTGTG GTGGTGCACA CCTGTAACTG TACCTACTCA 960

GGAGGCTGAG GCAGAAGAAT CTCTTGAACC TGGGAGGTGG AGGTTACAGT GAGCCACAGT 1020

CATGCCCCTG CACTCCAGCC TGGGCAACAG AGTGAGACTC CGTCCTAAAA AAAGAAAGAA 1080

AGAAAATATA GAGCATTCCT AAATACCACC TGTCCCCAAC ACCTGCACAG CCTCCTCATT 1140

ATCCACATCC TACACCACTG TGGTACCTTT GTTGCAATTG ATGGACCAAC ATTGACTCCT 1200

CATTATCACC CAAGCTTTGG TGTTGTACAT TCTGTAGATT TGGACAAATG TATAATGACA 1260

TGTGTCTACC ATTGTAGTAT CATACAGAAG AATTTGACTG CCCTGACAGT CCTCTGCTCC 1320

ACCTGCTTAC TCCTCTCTCC CTTTTCCTAA CTGCACAACC ACTGATTTTT TTTTTTTTTT 1380 TTTGAGAGGG GGTCTCACTC TGTCCCCCAG GCCGGAGTGC AGTGGGGCCA TTTGGGGTCA 1440

CTGAAAGCTC CACCTCCGGG GTTAATGCAA TTCTCCGGCC TCAGCCTCCC GGGTAACTGG 1500

GATTAAAGGG GCCCGCCACC AAATCGGGGT AATTTTTGGA ATTTGAAGTA AAAAGGGGGT 1560

TTCCCCATTT TAGCCAGGAT GGTCTCGATC TCCTGACCTC GTGATCCGCC CACCTCGGCC 1620

TCCCAAAGCT GGGATTACAG GCATGAGCCA CCACGCCCTA CCTTTTTTTT AAAAAACAAG 1680

GTCTTGCTCT GTCACCCAGG CCTGAGTGCA GTGATGATCA CTCCTCACTG AAGCGTCGAC 1740

CTCCCAGGCT CAAGTGATCC TCCCACCTCA GCCTCCTAAA TAGCTGAGAC TACACACACA 1800

CACCACCATG CCCAGCTAAG TTTTGTATTT TTTATAGAAA TGTGGTCTTG CTGTGTTGTC 1860

CAGGCTGGTC TTGAACTCCT GAGTTCAAGC AATTTGCCTG CCTTGGCCTC TCAAGGTGTT 1920

GGGATTACAG GCATGAGTCA CCGCACCTGG CCTTTTTTAT TTTCTTTTTT TTTTTTTAAC 1980

CAGTGATCTT TTACTGTCTC CATGGTTTTT CACATTGGCT TCTGTCACTT AGTAATATAT 2040

GTTTAAGTTT CTTCTACGTA TTTTCATGTT TTTAGCTTAT TTCTTTTTAG CAGTGAGTAA 2100

TATTTCATTG TCTGGATGTG CCATCACTTA TTTATCCATT CGCCTGCTGA AGGATATCTT 2160

GATTGCTCCC AGTCGTGGCA ATTATAAATA AAGTTGCTGT AAACATCCAT GTGCAGGTTT 2220

TTTTTAAGTG GCATAAGTTT TCATCTCATT TGGTTAAATA CCAAGGAGCA CAATTGCTGG 2280

ATCATATGGT AAGAGCTTAT TTATTTTTTT GAGAGACTAC CAAGCTGCCT TCCAAAGTGG 2340

ATGTACCATT TTGCATTCCC ACCAGCAGTG AATGAGAGTT CCTGCTGCTC CATATTCTTA 2400

CAAACATGTA GTATTGTCAA ATGTTTTGGA TTTTAAAACC AAAATCCATT TTCATAGATG 2460

TGTAGTGGTA TCCCGTTTTA ATTTGCAATT ACCTAATGAC TTGATGTTCT GTGTCTTTTC 2520

AGATGCTTAT TTGCCGTACT GTTTATCTTC TTTGGTGAGG TGTCTATTCA GGTCTTTTGC 2580

CCATTTTTAA TCTGGTTGTT ATTTTTCTTG TTGAGTTTAA GAATTCTCTG TCCTTTGTCA 2640

GATCTATCTT TTGCAAATAT TTTCTCCTAG TCTGTGGCTT ATCCTCTGAT TCTCTTGGCA 2700

TTGTCTTTCA CAGAGTAGAC ATTTTATATT TTAATGAAGT CCAGACTATC AATTATGTTC 2760 TCATGGATCA TGCCTTTGAT GTTATATCTA AAAAGTTCTC GCCATACCCA AAGTCATCTA 2820

GATTTTCTCC TGTTATCTTC TTGGCATTTT ATAGTCTTAT GATTGATATT TAGGTCTATG 2880

ATTCATTTTT AGTTAAATTT TTGTGAAAGA TAATAAGGTC TGATATGGAT TAATTTTTCT 2940

ATATGTAGCT GTCCCGTTCC AGTATCATTT GTTGAAAAGA CTATCTTGCT CCATTTTATT 3000

GCCTTTGCTC CTTTGTCAGT TGACTATATT TATGTGGGTC TGTTTATGAT CTCTGTTCCG 3060

TTCCATTGAT CTGTTTGCCT TTTCTTTTGC TAATACCACA GTCTTAATTA CCATAGCTTT 3120

AAAGTAAGTC TTGAAGTCCA ATAGCATTAA TCTTTGACTC TTCTTTAATA TTGAGTTGCC 3180

CCTTCAGAAT CTTAATGTCT CTCCATGTAA ACTTTAGAAT CAGCATTTTT ATATTCACAA 3240

AATAACTTGC TGAGATTATG ATTGAGATTG CATTGAATCT ATAGGCTTAT TTGGGAATAA 3300

CTGACATCTT GACAATATTG AGTCTTCCTG TCCATAAACA TTATTTATGA TGGGCTTCTT 3360

CTTTATGTTT AGGAGCTTTT GTTTTTTCTG TCAGATATTC CACTTCTACC TTTATGATTT 3420

CTTAATTGCC TTTTATGCTT AGAAAGTTTT TCCTCATCCT GAGCTCACAT ATTCATTTAT 3480

TTTCTTTTAA AATGTGTTTT CAAGCATTTA ATTTTTAAAC CTATGTGGAA TTTATTTTGG 3540

TATATGGAAT GAGGTGGTGG TCTAACTCCC TCCTCTCAAA TATGTAGTTA TTTTTCCCAA 3600

AACCATTTTC TATTAATTTA TCAAGAATAG ACATGTATAC ATATACATAT ATAATAGTCA 3660

GCCTTCCACT TGTTGTTTGA CCCTTGTGAA GGAAATTGTA TGAGTTTCCA ATTTTGGATT 3720

AGGCTCAGGT AGTAATTGAG CTGGGTTCTG CCAGAGATCC ATGTTAATTC ACTATCCAAA 3780

CAGAGTTATA AAATGTAAGT TTTATGAAAA TCTAACAGTA TATCACTGGT TTAATGATCA 3840

CAGCCTAGGA AGAATGGGGA AATTGTCAAA ATCTTCTGTG GATGCACCTG AAGGCCACTG 3900

CTGAACCCAT TTCCCTGCTA GGCACGGCTG CTGGTACCAG GGGCAAACTC CTGGAGTATA 3960

TATGAACCAC CTACATCTCC TTCTCTTCCC CCCCTACCCT TGAGATTTTC ATGTGTCCCT 4020

TAAGGATGTG TGTCCTACTT CCCTTGGAGA GTCACTACCA CATTGAACAC TTTAGACTGT 4080

GAGTCCTGTG AAGATGGGGC TCATGAGTGT ATTGCTCCCC AGTTGTTTCT CTAGCACTAG 4140 CTCAGTATAG GGCATAAAAA TCTGAATGGA TGAACAAACC ACTATTACTG GTGGGGACAT 4200

GCTACTATCT TACATGGTTC GAGGTGGAAT AAAGGTTGAG AACAGCTATA TAATGTGTTC 4260

CTTGAAGGGC AGCAGTACAT CAGTGCAATC AGCCTACCTT CTCCATACTT CTCACTCTGA 4320

AAACTGTAAA GCTGCACCTA GCAATCAACT TGGGAGCTTT AAAAGGGACT GCTCCCTAGC 4380

TCTCACCCAC AAAGCTGTAG TCTAGCACAG GTGACTTTTT TAAAAAAGTT TTTTGGTCCA 4440

GATGTGATGA CTCACGCCTG TAATCCCAGC ACTTCGGGAG GCTGAGGCTG GGAGGTCACC 4500

TGGGGTCAGG AGTTTGAGAC CAGCGTGACC AACATGGAGA AACCCCATCT CTACTAAAAA 4560

TTTGCCGGGC ATGGTGGCAC ATGCCTCTAA TCTCAGCTAC TCGGGAGGCT GAGGCAGGAG 4620

AATTGCTTGA ACCCGGGAGG CGGAGGTTGC CGTGAGCCAA GATCACACCA TTGCACTCCA 4680

GCCCGGGCGA CAGTGCAAGA CTCCGTCTCA AAAAAAAATA AAAAAGGAGT CCTATTAAGA 4740

CTTATTTTTA CAGGTTGGAT ATCTCTAATC CCAAAATCTG AAATGCTCCA AAATTTGAAA 4800

CTTTTTGAGC GCAGACATGA TGCTCAAAAA AATGCTCACT GGGACATTTT GGATTTCAAA 4860

ATTTGGATTA GGGACTAGGT GTGGGAGCTC ACACCTGTAA TCATAGCACT TTGGGAAGTT 4920

GAAGCAAGAG GATCAGTTGA ACCCAAGAGT TTGAGAGCAG CCTAGACAAC ATAGTGAGAC 4980

GCCGTCTCTA CAGAAAATTT TAAAAATTAG CCAGGCATCG TAGTACATGC CTATAGTCCC 5040

AGCTACTCAG GAGGCTGAGA CAGAAGGATC ACTTGAGTCC AGGAGGTAGA GGCTGCACTG 5100

AGCTATGATC ATAACCACTG TCTCCATCCT GGGCAACAGA GCAAGACCCT ATCTCTTAAA 5160

AAAAATCTGA AACACTGCTA GTCCTCAAGA TAAGGGATAG TCAGTCTTTA TAAAGACTCA 5220

ATTAGTTATT GGATATCTGA GGAAGCATGC ATATCAGGCT CCCAAAAGAT CATTGGTTTA 5280

GGCACACATT TTAATAGCTT GGAAATCCAG AATACTCTTC TGGTGACCAG CTCAGACATA 5340

GTCCTGATAA TATAGGACCT CATCTAACAT GACTCCCTAT TTTCCAGATA AGCATGGATT 5400

CCTGGTTCAT TCTTGTTCTG CTCGGCAGTG GTCTGATATG TGTCAGTGCC AACAATGCTA 5460

CCACAGGTAA ATTGTCATTT GATAAGGCTG CTATTTGAAA TGAAATTTTG CTTTCACATT 5520 TAATGAGCCA CATTTGAAAA CCGAGATGGT ATTTGAAGAA AGGAATATAA AAATTTTATT 5580

CAAAGTGATG GTAAAATAGG TGTCTTCAGA AATCTTGGAA TTGAATGCTC AGCATTGTTT 5640

TTCATACATA CATAACTGCT TTAAATAAAT CAAAGAGATT ATGTGTTCTT TCCTGAAAAG 5700

TAAAATAAAT TGTTGACATT TACAACTCTA TATATGGTTT CTGAGGAACT AAGTGAAGAA 5760

TCTTGTGTCT TTCTCCCTTA AACCGTAGTC CTTTGGAGGA GGTAGGAAAG GTCCAGCATG 5820

AGATAAAAAC GTAGGGGGTG GGTGGTGTTG AGGGGGATTG GTCTTTGCTT GGTCTCCATA 5880

TGTTTGAGAG TTTATTAAGG CTTGCTGCTT TGTGTCTCAC AGCTTTTTAG CCTCACATTC 5940

TTCATGTGCT ATTTCCTTGT TTTTTGGTGT TTGTAGTTGC ACCTTCTGTA GGAATTACAA 6000

GATTAATTAA CTCATCAACG GCAGAACCAG TTAAAGAAGA GGCCAAAACT TCAAATCCAA 6060

CTTCTTCACT AACTTCTCTT TCTGTGGCAC CAACATTCAG CCCAAATATA ACTCTGGGAC 6120

CCACCTATTT AACCACTGTC AATTCTTCAG ACTCTGACAA TGGGACCACA AGAACAGCAA 6180

GCACCAATTC TATAGGCATT ACAATTTCAC CAAATGGAAC GTGGCTTCCA GATAACCAGT 6240

TCACGGATGC CAGAACAGAA CCCTGGGAGG GGAATTCCAG CACCGCAGCA ACCACTCCAG 6300

AAACTTTCCC TCCTTCAGGT ACTAGAGATG ATTCTGTTTG TTCTTTTGCT CTTTGAGTTT 6360

AGTCTTCCTT TTATTATCTT GTTTGTGTTT TCTAGCCTTA AAATTTCTTC AAATAAGTAA 6420

AATTGCTCAA GTGAAGTAAT GAAACCTGTA TGTGGAATTT TTGGGTTAGC ATGAGTGAAG 6480

AGGAAAGAAG AAAGATTCTG GAGAATATCT TTCTGCTAGG TGGGATCCTG GTTAGATTGA 6540

GAGGACTTAA ATGTGTTTAA AGGTAGAGAA GAAGGCTTAA AAAGACAAGA GAAATAGAGG 6600

AGCTCATTGA CGATGCAAGA GACTGAAGAT GAAAAGATAC AGAGAATGAG TAATAAGATT 6660

AGGTTTGGAA AGGGAGGGAT CCGTGGAGAC CATGGAAAGG AGAATGGGTA TTGATGTCCA 6720

TGACAGTTAG ATGTGAGATA CAGAGAATGA GTAATAAGAT TAGGTTTGGA AAGGGAGGGA 6780

TCCATGGAGA CCATGGAAAG GAGAATGGAC ATTGATGTCC ATGACAGTTA GATATGGAGT 6840

GGCAGGCCAG TGGCCAGGGG TGGCATCAGG CTCTGGGAAA TGGTTACATT GCAGTGCCAG 6900 TTGTTCAGGG CCTCAGGTTG AAGCAGTAGT CCCAAGGAGA AAATCAGAGA CGTGGATCTG 6960

AGACCAGGGC AGGTAAGACA AGTTTCTGAC CTCTTTGAAC CTTAGGTACC TTGTCTGTAA 7020

AAGAGGATTA GAGATACCCT CAAAGGGCTT CTATGAGGAG TAAAGGAAAT AATCATTACC 7080

TGATTGCTAT GTAACTGTCA TCCCTTTTCT AGCAAAAATC ACTCTTTCCT CTTCTGTGTT 7140

CCCAGTTAGA TGGTGAGTGC CCCTAAGCAG AATCACATCT CGCTCATGTG GAACATTCAG 7200

GAACTGTTTG CTCAGTTGAT TCTCATTTGT TACTACAGAT GATATCTTTT ACTGCGCCTT 7260

ATAACTCAGA CCCTTCACCT GCCAGCTTTT CCCCATATTT TCTACCGTAA AGACAAGACA 7320

GCATTTGCAG TTAAGAGCAC AGTCTTCAGT GCCACACTGA GTTTGAATCC CAGCTCTTCC 7380

ATAAACCAGC CATGTTTATG GCATAGCTGG CTTACTTTAT CTCTCTACCT CGGTTTGTTC 7440

ATCTGTGAAA CAAGAATGAG TGATAGTAAT AGTTCTTACC TCATAGAGGA GATATTAGGA 7500

TTAAACAAGT TAATATGGGT AAAGCACTTA TAAAGGTGCC TACACATGGT AAGCACTATT 7560

TTTAAGTGTG AGCTGTTAGT ATTGTTGTGG TTATTGCTCT GATAGTTACC AGTAAAATAT 7620

ATGAAGGTAC CTTTAATGCA GATGGCATCC CACTATTCTT GATGAGATAG GGGACTGCAG 7680

ACAAATAATG TCTGATACTT GCTTTGTGCT TTAGAGTTAA TGTAGTTTTG TCATAGTTAT 7740

TACTGTGTGC TAGGCATCGT ACTAAGAGTT TTCTAGAATA ATCCTATGAA TTAAGTTCTA 7800

TTTTATGTTT TATAGGTGAA AGTATTTTAC AATGATGAAA CCATAATTTG TGGAATGTTT 7860

TTCAGTGTAC AGGTCATGAC ACAATTCATG AAATCACTTT AGCAGGCCAC CACTAGTTGT 7920

TTGTTTTGTT TTATTTTAAT GGATGATCCA GTTCCATGTT TATTCTTTTA ATGTTACATA 7980

CAATTTTTTG AAATTTTAGT AACAACATAA AATGTTGGGT TGTGGCCATT GCTTAGGGAG 8040

AAAGGCAGGA TAACTTGTAC AAACTGTATG AGTGAATGGA AAAGGTGGAG ACTGTAACAC 8100

AGGCCTGACT GACTGAACAG CCCATGTTCT ATTGTGTACT GTCTTTCATT TAACAGTTCT 8160

GTGACATGAC CATGGATAAT CATCTCCTTT TAACAGATGC TTGATTTCAG ACTGTATATA 8220

GAGGTTAAAT GATTTGTTTT AGATCTCAAG GCTGACAAAT TAGGCCTATT TCTCACTTTT 8280 GCGGTCTTTC CACTCTGCTT GTAGGGAACT TAGTTTTCCA TAAACTGACT TAGGTCCAAA 8340

TTGTGCCACA GCTAAGAATC TAGTTATTGT ACATTTAACA CAGTTCACGT CATAGGAGGC 8400

TGAGACTATG TTTCTCTAGT GGCGTTTATT CAAGATGAGT AAAACACAAG AAACCATTAT 8460

CGCACATGGG AATTTCATAG TCTTAAACCC CACATCCCAC TTATCACCAC CATTTACCAG 8520

TCCTCCTGTA ACAGTTACAA TTTTTTATTA AATCAGTATT TGATGTATAT TATTGTAATT 8580

ATGAAATATT CATTGCTGAG CTATAAGTAT AAATGGATTG TTTTTCTTGT ACAGTTTTTT 8640

TTCTGGATTT AATACTTACC TTATTTTTTG TTTATTTAGT TTTCTATTTA GTCAGGCCAG 8700

GCACACTGGC TAACACCTGT AATCCCAGCA CTTTGGGAGG CCAAGGTGGA CAGATCACTT 8760

GAGCTCAAGA GTTTGAGACC AGCCTGGGGA ACATGGTGAA ACCCCATCTC TACAAAAAAT 8820

ACAAAAATTA GCTGGGCATG GGTGCATGTG CTTGTAGTCC CAGCTACTCA GGAGCCTGAG 8880

GTGGGAGGAT TGCTTAAGCC CAGGAGGTTG AGGCTGCAGT GAGCTGTGTT CATACCACTG 8940

CACTCCAGCC TGGGTGACAA AGCGAGACCA TGTCTCAAAA AAGTTATTGC TACTCAATTC 9000

TTACCATGCT CTCCAGAGCC TCTCAAAACA GCTTTCTACA AAGTGAGATC TGTTAGATAA 9060

TCTATTTCTT TTTTACCTCT AGAAATTCCT CCTGAGCCCT CCATTGTCTT ATTCCAGTCT 9120

AGGCTTGTCG ATCTCTAGGG CTACTACACA GATACATCAG CCTGAGATTT CCCTTCTCTG 9180

TCATTCTGGG AATTCCCCTT GCTGCTGCTT CCTGACTTCC ATATTGTCTT CCTTTTTGTC 9240

TTCTCATCAT TCGGTAGATT CCTGAGAAAA GGGGTCCATG GGAGGCAAAT TGCATCCTTA 9300

CATATCTAAA AATATCTTTA GGGCTGTGCA TAGAATTTGA GGAATATTTT TCCCCCAGAA 9360

TTTTTAAAGT AATGCCCTAA CTGACACCTG TTTACCAGGT TTGGAGGATT TTACTGCTAT 9420

CTTAATCCCT AATTGTTTGT ATGCTTTCTA GGATCTTCTC TTTATCATCA GTATCCTGAA 9480

ATTTCACAGA GATGTATCTT GATGTGGGTC TTTTTCGTTC AT ATTATGG ATACTTAATA 9540

GGCCCTTTAG AGCCTTGATC TTGCATTTCT GAAAATTTTC TCCCATTTCT TTGAAACCTT 9600

CTCCCCCTCT TCCTTTTTTT TTTTTCTCAA ATTCTTAATA TTTGGATATT GGATGTATCC 9660 TGAATTAATT CTTTAATCTT TAAAATTTTT CCTTTCTGTT GATCTTTGCT TTGAGTCTTT 9720

TTCTCCTTTT AAAAATAAAC AAAGGCCAGC TAGGCACAGT GGCTTATATC TGTAATTCCA 9780

GCACTTTGGG AGGCTGAAGC AGGAGGATCG CTTAAGCCCG GGAGTTTGAG ACCAGCCTAA 9840

GCATCGCAGC AAAACCTCAT CTCTACAAAT GATTTAGAAA TTAGCAGGGC CTAATGGCTC 9900

ATGCCTGTGG TCCCAGCTAC TCAGGGCTGA GGCAGGAGGA TTACTTGAGG CCTGGCAGTT 9960

GAGGCTGCTG CAGTGAGCTG TGATCGCACC ACCGTACTCC AGTCTGGGCA ACAGAGGGAG 10020

ACCTCATCTC AAAAATAAAT AGGCCTGGTG TGGTGGCTCA CTCCTGTAAT CCCAGCACTT 10080

TGGGAGGCCA AGGCAGGTGG ATCACTTGAA GCCAGGAGCT CAAGACCAGC CTAGCCGACA 10140

TGGCAAAACC CTCTGTCTAC CTACTAAAAA TAAAAAAATT AGTCAAACGT GTTGGCATAT 10200

ACTTGTAATC CCAGCTACTT GGGAGGCTGA GACATGAGAA TTGCTTGAAC CTGGGAGGTG 10260

GAGGTTGCAG TGAGTCAAGT CCCTGCACTA TAGCCTGGGG AACAGAGTGA GACCCGAGAC 10320

TCTATCTCAA AAAAAAAAAA TCAGTGACAA GTAAAAAGGT AGAATACCTT TTTTTTTTTC 10380

TTTGAGACAG TCTCACCCTG TCGCCCAGTC TGGAGTGCAA TGGCGCAGTC TCGGCATACT 10440

GCAAACTCTG CCTTCAGGGT TCAAACAATT CTCCTGCCTC AGCCTCCTGA GTAGCTGGGA 10500

TTACACATGC CCACGACCAC ACCCAGCTTT TTTTTGTATT TTTAGTAGAG ACAGGTTTCA 10560

CCATGTTGGC CATGCTGGTC TCGAACTCCT GACCTCATGA TCCACCTGCC CCGGCCTCCC 10620

AAAGTGCTGG TATTACAGGC GTGAGCCACT GCGCCCAGCC TAGAATACCT TTTAAAAATA 10680

AATAAATAGG CCGGGCGCGG CGGCTCATGC CTGTAATCCC AGCACTTTGG GAGGCTGAGG 10740

CGGGCAGATC ACGAGGTCAG GAGATCAAGA CCCTCCTGGC TAACATGGTG AACCCCATCT 10800

CTACTAAAAA ATACAAAAAA AAATTAGCTG GGCGTGGTGG CAGGTGCCTG TAGTCCCAGC 10860

TACTCTGGAG GCTGAGGCAG GAGAATGGCG TGAACCCAGG AGGTGGAGCT TGCAGTGAGC 10920

CGAGATTGCG CCACTACACT CCAGCCTGGG CAACAGAGCA AGACTCTCTC TCTAAATAAA 10980

TAATAAATAA ATAAATAAAT AAATAAATAA CTCCTTTTAC AAAAGCATAT ATATTCATTT 11040 TTTCCATTTA TAATATAAAT AATAGATATG CTGAGTTGAT TTCTGCATAT TGCTTTTTCA 11100

GTTACCCTAT CATACTTGCT CTTTGTTTTA GTAAAGAGCT GCTGTATTGA AGGATATACC 11160

TTAATCTCTT TATCCAGTTT CCCCATCAGT GGACACTAAG ATTGTTTTCA GAGTACTCTT 11220

ATAAACAATA CAGTTTGTCA TTTCAGACAC ATATGAGAAT ATTAGCAGGA TGAATTATTT 11280

TAAGTCTGCA TTTATAAATT TATGGATATT GCCACATTTA CCTCTGCTAG GAAGTCTATT 11340

CCTATTAACA ATATGTCAAA GTGCCTATTT TTCTAAACTC TCTTCAGTGT GGTGAATTGT 11400

TAAACTTGGG GATCTCTGCC AATCTGACAG GTGAAAAATA ACATCTCAGT GTAAGTTTAA 11460

TTTGCATTTT GCTGAGATTG AGCAATTTTG TGTAATTTAA AAGATCATTT ATTTTTCTGA 11520

GCATTCTCTG TTGATATTCT TTACCCATTT TTATTAGAGT GTCAAGGTTT TCCTGACTCG 11580

TTTGTAGATG TTCTTTGTAC GTTTGGGAAA TGAGTCCTTT GCCTATGGTA AAACTGCAAA 11640

TGTTGTTCCC TAGGTGGTCA TCTAGATTTT CTGCATTGCA GAAGATATCA TTAGCTATTT 11700

TTAATTTTTT TAATTTAAAT ATTTCTCAGT TTAGGTTTTC TAGGAATTGG GTCA ATCTA 11760

GGAAGGCTTT CCTTACTCCA AGATTATAAA AATAATTTTC TTCTGGACTT CTATGGTTTC 11820

GTGTGTGTGT GTGTGTGTGT GTACACGCAC TTAAGTCTGT CTCGAATTTA TTCTGATGCA 11880

GAGTGAGCTA TGGATCTGTT TTTCCCCAAA TATCTAACTT GTCCCAATAC CCCTTAATAA 11940

TTTATTTTTC CTCATTGATT TGAAATGCCA CCTATCTTAT ATATTGAATT CAGATATTTA 12000

TTTACCTCTT CATATGTATT TGAGTATTTG GGAACATTCA TTTTATTTTC TATTAATCTT 12060

TTTCTCTGTC CATGTGCAAA GCCTCACTGT CTCAATAATT GTAACTTTGT AAAGTATTTA 12120

ATATCCAGTA AAATGAGTCA TTCCTTGTTA ATTTTATTTT TCAGAATTTT GTTAGCAATT 12180

CTTATTATAA ACATTAGAAT TAACTTGTCT AGCAGGAAAA AAAGTTTGTA TTGATCATGT 12240

TAAATACGTA GATTAACAGA GAAAATGGCA TCTTACAGAT GTTGAGTCTA ACTATCCAAG 12300

AATGCAATAT ATTCCATTTT CTGAAGTCTT TTTTTTTTAA ATCTTCTGTT TTTGTAATTA 12360

TAAATGGAGC ATTTTCTTCC ATCAGATCTT CTAACTGGCT GCTGTTGGGG ATATGAAGGC 12420 TACTGATTTT TGTAGAGACA TTTTGTACTG GCCACCTTAA ACTCTCTTAG TATTGGAAGT 12480

AATTTTCTTC ATTAATTTTT ATGGCTTCAA GTCATCTCAT CTGCATATAT CTTCCAAATT 12540

TTTAGAACTT TCTTTTTCTT CTGTTTAATC GCATTGATGA ATACCTCCAG AACAAAGTTA 12600

AGCAGCTGGT AAATGCAGAC AGCATTCTCT TGTATCTGAC ACTAAGGAGG ACACTTTCAG 12660

TGGTTTTTCA TTATACGTGG TACTGACTCT TGAGTTGAGA TAAACATATT TTATTGTGTT 12720

CAGGATTTAA TGAGCGTTTA TGTTAGGAAT GGGTGTTAAA TTTTGCCAGT TGCCTGTTCA 12780

GGATCAATGA GAAAGATCTG AATGATTTTT TTTCTCTTTT GGTCTGTTTC TATGGTGGAT 12840

TCTATTCCTA GGTTTGTTTG TTTGTTTGTT TATTTTGAGA TGGAGTCTGT TACCAGGCTG 12900

GAGTGCAGTG GCGCCATCTC AGCTCACTGC AACCTCCACC TCGCGGGTTC AAGTGATTCC 12960

CCTGCCTCAG CCTCCGAGTA GCTGGGACTA CAGGCACGCA CCACCATGCC CGGCTAATTT 13020

TTTGTATTTT AGTAGAGACG TGGTTTCACC ATGTTGGCCA ACCTGGTCTC GAACTCCTGA 13080

CCCCATGATC CTGCCTCAGC CTCCCAAAGT GCTGGGATTA TAGGTGTGAG CCACTGCGCC 13140

CTGCCAGTTT TTATTTATTC ATTTTTTAGA GACAGGGTCT TGCTCTGAAT TAATTCTTTA 13200

ATCTTCTTAA TTTTTCTTTT CTGTTGACCT TTGCTTTGCT TTAAGTCTTT TCCTTTGAGT 13260

CATCCAGGCT GAAGTACAGT GGCACGATCA TGGCTCACTG TAACCTTGAA CTCCCAGACT 13320

TAAGCAAACC CCACCTCAGA CTTCTGAGTA GCTAAGGACT ATAGGCGCAT GTCACCACGC 13380

CCAGCTAATT TTTAAATTTT CTCAGAAACA GGGACTCACT GTGTTGCCCA GACTGGTCAT 13440

GAACTCCTGG CCTCAAGCAG TCCTCAGCCT TAGCCTTCCA AAGCACTGGG ATTATAGGCA 13500

TGAGCCAAGG CCGCCCAAAC ATATTGTATC GTTCCTGTAA CAAGCTGTTG CAGTCTATTT 13560

GATATTATTT CTTATTTTTT TCATTTAGAA TTTTCTCTGT CTAGATATTC TCAAATTATC 13620

TCTAAATGAG ATTGATCTAT GTTTTTCCTT TGTGTGTGTA TTCTTTTTGA TAAGTTTTAG 13680

TTTTTAGTGT TTTGTTTTGC TACATGGAAA GGATTTGAAA GTTTACACTA AAAAATATGC 13740

TTTTTTTTTT TAAGACAGGC TTTTTCACTG TTGCCTAGTG CTGGAGTGCA GTGGCATGAT 13800 CTCGGCTCAT TGCGGCCTGC ACCTCCTGGG CTCAGGTGAT CCTCTCACCT CAGCCTCCCA 13860

AGTAGCTGGG ATTACAGGTG TGTTCCACCA TGCCCAGCTA ATTTTTTGTA TTTTTTTGTA 13920

GAGATGGGGT TTCGCCATGT TGCCCAGGCT GGTCTTGAAC TCCTGGGCTC ACATGATTCT 13980

CCTGTCTTAG CCTCCCAAAG TGCTAGGATT ACAGGTGTGA GCCACCACAT CTGGCCATTT 14040

CATTCATGTT TTCAAATGTA TTTGAATGAG GAAAAGTTCT CCCTTGTGAT TATTTATTAT 14100

AATAGCCTAC AGAGCTATTA ATTTTTAAAT TTTGTTTACT TTATGTCTCC TTTTTTTTTT 14160

TGTTTAGGCT GAATAACCAT TTATTTCATA GGTTTATTGC CTTTTTTCTT CCAAAGAACT 14220

TGCTATTGTG CATTTATAGT CCTTTTATGT TTACGTTTTC TATTTCATTG ATTTTAACTT 14280

TCTACCTTCT TTAGATTTAT TTTGTTCTTT TTCTATCTTC TTGAATTGAG TGTGCTTTAA 14340

TTGCATTCTT TCCAGTTAAT TAACATATTT AGTGCTGTGA ATTTTGAACA AGCACAGCTT 14400

TAGCCACATC CCATAGGTGT TTCTATAGGC AGTTGTATTA GGATGCGCTA TAAGCTGCTC 14460

TGACAAAGAT ACCAAAATTC AGTGACTTAA ATAAGACCAA AGTGTCTTTC TCTCCCCAGT 14520

TACATTCCAG AGGTAGACAG GGCCTTCGTC TCAGTAGGGA CCAAATTCCT TTCCTCTTGT 14580

GGCCCTGCCA TCCTAACAAT ATTGCCCTTA TCTGTTTGGT TAGAGATAGT TCTCACCATT 14640

GGGTTCTAGT TCCAACCACT GCGAAGGACA AACAAAGGGA ATAGGGGCCA TTTCTCTTCC 14700

AAAAGATGTG ACCTGGAAGT TACTCACATT GCTTTAGCTC ACATCCCGTT GGCTAGAATT 14760

CATCACATGA CCACACCTAG CACAAAGGAG TCTCAAATAT AGTCTGCCAG GAGAGCTTGG 14820

TGCTCAGCTA AAAAACAAAG GTTCTGTATC AAGGCAAGAA GAGAAAGAGA CTGATCTGAG 14880

GGGAGGAGAG TTGGCAGGTT CTGTCACAAA ACTTCTCGTC ATTGTTATTT TTAAGGTATT 14940

TTTCCATTTT GGGTTTTTTG TTTGTCTGAT TTTTTTTTTT TTTTTTGAGA TGGAGTCTCG 15000

CTCTGTTGCC CAGGCTGGAG TGCAGTGGCG TGATCTCTGC TCACCGCAAG CTCTGCCTCC 15060

TGGTTCACGC CATTCTCCTG CCTCAGCCTC CCAAGTAGCT GGGACTACAG GCGTACACCA 15120

CCACGCCTGG CTAATTTTTT TTTTGTATTT TTATTAGAGA CAGGGTTTCA CTGTGTTACC 15180 CAGGATGGTC TCATTCTCCT GACTTTGTGA TCTGCCCACT TCGGCCTCCC AAAGTGTTAG 15240

GATTACAGGC GTGAGCCACC GCGCCCGGCC GTCTGTTTGA TTTTTGAGAT GGAATCTCAC 15300

TCTGCCCCCC TTCTGGAGTA CAGTGGTGTG ATCTTGGGTC ACTGCAACCT CTACCCTCCC 15360

AGGTTTAAGC AATTCTTGTG CCTCAGCCTC CCAAAGTGCT GGGATTAAAG ACGTGAGCCA 15420

CTGTGCCCAG CCCATTTTGG TTTTGATTTT TTTTTTTCTT TGAAATAGAG TCTCGCTCTG 15480

TTACCTAGGC TGGAGTACAG TGGCATGATC TCGGCTCACT GCAACCTCCC CCTCCTGGGT 15540

TCAAGTGATT CTCGTGCCTC AGCCTCCCAA GTAGCTGGGA TTATAGGCAC CCACCACCAC 15600

GCCCAGCTAA TTTGTTTTGT ATTTTTAGTA GAGACGGGGT TTTACCATGT TGGCCAGGCT 15660

GGTCTCGAAC TCCTGACCTC AGGTGATCCA CTGCACCCGG CCTCATTTTG GTTTTGATTT 15720

TTATTTTCAA ATGTTTTCTT ACTTTGTCAA TTTCTAATTT TATTGCATTG GGACAAAAGA 15780

ATATTGTACT CTTTCTACTG TTGGGGTTTA TAAGGGCTGT GGATATTTCA CTCGCCTTTG 15840

AAAAGAAGGT TTTCTCTGTT AGTCTGTAGA GTTTGGTATG TACCAATTAG ATTTTATTAC 15900

TTATCATTTT GGTCTTTTGT ATCCTTACTT AATTTTGTCC TCTTGAATTT TAATGGAGCA 15960

AAAGACATAA AGTCCTCTAA TAACATGCGT TCTGTTTGCA TTCTCATACT TTTTATGAAT 16020

ATTGATGCTG CACTATTTGT GTACCCAGGG AGAAGGCCAG ACCACTGTCC AAAGTTTAGT 16080

GAATCTGGGC AGCCTTGTTT CCCAGTTGTT GGAGGATGCC TCATGGAGGA AAGCATTCCT 16140

AATCCTGGAG CTTGTTTTGT TGTACTCTAA TTGAATTGTA ATGTGTTTCT TTAACCTGAA 16200

TGAATGTTTC TATTTTTTAC TTATTACACA GGTAATTCTG ACTCGAAGGA CAGAAGAGGT 16260

GAGCTGCTCA CCTTATATCT GTTGTTCCTT TTACACAGTG TACAGTATTC ATTTATTTCC 16320

TCTGCTCACA GTCTGTGGTA ACCGTGTGCA TCTGTGGCTG TGTTGTTTGT TTACTTTCCC 16380

TTAAGTTATT TCCATGTTAA TCTCATGGAG AAGAGCAATA GAAACAAGTA CTGTATTCAG 16440

TATGTTTTTT AATATAGACT ATGGATTCTA ACAGCTATGA TGTATTTTAA CAAGTAACAA 16500

AATATATCTT ACTTTGACAT GTCACTTTGT TAACATTACT TTTTGGTGAT ATTAGGTCAT 16560 AATTTCTATA CCATTAGTTA CTTCTGATTT CTAGGCCACA GTTCCCTTTA AATATTCTTT 16620

GTGTTGTTTT TCCCCTAGTG TATAAAATGT CAACCCTTTG TGGCTTTATA TGGATTTTAT 16680

GGATTTTCAG CCCTTAAATG TAAAGTCTCT ATGGCCTGAG ATGTTGTGTC TGTGGTTTAA 16740

GCTGGACTGC TGAGTCCCTG GTCACTAGAG AGTAGGGGGA CATGGGTACT TGTCTGCAGA 16800

AGTGTGGCAC ATTTTGCCTA GAATGACAGT AAGGCTGCTA TCAAAGAGCA TGAGAGAAAG 16860

AGAAAGAGAT CATCTAACAT TCTAAGAAGT GATTATTACA TTTGAGTTTT AAAAATGTTA 16920

CTATTCGAAG CAGTGTTTTT ATCATAATTT TCTATTTTAT CAAATCAGAC TTGAGTTTTT 16980

TTTCTGATTC TGTTATTTAA CCATACACAA TTTTCCCTGT GTAATTAAGT AATGGAACAC 17040

TTGGAGGCAT ATGAAGTCCC ACTAAGTAGG GAGCATTTGA GTCAGAAAAG TGGGTACTCT 17100

CTTCCTTTAT GTGATGTCCA TCTGCCATTG TATTTGGTAA GGAATAGTGA GGTGTTACCA 17160

TACTGTGTAC AGATTTCCCT CACTTTTCCA CCTCTCACTT TCCTAAACTT GGGAACTAAA 17220

CATTGGATTA ATACAGTGTC TTTGCTGTTC AGATTCACTT GCCAGATTTT ATCAAATGTA 17280

GACTTAAATA GGTTTTATTG TGATAGATAT TTACTTGCTC CCTAAAACTG CTCTCTTAAC 17340

CAGCCTTACA ATAAAGTCAA AAGTCAAAGT GGTAGGCTTC AAGATGAAAC ATAAGATCTG 17400

TTGACTCCTT CCTCTATTTA GTATATATTT TCATAATATT CAGCCTTTTC TTGCCCCAGA 17460

TATCATATCT ATTTTACCTA CCCAATATTT AAGTAGTTTC CATGTTGTGA TTAAGAAAAC 17520

AAAATTACCA TAATTACCTA GATTATTGCT AATTGTGACA TATGTAAAGT CTATTAATGT 17580

AATAAATCTC CTTTCTTAAG TCAAAAAATA ATTTTGTGTA ATTCCAAACA GGAAACTGAA 17640

AAGGCATAGG TATTCTCAGC AGTCTCTAAA GTCCCAAAAT CTAATGGCAA TTTTACCAGA 17700

GCAGATCTTT AGAAGTATTG CTATAAATTT GGATATCCCA TTCTAATTTT AAGCCAAATG 17760

CTTTTTGAGA AATAAGCCAG CTGTTTGGAA ATGCTTGTAT TATAATCGGT TTGATAAGCA 17820

GTTATGTCTT ATGCAGATGA ATTAGGGGCT ACCTGTTTTT ATGCACTGGT CTTTGGGGTG 17880

CTTTTGAACA GTAGTGTCTG ATGTTTTAAT TGTCAAAGCA AAAAGAAATG AGAGGGAGGG 17940 CAACTTTTCT TCCTCTTCTG AATTCCAGGA AACTGGTTAT TTTCTCATGC CATATGATTT 18000

TAAAATATAT TCCCAGCCAG GTGCAGTGGG TCACGCTTGT AATCCCAGAT TTTTGGGATG 18060

CCAAGCGGGG GGA 18073

(2) INFORMATION FOR SEQ ID NO: 13:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7505 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:

TCCGAGCTCC ACGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCCCTCTGG TGGCCCATTG 60

AGAATCAAAA CTTGCAGTGA GTGACTCTAT AAAATGGAAA ATTGAATCAA GTCTGAAAAT 120

GATCCACATA GTTCTACAGC AGGGCTGGAC ACCGTGGTCA GGACCTCAAT ATATTCTGCT 180

TCCACAGAAT TCAGACAGTT CAGAGTTTGG TGAATTAACC TCAAAGGCAG CAAGATATCT 240

GTCCCGGGAG TCAGCAGGTA AGCATAGCAG AAATGGCTGG AGCAGCGGGA GCCTGCTTTC 300

CTTCTGTTGG CTGCTAGCGT CCACTCCATT ATAGCTCCTG ATGGAAGATT TCTACAGAGT 360

GATGCCTCAG AATCTTCCTT ATACCTTTCT TCCATGATCC TTGCACCTCT TTTTCTAGAT 420

TTGCCCACAT TCTTATGTGC AAGTAACTAG ATATACATTA TCAGACAAGC TAGCAGACCT 480

GCATATATCC ACTTCCCTAC TTTTCCTATA ATTTCTTCAC CTGAACCTCT ATCATTCTTC 540

TCTTTCTGTG TTGACTCTGG TGTTAACCTT GCAGGCAAGT TGAGCGTGGG TTTGGTGTCA 600

CAGTGAAGGA CTAAGGGAAT AGTTAGCCTT CTATTTATTA ACAAATCTTC CCTTTGATGT 660

CTGGATCAGT GTCTCTCTAA TAGGAATTAT TGGCATGTTA AGGCAAAGAA CATATGCTTA 720

TTGAGTGCTG ACTGATTGGG GTTAATACTA ATTTGATACT ATTAAGGTGT GGGGCCCAGG 780

AATGCCAAAA TTCTACCTCA ATGTAGAGCC ACCATTCCCC TTGAGGTAAC CTAGGTGGGA 840 TAGATATACG TGTAAGGGCT AATGGAAGAT AGGGAATCAA AGTATCACTT TATTTTTTAT 900

TTTTATTTTT TATTTAATTT TTTTGAGATG GAGTCTTGCT CTGTTGCTAG GCTGTAGCGC 960

AGTGGCACAA TGAAAGTATC ACTTTATTAT TATCTGAGCT TGTGCCCTAA ACTTCACTGC 1020

AGAATATGCT GGTAAAATGG ACTGGATTAC AGGATTTAGA GGCAAGGTCC ACAGGTCAGG 1080

ATAAGAGGTA AAGAGGGAAA TCTTTCTCTC TTCCTAAGCC CAAACCCTCC ATGACAATTG 1140

AGATTAAAAA AAAAAAATAA ACTGATGAGA GAATCCAAGC ACAGTTGATC AAAGAGGAAA 1200

GAGAAATGAT GATGTTTCCC TCTTTCTTTT TCATGAGAAA GTGGCTCTCT TATTGATCGG 1260

CTACTTGATT AGAGAAACAG TGGGGGAAAG AACTGCCATA TCCACATGTG CAATTTTTTA 1320

AAACACACAG TGATTCTGAA CACTAGTATA AATTCCCAGT CAGTGTTCTG GCCATCTGAC 1380

TACTCAGGTT ATAATACCTA ATTTTTACAA GGGAGTTGGG AAGTGTGCCA AACCTGTAGA 1440

AGTCTATATC TACTGTATTC AGATTTTATA TGCATTATTT TATATAACCT TTTGACCTCT 1500

CTCCTCTATC ATCACTTGAG TGATTTCATC CAGCGTCATC ATTTAACATA TTTTAAATAA 1560

CTCTATATAC TGATAATTCC CAAATTTATA TCTCCATCCC CGATTGTTCT CCTAACCTCC 1620

AGCCTCTAAT ATCCAACTGC CTACTCAAGC CTCAGCAATG GTGAGCGCCC CTGCCCCAGC 1680

CTCGCTGCTG CCTTGCAGCT CGATCTCAGA CTGCTGTGCT GGCAATGAGC GAGGCTCCGT 1740

GGGCGTGGGA CCTTCCGAGC CAGGCGCAGG ATATAATCTC CTGGTGTGCT GTTTGCTAAG 1800

ACCGTTGGAA AAGCACAGTA TTAGGGTGGG AGTGACCCAA TTTTCCAGGT GTCGTCTGTC 1860

ACAGCTTTGC TTGGCTACGA AAGGGAATTC GCTGACCCCT TGCACTTCCT GGGTGAGGCA 1920

ATGCCTCGCC CTGCTTCGGC TCATGCTCAG TGCGCTGCAC CCACTGTCCT GCACCCAGTG 1980

TCCGACGAGC CCCAGTGGGA TGAACCCGGT ACCTCAGTTG GAAATACAGA AATCACCCGT 2040

CTTCTGTGTC CCTCATGCTG GGAGCTGTAG ACTGGAGCTG TTCCTATTTG GCCATCTTGG 2100

AACTGCCTTG CATTCAGTTT TTAATATCCA ACTGCCTATA CGATATCTTC ACTTGGATTT 2160

TGAATAGGCA TATCAAACTT GTCATGTTCA AAAGTGAGGT TCTAATCTTC CCTCCCAAAC 2220 CTGCTTCTCC CATGGCTTTC CCCATCTCAG TAAATAGGAA TTTCATCCTT CCAATTGCTC 2280

ATGCCAAAAA TTTGGGAGTT ATCCTTGACT CTTCTCTTTC TCACACCCCA CATTCAATCC 2340

ATCACCACAT TCTGATGCCT CTATCTTCAA GATATACTTA GACTTTCACC ACTTTTCTTC 2400

ACTCTGCAAT TACCACTTTG GTCCAAGCCA CTGTTATCTC TTTCTTGGAT TATTGTAATA 2460

GCTTCCTAAT AATTTGTCCC CTTTCTTCCA CCTTTGTTTC CCCTACAGTA TAATCTTAAC 2520

GAAGCAGCCA GAATGGTTGC CTACAAACCT TTAAAATGGT AAGCCAGAAC ATGTAGGTAT 2580

ATTCAAAACC TTCCAATGGC TTGTCATGGA ACTAAAAGTC TCTACATTGG CCTATAAGAC 2640

CCTATGTCAT CTACCCCTAG TCTCCTCCTT TCTAACTTCA TCTCCTGCTA TGCTGTCCTT 2700

CAACTCACTC TGCTCCAGGT GCTCTGGCCT CCTCAAACAC ACCACACACA CTTGCAGCTC 2760

ACAGTCTTGG CACTTGCTGT TCTTCTCCTC TAGGACCTTC TTCCTCCAAC TGTCTGGTTC 2820

ACCCACCCCT TCCTTCTGGA TTTCTGCTCT GATGTCATTT TATCAGTGGG CACTTCCCAA 2880

TTTCTCTATT TAAGACCACA ATTCCAGGCC AGGGTGGTGG TTCATGCCTG TAATCCCAGC 2940

ACTTTGGGAA GCCGAGGTGG GCAGATCATG AGGTCAAGAA TTCGAGACCA GCTTGGCCAA 3000

CATGGTGAAA CCCCATCTCT ACTAAAAATA CAAAAAAAAT TAGCCAGGTG TGGTGGCACA 3060

TGCCTGTAAT CTCAGCTACT TAGGAGGCTG AGGCAGGAGA ATCGCTTGAA CCTGGGGGGC 3120

AGAGGTTGTA GTGAGCCGAG ATTGCGCCAC TGCACTTCAG CCTGGGCAAT AGAGCGAGAC 3180

TCTGTCTCAA AAAAAAAAAA AAATTTGCTG TTATTTCCTA TACTATTTTT GTAAGGCAAG 3240

GACCTTATTA TTTTCCTTGA TAATACCTCT CACACTTTAT AATTACATAT TTGACTTTGT 3300

TGATTAATGA ATATCCCTCC TTTATAGCAT AAATTCCACA AGAGCAAGGA TTACATGTCT 3360

GCTTCATTCT CACTGTACAC CTAAAACCTA GCACAGGGTC TCACACATAA CAGGCACAAA 3420

ACAAACAATG GATTACGTTG AGCCAAAGAA CAAAAAAAAA TAGTAATTTA TCACTAAATG 3480

TCTTTGTTAA ATTCCAACAA CAGGGGGCAG TATATCAGGT ATTATAAGAA AGTAATTAGG 3540

CACATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC AAGGTCAGGA GTTCAAGACC 3600 AGCCTGGCCA ATATGGTGAA ACCCCGTCTC TGCTAAAAAT ACAAAATTAG CGGGTGTGGT 3660

GGCACACCCC TCTGGTCCCA GCTACTCAGG AGGCTGAGGC AGGAGAATCG CTTGTACCCA 3720

GGAGGCGGAG GTTTCAGTGA GCCAAGATCG TGCCACTGCA CTCCAGCCTG GGTGACGGAG 3780

CGAGACTCTG CCTCAAAAAA AAAAAAAAAA AGAAGAAGAA GAAAGTAATT AGGCACCTTT 3840

GGCTTAAGAC ACTGGGCTAA ATCCATGAAT TTACTTCATC TTCCCCCAAA GCACACTGAC 3900

ATGGTAGAAG AAATATAAAA ATACTAATGA ATCAACAGCA TATCTGAAAG GCAGCAAACG 3960

GTGGCATATG TAGATCAGAA TCTTTGAGAG ATTTCTGGAA GACAAAACAG ACCAGACTCG 4020

ATGTCCAAGA GATCAAACAG AGCCAAAGAG CCTCCAGCTG AAAACTAAGT ACTAGTTCTA 4080

CCAGTTTGGG CCTGGAAACA CCTCAAGCTC AGAGGGAATT GGGACTGGGG TTGAAAGTGG 4140

ACCTTGAGGT ACCAGGATGG TACTTAAGCA AAGGCCTGCC AACCCAGCAC CAGTACACCC 4200

ACAGCCCAAA TGACAAGCGG GGCTTCCCAT CTAGACTCAG CTGGAAAAAC AGTGCTCTAC 4260

ACAGAGTAGA GAGTTTGTCA CAGAGACTGG TAAGGGCTTC TTTTTTACAA AACATATGCT 4320

GCATATATAT TTTCTCAACG TCACACTAAT GACATTTTGG GCTATACAAT TCTCTGTTAT 4380

GTGGGTCTGT CATGTGCACT GTAGGACATT TAACAATATC CCTAGCCTCT AATTATTAGA 4440

TGTCTGTAGC AAATTCCCAA TTTTGATGAC CAAAAGTATC TCCAAGCATT GCTAAATGCC 4500

TTTGTGGGGG AAATAGCCCC CAGTAAGGAA CCACTGGTCT ATACTCACGC CATTCTAACT 4560

GAATTCTTTT AAGGCAAATC CGAGACCTAG CATTTCAAAT GCAATTACTT AGGTATGTAT 4620

CACCAAGAGA TCAAGATTCT TAACATAAAC ATAATACTAT TATCCAATTT AAAAAGTAAC 4680

ACTAATTCCT TAGTATCATC TAATATTATT CAGTTACTGC TTGAATTTCC CTGAGTGTCT 4740

CATAAATGCT TTTTTTTTGT TTTGGTTAGA ATTGACACCA GAGCAGGTCT ACACTGCATA 4800

TGATTGTTAA GTATATTGGG TCCACAGAAG GTCTCCTGGG GCCTGCAGAC AGAAAAAAAC 4860

CATAGTAGTG CCCAAGCTAA TTCTAGGCAA CCACAAGAGA GGAAAGGAAA AAGAAAACGG 4920

CAGCTCGCCT AGAGGATAAC TGCACCCTGC CCCGATTTTC CTGAGCCATC ACTGAACCCC 4980 TTCCTGGTTT AGGACGTATG TCCATGTTTG TCTTCTGAAG GGATGAAGGG ACACCTATTG 5040

TGAGCACAGT CTAAGCCACT CAATGGTCCA GGGCATAGCT CAAACAGAGC AACAGTAGCC 5100

CTGGGAAATG GAGGTGACAA AAGAAACAGA ATAAATCTTT CAAAATATAC TGCAATTTGT 5160

GCAACAGGAT GCCATATTGA TTTAAAAAAA TTTTTTTTCT TAAATTTTTT GTAGAGATGG 5220

GGGGAGGGGG TCTTGTTGTT GCCCAGGCTG GTCTTGAACT CTTGGTCTCA AGTGATCTTC 5280

TTGCCTTGGC CTCCCAAAAT GCTATGATTA TGTGCGTGAG CCACTGCTGC ATTGCGTTTT 5340

TTTTTCTTTT CTCGAGACGG AGTCTCACTC CGTCACCCAG GCTGAAGTGC ACTGGCGTGA 5400

TCTTGGTTCA CTGCAACGGC CTCCTGGTTC GAGCGATCCT CACACCTTAG CCTCCCTAGT 5460

AGCTGGAACT GCAGGCCTGG CTAAGTTTTG TATTTTTAGT AGAGACAGGG TTTCACTATG 5520

TTGGCCAGCC TGGTCTTGAA CTCCTGACCT CAGGTGATCA GCCTGCCTCA GCCTCCCAAA 5580

GTGCTGGGAT TATAGGTGTG AGCCACTGTG CCCAGCCTAC ATTGATATTT TTTAAAAGCC 5640

ACTATTTAAA AAGGAGTAAT CTGAGTAGTA AGAAGGAGTT CTTTAAAAAC TGGCCGGGCA 5700

TGGTGGCTCA CGCCTGTAAT CCCAACACTT TGGGAGGCCG AGGCAGGCAG ATCACCTGAG 5760

GTTGGTAGTT TAAGAGCAGC CTGACCAACA TAGAGAAACC CCATCTCTAC TAAAAATACA 5820

AAATTAGCCA GGTGTGGTGG CACATGCCTG TAATCCCAGC TACTCTGGGG GCTGAGGCAG 5880

GAGAATCGTT TGAACCTGGA AGGCAGAGGT TGCGGTGAAC CGAGATCGTG CCATTGCACA 5940

CCAGCTTGGG CAACAAGAGC AAAACTCCGT CTCAAAACAA AACAAAACAA AAATGAAAAC 6000

AAACAAAAAA ACACCAACAT GATTAGGAGG GAAAAAATCT AGATAGAAAG GCTTAACAGG 6060

GCCGGGCACG GTGGCTCATG CCTGTAAGCC CAACACTTTG GGAGGCCAGG GTGGGAGGAC 6120

TGCTTGAGGC CAGGAGTTTG AGACCAGCCT GGGCAACTTA GCGAGACTCT GGTAGTCTGT 6180

CTCTACCAAA CAAACAAACA AACACCTGAT TAGCTGGGCA TGGTGGCATA TGCCTATAGT 6240

CCCAGCTACC CGGGAGGCTG AGGCTGGAGG ATCGCTTGAG TCCCAGAGGT CAAGGCTGCA 6300

GTGAGCTGTG ATCAGGCCAC TGCACTCCAG CCTGGGCGAC AGAGCATGAG TCTGCCCCAG 6360 CCCTGCCTCC AAAAAAAGAA AGGCTAAATA GGAGAACTGA TATAACTGAA AACCAAATTA 6420

GTTGTGTGAA AGAGCAACTG TCCTGGAAGC TCCCAGAACA CAGAGCAATA AGAGATGAAA 6480

AATATGACAG CATAGAAAAG AAAGGAACTG GATAGGTCCA GGAGATCCAA TACCTGTGCA 6540

ACAGGAGAGT CCAAAGAAGA AACCAGTAAG AAGGGAGAGA AGTAATACAA GAAAGTTCCT 6600

GAGTTATCAG GCCAAAAGAA ATAATCTAGT TTGTGGAGTA ATATTGACAA AAAAATCTTT 6660

ACACCTAGAT GTATTCTGAA AAAATTCTTA AATTCTAATT GAAATCAACC AACGAACCAC 6720

AGGCCAGCCT TAGAAAACCA TTTCCAGGGC ATGGGGTTTT AGGGTCTGAC AGACCTGAAG 6780

TTCAAATTCC TACTATCCTA ACTTACTAGT AGTGTGATAA TCTCTTAGAA CAATGTATGA 6840

AATGGAAGCA TAATAGCACC CTCCACCTTT TAGAGTTAAT GGGAGATCTA AAAGAGGTAA 6900

CATTTGCAAA GTGTCTGACA TGAAGGGAAG AGATTGGCTT TGGCATCCAC AAGTTCACAC 6960

ACTAGCAGAG AACCTCAGTC CAGCTTCCTA CGCTCAGGCA GTTCTTTGCC TAGAAGAGGG 7020

GTCGGCAAAC TATAGCCCAA ATTTAGCCCA CTGCCTGTTT TTGTAAATAA AATGCTATCA 7080

GAACATGGCC ATGTTCATTC ATTTACATAC CATCTATGGC TGCTTTTACA TTACAAAGGC 7140

AGAGCTGAGT AGATGAGACA GAGACAGTAT GGTTACAAAC CGAAACTGTT TCAACCCCAA 7200

CTTCATTCCA GCAAAGTTTT ACTTTCTAGA TTCAGGCCAG GGAGCAAGCA TGAAAATGAA 7260

AACCACTAAA ATGGTGTCCC GGGACAACAG ATACCTACTT GCTATAACTT CTTTCCTTGA 7320

AAACAAAGGG CCATATTAAT TGAAGGGCTC ACCTCTAAAC AGGTGAGTGA CTTAAGGACT 7380

TCAGACACAC ACTGGTCAAC TACAAACTAG TCAGTAAAGG AATAGCCATA GTCCTATAGC 7440

CCCAGTTCCT ATGGCCCAGG GGGATCCACT AGTTCTAGAG CGGCCGCCAC CGCGGTGGAC 7500

TCCAG 7505 (2) INFORMATION FOR SEQ ID NO: 14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 529 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:

GCTGAGGTGC ATCGCGGTGG CGGACGCTCT AGAACTAGTG GATCCCCAAA CAAAACCTGT 60

CCCTGCTAAT GATGGTAGAC CCAATCAGAT CCCCGGAGAA GCCGAAATAC GGAAACCATA 120

TCAGCATACG CATGGCATAC ATAGAACCCC ATACATGGAT TGCTTACTCA GCCAGATATA 180

GAAATCTATC TTCACGATAG AGATATATAT ATATAGACAC ACTGCATATA CAGATGTGAG 240

ATGGAGGCTC ACTCTGCCAC CCGTGCTGGA TCTACAGTGG CACAAGCTCA GTCCACAGTC 300

ACGTCGATCT GCCGGGCGTG ACCGACTGAG ATGCAGCGGC CTCGGGCGTA GCTGTGAGTA 360

CACGCACCAG TCATCGCGAC TGGCTGCAAG TGGTATAAGC GGAGGGGACA GGGTTACAGC 420

ATGACGGCTA GGCAGGCCGC AAACTGAGGA CCACAAGAGT GCCACGCTGC CCGAACGCAT 480

GCAGTGGCGA GATTACATGG GGCAGCCACT AGAGCCGCCG TATCAGAAA 529

(2) INFORMATION FOR SEQ ID NO: 15:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 635 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:

TACCACGCGG TAGCGCCGCT CTAGAACTAG TGGATCGGGT AATCCAGCAC TTTGGGAGGC 60

CAAGGAGGGC AGATCACCTG AAGTCAGGAG TTTGAGACCA GCCTGGCCAA CATGGTGAAA 120 CTCCATCTCT ACTAAAATTA CAAAAATTAG CCGGGCGTGG TGGCGCATGC CTGTAATCCC 180

AGCTACTCGA GAGGCTGCGG CATGACAGTC ACTCAAGCCC GGGAGGTAGA GGTTGCAGTG 240

AGCTGAGATT GTGCCACTGC ACTCCAGCCT GGGTGGCAGA GTGAGACCCT GTCTAAAAAA 300

AAAAAAAAAA AAAGGCCCAT TAGGGGACCC AAACGGTTCC CCAGCTTTGT TGGATTTCCC 360

CAAATTTGGG GCCAATTTTT GGAGGGTTGT CCCTTAAAAA TTTAAATTTG GGGGTTTTTT 420

TCCAGGCGCC CATTAGAAAT GGGTTCCGAA AATTTTTTGG CCAAAAAAAT TTGGTTTAAC 480

CGCGGACCAA AATCCTAAGG TTTAACTTTT TCCTAAACCT TTTAGAATTT AAAGTTTCCG 540

GGGTTTCTCA GGAGGGGGTA ACCCTTCACC CCAATATAAC TCGGAAACCC CCCTTTTTTA 600

GGAAAAGGGG AATTAGTGGT GCTTTCCGGG CCAAA 635

(2) INFORMATION FOR SEQ ID NO: 16:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 938 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:

CCCAGGGACC AAGCGAGTGC GACCGCTCTA GAACTAGTGG ATCCCCCTTG AAGACTATAT 60

TTCTTTTCAT CACGTGCTAT AAAAATAATT ATAATTTAAA TTTTTTAATA TAAATATATA 120

AATTAAAAAT AGAAAGTAAA AAAAGAAATT AAAGAAAAAA TAGTTTTTGG TTTCCGAAGA 180

TGTATAATAG GTTGAAAGTT AGAAATTATT ATTATAATAG CAAAAAAAAT TTAAAGTTAG 240

AAATTAGAAT TTAAGGCTCT ACACACGTTT ACGATGATAT TGGACGAACG ACACGATTAG 300

ACAGTTGTAG GTTGTGTGTT GTGATGTTTT TGAGTGATTT GTAGTGTTTA ACCTTGTGGT 360

TTGGAAAGGT NGTATGAGTA TTAATCTCGG GCTTATTGGG AGGTTTATGT GCAATGCATT 420

TTGTGGTTTT TTTATAATGT TGTGTTTAGG GTTAAAACCT GTTGTGTATA TTGTGTTGGT 480 TTGTTGCTTG TTTGTACATT GGTATGATGC CTNTTTTGCT TATGGGTTNG GTGTTTGGTT 540

TTGGTTGTGT TTTTTGTGGT GTGTTGTTTG ATAGTTTTAG CGGTTGTTTT TGGGTTGTTG 600

TTTTATGTTG TGGTGGTGTT TTGTGTGTAG AGTTGTGGTT TGTGTGTTTT GTTGGTTGTG 660

TTGTGGTATT GTTTATGTTT GTCGTGTGTA TGGTTTGTTG TTAGTCGTTG TTGTAGGCTT 720

GTGTGTTGTG TGTTGTGTGT GCGTGTGGTC TAGTTTGGGT GGTATTGTTG ATTTAGTGTG 780

ATAGTCTGTT AGAGTTTGGG TTGTTGTGTG TATTGGGTTT GTCTGTGTGT GGTTTTTTTG 840

TGGGTGTAGA TGATGATTTG TGTATGTGGG TGAGGTATAT GTTATTTGTG GTATTTCGGT 900

TGTGATGTGT TGGTTATTAT GTGTTTGTTA TGTGTATT 938

(2) INFORMATION FOR SEQ ID NO: 17:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1145 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:

GTCTCCGAGC TCACCGCGGT GGCGGCCGCT CTAGAACTAG TGGATCCCCC GCTCTCACTC 60

CCTGACTCTT GCCTTCTGTA ACAACTGGAG ACAACTCTTT CAAAACCAGC TCCAAGCCCC 120

AGACTTCTCT CTGGGCTTTA GTTCGTAAGG CAGGTGCCCT ACTGAGTGAG CCTAGATCAG 180

ACAGAAACAT AGCTGTTGGC AAGGATTTAG GTGAATTTCC TTCCATTGTT TTTCTAATAC 240

CTTTTTTTTT TTTTGGAAAA TATAACCATG CACCTACACA CATATTTGAA TATCCTGCCT 300

TTTTATTTAA AATGACATGA TAGGTCCGGG AGTGGTGGCT CATGCCTGTA ATCCCAGCAC 360

TTTGGGAGGC CGAGGTGGGC AGATCACCTG AGGTCAGGAG TTCGAGACCA GCCTGGCCAA 420

CATGGTGAAA CTCCATCTCT ACTAAAAATC AAAAATTAGC CGGGCATGGT GGCAGGCTCC 480

CAGCTACTCA GGAGGCTGAG ATGTGAAAAT CGCTTGAACC CGGGAGGTAG AGGTTGCAGT 540 GAGCTGAGAT CTTGCCATTG CACTCCAGCC TGGGCAATAA GAGCGAAACT CCATCTCAAA 600

AAAAAAAAAA AAAACCCAGG GATAAACTTT CCAAAAGGCC CCAAAAAGGG GCATGATTAA 660

GACAATAAAT TAGTCGAAAA TTGTCAATAT AAATGAATAA TAATTTTTTT GGCCATTCTG 720

CCAAGTGGCA TAACCCTGTC ATTCTGCCCA TTCGGCAACT CTTTTTCCTC CCGGGGAATC 780

GCTCCCACTT TTTGCATGGG TTTTGGATGG AACTGTTGGT CACAGGTTTT TCACCCCCAT 840

TTGGCCCTCC CAGAGGTGTA CAAAGTACCC CAGCCTGGCC CTTTTTCACC CAATTTTCCC 900

AGGTATATTC CCCCGGTTTT GGTCCCAGGT TTTAACCCCC CCCTCCAAAG GGCTTTGGGT 960

TTTGGAAGGA TTAAGTCCTC GAAATAGGCC CCTCATAATA CCTGGGGGGG GGACCTTTTT 1020

CAAAGTTGTG GGCACCTCTT GTGTCGCCCC CACGGGGGAC TGATGTATTT ACGCCCCNTT 1080

GGGGNNTAAT ATGGATTGNT ATGTATTGGG CGAGGAGAAA ATATTTTTGA TGGGGTTTTT 1140

CTCTT 1145

(2) INFORMATION FOR SEQ ID NO: 18:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 852 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:

TCACCGCGGT GGCGGCCGCT CTAGAACTAG TGGATCCCCC GTTTTGCTCT CTCCTTAGAA 60

TGAGCTGGGA ACTAGTCACT CTTGTTTTCT CACCTATAAT AGCATCTGGG TCCAGTGTTT 120

TTTATGTGGG ACAAATTTGA ACTTGTGGTC AACCTCTTTA ATTGTAAGAA TATTCAGGTC 180

TTTTGTTCTT CCTGGGCTAG TTTTTTATTC TTTTTCTAGA GATTCGTTCA TTTTTCTTAG 240

TTTTATTTGC CTATAATTGT GGATAATCTG TTTTTTATCT GCTACTTCTG TAATTATTTC 300

CACATTTGAT TTATAATATT AACTTGTGGG CCAGGCGTCG TGGCTCACAC CTGTAATCCC 360 AGCACTTTGG GAGGCCGAGG CGGGCGGATC ACGAGGTCAA GAGATTGAGG TGAAACCCCC 420

TCTCTACTAA AAGTAGAAAA ATTAGCTGGG CATGGTGGTG CGTGCCTGTA ATCCCAGCTA 480

CTCAGGAGAC TGAGGCAGGG AATCTCTTGA ACCCAGGAGG CAGAGGTTGC GGTGAGCCAA 540

GATTGCACCA CGGCACTCCA GCCTGGTGAC AGAGCGAGAC TCCATCTCAA AAAAAGAAAA 600

AAAAAAAACT GTCAAATGAT ACTCCAAAAT GGTTGTACCA TTTTATATTT GCAACAACAA 660

TGTCTGAGGG TACTGATTGC TCCATATCCT TGACAGCACT TGGTATAGCC GATCCTTTAA 720

TTTTAGGCAC TTTAAGGGGG CAAATACCTG GGATTTTAAA GGTTTAACCT TTTTATTTTC 780

CCAAATGGGT TAATAGGTTC TCAGCAACTT TTCAAGGGGC CTAATTCCCC CCTTCAAAAT 840

AACCTCCCCT GG 852

(2) INFORMATION FOR SEQ ID NO: 19:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1854 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:

CCGGCACTCA CCGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCCCCGGAA ATGTTACTTC 60

CAACATTTTA GAACTGAAAT GATTCTTAGT CTGGTGATAA ATGTCAATTA AAATAGTTCT 120

CCTTTCACAG AGAAAATTAA GAAAAAATTA GTTCAAGAAA ATATCAATCA TGATTGCCAG 180

CGGAAATTTG TTTCTGCAGT AAAACAAGCA AAACAAATCA AATCCATTAA AACTAGCAAC 240

AGACTGTCTT CTAAAGTCAA GTTCACATCT GGAGATTTTT ATAAACTTTA TTGGAAAAGT 300

TCTGGTTATC TATATTTTTA GCATAGCAAA ATATTCTTCT TGTTTGTTGA ATTTGATATA 360

AAATGTTATT TTTAGCCAAG TCCTGGGGCA ACTCCTACAT GGCTGGAAAA TGTTCTCGGT 420

GTTAACAAAG ATGCAAAGAT CTTAAATATT AATGTTATCA ATCAACTGGA TACTCTTAAG 480 TATTATTTGT AATTATGTCC AATGTCATCA CCACAGGGCT GACCAACAAG CAAAGAGCTG 540

ACAGTAGTAG CAAAATGTAG AAATCTCTGG TAAGCATGTT GTGTTTATCA ATCCTCTTCA 600

AATAGATGAA ATTAAATTGC ATTTAAAGAA TGTTACTTAT ATTAGGCATT TTTTGTGAAA 660

GACGTTTTAA ACTATGGTGT CAGAAAACAG AAATACTAAA CAGAATGCAT TTAACAGGAC 720

CTTGAAATCA CTGAATACTC ACCTGTGTAA AAGTCAAAGT TCAGATAATT GAAATGTTCT 780

TACTAGTCTC AAGATGTCTT TTGGTTACAT AGAAATTTCC ATGCTGAATT TTGATTTTTT 840

TAAAAAGCCA TTAATATGAG TCAAAATCCA TTATTTCACA AGTAAATGAC CTTTTTATTA 900

AAAAAAAAAA AGAGAGAGAG AGAAGAGCAA GGAACCACCC ACATCTAACC TCTTAAATCT 960

GAGATCAATA TATCAAAATT TTAATGTACA TTGAAAACAT TTTCATTTTA TTCCACACAC 1020

TACCTTTTCT TCATAATTTC TTATTCTGGA CATATAGCAG TTTTTTTTGT CTTTTAAAAC 1080

AGGAAAAATA AACAAACATG GTCTTATTAT TGTTACTAAG TCACAGGTAG TAAAGATGGG 1140

ACCAGGAGAA CCTTGGAGGA CTAGAAACTT CTCAAGAGTA GTTAGATTTC ACATTCAGAG 1200

GGAGGACTCA GAGTCCTGCC TGGGACATAC ATTTGCATTC TAGGCTCAAG AGCAAATATG 1260

TCAGCTTTCC TTTGGTCAAA CAATCTTTGC TACAGGTCCT AGGTAGTTAT ATCAGTGGAA 1320

CCTACTAAAG ATGATGGAAT TTGTGGTATT TCAGGGTAGG AGGTAAAGTC TTAGCAGGCT 1380

CAACTATACA TGATCTTAAA ACTAAATTTG AAATGCAGAT GTTCTATGAG TTAGTTGGAT 1440

ATTGTAGTTA TCCCATCTAT CAACTGATCA CATTTGGTAT GAGCTTGTTA GTTCTGATTA 1500

GGACTCATCT CAACATAATA AGAAGGGTGG CATTTAGGGC CCAGTGTGGG GGCCTAGTGA 1560

TCACTGCTGG GACACTGCTT CTAAATCAAC ATAACTAACC TCTCTAGGAT GGCAGGCTGA 1620

GGCTGCTCAA GTACTTCCTG TCTGGCATCT GGGACAGGGC TGAGTCTCTG GGTGGGAAGA 1680

TGGGTGGGAG GACTGAGGCT GATGAGTATA TGATATAAAT GAGAGCCATT GGAATGGCTC 1740

CACATACAGG ACATGTTGAT AAATCATTTT AACATATTTT GCTTTCTCTC TCTGGTGGCC 1800

CATTGAGAAT CAAAAGGGGG ATCCACTAGT TCTAGAGCGG CCGCCACCGC GGTA 1854 (2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1101 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:

CCACCTTTTC AATTCATCAT TTTTTTTTTA TTCTTTTTTT TGATTTCGGT TTCCTTGAAA 60

TTTTTTTGAT TCGGTAATCT CCGAACAGAA GGAAGAACGA AGGAAGGAGC ACAGACTTAG 120

ATTGGTATAT ATACGCATAT GTAGTGTTGA AGAAACATGA AATTGCCCAG TATTCTTAAC 180

CCAACTGCAC AGAACAAAAA CCTGCAGGAA ACGAAGATAA ATCATGTCGA AAGCTACATA 240

TAAGGAACGT GCTGCTACTC ATCCTAGTCC TGTTGCTGCC AAGCTATTTA ATATCATGCA 300

CGAAAAGCAA ACAAACTTGT GTGCTTCATT GGATGTTCGT ACCACCAAGG AATTACTGGA 360

GTTAGTTGAA GCATTAGGTC CCAAAATTTG TTTACTAAAA ACACATGTGG ATATCTTGAC 420

TGATTTTTCC ATGGAGGGCA CAGTTAAGCC GCTAAAGGCA TTATCCGCCA AGTACAATTT 480

TTTACTCTTC GAAGACAGAA AATTTGCTGA CATTGGTAAT ACAGTCAAAT TGCAGTACTC 540

TGCGGGTGTA TACAGAATAG CAGAATGGGC AGACATTACG AATGCACACG GTGTGGTGGG 600

CCCAGGTATT GTTAGCGGTT TGAAGCAGGC GGCAGAAGAA GTAACAAAGG AACCTAGAGG 660

CCTTTTGATG TTAGCAGAAT TGTCATGCAA GGGCTCCCTA TCTACTGGAG AATATACTAA 720

GGGTACTGTT GACATTGCGA AGAGCGACAA AGATTTTGTT ATCGGCTTTA TTGCTCAAAG 780

AGACATGGGT GGAAGAGATG AAGGTTACGA TTGGTTGATT ATGACACCCG GTGTGGGTTT 840

AGATGACAAG GGAGACGCAT TGGGTCAACA GTATAGAACC GTGGATGATG TGGTCTCTAC 900

AGGATCTGAC ATTATTATTG TTGGAAGAGG ACTATTTGCA AAGGGAAGGG ATGCTAAGGT 960

AGAGGGTGAA CGTTACAGAA AAGCAGGCTG GGAAGCATAT TTGAGAAGAT GCGGCCAGCA 1020 AAACTAAAAA ACTGTATTAT AAGTAAATGC ATGTATACTA AACTCACAAA TTAGAGCTTC 1080

AATTTAATTA TATCAGTTAT T 1101

(2) INFORMATION FOR SEQ ID NO: 21:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 120 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:

AACTAATGTA TCCCCCGGGC TGCAGGAACA CGATATAAAG CCTTAAAATT GTGCGAATGT 60

GRTAAGTCGA TCCAATCTCA ACTGCTATCT RTGTACCAGA ATAGTTTCAT AATTACGTGT 120

(2) INFORMATION FOR SEQ ID NO: 22:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:

GAATTCTCTG WKATTAKAAC TATCTTGMCT CAAATTSACT TGGTGAGCTA ACCTGGCCTG 60

TGGTCCCTTG GCTTTAATGG AGGCTTTGTC ATATAGATCA TMTGTGGTAC TKGTGCCTAG 120

TTGTAGTGCC CTGCCTTGCT STTCTWGGCT TACTKGATTT WGGGGTATAC ATCWATKTAA 180

YTSAAAGGTC TTTCTCCTCC CGYYGGGAGA ATTTCTCCTC CTCCCTCGGA GAACTCTTTC 240

TSCCGAAATT CTATTCCGGG CTGGGTCTCC ATTCTGCTTA CCTCCCACAC TTTTAATMAA 300 (2) INFORMATION FOR SEQ ID NO: 23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 599 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:

GAATTCCCTC TTGCTTGGGG GAGGTCAGCC TTTTGTTCTA TTCAAATCTT TGAGGAAAAT 60

AGAAAGCAAA GAATATATTA ACTATATTAA ACAAACTAAA TGTTCCAATT AAAATACAAA 120

AATTATAAAG CCTAATAATA AAAGCCCTCA ATTATATGCT GTTTAAAAGA GACATTTTTA 180

AGCTTAAGGA TATAGAAAAG TTGAAAATAA AAGAATGGAA TAAAATAAGC CATGAAAATA 240

CTAGTATAAC ACTGATGTCA AAATCTGACA AAGCACACAA AAAAGAAAAT AACTTTAACT 300

GCAAAATCTT AAAATCCTAG CAAAGAAAAA GCAGCATATG TTATAATTAT ACCACAACCT 360

GATCAAGTAA GGCTTACTTC AAAAATTTAA CCATGGTCCA TTATTGGAAA ACATATTAAT 420

AAAAATCCTC ACAAAAATAA TTCAAAATAT AAAAAGCCAT ATGATAAGCC TGATGAATGC 480

TGGTTTACAG AACTGGTTTT CTTTAAAAAG GCAATCATTG GGGAAATAAC CCGCTTACTC 540

AGTATTTACT ATGTGCTAGC CCTGTTCCTT CTACTAGAAA TTAGTGAACA AATTCTAAC 599

(2) INFORMATION FOR SEQ ID NO: 24:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 330 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:

AAGCTTTCAA GAACAGGGAC TGTTAAGCCG GGTACAGTGG CTCACACCTA TAATCCTAGC 60 ATTTTGGGAG GCCAAGGCGG GTGGATCACT TGAGGTCAGG AGTTCAAGAC CAGCCTGGCC 120

AACATGGTGA AACCCCATCT CTACTAAAAA AAAAAAAAAA AAAAAAAAAA AAAGAAATWC 180

MAAAATTACC CAGGCATGGT GGCACGCGCC TGTAATCCCA KCTACTTGGG AGGCTGAGGC 240

AGGAAAATTG CTTGAACCTA GGAGGCGGAG GTGGCAGTGA CCTAATCACA CCACTGTTCT 300

CCATCCTGGG CAACAGAACG AAACTGTTTC 330

(2) INFORMATION FOR SEQ ID NO: 25:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 258 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:

AAGCTTGGGT GATAATGAGG AGTCAATGTT GGTCCATCAA TTGCAACAAA GGTACCACAG 60

TGGTGTAGGA TGTGGATAAT GAGGAGGCTG TGCACGTGTT GGGGACAGGT GGTATTTACG 120

AATGCTCTAT ATTTTCTTTC TCTCTTTTTT TAGGACGGAG TCTCACTCTG TTGCCCACGC 180

TGGAATGCAY GGGCATGACT GTGGCTCACT GTACCCCCCA CTCCCCATGT TCAAGAGATT 240

CTCTTGCCTC ACCTCCTG 258

(2) INFORMATION FOR SEQ ID NO: 26:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 622 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:

CTCGAGTCCA CCGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCCCCCGAT TTATTTAAAG 60 CAGTTATGTA TGTATGAAAA ACAATGCTGA GCATTCAATT CCAAGATTTC TGAAGACACC 120

TATTTTACCA TCACTTTGAA TAAAATTTTT ATATTCCTTT CTTCAAATAC CATCTCGGTT 180

TTCAAATGTG GCTCATTAAA TGTGAAAGCA AAATTTCATT TCAAATAGCA GCCTTATCAA 240

ATGACAATTT ACCTGTGGTA GCATTGTTGG CACTGACACA TATCAGACCA CTGCCGAGCA 300

GAACAAGAAT GAACCAGGAA TCCATGCTTA TCTGGAAAAT AGGGAGTCAT GTTAGATGAG 360

GTCCTATATT ATCAGGACTA TGTCTGAGCT GGTCACCAGA AGAGTATTCT GGATTTCCAA 420

GCTATTAAAA TGTGTGCCTA AACCAATGAT CTTTTGGGAG CCTGATATGC ATGCTTCCTC 480

AGATATCCAA TAACTAATTG AGTCTTTATA AAGACTGACT ATCCCTTATC TTGAGGACTA 540

GCAGTGTTTC AGATTTTTTT TAAGAGATAG GGTCTTGCTC TGTTGCCAGG ATGGAGACAG 600

TGGTTATGAT CATAGCTCAG TG 622

(2) INFORMATION FOR SEQ ID NO: 27:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 602 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:

TCGGACTCCA CCGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCCCCCGGG CCCTCAGGAC 60

TGCTGGGCTG CCTGGTGTCA GCACTTCCCG CCATTTTCTA TAGCACCAGT ATTATTCTTA 120

ATACTTTAAA AAACCACCAG GCACGGTGGC TCACGCCTGG AATCCCAGCA CTTTGGGAGG 180

CCAAGGTGGG CGGATCACAA GGTCAGGAGA TCAAGACCAT CCTGGCTAAC ACGGTGAAAC 240

CCTGTCTGTA CTAAAAATAG AAAAAAATTA GCTGGGCGTG GTGGCATGCA CCTGTAGTCC 300

CAGCTGCTGG GGAGGCTGAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG ACTTGCAGTG 360

AGCCGAGATT GCACCACTGC ACTCCAGCCT GGGTGACAGA GCGAGACCCC GTCTCAAAAA 420 AAAAAAGTAA ATAAAAATAA AAAACCATAT CCCACTATCT CCCCCTTCTC TCTTTGCCTG 480

TGACTANNNG GCATACTTAT GGGGAAATCT TTAAGATGTC AGATTTCAGT TCTCTCACTT 540

TTCTACAACT TCTCCCCATT TTGCCTTTCT TAGGAACTTC CCTTCTTCCC ATCTGATTCC 600 TN 602

(2) INFORMATION FOR SEQ ID NO : 28 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 546 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:

TATCAAGGCG GAGTCCACGG TGGCGGCCGC TCTAGAACTA GTGGATCCCC GAACCAGGAA 60

TCCATGCTTA TCTGGAAAAT AGGGAGTCAT GTTAGATGAG GTCCTATATT ATCAGGACTA 120

TGTCTGAGCT GGTCACCAGA AGAGTATTCT GGATTTCCAA GCTATTAAAA TGTGTGCCTA 180

AACCAATGAT CTTTTGGGAG CCTGATATGC ATGCTTCCTC AGATATCCAA TAACTAATTG 240

AGTCTTTATA AAGACTGACT ATCCCTTATC TTGAGGACTA GCAGTGTTTC AGATTTTTTT 300

TAAGAGATAG GGTCTTGCTC TGTTGCCCAG GATGGAGACA GTGGTTATGA TCATAGCTCA 360

GTGCAGCCTC TACCTCCTGG ACTCAAGTGA TCCTTCTGTC TCAGCCTCCT GAGTAGCTGG 420

GACTATAGGC ATGTACTACG ATGCCTGGCT AATTTTTAAA ATTTTCTGTA GAGACGGCGT 480

CTCACTATGT TGTCTAGGCT GCTCTCAAAC TCTTGGGTTC AACTGATCTC TTGCTTCAAC 540

TTCCAG 546 (2) INFORMATION FOR SEQ ID NO: 29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 498 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29:

GTGGATTCAG ACGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCCCCCGAG CAGAGGTTGC 60

AGTGAGCCAA GATCGTGCTA CTGTACTCCA GCCTGGGCAA CAGAGCAAGA CTCCGTCTCA 120

AAAAAAAAAA CAAACAAACG ATGTGTGCCT GTGTTTCCTC ATCTGTAGTA TGAGGATAAT 180

GATCATATAT ATTTACTAGT GTTGTTGGGA TGATCAAATT AGGTATATTT AATCATTGTG 240

TAAAAAAGTT GACGTGTAAA ATCCATGTAA AAAAGTTGGC AGAAGAGACA AACTGGTAAA 300

GCAGCCGTTC TTCATTTCTC ATTTCATTCA ACAAGCATTA TTAACAGCCT AGCAAGAACA 360

CAGTATCCAG GAAAAATCAA AGATTATCAA GCTCATGTTC TATAATCAAG CAATTTATAA 420

ACTAGCAGAA GAACAAGACA GATGAATAAG AACTTGGGTA TATTTAAATG CTAAGAAGTT 480

CAATTCAAAT AAATGTCC 498