Some content of this application is unavailable at the moment.
If this situation persist, please contact us atFeedback&Contact
1. (WO2018187796) METHODS FOR INCREASING RESISTANCE TO COTTON BACTERIAL BLIGHT AND PLANTS PRODUCED THEREBY
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

METHODS FOR INCREASING RESISTANCE TO COTTON BACTERIAL BLIGHT

AND PLANTS PRODUCED THEREBY

CROSS REFERENCE TO RELATED APPLICATIONS

[001] The present application claims priority to U.S. Provisional Application No. 62/483,174, filed April 7, 2017, entitled "Methods For Increasing Resistance To Cotton Bacterial Blight And Plants Produced Thereby," which is herein incorporated by reference.

INCORPORATION OF SEQUENCE LISTING

[002] The sequence listing that is contained in the file named "DDPSC0076-101-US_seq.txt", created on April 7, 2018, is filed herewith by electronic submission and incorporated herein by reference in its entirety.

BACKGROUND

[003] Upland cotton {Gossypium hirsutum L.) is the world's leading natural fiber crop. Cotton is commercially grown in over 84 countries and in the United States is responsible for $74 billion annually. Numerous foliar diseases affect cotton throughout the world's cotton growing regions includes Fusarium wilt {Fusarium oxysporum f.sp. vasinfectum), Verticillium wilt (Verticillim dahlia) and several viral diseases such as cotton leaf curl virus. Historically, one of the most significant foliar diseases has been bacterial blight, caused by Xanthomonas citri pv.

malvacearum. Cotton bacterial blight significantly limited cotton yield in the late 20th century. In the 1940's and 1950's, breeders identified and introgressed multiple resistance loci into elite germplasm.This strategy proved durable for over half a century. In 2011, cotton bacterial blight (CBB) returned and caused significant losses to farmers in the southern United States, more specifically in Arkansas and Mississippi. Nonetheless, CBB has received little research focus during the last several decades because this disease had been considered "tamed". Modern molecular and genomic technologies can now be employed expeditiously to deduce the underlying cause of the disease re-emergence and pinpoint optimized routes towards the development of durable resistance.

[004] CBB is caused by X. citripv. malvacearum (Xcm); however, the pathogen has previously been placed within other species groupings. The Xcm pathovar can be further divided into at least 19 races according to virulence phenotypes on a panel of historical cotton cultivars: Acala-44, Stoneville 2B-S9, Stoneville 20, Mebane B-l, 1-lOB, 20-3, and 101-102.B. Historically, the most common race observed in the U.S. has been race 18, which was first isolated in 1973. This race is highly virulent, causing disease on all cultivars in the panel except for 101-102. B. CBB can occur at any stage in the plant's life cycle and on any aerial organ. Typical symptoms include seedling blight as either pre or post-emergent damping-off, black arm on petioles and stems, water-soaked spots on leaves and bracts, and most importantly boll rot. The most commonly observed symptoms are the angular-shaped lesions on leaves that, in some cases, can coalesce and result in a systemic vein infection where leaf lesions coalesce on major leaf veins. Disease at each of these stages can cause yield losses either by injury to the plant or direct damage to the boll. No effective chemical treatments for the disease have been released to date. Therefore, the most important methods to reduce loss as a result of CBB include field methods that rely on cultivation to reduce potential sources of overwintering inoculum and planting cultivars with known sources of resistance.

[005] Most pathogenic bacteria assemble the type three secretion system (T3SS), a needle-like structure, to inject diverse type three effectors (T3Es) into the plant cell to suppress immunity and promote disease. For example, transcription activator-like (TAL) effectors influence the expression levels of host genes by binding directly to host gene promoters in a sequence-specific way. Up-regulated host genes that contribute to pathogen virulence are termed susceptibility genes and may be modified through genome editing for the development of resistant crop varieties.

[006] Plants have specialized immune receptors, collectively known as nucleotide-binding leucine rich repeat receptors (NLRs), that recognize, either directly or indirectly, the pathogen effector molecules. Historically, this host-pathogen interaction has been termed the 'gene-for-gene' model of immunity, wherein a single gene from the host and a single gene from the pathogen are responsible for recognition. Recognition triggers a strong immune response that often includes a localized hypersensitive response (HR) in which programmed cell death occurs around the infection site. Nineteen CBB resistance loci have been reported in Gossypium hirsutum breeding programs; however, none have been molecularly identified.

[007] Here we combine comparative genomics of the pathogen Xcm with transcriptomics of the host to identify the molecular interactions underlying this re-emergent disease. This has lead to the discovery of several cotton genes which when manipulated at the DNA level can provide resistance to Xcm and other possible cotton pathogens. Such cotton pathogens include but are not limited to fungal pathogens such as Fusarium wilt {Fusarium oxysporum f.sp. vasinfectum) and Verticillium wilt {Verticillium dahlia), viral pathogens such as Cotton leaf curl virus, and/or oomycetes pathogens. The invention described herein will inform the development of durable resistance strategies.

SUMMARY

[008] Cotton bacterial blight (CBB), caused by Xanthomonas citri pv. malvacearum (Xcm), significantly limited cotton yields in the early 20th century but has been controlled by classical resistance genes for more than 50 years. In 201 1, the pathogen re-emerged with vengeance. In this study, we compare historical and contemporary pathogen isolates to reveal that no major shift has occurred in the pathogen population since the mid 1900's. Next, we determine that the percentage of susceptible cotton planted in the U.S. has increased dramatically since 2009.

[009] To further understand the virulence mechanisms employed by Xcm and to identify promising resistance strategies, we generate fully contiguous genome assemblies for two diverse Xcm strains and identify pathogen proteins used to modulate host transcription and promote susceptibility. Together, the data presented reveal the underlying cause of CBB re-emergence in the U.S. and highlight several promising routes towards the development of durable resistance including classical resistance genes and potential manipulation of susceptibility targets.

[010] Provided herein are cotton plants that are resistant to cotton bacterial blight. In certain aspects, cotton blight is by Xanthomonas pathovars. In certain aspects, a TAL effector binding elements (EBEs) is associated with a cotton susceptibility gene. In certain aspects, the susceptibility gene is a sugar transporter (SWEET) gene. In certain aspects, the TAL effector binding elements (EBEs) are modified to inhibit binding of the cognate pathovar TAL effectors. As a non-limiting example, the EBE region (SEQ ID NO: l 1) within the promoter region (SEQ ID NO: 1) of SWEET gene A04G0861 (SEQ ID NO:2) can be modified to inhibit binding of a cognate pathovar TAL effector, such as TAL 14b (SEQ ID NO:9). In certain aspects, this reduces or completely inhibits pathogen induction of expression of the cotton plant genes, thereby preventing or reducing infection and/or reducing pathogen-induced plant damage. These modifications still permit normal cotton plant growth, development, and seed production. Also provided herein are nucleic acid and amino acid sequences, and expression vectors and other constructs, for use in methods for producing such cotton bacterial blight-resistant plants, as well as parts, products, hybrids, and progeny of such plants.

[011] Also provided herein are methods of rendering cotton plants Xanthomonas -resistant, methods of preventing, treating, controlling, reducing, or inhibiting Xanthomonas damage to normally

Xanthomonas susceptible cotton, methods of cultivating cotton plants. Also provided are, fields of cultivated, transgenic Xanthomonas -resistant cotton plants. In certain aspects, provided herein are methods of producing cotton fibers, cottonseed meal, cottonseed oil, etc. Also provided herein are method of making various consumer and industrial products from cotton bacterial blight-resistant cotton plants: articles of clothing, homewares, and industrial products, including fabrics (e.g., velvet, corduroy, chambray, velour, jersey and flannel); textile products (e.g., underwear, socks and t-shirts); tarpaulins; tents; sheets/pillow cases; uniforms; fishnets; coffee filters; book binding; archival paper. Cotton seed as cattle feed, and as a source of oil (cottonseed oil) for manufacturing soaps; margarines; emulsifiers;

cosmetics; pharmaceuticals; rubber; and plastics. Linters (the very short fibers that remain on the cottonseed after ginning) are used to produce goods such as bandages, swabs, bank notes, cotton buds, and x-rays.

1. A transgenic cotton plant resistant to infection by, or damage from, a cotton pathogen selected from the group consisting of a bacterial pathogen, a fungal pathogen, a viral pathogen, and a oomycete pathogen,

wherein the genome of cells of the transgenic cotton plant comprises one or more pathogen susceptibility genes.

2. The transgenic cotton plant of 1, wherein the infection by, or damage from, the cotton pathogen is reduced within a range of from about 50%, from about 60%, from about 70%), from about 80%>, from about 90%, or from about 95%, to about 100%), compared to the extent of infection by, or damage from, the cotton pathogen within an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

3. The transgenic cotton plant of 1 or 2, wherein the reduced infection by, or damage from, the cotton pathogen comprises one or more plant characteristics selected from yield, growth, development, boll formation, boll rot, seed set, seedling pre-emergent damping- off, seedling post-emergent damping-off, black arm on petioles, black arm on stems, water-soaked spots on leaves, water-soaked spots on bracts, angular-shaped lesions on leaves, coalesced angular-shaped lesions on leaves, systemic vein infection, wilting, root rot, decreased photosynthetic activity and/or stunted growth.

4. The transgenic cotton plant of any one of 1 to 3, wherein the genome of cells thereof comprises one or more nucleotide sequences having at least about 80%>, about 85%>, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%,

about 98%, about 99%, or 100% sequence identity to one or more nucleotide sequences selected from SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, and SEQ ID NO:28,

wherein each of the nucleotide sequences comprises a transcription activator-like effector binding element to which a cognate transcriptional activator-like effector binds, and

wherein the nucleotide sequence of the transcription activator-like effector binding element is modified.

The transgenic cotton plant of 4, wherein the modification comprises insertion of nucleotides and/or deletion of nucleotides and/or substitution of nucleotides and/or modification of nucleotides within the nucleotide sequence of the transcription activatorlike effector binding element.

The transgenic cotton plant of 4 or 5, wherein the transcription activator-like effector binding element modification results in reduced binding of the cognate transcription activator-like effector to the transcription activator-like effector binding element.

The transgenic cotton plant of any one of 4 to 6, wherein the binding of the cognate transcription activator-like effector is reduced by about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or about 100%) compared to binding thereof in the unmodified transcription activator-like effector binding element.

The transgenic cotton plant of any one of 1 to 7, wherein the expression of one or more pathogen susceptibility genes, comprising a nucleotide sequence having at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%), about 97%, about 98%, about 99%, or 100% sequence identity to one or more nucleotide sequences selected from SEQ ID NO:2, SEQ ID NO: 4, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, and SEQ ID NO:29, is reduced compared to the expression of the one or more corresponding pathogen susceptibility genes in an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant. The transgenic cotton plant of 8, wherein the expression of the one or more pathogen susceptibility genes is reduced by about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or about 100% compared to the expression of the corresponding pathogen susceptibility gene in an

otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

The transgenic cotton plant of any one of 1 to 9, which further exhibits insect resistance, and/or fungal resistance, and/or herbicide resistance.

The transgenic cotton plant of any one of 1 to 10, wherein plant growth, development, reproduction, and boll formation are normal or near normal compared to plant growth, development, reproduction, and boll formation in an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

A method of making a transgenic cotton plant resistant to infection by, or damage from, a cotton pathogen selected from the group consisting of a bacterial pathogen, a fungal pathogen, a viral pathogen, and a oomycete pathogen, the method comprising reducing the expression of one or more pathogen susceptibility genes present within the genome of cells of the cotton plant.

The method of 12, wherein reducing the expression of one of more pathogen

susceptibility genes comprises modifying a nucleotide sequence comprising a

transcription activator-like effector binding element of a nucleotide sequence of one or more pathogen susceptibility genes present within the genome of cells of the cotton plant, wherein the binding of a cognate transcription activator-like effector is reduced compared to binding thereof in the nucleotide sequence of the unmodified transcription activator-like effector binding element present in the nucleotide sequence of a pathogen susceptibility gene present within the genome of cells of the cotton plant. The method of 13, wherein the modification comprises insertion of nucleotides, and/or deletion of nucleotides and/or substitution of nucleotides and/or modification of nucleotides within the nucleotide sequence of the transcription activator-like effector binding element.

The method of 13 or 14, wherein the binding of the transcription activator-like effector is reduced by about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%), about 80%), about 90%, about 95%, or about 100%> compared to the binding thereof in the unmodified transcription activator-like effector binding element nucleotide sequence.

The method of any one of 13 to 15, wherein the nucleotide sequence comprising the transcription activator-like effector binding element is at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%), about 98%), about 99%, or 100% identical to one or more nucleotide sequences selected from SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, and SEQ ID NO:28.

The method of any one of 12 to 16, wherein the expression of one or more pathogen susceptibility genes, comprising a nucleotide sequence having at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%), about 98%, about 99%, or 100% sequence identity to one or more sequences selected from SEQ ID NO:2, SEQ ID NO: 4, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, and SEQ ID NO:29, is reduced compared to the expression of the corresponding one or more pathogen susceptibility genes in an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

The method of any one of 12 to 17, wherein the extent of infection by, or damage from, the cotton pathogen is reduced within a range of from about 50%, from about 60%, from about 70%, from about 80%, from about 90%, or from about 95%, to about 100%, compared to the extent of infection by, or damage from, the cotton pathogen within an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

The method of 18, wherein the parameters of reduced infection by, or damage from, comprise one or more plant characteristics selected from yield, growth, development, boll formation, boll rot, seed set, seedling pre-emergent damping-off, seedling post-emergent damping-off, black arm on petioles, black arm on stems, water-soaked spots on leaves, water-soaked spots on bracts, angular-shaped lesions on leaves, coalesced angular- shaped lesions on leaves, systemic vein infection, wilting, root rot, decreased photo synthetic activity and/or stunted growth.

A transgenic cotton plant resistant to infection by, or damage from, a cotton pathogen selected from the group consisting of a bacterial pathogen, a fungal pathogen, a viral pathogen, and a oomycete pathogen made by the method of any one of 12 to 19.

The transgenic cotton plant of 20, wherein plant growth, development, reproduction, and boll formation are normal or near normal compared to plant growth, development,

reproduction, and boll formation in an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

A nucleotide sequence, comprising a first nucleotide sequence having at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or 100% sequence identity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, or SEQ ID NO:29.

The nucleotide sequence of 22, further comprising a second nucleotide sequence operably linked upstream of the first nucleotide sequence, wherein the second nucleotide sequence is at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or 100% sequence identical to SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:28.

The nucleotide sequence of 22, further comprising a second nucleotide sequence operably linked upstream of the first nucleotide sequence, wherein the second nucleotide sequence is at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or 100% sequence identical to SEQ ID NO: 11, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO:21, SEQ ID NO:24 or SEQ ID NO:27.

The nucleotide sequence of 23 or 24, wherein the second nucleotide sequence comprises a modified transcription activator-like effector binding element nucleotide sequence to which a cognate transcription activator-like effector binds in a reduced amount compared to the binding of the cognate transcription activator-like effector to the unmodified transcription activator-like effector binding element nucleotide sequence.

The nucleotide sequence of 25, wherein the binding of the cognate transcription activator-like effector is reduced by about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or about 100% compared to the binding thereof in the unmodified transcription activator-like effector binding element nucleotide sequence.

The nucleotide sequence of any one of 23 to 26, wherein the second nucleotide sequence comprises an intact TATAAA box and/or wherein the second nucleotide sequence is operably linked to an intact TATAAA box.

The nucleotide sequence of any one of 25 to 27, wherein the modified transcription activator-like effector binding element nucleotide sequence imparts resistance to a cotton pathogen selected from the group consisting of a bacterial pathogen, a fungal pathogen, a viral pathogen, and a oomycete pathogen when it is present in cells of a transgenic cotton plant, while still permitting normal, or near normal, plant growth, development, reproduction, and boll formation.

A transgenic cotton plant, cells of which comprise within their genome one or more nucleotide sequences of any one of 25 to 28.

The transgenic cotton plant of 29, wherein plant growth, development, reproduction, and boll formation are normal or near normal compared to plant growth, development, reproduction, and boll formation in an otherwise identical control cotton plant grown under the same conditions as the transgenic cotton plant.

A recombinant vector or recombinant construct, comprising one or more nucleotide sequences of any one of 22 to 28.

A part, product obtained from, material from, progeny, or derivative of the transgenic cotton plant of any one of 1-11, 20, 21, 29, or 30, or part, product obtained from, material from, progeny, or derivative of a transgenic cotton plant made by the method of any one of 12 to 19.

The plant part of 32, which is selected from among a protoplast, a cell, a tissue, an organ, a cutting, an explant, a reproductive tissue, a vegetative tissue, a biomass, fiber, an inflorescence, a flower, a sepal, a petal, a pistil, a stigma, a style, an ovary, an ovule, an embryo, a receptacle, a seed, a fruit, a stamen, a filament, an anther, a male or female gametophyte, a pollen grain, a meristem, a terminal bud, an axillary bud, a leaf, a stem, a root, an offset, a cell of said plant in culture, a tissue of said plant in culture, an organ of said plant in culture, a homogenate, and a callus.

The progeny or derivative of 32, which is selected from among clones, hybrids, samples, seeds, and harvested material thereof.

The progeny or derivative of 32 or 34, which is produced sexually or asexually.

A consumer or industrial product obtained from, or produced from, a transgenic cotton plant or progeny of any one of 1-11, 20, 21, 29, 30, 34 or 35, or a transgenic cotton plant made by the method of any one of 12 to 19.

37. The transgenic cotton plant or progeny of any one of 1-11, 20, 21, 29, 30, 34 or 35, or a transgenic cotton plant made by the method of any one of 12 to 19, wherein the infection by, or damage from, a cotton pathogen is cotton bacterial blight caused by a species of Xanthomonas.

38. The transgenic cotton plant or progeny of 37, wherein the species of Xanthomonas is Xanthomonas citripv. malvacearum.

39. The transgenic cotton plant or progeny of 37 or 38, wherein the species of Xanthomonas is Xanthomonas citripv. malvacearum Race 18.

BRIEF DESCRIPTION OF THE FIGURES

[012] The foregoing and other aspects, features, and advantages of the present disclosure will be better understood from the following detailed description taken in conjunction with the accompanying figures, all of which are given by way of illustration only, and are not limitative of the present specification, in which the figures show:

[013] Fig. 1: Cotton Bacterial Blight (CBB) symptoms and reemergence across the southern United States. (Left) Typical CBB symptoms present in cotton fields near Lubbock, TX during the 2015 growing season include angular leaf spots, boll rot, and black arm rot. Acres of cotton planted per county in the United States in 2015 (blue) and counties with confirmed CBB in 2015 (red outline). Statistics on cotton planted in the U.S. were acquired from the USDA. CBB was reported by Extension agents, Extension specialists, and Certified Crop Advisers in their respective states, and compiled by Tom Allen.

[014] Fig. 2: Maps of CBB incidence in the US from 2011-2012 and 2014-2016. CBB incidence was reported by farmers, Extension specialists and Certified Crop Advisers in their respective states for the years 2011-2012 and 2014-2016, and compiled by Tom Allen. CBB reports for 2013 were infrequent.

[015] Fig. 3: Phylogenetic analysis of Xcin isolates and 13 species of Xanthomonas A)

MLST (Multi Locus Sequence Typing) analysis of 12 Ulumina sequenced Xcm isolates

(disclosed herein) and 40 other Xanthomonads using concatenated sections of the gltA, lepA, lacF, gyrB, fusA and gap-1 loci. B) S P based Neighbor- Joining Tree generated from 17853

variable loci between 14 Xcm isolates and the reference genome Xanthomonas citri subsp. citri strain Awl2879. The tree was made using the Simple Phylogeny tool from ClustalW2.

[016] Fig. 4: Molecular and phenotypic analysis of Xcm and G. hirsutum interactions. A)

Type three effector profiles of Xcm isolates were deduced from de novo, Illumina based genome assemblies. Effector presence absence was determined based on homology to known type three effectors using the program Prokka. B) Commercial and public G hirsutum cultivars were inoculated with 14 Xcm isolates. Susceptible (S) indicates water soaking symptoms. Resistant (R) indicates a visible hypersensitive response. Plants were screened with a range of inoculum concentration from OD6oo = 0.001-0.5. C) Disease symptoms on G hirsutum cultivars Stoneville 5288 B2F and DES 56 after inoculation with Xcm strain AR81009 (ODeoo = 0.05). Symptoms are visualized under visible (VIS) and near infrared (NIR) light. D) The proportion of US fields planted with susceptible and resistant cultivars of G hirsutum was determined based on planting acreage statistics from the USDA-AMA and disease phenotypes based on previous reports for common cultivars.

[017] Fig. 5: SMRT sequencing of two phenotypically and geographically diverse Xcm isolates: MS14003 and AR81009. Circos plot comparing the circular genomes. Tracks are as follows from inside to outside: synteny of gene models; GC Content; Methylation on + and -strands; location of type three effectors (teal) and TAL effectors (red). On each side,

accompanying plasmids are cartooned. Type three effector repertoires and the type IV secretion systems were annotated using Prokka, homologous regions greater than lkb were identified using MAUVE, and TAL effectors were annotated using AnnoTALE.

[018] Fig. 6: SMRT sequencing and western blot reveal diverse TAL effector repertoires between Xcm strains MS14003 and AR81009. A) Gene models of TAL-effectors identified by AnnoTALE. Blue and Green highlighted gene models represent TALs grouped in the same clade by RVD sequence using AnnoTALE. B) Western Blot of TAL effectors using polyclonal TAL-specific antibody.

[019] Fig. 7: RNA-Sequencing analysis of infected G. hirsutum tissue demonstrates transcriptional changes during CBB. A) Disease phenotypes of Xcm strains MS 14003 and AR81009 on G. hirsutum cultivars Acala Maxxa and DES 56, 7dpi. B) RNA-Seq Experimental Design: Acala Maxxa and DES 56 were inoculated with Xcm strains MS 14003 and AR81009 at

an OD of 0.5 and a mock treatment of lOmM MgCb. Inoculated leaf tissue was collected at 24 and 48 hpi (before disease symptoms emerged). C) Venn diagram of upregulated G. hirsutum genes (Log2(fold change in FPKM) > 2 and p value < 0.05) in response to Xcm

inoculation. Cuffdiff output was parsed using a custom script and visualized with the

VennDiagram package in R.

[020] Fig. 8: Growth assay of MS14003 and AR81009 on cotton cultivars Acala Maxxa and DES 56.

[021] Fig. 9: Expression levels of significantly upregulated genes with a Log2 fold change of 2 in G. hirsutum A) All significantly upregulated genes with a Log2 fold change of 2 B) All significantly upregulated genes (p < 0.05) with a Log2 (fold change in FPKM) > 2 that are unique to each cultivar/pathovar disease interaction in G hirsutum.

[022] Fig. 10: Expression of homeologous pairs across the A and D G. hirsutum genomes in response to Xcm inoculation. Genes considered up or down regulated meet both differential expression from mock significance of q-value < 0.05 and the absolute value of the log2 fold change is greater than 2. A) Acala Maxxa inoculated with MS 14003 B) DES 56 inoculated with MS14003 C) Acala Maxxa inoculated with AR 81009 D) DES 56 inoculated with AR81009.

[023] Fig. 11: Three candidate G. hirsutum susceptibility genes are targeted by two different Xcm strains, (left) Bioinformatically predicted Xcm TAL Effector binding sites on the 300bp promoter region of four SWEET genes. These were predicted with TALEsf using a quality score cutoff of 4. (right) Heat-map of Cuffdiff results of significantly upregulated G hirsutum SWEET genes (p < 0.05) with a Log2 (fold change in FPKM) > 2, 48 hours after inoculation with Xcm.

DETAILED DESCRIPTION

[024] The following detailed description is provided to aid those skilled in the art in practicing the embodiments disclosed and claimed herein. Even so, the following detailed description should not be construed to unduly limit the present disclosure and claims as modifications and variations in the embodiments discussed herein can be made by those of ordinary skill in the art without departing from the spirit or scope of the present discoveries.

[025] To describe every conceivable application of the embodiments of this disclosure would be prohibitively time consuming and expensive. Those of ordinary skill in the art will recognize that the present disclosure and claims are not limited to the combinations of features, examples, or embodiments specifically disclosed herein, and that any and all features, combinations of features, sub-combinations of features, or permutations of features described or possible herein, including those in the description, abstract, figures, Sequence Listing, examples, and claims, is(are) linked, apply directly and unambiguously to the general context of the present disclosure, and are clearly and unambiguously intended to be included within the scope of the present disclosure and claims, provided that the features included in any such combination, subcombination, or permutation are not mutually inconsistent as will be apparent from the context, this specification, and the knowledge of one of ordinary skill in the art. Features disclosed in connection with a single embodiment or example herein are generally applicable to any embodiment. All combinations, sub-combinations, and permutations of features disclosed or possible herein and their additional advantages that would be readily apparent and enabled, but not explicitly described and claimed, to those of ordinary skill in the art are encompassed by the present disclosure just as if each was individually and explicitly disclosed, permitting practice of the present methods, uses, products, etc., across their entire scope.

[026] The contents of each of the publications, patent applications, patents, and other references mentioned herein are herein incorporated by reference in their entirety. In case of conflict, the present disclosure, including explanations of terms, will control.

OVERVIEW OF THE DISCLOSURE

[027] Cotton bacterial blight (CBB), an important disease of (Gossypium hirsutum) in the early 20th century, had been controlled by resistance genes for over half a century. Recently, CBB re-emerged as an agronomic problem in the United States. Here, modern molecular and genomic tools were employed to illuminate the cause(s) of the disease resurgence. Phylogenetic analysis revealed that strains from the current outbreak cluster with historical Xanthomonas citri pv. malvacearum (Xcm) strains. Contemporary strains encode virulence protein repertoires and elicit susceptibility and resistance phenotypes consistent with historical strains. Genome assemblies for two geographically and temporally divergent strains of Xcm, yielded circular chromosomes and accompanying plasmids. Both genomes encode transcription activator-like effector genes. RNA-sequencing revealed that both strains induced a homeologous pair of genes, in diverse cotton cultivars with homology to the known susceptibility gene, MLO. In contrast, the two strains of Xcm induced different SWEET sugar transporters. In one case, only one homeolog was significantly induced. Subsequent genome wide analysis revealed the overall expression patterns of the homeologous gene pairs in cotton after inoculation by Xcm. These data reveal

susceptibility genes that can be modified as well as host-pathogen specificity in the Xcm-G. hirsutum pathosystem, give explanations for the CBB reemergence, and strategies for future development of pathogen resistant cultivars.

Terms

[028] In the description and examples that follow, a number of terms are used herein. In order to provide a clear and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided:

[029] As used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a plant" includes a plurality of such plants, reference to "a cell" includes one or more cells and equivalents thereof known to those skilled in the art, and so forth. Similarly, the word "or" is intended to include "and" unless the context clearly indicates otherwise. Hence "comprising A or B" means including A, or B, or A and B. Furthermore, the use of the term "including", as well as other related forms, such as "includes" and "included", is not limiting. [030] The term "comprising" as used in a claim herein is open-ended, and means that the claim must have all the features specifically recited therein, but that there is no bar on additional features that are not recited being present as well. The term "comprising" leaves the claim open for the inclusion of unspecified ingredients even in major amounts. The term "consisting essentially of in a claim means that the invention necessarily includes the listed ingredients, and is open to unlisted ingredients that do not materially affect the basic and novel properties of the invention. A "consisting essentially of claim occupies a middle ground between closed claims that are written in a closed "consisting of format and fully open claims that are drafted in a "comprising' format". These terms can be used interchangeably herein if, and when, this may become necessary. Furthermore, the use of the term "including", as well as other related forms, such as "includes" and "included", is not limiting.

[031] Unless otherwise stated, nucleic acid sequences in the text of this specification are given, when read from left to right, in the 5' to 3' direction. Nucleic acid sequences may be provided as DNA or as RNA, as specified; disclosure of one necessarily defines the other, as is known to one of ordinary skill in the art and is understood as included in embodiments where it would be appropriate. Nucleotides may be referred to by their commonly accepted single-letter codes. Unless otherwise indicated, amino acid sequences are written left to right in amino to carboxyl orientation, respectively. Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUM Biochemical Nomenclature Commission. It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for description. Unless otherwise provided for, software, electrical, and electronics terms as used herein are as defined in The New IEEE Standard Dictionary of Electrical and Electronics Terms (5th edition, 1993). The terms defined below are more fully defined by reference to the specification as a whole.

[032] If ranges are disclosed, the endpoints of all ranges directed to the same component or property are inclusive and independently combinable (e.g., ranges of "up to about 25 wt.%, or, more specifically, about 5 wt.% to about 20 wt.%," is inclusive of the endpoints and all intermediate values of the ranges of "about 5 wt.% to about 25 wt.%," etc.). Numeric ranges recited with the specification are inclusive of the numbers defining the range and include each integer within the defined range.

[033] The term "about" as used herein is a flexible word with a meaning similar to "approximately" or "nearly". The term "about" indicates that exactitude is not claimed, but rather a contemplated variation. Thus, as used herein, the term "about" means within 1 or 2 standard deviations from the specifically recited value, or ± a range of up to 20%, up to 15%, up to 10%), up to 5%), or up to 4%, 3%, 2%, or 1%> compared to the specifically recited value.

[034] As used herein, "altering level of production" or "altering level of expression" means changing, either by increasing or decreasing, the level of production or expression of a nucleic acid sequence or an amino acid sequence (for example a polypeptide, an siRNA, a miRNA, an mRNA, a gene), as compared to a control level of production or expression.

[035] The phrase "conservative amino acid substitution" or "conservative mutation" refers to the replacement of one amino acid by another amino acid with a common property. A functional way to define common properties between individual amino acids is to analyze the normalized frequencies of amino acid changes between corresponding proteins of homologous organisms (Schulz, G. E. and R. H. Schirmer (1979) Principles of Protein Structure, Springer- Verlag). According to such analyses, groups of amino acids can be defined where amino acids within a group exchange preferentially with each other, and therefore resemble each other most in their impact on the overall protein structure.

[036] Examples of amino acid groups defined in this manner include: a "charged / polar group," consisting of Glu, Asp, Asn, Gin, Lys, Arg and His; an "aromatic, or cyclic group," consisting of Pro, Phe, Tyr and Trp; and an "aliphatic group" consisting of Gly, Ala, Val, Leu, He, Met, Ser, Thr and Cys. Within each group, subgroups can also be identified, for example, the group of charged / polar amino acids can be sub-divided into the sub-groups consisting of the "positively-charged sub-group," consisting of Lys, Arg and His; the negatively-charged subgroup," consisting of Glu and Asp, and the "polar sub-group" consisting of Asn and Gin. The aromatic or cyclic group can be sub-divided into the sub-groups consisting of the "nitrogen ring sub-group," consisting of Pro, His and Trp; and the "phenyl sub-group" consisting of Phe and Tyr. The aliphatic group can be sub-divided into the sub-groups consisting of the "large aliphatic non-polar sub-group," consisting of Val, Leu and He; the "aliphatic slightly-polar sub-group," consisting of Met, Ser, Thr and Cys; and the "small-residue sub-group," consisting of Gly and Ala. Examples of conservative mutations include substitutions of amino acids within the sub- groups above, for example, Lys for Arg and vice versa such that a positive charge can be maintained; Glu for Asp and vice versa such that a negative charge can be maintained; Ser for Thr such that a free -OH can be maintained; and Gin for Asn such that a free - H2 can be maintained.

[037] As used herein "control" or "control level" means the level of a molecule, such as a polypeptide or nucleic acid, normally found in nature under a certain condition and/or in a specific genetic background. In certain embodiments, a control level of a molecule can be measured in a cell or specimen that has not been subjected, either directly or indirectly, to a treatment. A control level is also referred to as a wildtype or a basal level. In certain aspects the term "near normal" can be used as a "control." These terms are understood by those of ordinary skill in the art. A control plant, i.e., a plant that does not contain a recombinant DNA that confers (for instance) an enhanced trait in a transgenic plant, is used as a baseline for comparison to identify an enhanced trait in the transgenic plant. A suitable control plant may be a non-transgenic plant of the parental line used to generate a transgenic plant. A control plant may in some cases be a transgenic plant line that comprises an empty vector or marker gene, but does not contain the recombinant DNA, or does not contain all of the recombinant DNAs in the test plant.

[038] As used herein "damage" means one or more plant characteristics adversely effected by cotton bacterial blight infection. Such characteristics include, but are not limited to, yield, growth, development, boll formation, boll rot, seed set, seedling pre-emergent damping-off, seedling post-emergent damping-off, black arm on petioles, black arm on stems, water-soaked spots on leaves, water-soaked spots on bracts, angular-shaped lesions on leaves, coalesced angular-shaped lesions on leaves, and/or systemic vein infection.

[039] The terms "enhance", "enhanced", "increase", or "increased" refer to a statistically significant increase. For the avoidance of doubt, these terms generally refer to about a 5% increase in a given parameter or value, about a 10% increase, about a 15% increase, about a 20% increase, about a 25% increase, about a 30% increase, about a 35% increase, about a 40% increase, about a 45% increase, about a 50% increase, about a 55% increase, about a 60% increase, about a 65% increase, about 70% increase, about a 75% increase, about an 80% increase, about an 85% increase, about a 90% increase, about a 95% increase, about a 100%

increase, or more over the control value. These terms also encompass ranges consisting of any lower indicated value to any higher indicated value, for example "from about 5% to about 50%", etc.

[040] As used herein, "expression" or "expressing" refers to production of a functional product, such as, the generation of an RNA transcript from an introduced construct, an endogenous DNA sequence, or a stably incorporated heterologous DNA sequence. A nucleotide encoding sequence may comprise intervening sequence (e.g. introns) or may lack such intervening non-translated sequences (e.g. as in cDNA). Expressed genes include those that are transcribed into mRNA and then translated into protein and those that are transcribed into RNA but not translated (for example, siRNA, transfer RNA and ribosomal RNA). The term may also refer to a polypeptide produced from an mRNA generated from any of the above DNA precursors. Thus, expression of a nucleic acid fragment, such as a gene or a promoter region of a gene, may refer to transcription of the nucleic acid fragment (e.g., transcription resulting in mRNA or other functional RNA) and/ or translation of RNA into a precursor or mature protein (polypeptide), or both.

[041] As used herein, a "TAL Effector Binding Element" or "TAL EBE" or "TAL Effector Binding Site" is a nucleotide sequence to which a cognate TAL normally binds and upregulates gene expression. Such element or site occurs within a promoter region of a susceptibility gene, wherein the promoter region of said susceptibility gene is operably located upstream of the susceptibility gene. As a non -limiting example, TAL 14b (SEQ ID NO: 7) binds to TAL Effector Binding Element (SEQ ID NO:9) within the promoter (SEQ ID NO: 1) which is upstream of SEQ ID NO:2.

[042] An "expression cassette" refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively.

[043] The term "genome" as it applies to a plant cells encompasses not only chromosomal DNA found within the nucleus, but organelle DNA found within subcellular components (e.g., mitochondrial, plastid) of the cell. As used herein, the term "genome" refers to the nuclear genome unless indicated otherwise. However, expression in a plastid genome, e.g., a chloroplast genome, or targeting to a plastid genome such as a chloroplast via the use of a plastid targeting sequence, is also encompassed by the present disclosure.

[044] A polynucleotide sequence is "heterologous to" a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified by human action from its original form. For example, a promoter operably linked to a heterologous coding sequence refers to a coding sequence from a species different from that from which the promoter was derived, or, if from the same species, a coding sequence which is different from naturally occurring allelic variants. Heterologous nucleic acid fragments, such as coding sequences that have been inserted into a host organism, are not normally found in the genetic complement of the host organism. As used herein, the term "heterologous" also refers to a nucleic acid fragment derived from the same organism, but which is located in a different, e.g., non-native, location within the genome of this organism. Thus, the organism can have more than the usual number of copy(ies) of such nucleic acid fragment located in its(their) normal position within the genome and in addition, in the case of plant cells, within different genomes within a cell, for example in the nuclear genome and within a plastid or mitochondrial genome as well. A nucleic acid fragment that is heterologous with respect to an organism into which it has been inserted or transferred is sometimes referred to as a "transgene", and the organism comprising the transgene is referred to as "transgenic." In some embodiments, transgenic refers to an organism, wherein the genome of the organism is modified through genome editing. Genome editing can include, but is not limited to, deletion or removal of genetic material via CRISPR-Cas9 or other genome editing nucleases.

[045] The term "homology" or "sequence homology" describe a mathematically based comparison of sequence similarities which is used to identify genes or proteins with similar functions or motifs. The nucleic acid and protein sequences of the present invention can be used as a "query sequence" to perform a search against public databases to, for example, identify other family members, related sequences, orthologs, or homologs. The term "homologous" refers to the relationship between two nucleic acid sequence and/or proteins that possess a "common evolutionary origin", including nucleic acids and/or proteins from superfamilies (e.g., the immunoglobulin superfamily) in the same species of animal, as well as homologous nucleic acids and/or proteins from different species of animal (for example, myosin light chain polypeptide, etc.; see Reeck et al., (1987) Cell, 50:667). Such proteins (and their encoding nucleic acids) may have sequence homology, as reflected by sequence similarity, whether in terms of percent identity or by the presence of specific residues or motifs and conserved

positions. The methods disclosed herein contemplate the use of the presently disclosed nucleic and protein sequences, as well as sequences having sequence identity and/or similarity.

[046] By "host cell" it is meant a cell which contains a vector and supports the replication and/or expression of the vector. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells. Alternatively, the host cells are monocotyledonous or dicotyledonous plant cells.

[047] The term "introduced" means providing a nucleic acid (e.g., expression construct) or protein into a cell. Introduced includes reference to the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid may be incorporated into the genome of the cell, and includes reference to the transient provision of a nucleic acid or protein to the cell. "Introduced" includes reference to stable or transient transformation methods, as well as sexually crossing. Thus, "introduced" in the context of inserting a nucleic acid fragment (e.g., a recombinant DNA construct/expression construct) into a cell, can mean "transfection" or "transformation" or "transduction", and includes reference to the incorporation of a nucleic acid fragment into a eukaryotic or prokaryotic cell where the nucleic acid fragment may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g., transfected mRNA).

[048] As used herein the term "isolated" refers to a material such as a nucleic acid molecule, polypeptide, or small molecule that has been separated from the environment from which it was obtained. It can also mean altered from the natural state. For example, a polynucleotide or a polypeptide naturally present in a living animal is not "isolated" but the same polynucleotide or polypeptide separated from the coexisting materials of its natural state is "isolated", as the term is employed herein. Thus, a polypeptide or polynucleotide produced and/or contained within a recombinant host cell is considered isolated. Also intended as "isolated polypeptides" or

"isolated nucleic acid molecules", etc., are polypeptides or nucleic acid molecules that have been purified, partially or substantially, from a recombinant host cell or from a native source.

[049] As used here "modulate" or "modulating" or "modulation" and the like are used interchangeably to denote either up-regulation or down-regulation of the expression or biosynthesis of a material such as a nucleic acid, protein or small molecule relative to its normal expression or biosynthetic level in a wild type or control organism. Modulation includes expression or biosynthesis that is increased or decreased by about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.5%, 99.9%, 100%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150% , 155%, 160%, 165% or 170% or more, or any range therein, relative to the wild type or control expression or biosynthesis level. As described herein, various material accumulation, can be increased, or in the case of some embodiments, sometimes decreased relative to a control. One of ordinary skill will be able to identify or produce a relevant control.

[050] The terms "modify", "modifying", "modification", and the like as used herein refer to either an increase or enhancement, or a decrease or reduction, as the agricultural context dictates and which is desired, of a characteristic in a transgenic plant or method disclosed herein in order to improve the agricultural fitness, growth, yield, environmental adaptability, response to stress, etc., of such transgenic plant. Such desired increases or enhancements, or decreases or reductions, are by about 2%, 3%, 4%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.5%, 99.9%, 100%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150% , 155%, 160%, 165%) or 170%) or more, or any range therein, compared to the same characteristic in a wild type or control plant. One of ordinary skill will easily be able to identify or produce a relevant control.

[051] Modifications to the transcription activator-like effector binding element nucleotide sequences having the nucleotide sequences shown in SEQ ID NOs: 11, 14, 15, 18, 21, 24, and 27 that reduce the binding of cognate effector proteins thereto comprise insertion of nucleotides and/or deletion of nucleotides and/or substitution of nucleotides and/or modification of nucleotides, including combinations thereof, within the transcription activator-like effector binding element nucleotide sequences.

[052] A person skilled in the art will recognize that insertion or deletion of one or more nucleotides in a transcription activator-like effector binding element nucleotide sequence can disrupt the binding of the cognate effector protein to the effector binding element nucleotide sequence. In one aspect, this can be achieved by complete deletion of the effector binding element nucleotide sequence. In other embodiments, this can also be achieved by deletion of any number of contiguous or non-contiguous individual nucleotides, any range(s) of contiguous or

non-contiguous nucleotides, or combinations thereof, normally present in an effector binding element nucleotide sequence.

[053] Substitution of nucleotides that naturally occur within the transcription activator-like effector binding element nucleotide sequences disclosed herein can be achieved by replacing a nucleotide that naturally occurs at one or more contiguous or non-contiguous positions in the effector binding element nucleotide sequence with a different naturally occurring nucleotide, resulting in reduction of binding of the cognate binding effector protein. Thus, A (adenine), T (thymine), G (guanine), or C (cytosine) can be replaced with a different nucleotide within this group. T or C can also be replaced with a nucleotide normally present in RNA rather than DNA, for example U (uracil). Nucleotide substitution also encompasses replacing a nucleotide that naturally occurs at one or more contiguous or non-contiguous positions in the effector binding element nucleotide sequence with a modified nucleotide, resulting in reduction of binding of the cognate binding effector protein. Non-limiting examples of modified nucleotides contemplated for use in the plants, methods, nucleic acid constructs, etc., disclosed and claimed herein include, for example, those listed in Table 2: List of Modified Nucleotides, presented in Chapter 2400, Section 2422, Nucleotide and/or Amino Acid Sequence Disclosures in Patent Applications of the Manual of Patent Examining Procedure (MPEP) Ninth Edition, Revision 08.2017, Last Revised January 2018, the contents of which are herein incorporated by reference in their entirety, as well as those known in the art.

[054] Modification of one or more contiguous or non-contiguous nucleotides within the transcription activator-like effector binding element nucleotide sequences can be performed by chemical or biochemical modification of naturally occurring nucleotides within these sequences to produce effector binding sequences comprising modified nucleotides such as those listed in Table 2 in MPEP Chapter 2400, Section 2422, noted above, or as are known in the art. For example, adenosine can be converted to N6-isopentenyl-adenosine; thymidine can be converted to dihydrothymidine or uracil; guanosine can be converted to 2,2-dimethylguanosine; cytidine can be converted to 4-acetylcytidine or uracil, etc.

[055] The naturally occurring effector binding element nucleotide sequences can be modified as described above by methods familiar to those having expertise in the arts of gene modification or nucleic acid chemistry or biochemistry, and the modified sequences can be introduced into cotton plants using methods familiar to those in the art of producing transgenic plants.

[056] The effect of any of the modifications described above in reducing infection by, damage from, or susceptibility of cotton plants to pathogens can be assessed by routine screening via the methods disclosed in the present Examples, as well as those known in the art, without undue experimentation.

[057] As used herein, "nucleic acid" means a polynucleotide (or oligonucleotide), including single or double-stranded polymers of deoxyribonucleotide or ribonucleotide bases, and unless otherwise indicated, encompasses naturally occurring and synthetic nucleotide analogues having the essential nature of natural nucleotides in that they hybridize to complementary single-stranded nucleic acids in a manner similar to naturally occurring nucleotides. Nucleic acids may also include fragments and modified nucleotide sequences. Nucleic acids disclosed herein can either be naturally occurring, for example genomic nucleic acids; or isolated, purified, non-genomic nucleic acids, including synthetically produced nucleic acid sequences such as those made by chemical oligonucleotide synthesis, enzymatic synthesis, or by recombinant methods, including for example, cDNA, codon-optimized sequences for efficient expression in different transgenic plants reflecting the pattern of codon usage in such plants, nucleotide sequences that differ from the nucleotide sequences disclosed herein due to the degeneracy of the genetic code but that still encode the protein(s) of interest disclosed herein, nucleotide sequences encoding the presently disclosed protein(s) comprising conservative (or non-conservative) amino acid substitutions that do not adversely affect their normal activity, PCR-amplified nucleotide sequences, and other non-genomic forms of nucleotide sequences familiar to those of ordinary skill in the art. Numerous methods and strategies for codon optimization in plants are well known in the art, being described in the literature and being publicly available via various computer programs.

[058] As used herein, "nucleic acid construct" or "construct" refers to an isolated

polynucleotide which can be introduced into a host cell. This construct may comprise any combination of deoxyribonucleotides, ribonucleotides, and/or modified nucleotides. This construct may comprise an expression cassette that can be introduced into and expressed in a host cell.

[059] As used herein "operably linked" refers to a functional arrangement of elements. A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter effects the transcription or expression of the coding sequence. The control elements need not be contiguous with the coding sequence, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter and the coding sequence and the promoter can still be considered "operably linked" to the coding sequence.

[060] As used herein, the terms "plant" or "plants" that can be used in the present methods broadly include the classes of higher and lower plants amenable to transformation techniques, including angiosperms (monocotyledonous and dicotyledonous plants), gymnosperms, ferns, and unicellular and multicellular algae. The term "plant" also includes plants which have been modified by breeding, mutagenesis or genetic engineering (transgenic and non-transgenic plants). It includes plants of a variety of ploidy levels, including aneuploid, polyploid, diploid, haploid and hemizygous. The plant may be in any form including suspension cultures, embryos, meristematic regions, callus tissue, gametophytes, sporophytes, pollen, microspores, whole plants, shoot vegetative organs/structures (e.g. leaves, stems and tubers), roots, flowers and floral organs/structures, seed (including embryo, endosperm, and seed coat) and fruit, plant tissue (e.g. vascular tissue, ground tissue, and the like) and cells, and progeny of same. The term "food crop plant" includes plants that are either directly edible, or which produce edible products, and that are customarily used to feed humans either directly, or indirectly through animals. Non-limiting examples of such plants include: Cereal crops: wheat, rice, maize (corn), barley, oats, sorghum, rye, and millet; Protein crops: peanuts, chickpeas, lentils, kidney beans, soybeans, lima beans; Roots and tubers: potatoes, sweet potatoes, and cassavas; Oil crops: corn, soybeans, canola (rapeseed), wheat, peanuts, palm, coconuts, safflower, sesame, cottonseed, sunflower, flax, olive, and safflower; Sugar crops: sugar cane and sugar beets; Fruit crops: bananas, oranges, apples, pears, breadfruit, pineapples, and cherries; Vegetable crops and tubers: tomatoes, lettuce, carrots, melons, asparagus, etc.; Nuts: cashews, peanuts, walnuts, pistachio nuts, almonds; Forage and turf grasses; Forage legumes: alfalfa, clover; Drug crops: coffee, cocoa, kola nut, poppy,

tobacco; Spice and flavoring crops: vanilla, sage, thyme, anise, saffron, menthol, peppermint, spearmint, coriander.

[061] The terms "peptide", "polypeptide", and "protein" are used to refer to polymers of amino acid residues. These terms are specifically intended to cover naturally occurring biomolecules, as well as those that are recombinantly or synthetically produced.

[062] The term "promoter" or "regulatory element" refers to a region or nucleic acid sequence located upstream or downstream from the start of transcription and which is involved in recognition and binding of RNA polymerase and/or other proteins to initiate transcription of RNA. Promoters need not be of plant or algal origin, for example, promoters derived from plant viruses, such as the CaMV35S promoter, or from other organisms, can be used in variations of the embodiments discussed herein. Promoters useful in the present methods include constitutive, tissue-specific, cell-type specific, seed-specific, inducible, repressible, and developmentally regulated promoters.

[063] A skilled person appreciates that a promoter sequence can be modified to provide for a range of expression levels of an operably linked heterologous nucleic acid molecule. Less than the entire promoter region can be utilized and the ability to drive expression retained. However, it is recognized that expression levels of mRNA can be decreased with deletions of portions of the promoter sequence. Thus, the promoter can be modified to be a weak or strong promoter. A promoter is classified as strong or weak according to its affinity for RNA polymerase (and/or sigma factor); this is related to how closely the promoter sequence resembles the ideal consensus sequence for the polymerase. Generally, by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended levels of about 1/10,000 transcripts to about 1/100,000 transcripts to about 1/500,000 transcripts. Conversely, a strong promoter drives expression of a coding sequence at a high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1,000 transcripts. The promoter of choice is preferably excised from its source by restriction enzymes, but can alternatively be PCR-amplified using primers that carry appropriate terminal restriction sites. It should be understood that the foregoing groups of promoters are non-limiting, and that one skilled in the art could employ other promoters that are not explicitly cited herein.

[064] The term "purified" refers to material such as a nucleic acid, a protein, or a small molecule, which is substantially or essentially free from components which normally accompany or interact with the material as found in its naturally occurring environment, and/or which may optionally comprise material not found within the purified material's natural environment. The latter may occur when the material of interest is expressed or synthesized in a non-native environment. Nucleic acids and proteins that have been isolated include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.

[065] "Recombinant" refers to a nucleotide sequence, peptide, polypeptide, or protein, expression of which is engineered or manipulated using standard recombinant methodology. This term applies to both the methods and the resulting products. As used herein, a "recombinant construct", "expression construct", "chimeric construct", "construct" and "recombinant expression cassette" are used interchangeably herein.

[066] As used herein, the phrase "sequence identity" or "sequence similarity" is the similarity between two (or more) nucleic acid sequences, or two (or more) amino acid sequences. Sequence identity is frequently measured as the percent of identical nucleotide or amino acid residues at corresponding positions in two or more sequences when the sequences are aligned to maximize sequence matching, i.e., taking into account gaps and insertions.

[067] One of ordinary skill in the art will appreciate that sequence identity ranges are provided for guidance only. It is entirely possible that nucleic acid sequences that do not show a high degree of sequence identity can nevertheless encode amino acid sequences having similar functional activity. It is understood that changes in nucleic acid sequence can be made using the degeneracy of the genetic code to produce multiple nucleic acid molecules that all encode substantially the same protein. Means for making this adjustment are well-known to those of skill in the art. When percentage of sequence identity is used in reference to amino acid sequences it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences which differ by such conservative

substitutions are said to have "sequence similarity" or "similarity". Means for making this adjustment are well-known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity.

[068] "Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.

[069] Sequence identity (or similarity) can be readily calculated by known methods, including but not limited to those described in: Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied Math., 48: 1073 (1988). Methods to determine identity are designed to give the largest match between the sequences tested. Moreover, methods to determine identity are codified in publicly available computer programs. Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm of Smith & Waterman, by the homology alignment algorithms, by the search for similarity method or, by computerized implementations of these algorithms (GAP, BESTFIT, PASTA, and TFASTA in the GCG Wisconsin Package, available from Accelrys, Inc., San Diego, California, United States of America), or by visual inspection. See generally, (Altschul, S. F. et al., J. Mol. Biol. 215: 403-410 (1990) and Altschul et al. Nucl. Acids Res. 25: 3389-3402 (1997)).

[070] One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in (Altschul, S., et al., NCBI NLM Nffl Bethesda, Md. 20894; & Altschul, S., et al., J. Mol. Biol. 215: 403-410 (1990).

Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M = 5, N = -4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff(1989) Proc. Natl. Acad. Sci. USA 89: 10915).

[071] In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90: 5873-5877 (1993)). One measure of similarity provided by the

BLAST algorithm is the smallest sum probability (P (N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. BLAST searches assume that proteins can be modeled as random sequences. However, many real proteins comprise regions of nonrandom sequences which may be homopolymeric tracts, short-period repeats, or regions enriched in one or more amino acids. Such low-

complexity regions may be aligned between unrelated proteins even though other regions of the protein are entirely dissimilar. A number of low-complexity filter programs can be employed to reduce such low-complexity alignments. For example, the SEG (Wooten and Federhen, Comput. Chern., 17: 149-163 (1993)) and XNU (Claverie and States, Comput. Chern., 17: 191-201 (1993)) low-complexity filters can be employed alone or in combination.

[072] The constructs and methods disclosed herein encompass nucleic acid and protein sequences having sequence identity/sequence similarity at least about 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% to those specifically disclosed.

[073] A "transgenic" organism, such as a transgenic plant, is a host organism that has been stably or transiently genetically engineered to contain one or more heterologous nucleic acid fragments, including nucleotide coding sequences, expression cassettes, vectors, etc.

Introduction of heterologous nucleic acids into a host cell to create a transgenic cell is not limited to any particular mode of delivery, and includes, for example, microinjection, adsorption, electroporation, particle gun bombardment, whiskers-mediated transformation, liposome-mediated delivery, Agrobacterium-mediated transfer, the use of viral and retroviral vectors, etc., as is well known to those skilled in the art.

[074] Conventional techniques of molecular biology, recombinant DNA technology, microbiology, chemistry useful in practicing the methods of the present disclosure are described, for example, in Green and Sambrook (2012) Molecular Cloning: A Laboratory Manual, Fourth Edition, Cold Spring Harbor Laboratory Press; Ausubel et al. (2003 and periodic supplements) Current Protocols in Molecular Biology, John Wiley & Sons, New York, N. Y.; Amberg et al. (2005) Methods in Yeast Genetics: A Cold Spring Harbor Laboratory Course Manual, 2005 Edition, Cold Spring Harbor Laboratory Press; Roe et al. (1996) DNA Isolation and Sequencing: Essential Techniques, John Wiley & Sons; J. M. Polak and James O'D. McGee (1990) In Situ Hybridization: Principles and Practice; Oxford University Press; M. J. Gait (Editor) (1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press; D. M. J. Lilley and J. E. Dahlberg (1992) Methods in Enzymology: DNA Structure Part A: Synthesis and Physical Analysis of DNA, Academic Press; and Lab Ref: A Handbook of Recipes, Reagents, and Other Reference Tools for Use at the Bench, Edited by Jane Roskams and Linda Rodgers (2002) Cold Spring Harbor Laboratory Press; Burgess and Deutscher (2009) Guide to Protein Purification, Second

Edition (Methods in Enzymology, Vol. 463), Academic Press. Note also U.S. Patent Nos.

8, 178,339; 8,119,365; 8,043,842; 8,039,243; 7,303,906; 6,989,265; US20120219994A1; and EP1483367B1. The entire contents of each of these texts and patent documents are herein incorporated by reference.

[075] The materials and methods employed in the examples below, as well as the examples themselves, are for illustrative purposes only, and are not intended to limit the practice of the presently disclosed or claimed embodiments thereto. Any materials and methods similar or equivalent to those described herein as would be apparent to one of ordinary skill in the art can be used in the practice or testing of any of these embodiments.

Methods

[076] Details of the materials and methods employed in the following examples include:

[077] Xcm strain isolation and manipulation: New strains were isolated from infected cotton leaves by grinding tissue in lOmM MgCb and culturing bacteria on NYGA media. The most abundant colony type was selected, single colony purified and then 16S sequencing was used to confirm the bacterial genus as previously described in Weisburg WG, Barns SM, Pelletier DA, Lane DJ. 16S ribosomal DNA amplification for phylogenetic study. Journal of bacteriology. 1991. In addition, single colony purified strains were re-inoculated into cotton leaves and the appearance of water soaked symptoms indicative of CBB infection was confirmed. Both newly isolated strains as well as strains received from collaborators were used to generate a rifampicin resistance version of each strain. Wildtype strains were grown on NYGA, then transferred to NYGA containing 100μg/ml rifampicin. After approximately 4-5 days, single colonies emerged. These were single colony purified and stored at -80C. The rifampicin resistant version of each Xcm strain was used in all subsequent experiments reported in this manuscript unless otherwise noted.

[078] Plant inoculations: Xcm strains were grown on NYGA plates containing 100μg/ml rifampicin at 30°C for two days before inoculations were performed. Disease assays were conducted in a growth chamber set at 30°C and 80% humidity. Inoculations were conducted by

infiltrating a fully expanded leaf with a bacterial solution in lOmM MgCb (OD6oo specified within each assay).

[079] Cotton Cultivar Statistics: Area of cotton planted per county in the United States in 2015 was obtained from the USD A National Agricultural Statistics Service. Estimated percentage of upland cotton planted for each variety was obtained from the Agricultural

Marketing Service (AMS). CBB disease phenotyping data from 2009-2016 was determined via cotyledon scratch assays and/or field trials sprayed with virulent Xcm isolates and has previously been described in Wheeler TA, Woodward JE. Response of cotton varieties to bacterial blight race 18 ain 2016. Texas AgriLife Research and Extension Service Center. 2016; Wheeler TA, Woodward JE. Response of cotton varieties to diseases on the Texas High Plains. Texas AgriLife Research and Extension Service Center. 2012; and Wheeler TA, Woodward JE. Response of cotton varieties to diseases on the Southern High Plains of Texas, 2010. Texas AgriLife Research and Extension Service Center. 2010.

[080] Bacterial Sequencing and Phylogenetics: Illumina based genomic datasets were generated as previously described in Bart R, Cohn M, Kassen A, McCallum EJ, Shybut M, Petriello A, et al. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance. Proceedings of the National Academy of Sciences. 2012. Paired-end Illumina reads were trimmed using Trimmomatic v0.32 (ILLUMIN ACLIP : Tru S eq3 -PE . f a : 2 : 30 : 10 LEADING: 3 TRAILING: 3 SLIDINGWINDOW:4: 15 MINLEN:36). Genome assemblies were generated using the SPAdes de novo genome assembler. Strain information is reported in Supplemental Table 1. Similar to our previously published methods, the program Prokka was used in conjunction with a T3E database to identify type three effector repertoires for each of the 12 Xcm isolates as well as four Xcm genomes previously deposited on NCBI (Table 2).

[081] Multi-locus sequence analysis was conducted by concatenating sequences of the gltA, lepA, lacF, gyrB, fusA and gap-1 loci obtained from the PI ant- Associated Microbes Database (PAMDB) for each strain as previously described in Almeida NF, Yan S, Cai R, Clarke CR, Morris CE, Schaad NW, et al. PAMDB, a multilocus sequence typing and analysis database and website for plant-associated microbes. Phytopathology. 2010. A maximum-likelihood tree using these concatenated sequences was generated using CLC Genomics 7.5.

[082] Variant Based Phylogeny: A variant based dendrogram was created by comparing 12 Illumina sequenced Xcm genomes to the complete Xanthomonas citri subsp. citri strain

Awl2879 reference genome (565918 [RefSeq]) on NCBI. Read pairs were aligned to the reference genome using Bowtie2 v2.2.9 with default alignment parameters. From these alignments, single nucleotide polymorphisms (S Ps) were identified using samtools mpileup vl .3 and the bcftools call vl .3.1 multi-allelic caller. Using Python v2.7, the output from samtools mpileup was used to identify loci in the X. citri subsp. citri reference genome with a minimum coverage of 10 reads in each Xcm genome used Python version 2.7. Vcftools vO.1.14 and bedtools v2.25.0 were used in combination to remove sites marked as indel, low quality, or heterozygous in any of the genomes. Remaining loci were concatenated to create a FASTA alignment of confident loci. Reference loci were used where S P's were not detected in a genome. The resulting FASTA alignment contained 17853 loci per strain. This alignment was loaded into the online Simple Phylogeny Tool from the ClustalW2 package to create a neighbor joining tree of the assessed strains. Trees were visualized using FigTree vl .4.2.

[083] Genome Assembly: Single Molecule, Real Time (SMRT) sequencing of Xcm strains MS 14003 and AR81009 was obtained from DNA prepped using a standard CTAB DNA preparation. Blue Pippin size selection and library preparation was done at the University of Deleware Sequencin Facility. The genomes were assembled using FALCON-Integrate. The following parameters were used: Assembly parameters for MS 14003 : length cutoff = 7000; length_cutoff_pr = 7000; pa_HPCdaligner_option = -v -dal8 -tl6 -e.70 -12000 -s240 -M10; ovlp HPCdaligner option = -v -dal8 -t32 -h60 -e.96 -12000 -s240 -M10; falcon sense option = --output multi— min idt 0.70— min cov 5— local match count threshold 2— max n read 300— n core 6; overlap filtering setting =— max diff 80— max cov 160—min cov 5— bestn 10; Assembly parameters for AR81009: length cutoff = 8000; length_cutoff_pr = 8000;

pa HPCdaligner option = -v -dal8 -tl6 -e.72 -12000 -s240 -M10; ovlp HPCdaligner option = -v -dal8 -t32 -h60 -e.96 -12000 -s240 -M10; falcon_sense_option = ~output_multi ~min_idt 0.72 —min cov 4— local match count threshold 2— max n read 320— n core 6;

overlap filtering setting =— max diff 90— max cov 300— min cov 10—bestn 10. Assemblies were polished using iterations of pbalign and quiver. Two iterations were run for Xcm strain MS14003 and 3 iterations for AR81009. Chromosomes were then reoriented to the DnaA gene and plasmids were reoriented to ParA. The assemblies were checked for overlap using BLAST, and trimmed to circularize the sequences. TAL effectors were annotated and grouped by RVD sequences using AnnoTALE. Homologous regions among plasmids that are greater than 1 kb were determined using progressiveMauve. Genomic comparisons between the MS 14003 and AR81009 chromosomes were visualized using Circos. Single-copy genes on each of the chromosomes were identified and joined using their annotated id's. Lines connecting the two chromosomes represent these common genes and their respective positions in each genome. A sliding window of 1KB was used to determine the average GC content. Methylation was determined using the Base Modification and Motif Analysis workflow from pbsmrtpipe vO.42.0.

[084] Western Blot Analysis: Western Blot analysis of Transcription Activator-Like (TAL) effectors was performed using a TAL specific antibody. Briefly, bacteria were suspended in 5.4 pH minimal media for 4.5 hours to induce effector production and secretion. Pellet was then suspended in laemmli buffer at 95 degrees Celsius for three minutes to lyse the cells. Freshly boiled samples were then loaded onto a 4-6% gradient gel and run for several hours to ensure sufficient separation of the different sized TAL effectors. Polyclonal rbTallO antibody was used to visualize all TALs.

[085] Gene Expression Analysis: Susceptible cotton were inoculated with Xcm using a needleless syringe at an OD6oo of 0.5. Infected and mock-treated tissue were collected and flash frozen at 24 and 48 hours post inoculation. RNA was extracted using the Sigma tRNA kit. RNA-sequencing libraries were generated as previously described in Bart R, Cohn M, Kassen A, McCallum EJ, Shybut M, Petri ello A, et al. High -throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance. Proceedings of the National Academy of Sciences. 2012.

[086] Raw reads were trimmed using Trimmomatic. The Tuxedo Suite was used for mapping reads to the TM-1 NBI Gossypium hirsutum genome, assembling transcripts, and quantifying differential expression.

[087] Homeologous pairs were identified based on syntenic regions with MCScan. A syntenic region is defined as a region with a minimum of five genes with an average intergenic distance of 2 and within extended distance of 40. All other values are set to the default.

[088] Bioinformatic prediction of TAL effector binding sites on the G hirsutum promoterome was performed using the TAL Effector-Nucleotide Targeter (TALEnt). In short, the regions of

the genome that were within 300 basepairs of annotated genes were queried with the RVD's of MS14003 and AR81009 using a cutoff score of 4. Promiscuously binding TALs 16 from MS14003 and 16a from AR81009 were removed from analysis.

Example I: CBB Reemergence in the US

[089] In 2011, farmers, Extension specialists, and Certified Crop Advisers in Missouri, Mississippi, and Arkansas observed cotton plants exhibiting symptoms of CBB. While a major limiting factor for cotton production through the 1950s, this disease had been controlled by agricultural practices such as acid-delinting seed as well as planting resistant cultivars. Prior to the widespread observation of CBB in the mid-southern U.S., isolated, sporadic instances of the disease were generally detected on an annual basis. Reemergence of the disease occurred rapidly during 2011. Widespread infected plant material was observed throughout much of the production area, but appeared to be centered around Clarksdale, Mississippi. Much of the infestation in the Arkansas production system was reported to have originated from several infested seed lots. The disease has since spread through much of the cotton belt in the southern U.S. (Fig. 1 and Fig. 2).

[090] In 2014, diseased cotton leaves were collected from three sites across Mississippi and Koch's postulates were conducted to prove causality. PCR amplification of the 16S rRNA gene confirmed that the causal agent was a member of the Xanthomonas genus. Multi locus sequence type (MLST) analysis and maximum-likelihood analysis were performed using concatenated sections of the gltA, lepA, lacF, gyrB, fusA and gap-1 loci (Fig. 3a) for increased phylogenetic resolution. The newly sequenced strains were named MS 14001, MS 14002 and MS 14003 and were compared to four previously published Xcm genomes and thirty-six additional

Xanthomonas genomes representing thirteen species (Table 1, Table 2). MS14001, MS14002 and MS 14003 grouped with the previously published Xcm strains as a single polytomy, further confirming that the current disease outbreak is CBB and is caused by Xcm. The species designation reported here is consistent with previous reports. To date, CBB has been reported from at least eight out of the sixteen states that grow cotton (Fig. 1).

[091] Table 1: Illumina and SMRT sequenced Xcm genomes described herein.

Strai Ident Lati Long City/C Y Prov Colle Platf Co Avg Tot n50 n ifier tude itude ountry ea ider ctor orm nti Con al

Nam r g # tig Bas e Len es

MSI 34.1 Clarksd 20 R.Ba R.Ba Illu 332 1330 441 228 4001 5 90.52 ale, MS 14 rt rt mina 0 .38 686 8

5

MSI 34.1 Clarksd 20 R.Ba R.Ba Illu 545 9443 514 625 4002 2 90.52 ale, MS 14 rt rt mina .27 658 42

0

MSI 32.9 Wilzon 20 R.Ba R.Ba Illu 257 1511 389 220 4003 5 90.51 e, MS 14 rt rt mina 7 .35 474 9

4

Race US S.lu P.Th Illu 523 1012 529 485 1 axton mina 7.35 660 99

6

Race US S.lu P.Th Illu 387 1340 518 548 2 axton mina 2.57 679 04

6

Race US S.lu P.Th Illu 725 7207 522 283 3 axton mina .34 532 44

4

Race US S.lu P.Th Illu 632 8134 514 214 12 axton mina .35 091 28

1

Race US S.lu P.Th Illu 369 1392 513 112 18 axton mina 4.03 796 543

8 AR8 CFB Argenti 19 CIR Fross Illu 306 1718 525 865

1009 P203 na 81 M- ard P mina 2.59 787 94

5 CFB 2

P

MA8 CFB Mali 19 CIR Fross Illu 584 9033 527 233

1010 P203 81 M- ard P mina .09 532 23

6 CFB 6

P

SU5 CFB Sudan 19 CIR Last Illu 113 4563 517 968

8011 P253 58 M- F T. mina 4 .33 481 2

0 CFB M25 9

P 19

SU9 CFB Sudan 19 CIR Sch Illu 377 1391 524 885

012 P563 92 M- mit J. mina 9.54 766 22

7 CFB 1178 5

P 1

Xcm MSC US S.lu Illu 216 2151 466 386

013 T4 mina 9 .5 660 9

7

Xcm MSC US S.lu Illu 580 8929 517 882

014 T8 mina .58 915 55

6

MSI 32.9 - Wilzon 20 R.Ba R.Ba SM 4 1286 514 502

4003 5 90.51 e, MS 14 rt rt RT 177 470 961

6 7

AR8 CFB Argenti 19 CIR Fross SM 4 1352 540 526

1009 P203 na 81 M- ard P RT 212 884 705

5 8 7

[092] Table 2: Xanthomonas genomes previously deposited on NCBI that are referenced herein.


GCA_000007145 Xanthomonas campestris pv. campestris str. ATCC X.

33913 campestris l

GCA_000012105 Xanthomonas campestris pv. campestris str. 8004 X.

campestris_2

GCA_000403575 Xanthomonas campestris pv. campestris str. CN15 X.

campestris_3

GCA_000007165 Xanthomonas axonopodis pv. citri str. 306 X. citri_l

GCA_000263335 Xanthomonas citri pv. mangiferaeindicae LMG 941 X. citri_2

GCA_000349225 Xanthomonas citri subsp. citri Awl2879 X. citri_3

GCA_000009165 Xanthomonas campestris pv. Vesicatoria X.

euvesicatoria l

GCA_000802325 Xanthomonas euvesicatoria strain 66b X.

euvesicatoria_2

GCA_000175135 Xanthomonas fuscans subsp. aurantifolii str. ICPB X. fuscans_l

11122

GCA_000741885 Xanthomonas fuscans subsp. fuscans strain CFBP4884 X. fuscans_2

GCA_000817715 Xanthomonas fuscans subsp. fuscans strain X621 X. fuscans_3

GCA_000192065 Xanthomonas gardneri ATCC 19865 X. gardneri_l

GCA_000007385 Xanthomonas oryzae pv. oryzae KACC 10331 X. oryzae_l

GCA_000010025 Xanthomonas oryzae pv. oryzae MAFF 311018 X. oryzae_2

GCA_000019585 Xanthomonas oryzae pv. oryzae PX099A X. oryzae_3

GCA_000192045 Xanthomonas perforans 91-118 X. perforans l

GCA_000800665 Xanthomonas perforans strain 4P1 S2 X. perforans_2

GCA_000225975 Xanthomonas sacchari NCPPB 4393 X. sacchari l

GCA_000815185 Xanthomonas sacchari strain Rl X. sacchari_2

GCA_000831625 Xanthomonas sacchari strain LMG 476 X. sacchari_3

GCA_000159795 Xanthomonas vasicola pv. vasculorum NCPPB 702 X. vasicola_l

GCA_000277995 Xanthomonas vasicola pv. vasculorum NCPPB 1326 X. vasicola_2

GCA_000772695 Xanthomonas vasicola strain NCPPB 1241 X. vasicola_3

GCA_000192025 Xanthomonas vesicatoria ATCC 35937 X.

vesicatoria l

GCA_000803145 Xanthomonas vesicatoria strain 53M X X.

vesicatoria_2

GCA_000803155 Xanthomonas vesicatoria strain 15b X.

vesicatoria_3

GCA_000454525 Xanthomonas citri pv. malvacearum X20 Xcm BF l

!Xcitri_malvX20

GCA_000454505 Xanthomonas citri pv. malvacearum XI 8 Xcm_BF_2

!Xcitri_malvX18

GCA_000309925 Xanthomonas axonopodis pv. malvacearum str. Xcm_SU44

GSPB2388 !Xaxon_malv2388

GCA_000309905 Xanthomonas axonopodis pv. malvacearum str. Xcm_NI86

GSPB1386 !Xaxon_malvl386

Example II: Contemporary U.S. Xcm strains cluster phylogenetically with historical race 18 strains.

[093] Race groups have been described for Xcm strains by analyzing compatible (susceptible) and incompatible (resistant) interactions on a panel of seven cotton cultivars. In general, race groups tend to be geographically distinct. For example, as mentioned previously, race 18 is prevalent in the U.S. while race 20 is a highly virulent strain reported from several African countries. Consequently, one possible explanation for the recent outbreak of CBB would be the introduction of a new race of Xcm capable of overcoming existing genetic resistance.

Unfortunately, only 2 cultivars of the original cotton panel plus three related cultivars, were available and these cultivars were not sufficient to determine whether a new race had established within the U.S. Consequently, twelve Xcm strains were sequenced using Illumina technology to determine the phylogenetic relationship between recent isolates of Xcm and historical isolates. Isolates designated as race 1, race 2, race 3, race 12 and race 18 have been maintained at Mississippi State University. Additional isolates were obtained from the Collection Francaise de Bacteries associees aux Plantes (CFBP) culture collection. Together, these isolates include eight strains from the US, three from Africa, and one from South America and span collection dates ranging from 1958 through 2014 (Fig. 1). Illumina reads were mapped to the Xanthomonas citri subsp. citri strain Awl2879 (565918 [RefSeq]) using Bowtie and single nucleotide

polymorphisms (SNPs) were identified using Samtools. Only regions of the genome with at least lOx coverage for all genomes were considered. This approach identified 17,853 sites that were polymorphic in at least one genome. Nucleotides were concatenated and used to build a neighbor-joining tree (Fig. 3b). This analysis revealed that recent Xcm isolates grouped with the race 18 clade. Notably, the race 18 clade is phylogenetically distant from the other Xcm isolates.

Example III: Contemporary US Xcm strains have conserved virulence protein arsenals and disease phenotypes with historical race 18 strains.

[094] Type three effector (T3E) profiles from sixteen Xcm isolates were compared to determine whether a change in the virulence protein arsenal of the newly isolated strains could explain the re-emergence of CBB. Genomes from 12 Xcm isolates were de novo assembled with SPAdes and annotated with Prokka based on annotations from the X. euvesicatoria (aka. X. campestris pv. vesicatoria) 85-10 genome (NCBI accession: NC_007508.1). T3Es pose a particular challenge for reference based annotation as no bacterial genome contains all effectors. Consequently, an additional protein file containing known T3Es from our previous work was included within the Prokka annotation pipeline. This analysis revealed 24 conserved and 9 variable Xcm T3Es (Fig. 4a). Most race 18 isolates contain more effectors than other isolates that were sequenced. The recent Xcm isolates (MS14002 and MS14003) were not distinguishable from historical race 18 isolates, with the exception of XcmNI86 isolated from Nigeria in 1986, which contains mutations in XopE2 and XopP.

[095] Analysis of the genomic sequence of T3E revealed presence/absence differences, frameshifts and premature stop codons. However, this analysis does not preclude potential allelic or expression differences among the virulence proteins that could be contributing factors to the re-emergence of CBB. Therefore, newly isolated strains may harbor subtle genomic changes that have allowed them to overcome existing resistance phenotypes. Many commercial cultivars of cotton are reported to be resistant to CBB. Based on these previous reports, we selected commercial cultivars resistant and susceptible (6 of each) to CBB. In addition, we included 5 available varieties that are related to the historical panel as well as 2 parents from a nested association mapping (NAM) population currently under development. All varieties inoculated with the newly isolated Xcm strains exhibited inoculation phenotypes consistent with previous reports for these varieties (Fig. 4b, Fig. 4c). In these assays, brightfield and near infrared (NIR) imaging were used to distinguish water-soaked disease symptoms from rapid cell death

(hypersensitive response) that is indicative of an immune response. These data confirm that existing resistance genes present within cotton germplasm are able to recognize the newly isolated Xcm strains and trigger a hypersensitive response. Together, the phylogenetic analysis, effector profile conservation and cotton inoculation phenotypes, confirm that the recent outbreak of Xcm in the US represents a re-emergence of race I S Xcm and is not the result of a dramatic shift in the pathogen.

[096] The USDA Agricultural Marketing Service (AMS) publishes percentage of upland cotton cultivars planted in the U. S. each year. In 2016, only 25% of the total cotton acreage was planted with resistant cultivars (Fig. 4d), based on previously published CBB phenotypes for these cultivars. This is part of a larger downward trend in which the acreage of resistant cultivars has fallen each year since at least 2009.

Example IV: Comparative genome analysis for two Xcm strains

[097] Differences in virulence were observed among Xcm strains at the molecular and phenotypic level. In order to gain insight into these differences, we selected two strains from our collection that differed in T3E content, virulence level, geography of origin and isolation date. AR81009 was isolated in Argentina in 1981 and is one of the most virulent strains in this study; MS 14003 was isolated in Mississippi in 2014 and causes comparatively slower and diminished

leaf symptoms. Full genome sequences were generated with Single Molecule Real-Time

(SMRT) sequencing. Genomes were assembled using the PacBio Falcon assembler which yielded circular 5Mb genomes and associated plasmids. Genie synteny between the two strains was observed with the exception of two 1.05 Mb inversions (Fig. 5). Regions of high and low GC content, indicative of horizontal gene transfer, were identified in both genomes. In particular, a 120kb insertion with low GC content was observed in AR81009. This region contains one T3E as well as two annotated type four secretion system related genes, two conjugal transfer proteins, and two multi drug resistant genes. MS 14003 contained three plasmids of the sizes 52.4, 47.4, and 15.3kb while AR81009 contained two plasmids of the sizes 92.6 and 22.9kb. Analysis of homologous regions among the plasmids was performed using progressiveMauve. This identified four homologous regions greater than lkb that were shared among multiple plasmids (Fig. 5).

[098] The AR81009 genome encodes twelve TAL effectors that range in size from twelve to twenty three repeat lengths, six of which reside on plasmids. The MS 14003 genome encodes eight TAL effectors that range in size from fourteen to twenty eight repeat lengths, seven of which reside on plasmids (Fig. 6a). Three incomplete TAL effectors were also identified within these genomes. A 1 -repeat gene with reduced 5' and 3' regions was identified in both strains directly upstream of a complete TAL effector. In addition, a large 4kb TAL effector was identified in AR81009 with a 1.5 kb insertion and 10 complete repeat sequences. The tool AnnoTALE was used to annotate and group TAL effectors based on the identities of the repeat variable diresidues (RVDs) in each gene. Little homology was identified among TAL effectors within and between strains; only two TAL effectors were determined to be within the same TAL class between strains (TAL 19b of AR81009 and TAL 19 of MS14003) and two within strain MS14003 (TAL14b and TAL16). Both strains express TAL effector proteins as demonstrated through western blot analysis using a TAL effector specific antibody (Fig. 6b). However, the complexity of TAL effector repertoires within these strains prevented complete resolution of each individual TAL effector.

Example V: Transcriptome changes induced by Xcin in G. hirsutum,

[099] An RNA-sequencing experiment was designed to determine whether AR81009 and MS 14003 incite different host responses during infection (Fig. 7a). Isolates were inoculated into the phylogenetically diverse G hirsutum cultivars Acala Maxxa and DES 56 (Fig. 7b). Infected and mock-treated tissue were collected at 24 and 48 hours post inoculation. First, we considered global transcriptome patterns of gene expression. Fifty-two genes were determined to be induced in all Xcm-G. hirsutum interactions at 48 hours (Table 3). Of note among this list of genes is a homeologous pair of genes with homology to the known susceptibility target, MLO. Gene induction by a single strain was also observed; AR81009 and MS14003 uniquely induced 127 and 16, G hirsutum genes, respectively (Fig. 7c). The increased number of genes induced by AR81009 correlates with the observed severe leaf symptoms caused by this strain. In contrast, the average magnitude of gene induction between the two strains was not significantly different (Fig. 9). Both Xcm strains caused more genes to be differentially expressed in DES 56 than in Acala Maxxa. Among the 52 genes significantly induced by both strains, sixteen conserved targets are homeologous pairs, whereas seventeen and fifteen genes are encoded by the A and D sub-genomes, respectively (Table 3 and Table 4). It has been previously reported that homeologous genes encoded on the G hirsutum A and D sub-genomes are differentially regulated during abiotic stress. A set of approximately 10,000 homeologous gene pairs were selected and differential gene expression was assessed (Fig. 10). For each pairwise comparison of Xcm strain and G. hirsutum cultivar, a similar number of genes were differentially expressed in each subgenome.

[0100] Table 3: RNA-Seq analysis reveals that 52 genes are induced in all Xcm-G. hirsutum interactions at 48 hours ((p < 0.05) with a Log2 (fold change in FPKM) > 2).

Gh_A10G0257 Gh_D10G0257 Protein E6

Gh_A10G1075 Gh_D10G1437 Pectin lyase-like superfamily protein

Gh_A13G1467 Gh_D13G1816 pathogenesis-related 4

Gh_A01G0779 Predicted Protein

Gh_A01G1712 terpene synthase 21

Gh_A02G0972 glycosyl hydrolase 9B13

Gh_A03G0875 Protein of unknown function (DUF1666)

Gh_A04G0364 Cysteine proteinases superfamily protein

Gh_A04G0366 Cysteine proteinases superfamily protein

Gh_A05G1967 Predicted Protein

Gh_A07G0470 malate synthase

Gh_A08G1167 downstream target of AGL15 2

Gh_A09G0128 EXS (ERDl/XPRl/SYGl) family protein

Gh_A09G1148 Protein of unknown function, DUF642

Gh_A09G1803 Pectin lyase-like superfamily protein

Gh_A12G2323 PARI protein

Gh_A13G0185 expansin A4

Gh_A13G0205 Ypt/Rab-GAP domain of gyplp superfamily protein

Gh_A13G0281 Subtilase family protein

Gh_A13G1662 Protein of unknown function (DUF1677)

Gh_D01G1158 hydroxy methylglutaryl CoA reductase 1

Gh_D02G1352 glutaredoxin-related

Gh_D02G1437 Plant invertase/pectin methylesterase inhibitor superfamily protein

Gh_D03G1462 osmotin 34

Gh_D05G2589 laccase 14

Gh_D06G0662 Nucleotide-diphospho-sugar transferases superfamily protein

Gh_D07G1960 Uncharacterized membrane protein

Gh_D07G1997 RAB GTPase homolog A5E

Gh_D08G0336 WUSCHEL related homeobox 13

Gh_D08G2134 Protein of unknown function (DUF1635)

Gh_D09G1130 beta-l,3-glucanase 3

Gh_D10G1861 expansin A8

Gh_Dl lG0279 chloroplast beta-amylase

Gh_Dl lG1628 reversibly glycosylated polypeptide 1

Gh_D12G2309 glycosyl hydrolase 9C2

Gh_Sca005130G01 photosystem II reaction center protein B

Gh_Sca005423G01 Leucine-rich receptor-like protein kinase family protein

Gh_Sca006797G01 TBP-ASSOCIATED FACTOR 6B

[0101] Table 4: RNA-Seq analysis reveals that 8 homeologous pairs of G. hirsutum genes are upregulated in both Acala Maxxa and DES 56 cultivars 48 hours post inoculation with Xcm strains MS14003 and AR81009 at Log2(fold change in FPKM) > 2 and p value < 0.05). Homeologous pairs were identified using genie synteny.


Gh_A05G2012 Gh_D05G2256 Protein of unknown function DUF688

Gh_A06G0439 Gh_D06G0479 basic chitinase

Gh_A07G1129 Gh_D07G1229 Protein of unknown function (DUF1278)

Gh_A10G0257 Gh_D10G0257 Protein E6

Gh_A10G1075 Gh_D10G1437 Pectin lyase-like superfamily protein

Gh_A13G1467 Gh_D13G1816 pathogenesis-related 4

Example VI: Different strains of Xcm target distinct SWEET transporters in G. hirsutum.

[0102] SWEET sugar transporter genes are commonly targeted and upregulated by TAL effectors in Xanthomonas plant interactions. Surprisingly, no SWEET genes were detected in the above list of conserved targets. However, of the 54 SWEET sugar transporter genes encoded by the G. hirsutum genome, three were upregulated greater than 4 fold in response to inoculation by one of the two Xcm strains (Fig. 11). Potential TAL effector binding sites were identified using the program TALEnt. MS 14003 significantly induces the homeologs Gh_A04G0861 and Gh_D04G1360 and contains three TAL effectors predicted to bind within the 300bp promoter sequences of at least one of these genes (Fig. 11a). In contrast, AR81009 significantly induces Gh_D12G1898 but not its homeolog Gh_A12G1747 (Fig. 1 lb). TAL 14a, TAL14c, and TAL 16b from AR81009 are all predicted to bind to the Gh_D12G1898 promoter however the latter two are also predicted to bind to the homeolog Gh_A12G1747. We note that while Gh_A12G1747 did not pass the four fold cut off for gene induction, this gene is slightly induced in DES 56 compared to mock inoculation.

Example VII

[0103] Cotton Bacterial Blight was considered controlled in the U.S. until an outbreak was observed during the 2011 growing season in Missouri, Mississippi and Arkansas. Until 2011, seed sterilization, breeding for resistant varieties, and farming techniques such as crop rotation and sterilizing equipment prevented the disease from becoming an economic concern. The number of counties reporting incidence of CBB has increased from 17 counties in 2011 to 77

counties in 2015. Herein we investigate the root of the re-emergence and identifies several routes towards control of the disease.

[0104] When the disease was first recognized as re-emerging, several possible explanations were proposed including: (1) A highly virulent race of the pathogen that had been introduced to the U.S.; (2) Historical strains of Xcm that had evolved to overcome existing resistance; and (3) Environmental conditions over the last several years that had been particularly conducive to the disease. Here, we present evidence that the re-emergence of CBB is not due to a large genetic change or race shift in the pathogen as has been previously suggested. Rather, the re-emergence of the disease is likely due to large areas of susceptible cultivars being planted. The presented data do not rule out potential environmental conditions that may also have contributed to the re-emergence. In this context, environmental conditions includes disease conducive temperature and humidity as well as potentially contaminated seed or other agronomic practices that may have perpetuated spread of the disease outbreaks. Importantly, the presented data confirm the presence of resistance loci could be deployed to prevent further spread of this disease. However, since many of the most popular farmer preferred varieties lack these resistance traits, additional breeding or biotechnology strategies will be needed to maximize utility. Notably, the current Xcm isolates characterized in this study all originate from Mississippi cotton fields in 2014. During the 2015 and 2016 growing seasons, resistant cotton cultivars were observed in Texas with symptoms indicative of bacterial infection yet distinct from CBB. Additional work is underway to identify and characterize the causal agent(s) of these disease symptoms.

[0105] While resistant cotton cultivars were identified for all strains in this study, variability in symptom severity was observed for different strains when inoculated into susceptible cultivars. Two strains in particular, MS 14003 and AR81009, have different effector profiles as well as different disease phenotypes. Comparative genomic analysis of the two pathogens revealed many differences that may contribute to the relative disease severity phenotypes. Similarly, transcriptomic analysis of two cultivars of G hirsutum inoculated with these strains confirm that the genomic differences between the two strains result in a divergence in their molecular targets in the host.

[0106] Over the past decade, susceptibility genes have become targets for developing disease tolerant plants. These genes are typically highly induced during infection. Therefore, RNA-Seq

of infected plants has become a preferred way to identify candidate susceptibility genes. Once identified, genome editing can be used to block induction. We report a homeologous pair of genes that are homologs of the MLO gene as targeted by both Xcm strains in both cotton cultivars. This conservation makes it an excellent candidate for future biotechnology efforts. However, because the potential importance of this gene in cotton biology is unknown, the effect of disrupting this gene in cotton physiology must first be explored. The dual purpose of host susceptibility genes has been observed previously. For example, the rice Xal3 (aka. Os8N3 and OsSWEETl 1) gene is required for pollen development but also targeted by a rice pathogen during infection. Xal3 is a member of the SWEET sugar transporters implicated in many pathosystems. In this case, the induction of Xal3 for pathogen susceptibility is mediated by a TAL effector. Of the 54 SWEET genes in the G hirsutum genome, three are significantly upregulated during Xcm infection. In contrast to MLO, no single SWEET gene was induced by both pathogen strains in both hosts. Analysis of SWEET gene expression after inoculation revealed a context for polyploidy in the G hirsutum-Xcm pathosystem. We observed a difference in induction between the Gh_A12G1747 and Gh_D12G1898 SWEET genes. Future research may investigate the diploid ancestors of tetraploid cotton to further explore the evolution of host and pathogen in the context of ploidy events.

[0107] Multiple putative TAL effector binding sites were identified within each up-regulated SWEET promoter. These observations suggest that TAL14b, TAL28a and TAL28b from

MS 14003 may work independently or in concert to induce the homeologs Gh_A04G0861 and Gh_D04G1360. Further, TALC14a from AR81009 is likely responsible for the upregulation of Gh_D12G1898. Whether additional TAL effectors are involved in these responses is not clear. It is possible that not all the TAL effectors are expressed. Similarly, genome organization in the host, such as histone modifications or other epigenetic regulation may be affecting these interactions. Future research will investigate these mechanisms further. However, these experiments will be difficult as most Xcm strains are not amenable to conjugation nor electroporation.

[0108] Collectively, the data presented here suggest that the wide-spread planting of CBB-susceptible cultivars has contributed to the re-emergence of CBB is the southern U.S. It is possible that a pathogen reservoir was maintained in cotton fields below the level of detection due to resistant cultivars planted in the 1990s and early 2000s. Alternatively, the pathogen may have persisted on an alternate host or was brought in on contaminated seed as has previously been suggested. Regardless of the cause of the re-emergence, this work has identified several possible routes towards resistance including use of existing and effective resistance loci and potential disruption of the induction of susceptibility through genome editing. The latter is an attractive strategy in part because of recent progress in genome editing. In summary, within a relatively short time frame, through the deployment of modern molecular and genomic techniques, we were able to identify the cause of the re-emergence of cotton bacterial blight and generate data that can now be rapidly translated to effective disease control strategies.

[0109] In order to disrupt pathogen induced expression of susceptibility genes in cotton, CRISPR-Cas9 or other genome editing nucleases will be used to mutate the effector binding element (EBE) in the promoters of TAL effector induced susceptibility genes (eg. SWEETs) or the gene coding regions of non-TAL effector induced susceptibility genes (eg. MLO). Guide RNAs (gRNAs) will be designed to bind DNA sequences nearby the target sequences while avoiding other sites within the cotton genome. Upon binding, the nuclease cuts the DNA.

Mutations are introduced either via the cotton plant's endogenous DNA repair mechanisms or by supplying a repair template containing the desired DNA sequence. The latter can be achieved through homologous recombination if homology arms are supplied on the repair template, or through non-homologous end joining if multiple gRNAs are included to cut on either side of the target DNA sequence. Plants will be screened for mutations that abolish pathogen induced gene expression while maintaining normal cotton physiology and development.

Example VIII

[0110] Cultivation of upland cotton (Gossypium hirsutum) is economically hindered by infection from a wide range of microbial pathogens. These pathogens range from fungal pathogens such as Fusarium oxysporum f.sp. vasinfectum (Fusarium wilt), Verticillium dahlia (Verticillium wilt), and Leveillula taurica (Powdery Mildew) to bacterial pathogens such as Xanthomonas citri pv. malvacearum (Cotton Bacterial Blight) and Pseudomonas syringae to viral pathogens such as Cotton Leaf Curl Virus. These pathogens decrease yield by a number of methods including but not restricted to wilting, root rot, decreased photosynthetic activity, stunted growth and development, stunted boll formation, boll rot, decreased seed set, seedling pre-emergent

damping-off, seedling post-emergent damping-off, black arm on stems and petioles, water-soaked lesions on above ground tissue, and systemic vein infection.

[0111] Diverse microbial pathogens (fungal, oomycete, bacterial and viral) target an overlapping set of susceptibility factors in the host to promote disease. Previously described susceptibility genes encompass a broad range of mechanisms including facilitating the entry of the pathogen into the host, down-regulating immune responses and promoting pathogen proliferation in the host (van Schie C. and Takken F., 2014). Mutations in susceptibility genes are often used to confer durable broad spectrum resistance to pathogens. This is in contrast to Resistance genes that usually only provide resistance to a particular race of a pathogen. While many classes of susceptibility genes have been previously described, it is currently not possible to predict, a priori, which specific genes will be targeted by a given pathogen within a specific plant host. Thus, experimental investigation is required.

[0112] A class of sugar transporters (SWEET s) have been identified as susceptibility genes in many plants in response to Xanthomonas sp. infection. The mechanism of induction of these genes has been identified. Transcription Activator-Like Effectors (TALE's) bind in a sequence-specific manner to the promoter of these genes and up-regulate their expression. This in turn promotes the proliferation of the pathogen. This effector-susceptibility gene interaction has been disrupted in rice by editing the promoter of SWEET genes so that the TAL effector can no longer bind to and up-regulate the gene expression. This in turn conserves the normal expression of the gene while modulating the pathogen's ability to use it to promote susceptibility.

[0113] The G. hirsutum genome encodes 52 annotated SWEET genes. We describe 4 genes that are specifically upregulated by Xanthomonas citri pv. malvacearum. Mutations introduced into the promoters of these genes will provide disease resistance. These genes were identified from an RNA-Seq experiment in which these genes were up-regulated 48 hours after infection with Xcm and not by the mock treatment (Table).

[0114] In this experiment we also identified a pair of up-regulated genes with homology to known Mildew Locus O (MLO) genes. Mildew Locus O (MLO) was first identified in the 1940's as a susceptibility gene in barley that conferred broad spectrum resistance to powdery mildew (Freisleben R and Lein A, 1942). To date, at least 16 mutants of MLO that confer resistance to powdery mildew have been identified. These were either caused by natural

mutations, chemical mutations, TILLING, or RNAi technologies (Kusch S and Panstruga R, 2017). However, this locus has also been found to confer resistance to oomycete and

Xanthomonas sp. infection (Kim D. and Hwang B., 2012). Thus, our discovery suggests that mutation of cotton MLO will result in broad spectrum resistance.

[0115] The disclosure being thus described, it will be obvious that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the disclosure, and all such modifications as would be obvious to one skilled in the art are intended to be included within the scope of the following claims.