Emergence and diversification of a highly invasive chestnut pathogen lineage across south-eastern Europe

Invasive microbial species constitute a major threat to biodiversity, agricultural production and human health. Invasions are often dominated by one or a small number of genotypes, yet the underlying factors driving invasions are poorly understood. A prime example for a successful global invasion is the recent outbreak of the chestnut blight fungus Cryphonectria parasitica. Native to East Asia, the pathogen colonized North America and Europe during the first half of the 20th century. After decimation of the American chestnut, the pathogen threatens European chestnut trees. To unravel the mechanisms underpinning the most recent invasion of south-eastern Europe over the past decades, we sequenced genomes of 188 predominantly European strains. Genotypes outside of the invasion zone showed high levels of diversity with evidence for frequent and ongoing recombination. The invasive lineage emerged from the highly diverse European genotype pool rather than a secondary introduction from Asia. The expansion across south-eastern Europe was mostly clonal and shows distinct signs of mutation accumulation. The lineage is also dominated by a single mating type suggesting a fitness advantage to switch to asexual reproduction. However, we show experimentally that the lineage retained the ability to undergo mating consistent with the low degree of recombination detected among strains within the lineage and possibly closely related strains. Our findings show how an intermediary, highly diverse bridgehead population gave rise to an invasive, largely clonally expanding pathogen.

pathogen colonized North America and Europe during the first half of the 20 th century. After 23 decimation of the American chestnut, the pathogen threatens European chestnut trees. To unravel the 24 mechanisms underpinning the most recent invasion of south-eastern Europe over the past decades, we 25 sequenced genomes of 188 predominantly European strains. Genotypes outside of the invasion zone 26 showed high levels of diversity with evidence for frequent and ongoing recombination. The invasive 27 lineage emerged from the highly diverse European genotype pool rather than a secondary introduction 28 from Asia. The expansion across south-eastern Europe was mostly clonal and shows distinct signs of 29 mutation accumulation. The lineage is also dominated by a single mating type suggesting a fitness 30 advantage to switch to asexual reproduction. However, we show experimentally that the lineage 31 coloured hexagons represent sampling locations where EU-12 (mating type MAT-1) isolates were 157 found (marked as "S12 lineage"). First observations of chestnut blight in the corresponding countries 158 and regions are marked in a colour scheme according to decade. A tree with individual isolate 159 labelling is shown in Supp. Fig. 3.

161
We performed a SplitsTree phylogenetic network analysis to account for reticulation caused by 162 recombination. The network showed a high diversification, with both long branching and reticulation 163 ( Figure 2, Supp. Fig. 4). The PHI-test revealed significant evidence for recombination (p < 0.0001). 164 Despite the high level of genetic diversity, we found no evidence for geographic structure. Moreover, 165 we found no clustering of isolates belonging to the same vegetative compatibility type with the 166 exception of some EU-01, EU-02 and EU-12 (S12) genotypes from the Balkans. Nearly all C. 167 parasitica isolates representing the S12 lineage showed almost identical genotypes and tight 168 clustering. All most tightly clustered S12 genotypes were of mating type MAT-1 (n = 104). 169 Consistent with analyses by Milgroom et al. (2008), this group represents the invasive S12 lineage at 170 the origin of the expansion of C. parasitica across south-eastern Europe. Additionally, the 171 phylogenetic network revealed closely related but not identical S12 genotypes of mating type MAT-172 2 (n = 7, Figure 2). Hence, S12 outbreak strains of MAT-2 connect the nearly uniform cluster of S12 173 MAT-1 strains with the remaining genetic diversity of the major European subgroup of C. parasitica. 174 The S12 cluster was furthermore connected with the remaining genotypes of the major clade by two 175 EU-12 isolates from Bosnia (M1808 with MAT-1) and Georgia (MAK23 with MAT-2). The highlighted branches represent the most abundant vegetative compatibility types (colour scheme 180 matching Figure 1A). Isolates belonging to the S12 outbreak lineage (EU-12; mating type MAT-1, n 181 = 104) are marked with a purple hexagon. S12 isolates of mating type MAT-2 are highlighted in light-182 purple. Additional EU-12 isolates not belonging to the S12 lineage are highlighted in blue with 183 information on the country of origin. Genetic donors of the S12 lineage as inferred by 184 fineSTRUCTURE ( Figure 3, Table 1, Supp. Fig. 5) are marked with red squares. 185 186 187 Potential S12 founder populations 188 The maximum likelihood phylogenetic tree and the SplitsTree network revealed that the invasive S12 189 lineage has closely related genotypes occurring in Europe. Thus, to dissect the genetic origin of S12, 190 we performed a co-ancestry matrix analysis using fineSTRUCTURE considering all isolates of the 191 major US/European subgroup, including the S12 lineage (n = 185; Figure 1A). The averaged co-192 ancestry matrix revealed no direct ancestors of the invasive S12 lineage among the major clade of 193 different vegetative compatibility types. However, we found an association with a coefficient of 24.9 194 -32.3 between the recipient S12 genotypes and European donors from different locations ( Figure 3, 195  To infer potential invasion routes of the S12 outbreak lineage, we investigated intra-lineage genetic 209 diversity across south-eastern Europe. We focused only on S12 isolates of mating type MAT-1 to 210 delimit the closest genotypes contributing to the outbreak (n = 104; Figure 2). The closely related 211 genotypes segregated 468 high-confidence SNPs across the genome. The genetic structure assessed 212 by a principal component analysis showed loose clustering of genotypes across south-eastern Europe 213 ( Figure 4A and B, Supp. Fig. 6). We assigned genotypes to five regions: Italy, Northern Balkans, 214 Central Balkans, Greece/Turkey and Georgia ( Figure 4A). Italy, Northern and Central Balkans, as 215 well as Georgia harbored mainly genotypes of two dominant clusters. In contrast, the Greece/Turkey 216 region contained genotypes of the two dominant clusters but also a broad diversity of further 217 genotypes. We analyzed evidence for reticulation in the phylogenetic relationships among genotypes 218 but found a star-like structure. We found minor evidence for reticulation among Central Balkans 219 genotypes ( Figure 4C). Consistent with the phylogenetic network pattern, we found significant 220 evidence for recombination within the S12 lineage (PHI test; p = 0.0035). We tested experimentally 221 whether S12 mating type MAT-1 isolates were still able to reproduce sexually. We confirmed 222 outcrossing of isolates of opposite mating type within the S12 lineage by pairing isolates from Molliq 223 (Kosovo) and Nebrodi (southern Italy) (Fig. 4D). Mating pairs from Molliq and Nebrodi grew 224 numerous perithecia, which are the fruiting bodies specific to sexual reproduction (Fig. 4E). Pairings 225 of Bosnian isolates showed no perithecia formation. Using molecular mating type assays, we 226 recovered both mating types among the ascospores produced from successful matings . Principal component analysis (PCA) and C) SplitsTree of the S12 outbreak isolates. Symbols and 232 colors are as in A). D) Scheme of successful mating pairs of S12 mating type MAT-1 isolates crossed 233 with isolates from the opposite mating type, of the same geographic origin. Symbols are as in C). E) 234 Photographic images of sexual C. parasitica fruiting bodies (i.e. perithecia) emerging from crosses of 235 S12 mating type MAT-1 isolates with isolates of the opposite mating type after five months of 236 incubation under controlled conditions. Left: Perithecia embedded in a yellow-orange stromatic 237 tissue. Right: Cross-section of perithecia and chestnut bark. Flask-shaped structures with a long 238 cylindrical neck develop in yellow-orange stromatic tissue and are embedded in the bark (except for 239 the upper part). The ascospores are formed in sac-like structures (asci) in the basal part of the 240 perithecium. When mature, the ascospores are actively ejected into the air through a small opening 241 (ostiole) at the end of the perithecial neck. 242 243 244

Polymorphism and allele frequency spectra within the outbreak lineage 245
To gain insights into evolutionary forces shaping polymorphism in the outbreak S12 mating type 246 MAT-1 lineage versus non-S12 populations, we first analyzed allele frequencies across the genome 247 in both groups ( Figure 5A). The S12 lineage segregated virtually no intermediate allele frequencies in 248 the range of 0.05-0.95. In contrast, the non-S12 genotypes showed overall a wide spectrum of allele 249 frequencies across the genome. Second, we analyzed the predicted impact on protein functions in the 250 S12 lineage and non-S12 populations ( Figure 5B). We found 4 highly and 94 moderately deleterious 251 SNPs within S12 in contrast to 29 high and 773 moderately deleterious mutations in non-S12 groups 252 ( Figure 5B). Three of the high impact SNPs in the S12 lineage were classified as stop gain mutations, 253 as well as one splice acceptor variant (insertion variant). Two of these high impact mutations affect 254 proteins of the major facilitator superfamily, as well as a protein containing a LCCL domain and an 255 ecdysteroid kinase. Non-S12 populations showed an over-representation of low frequency high-256 impact mutations ( Figure 5B). This is consistent with purifying selection reducing the frequency of 257 these mutations due to fitness costs. Within the S12 lineage nearly all segregating mutations were at 258 very low frequency. We found only modifier (i.e. nearly neutral) mutations rising to higher frequency 259 within the lineage. analyses revealed frequent and ongoing in situ admixture in Europe. Thus, vegetative compatibility 307 type diversity does not necessarily underpin population admixture frequency and genetic diversity in 308 sexually recombining populations. Our findings show that in asexually reproducing populations, such 309 as in the S12 lineage, genotypes tend to cluster according to vegetative compatibility types. 310 311

Emergence of an invasive lineage from a European bridgehead 312
The invasive lineage S12 most likely arose from existing genotypes established in Europe. We 313 identified a series of closely related genotypes to the S12 lineage in Bosnia, Croatia, Georgia and 314 southern Switzerland. Strikingly, the closest genotypes to the dominant S12 MAT-1 were S12 MAT-315 2 isolates found in Bosnia, Kosovo and southern Italy. Analyses based on a coancestry matrix 316 identified a group of more distantly related genotypes from Bosnia, Croatia, Switzerland and Georgia 317 having made the strongest genetic contributions to the S12 lineage. This shows that introductions 318 from outside of Europe are unlikely to explain the emergence of S12. Furthermore, the emergence of 319 S12 was accompanied by a striking evolutionary transition from mixed mating type populations to 320 single mating type outbreak populations. Human activity may have contributed to the shift towards 321 single mating type populations. Shipments of infected chestnut seedlings from Northern Italy and 322 other trading activities could have disseminated the invasive lineage further South. This would have 323 exposed the pathogen to the geographically more fragmented chestnut forests typically found in 324 south-eastern Europe where asexuality or selfing may be advantageous. Although C. parasitica is 325 able to produce asexual conidia in large quantities, these specific spores are thought to be mainly 326 splash dispersed by rain over short distances (Griffin, 1986). Accounting for occasional dispersal by 327 birds or insects (Heald & Studhalter, 1914), conidia dispersal is unlikely to contribute substantially to 328 the colonization of new areas. 329

330
Despite the loss of a mating type in the S12 lineage, we found genome-wide evidence for reticulation 331 indicating at least low levels of recombination. If mating in C. parasitica follows the canonical 332 process found in many ascomycetes, isolates of opposite mating type are required. Hence, S12 333 isolates of mating type MAT-1 may sporadically mate with rare S12 isolates of mating type MAT-2, 334 which are comparatively more diverse. The emergence of the opposite mating type at low frequency 335 could be the result of recombination with other genotypes and subsequent backcrossing. Combined 336 with experimental evidence, we show that the dominant S12 mating type MAT-1 has retained the 337 ability for sexual reproduction. Furthermore, in Bosnia, Croatia, Italy (Sicily) and Turkey the S12 338 lineage co-exists with other genotypes (i.e. vegetative compatibility types EU-01 and EU-02) of both 339 mating types, potentially enabling sexual recombination and diversification in situ. The invasive S12 340 lineage was likely pre-adapted to the south-eastern European niche as we traced the origins to a likely 341 Italian bridgehead population. Niche availability and benefits associated with asexual reproduction to 342 colonize new areas may have pre-disposed the European C. parasitica bridgehead population to 343 produce a highly invasive lineage. 344 345

Expansion and mutation accumulation within the invasive S12 lineage 346
The MAT-1 S12 lineage diversified largely through mutation accumulation as nearly all high-347 confidence SNPs were identified as singletons. Mutation accumulation in absence of substantial 348 recombination resulted in star-like phylogenetic relationships. Analyses of allele frequency spectra 349 suggested that the broader European C. parasitica populations efficiently removed the most 350 deleterious mutations through purifying selection. In contrast, the S12 lineage shows strong skews 351 towards very low minor allele frequencies of all mutation categories. Interestingly, we found a 352 broader spread in allele frequencies for nearly neutral mutations in the S12 lineages. This suggests 353 that despite the largely clonal population structure, deleterious mutations can still be removed 354 through low levels of recombination and purifying selection. Using accumulated mutations as 355 markers to retrace the spatial expansion of the invasive S12 lineage, we found no indication for a 356 step-wise geographic expansion along potential invasion routes. A lack of genetic clustering across 357 south-eastern Europe may be a consequence of high levels of gene flow frequently introducing new 358 genotypes over large distances. However, the lack of geographic structure could also have its origins 359 from substantial population bottlenecks during the spread of S12 across south-eastern Europe. 360 Finally, the largely clonal lineage may also become exposed to processes such as Muller's Ratchet 361 fixing deleterious mutations over time (Felsenstein, 1978). Macedonia, Serbia, Slovenia, Switzerland, and Turkey (Fig. 1B, Supp. Table 1, Supp. Fig. 7). The 387 six other isolates were from South Korea (2) and North America (4) (Supp.   Fig. 2). 432 433

Inference of S12 donor populations 445
We generated an averaged co-ancestry matrix as inferred by fineSTRUCTURE v2.1.3 (Lawson et 446 al., 2012). The software uses a Markov-Chain-Monte-Carlo (MCMC) based algorithm to infer 447 ancestral contributions based on patterns of haplotype similarity. We ran the fineSTRUCTURE 448 pipeline in 'automatic mode', with 500 Expectation-Maximation (EM) and 300'000 MCMC 449 iterations, 400'000 maximization steps to infer the best tree and with ploidy set to 1.

Mating experiments 464
The ability of C. parasitica isolates belonging to S12 (of mating type MAT-1) to outcross with 465 isolates of the opposite mating type, was assessed in an inoculation experiment. For this, we randomly 466 selected four S12 and one non-S12 isolate with mating type MAT-1, as well as five isolates of Scientific, Waltham MA, USA). All single ascospore cultures were screened for mating types by 485 performing a multiplex PCR following the protocol described in Cornejo Table   789 790 Table 1: Donor isolates for S12 identified using FineStructure. The geographic origin (country and 791 population), vegetative compatibility (vc) type and mating type of donors contributing to genotypes of 792 the invasive S12 lineage are given. See Figure 3 for the corresponding co-ancestry matrix. 793 794 795