A new PacBio genome sequence of an Australian Pyrenophora tritici-repentis race 1 isolate

Objectives The necrotrophic fungal pathogen Pyrenophora tritici-repentis (Ptr) is the causal agent of tan spot a major disease of wheat. We have generated a new genome resource for an Australian Ptr race 1 isolate V1 to support comparative ‘omics analyses. In particular, the V1 PacBio Biosciences long-read sequence assembly was generated to confirm the stability of large-scale genome rearrangements of the Australian race 1 isolate M4 when compared to the North American race 1 isolate Pt-1C-BFP. Results Over 1.3 million reads were sequenced by PacBio Sequel small-molecule real-time sequencing (SRMT) cell to yield 11.4 Gb for the genome assembly of V1 (285X coverage), with median and maximum read lengths of 8959 bp and 72,292 bp respectively. The V1 genome was assembled into 33 contiguous sequences with a of total length 40.4 Mb and GC content of 50.44%. A total of 14,050 protein coding genes were predicted and annotated for V1. Of these 11,519 genes were orthologous to both Pt-1C-BFP and M4. Whole genome alignment of the Australian long-read assemblies (V1 to M4) confirmed previously identified large-scale genome rearrangements between M4 and Pt-1C-BFP and presented small scale variations, which included a sequence break within a race-specific region for ToxA, a well-known necrotrophic effector gene.


Introduction
The necrotrophic fungal pathogen Pyrenophora triticirepentis (Ptr) is the causal agent of tan (or yellow) spot disease of wheat (Triticum aestivum), which has a significant economic impact on the grain industry worldwide [1]. Ptr, a necrotrophic fungal pathogen is an ascomycete within the order Pleosporales, which also contains other important crop pathogens [2,3]. The production of necrotrophic host-specific effectors contributes to the pathogenicity of this fungus, and three Ptr effectors have been described to date. ToxA and ToxB, are both well-characterised small effector proteins that produce necrosis and chlorosis symptoms respectively [4,5], while ToxC, which also causes chlorosis, remains to be identified and may be the product of a secondary metabolite gene cluster [6].
The ToxA gene, believed to have been horizontally transferred to Ptr from Parastagonospora nodorum [7], can occupy different loci positions in different isolates and races of Ptr [8]. It has been proposed that this type of translocation could be the result of gene proximity to a chromosomal break point [9]. Recently, we confirmed large scale chromosomal rearrangements/fusions between two race 1 isolates sourced from North America (Pt-1C-BFP) [10] and Australia (M4) [11]. We therefore undertook PacBio Biosciences (PacBio) Sequel system for long-read sequencing of a second Australian Ptr isolate (V1), collected from a different geographic state, for whole genome comparison necessary to confirm these larger rearrangements and examine any further genomic variations. Furthermore, a high quality PacBio genome assembly enables the characterization of predicted

Isolate collection and sequencing
The pathogenic isolate V1 was isolated from tan spot infected leaves collected from Horsham, Victoria, Australia in 2015. V1 was cultured in vitro from a single spore [12].
V1 genomic DNA was extracted from 3-day old mycelia grown in vitro in Fries 3 liquid medium, using a Bio-Sprint 15 DNA Plant Kit (Qiagen, Hilden, Germany) and automated workstation according to the manufacturer's instruction [11]. DNA was further treated with 50 μg/ ml of RNase enzyme (Qiagen, Hilden, Germany) for 1 h followed by phenol/chloroform extraction, followed by precipitation with sodium acetate and ethanol, and finally resuspension in TE buffer [11].
The V1 genome was sequenced using PacBio Biosciences (PacBio) Sequel small-molecule real-time sequencing (SMRT) Cell Technology at 283X to yield 11.4 Gb (1,319,569 long reads) by Novogene Co., Ltd, Hong Kong. The V1 genome was also sequenced via Illumina HiSeq 150 bp PE by Novogene Co., Ltd (Hong Kong) to yield 3.2 Gb at 80X coverage. Illumina read Phred quality score distribution was greater than 30, and reads containing adaptors were removed.

Plant materials and pathogenicity assays
To assess pathogenicity and race classification differential wheat genotypes, which differ in their effector sensitivities were used for inoculation [5,13]. The wheat lines used were Glenlea and BG261 (both ToxA-sensitive), 6B662 (ToxB-sensitive) and 6B365 (ToxC-sensitive). Two week-old wheat (Triticum aestivum L.) seedlings were spore-inoculated by spraying the whole plants evenly with approximately 2000 conidia/ml and grown under controlled growth conditions [14]. The second leaves were harvested 7-days post-inoculation, visually inspected for symptoms and photographed. Infection experiments were repeated twice with four replicate plants per wheat line to demonstrate the pathogenicity and race classification.

Genome assembly of V1 and comparative analysis
PacBio sequence data was error-corrected and assembled using linux-amd64 Canu 1.8 software [15] guided by a genome size of 40 Mb. Illumina PE reads were quality trimmed for random hexamer primers on the 5′ read end using Trimmomatic v0.22 [16]. The high quality trimmed Illumina reads were aligned to the Canu genome assembly using BWA 0.7.14-r1138 [17] and filtered for concordant PE read alignments using samtools 0.1.19-96b5f2294a [18]. The Canu genome assembly was additionally corrected with the high quality Illumina alignments using Pilon 1.2 [19] to generate a final polished V1 sequence assembly with SNP and INDEL corrections.

Results
Plant infection assays confirmed V1 as a race 1 isolate (producing ToxA and ToxC), by the presence of necrosis and chlorosis symptoms on the differential wheat lines Glenlea and 6B365 respectively, and absence of tan spot symptoms on Auburn and 6B662 (Fig. 1a). Furthermore, the induced chlorosis symptoms on the tan spot susceptible Australian commercial wheat cultivar Yitpi, also confirmed V1 pathogenicity (Fig. 1b).
The genomic sequence of V1 was assembled into 33 contiguous sequences with a of total length of 40,408,077 bp and a N50 of 3,421,861 bp. The mean and longest contig sizes were 1,224,487 bp and 9,664,470 bp respectively. V1 contig length statistics showed an improvement in comparison to M4 and Pt-1C-BFP (Table 1). A total of 14,050 protein coding genes were annotated for V1, which included the major effector ToxA (PtrV1_13859) positioned on contig12: 1,348,464-1,349,050. A total of 10,398 genes were orthologous to Pt-1C-BFP and M4. The V1 annotated genome has been deposited with National Center for Biotechnology Information (NCBI) GenBank under the accession SAXQ00000000.

Discussion
The PacBio sequence for V1 had longer assembly statistics when compared to the recent M4 PacBio RSII (six SMRT cells) assembly, possibly due to the higher depth of coverage obtained from the PacBio Sequel system for overlap assembly.
The sequence comparison of V1 to both M4 and Pt-1C-BFP confirmed the chromosomal rearrangements for M4 chr1, chr2, chr3 and chr7 for the Australia isolates, which appears stable despite isolates being collected from different geographic states. Smaller scale differences  between V1 and M4 were however detected in chr4, chr6, chr7 and chr8. Also, V1 had a sequence break upstream of ToxA with a significant sequence variation between Pt-1C-BFP and M4 in both length and complexity that was not resolved by assembly. These larger and smaller scale rearrangements can impact gene clusters, especially when they are proximal to complex sub-telomeric regions and breakpoints. This resource will therefore be useful for future 'omics experiments and comparative Ptr genomic analyses.

Limitations
Although all methods have been made as consistent as possible for comparative analyses, this analysis has used databases, software and PacBio sequencing versions currently available, which may be updated in the future. The comparison of two Australian long-read assemblies is only an indication of potential genome stability in Australia.