Escherichia coli is a facultative anaerobic bacterium which is commonly spread on biosphere. E. coli normally colonizes the lower intestine of animals and humans (1). However, Some of the serotypes such as Enterohemorrhagic E. coli (EHEC), Enterotoxigenic E. coli (ETEC), Enteropathogenic E. coli (EPEC) and Shiga toxin-producing E. coli (STEC) can cause foodborne illnesses in people.
E. coli K_EC180 was isolated from swine feces that were collected from a livestock farm in Haenam-gun, Jeollanam-do, Korea. E. coli K_EC180 was streaked to Luria-Bertani (LB) agar and incubated at 37°C for 24 h. The suspected colony in LB agar was inoculated into LB broth and incubated at 37°C for 24 h. To analyze the complete genome, the E. coli K_EC180 genome was sequenced by PacBio RS II (Pacific Biosciences, Menlo Park, CA, USA) at Insilicogen (Yongin, Korea) and Illumina NextSeq 500 (Illumina, San Diego, CA, USA) platform at LabGenomics (Seongnam, Korea). The genomic DNA of E. coli K_EC180 for PacBio and Illumina sequencing was extracted using the MagAttract HMW DNA Kit (QIAGEN), and NucleoSpin® Microbial DNA kit (TAKARA) according to the manufacturer’s instructions. Library preparation was conducted using SMRTbell™ Template Prep Kit 1.0 for Pacbio (Pacific Biosciences) and TruSeq DNA Sample Preparation Kit for Illumina (Illumina) according to the manufacturer’s instructions. PacBio sequencing yielded 1,131,537,370 base pairs and 145,423 long reads after filtering, and 9,199,306 paired-end reads with 1,389,095,206 bp were obtained with Illumina sequencing. De novo assembly was conducted using the hierarchical genome assembly process (HGAP v2.3.0) workflow (Chin et al., 2013) and polished using Quiver. Subsequently, Illumina NextSeq reads were aligned to the PacBio RSII assembly using Burrows-Wheeler Aligner (BWA)-MEM v0.7.17-r1188, and the errors were corrected by using Pilon version 1.23 (2, 3). The quality of genome assembly and the validaty of the final genome were assessed using Quality Assessment Tool for Genome Assemblies (QUAST) v5.0.2 and Benchmarking Universal Single-Copy Orthologs (BUSCO) v3.0.2 (4, 5). Open reading frames (ORFs) and RNA genes of E. coli K_EC180 were predicted and functionally annotated through rapid prokaryotic genome annotation (PROKKA) v1.14.5 (6) and Rapid Annotation using Subsystem Technology (RAST) v2.0 (7). The functional categorization and classification of all predicted ORFs were conducted using the RAST server-based SEED viewer and Clusters of Orthologous Groups (COG) – based EggNOG. The putative virulence factors and Antimicrobial resistance were described using BLAST according to the Virulence Factor Database (VFDB) (8). The whole genome of E. coli K_EC180 is composed of one circular chromosome (5,017,281 bp) with 50.4% of G+C content, 4,935 of coding sequence (CDS), 88 of tRNA, and 22 of rRNA genes.
The complete genome of E. coli K_EC180 contains the toxin genes encoding shiga-like toxin (stx2e subunit A and stx2e subunit B), which may cause diseases in humans by damaging small blood vessels in places such as the digestive tract, kidneys and central nervous system (9, 10). E. coli K-EC180 also possessed essC, escV, escR, escS, escV, and escJ genes which involved in a type III secretion system. In addition, there were fim (A to H) genes encoding Type I fimbriae. We summarized the general properties of the E. coli K_EC180’s complete genome in the Fig. 1 and Table 1.