Application of genomic big data to analyze the genetic diversity and population structure of Korean domestic chickens

Eunjin Cho1, Minjun Kim2, Jae-Hwan Kim3, Hee-Jong Roh3, Seung Chang Kim3, Dae-Hyeok Jin3, Dae Cheol Kim4, Jun Heon Lee1,2,*
Author Information & Copyright
1Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea
2Division of Animal & Dairy Science, Chungnam National University, Daejeon 34134, Korea
3Animal Genetic Resources Research Center, National Institute of Animal Science, Rural Development Administration, Hamyang 50000, Korea
4Jeju Special Self-Governing Province Livestock Promotion Agency, Jeju 63078, Korea
*Corresponding author: Jun Heon Lee, Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. Tel: +82-42-821-5779, E-mail:

© Copyright 2023 Korean Society of Animal Science and Technology. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Dec 27, 2022; Revised: Jan 16, 2023; Accepted: Jan 16, 2023

Published Online: Sep 30, 2023


Genetic diversity analysis is crucial for maintaining and managing genetic resources. Several studies have examined the genetic diversity of Korean domestic chicken (KDC) populations using microsatellite markers, but it is difficult to capture the characteristics of the whole genome in this manner. Hence, this study analyzed the genetic diversity of several KDC populations using high-density single nucleotide polymorphism (SNP) genotype data. We examined 935 birds from 21 KDC populations, including indigenous and adapted Korean native chicken (KNC), Hyunin and Jeju KDC, and Hanhyup commercial KDC populations. A total of 212,420 SNPs of 21 KDC populations were used for calculating genetic distances and fixation index, and for ADMIXTURE analysis. As a result of the analysis, the indigenous KNC groups were genetically closer and more fixed than the other groups. Furthermore, Hyunin and Jeju KDC were similar to the indigenous KNC. In comparison, adapted KNC and Hanhyup KDC populations derived from the same original species were genetically close to each other, but had different genetic structures from the others. In conclusion, this study suggests that continuous evaluation and management are required to prevent a loss of genetic diversity in each group. Basic genetic information is provided that can be used to improve breeds quickly by utilizing the various characteristics of native chickens.

Keywords: Genetic diversity; Population structure; Korean domestic chicken; Single nucleotide polymorphism


Genetic diversity depends on the rates of allele loss and fixation, and reflects the balance in emergent genetic variants within populations [1]. It allows animal to survive and adjust to the environmental changes they will face. Genetic diversity is an important aspect of disease prevention and trait enhancement research for a sustainable livestock industry. Commercial breeds with excessive breeding have limited genetic diversity, compared with indigenous breed, since they are frequently bred for conservation without a structured selection procedure [2]. The livestock industry selectively produces commercial animals with high economic benefits, which reduces genetic diversity and could undermine the conservation of indigenous breeds with small populations. Hence, research on genetic diversity is required to maintain and manage their genetic resources.

Various genetic markers have been developed to obtain genetic information. Several studies of genetic diversity have used polymorphic microsatellite (MS) markers throughout the genome [35]. Due to their unique properties, however, MS markers do not always accurately reflect the characteristics of the whole genome, and some have high rates of genotyping errors [6]. Furthermore, research using MS markers necessitates much effort and interpretation of the results is highly subjective. The use of single nucleotide polymorphism (SNP) markers could overcome these limitations of MS markers [7]. SNPs are the most common genetic molecular markers throughout the genome and are ideal for large-scale analysis platforms [8]. Various genotyping methods based on SNP assays have recently been developed, and analysis costs are dropping gradually. Therefore, SNP markers are much more effective than MS markers for studying genetic diversity.

Korean domestic chicken (KDC) populations are generally classified into native and commercial breeds. Korean native chicken (KNC) populations are subdivided into five breeds and 12 lines, and the purebred KNC has been preserved by the National Institute of Animal Science (NIAS) in Korea. Six lines of two breeds are indigenous KNCs, including the Gray-brown KNC (NG), Black KNC (NL), Red-brown KNC (NR), White KNC (NW), Yellow-brown KNC (NY), and Yeonsan Ogye (YO). The remaining six lines of the other three breeds are adapted KNCs, which were imported in the 1960s and adapted in Korea for more than seven generations until now and include the Rhode Island Red (NC and ND), Cornish (NH and NS), and Leghorn (NF and NK) lines. In addition to KNCs, which are preserved by NIAS, two local chicken breeds classified as KDCs managed in Korea: Hyunin KDC (HI), and Jeju KDC (J). Although they are preserved in a private institution, their populations are small and they are not managed under an efficient selection system. As well as the native KDCs, Korean poultry breeding companies produce commercial KDCs that have been improved to suit the taste of Koreans. Hanhyup is a representative breeding company that produces several breed lines by improving Rhode Island Red (HS and HW), Cornish (HA, HF, and HH), Plymouth Rock (HG, HV, and HZ), and New Hampshire (HY) lines.

Several studies using MS markers have reported the genetic diversity of indigenous KNCs [913]. However, there have been relatively few diversity analyses using large-scale SNP data. Therefore, this study aims to conduct a genetic diversity study using high-density SNP genotype data targeting several KDC populations inhabiting Korea.


Samples and genotypes

Data on three purebred populations were used in this study (Table 1). The first population consisted of 694 KNC birds separated into five breeds and 12 lines, including YO, indigenous and adapted KNC lines. The second population consisted of 47 Korean local chickens from two breeds: Hyunin and Jeju. The third population consisted of 194 Hanhyup commercial KDCs. The first and third populations were genotyped using a 600K chicken SNP array (Affymetrix, Santa Clara, CA, USA) [14], whereas the second population was genotyped using a custom 60K chicken SNP array created by our team.

Table 1. Summary of the samples used in this study
Class Population Origin Description No. Animals
Korean native chicken NG Gray-brown KNC Indigenous KNC 89
NL Black KNC 74
NR Red-brown KNC 127
NW White KNC 94
NY Yellow-brown KNC 97
YO Yeonsan Ogye 189
NC Rhode Island Red Adapted KNC (imported in 1960s and locally adapted) 6
ND 6
NH Black Cornish 6
NS Brown Cornish 6
Korean local chicken HI Hyunin KDC Maintained population in Hyunin Farm 23
J Jeju KDC Maintained population in Jeju 24
Commercial KDC HA White Cornish Maintained population in Hanhyup Farm 20
HF Black Cornish 23
HG White Plymouth Rock 23
HH Brown Cornish 23
HS Rhode Island Red 23
HV White Plymouth Rock 23
HW Rhode Island Red 23
HY New Hampshire 21
HZ Partridge Plymouth Rock 15
Total 935

KNC, Korean native chicken; KDC, Korean domestic chicken.

Download Excel Table
Data pre-processing and quality control for the genotype data

A total of 542,717 SNPs and 66,852 SNPs were derived from 600K and 60K arrays, respectively. Genotype data from the 60K SNP array were imputed using Minimac3 and Minimac4 software [15]. After imputation, 468,584 common SNPs were derived from the two SNP arrays. For genotype quality control (QC), PLINK 1.9 software [16] was used with the following cut-offs: genotyping rate ≤ 95%, minor allele frequency (MAF) ≤ 0.01, and Hardy-Weinberg equilibrium (HWE) at p ≤ 0.000001. Following QC, 212,420 SNPs were subjected to further analysis.

Analysis of genetic diversity

The genetic distances (GD) among the chicken populations were calculated using Reynolds’ equation and the fixation index (FST) was estimated. The formulas used for these calculations are as follows:

Reynolds GD = l u ( p o p 1 u p o p 2 u ) 2 2 l ( 1 u p o p 1 u p o p 2 u )

where u is the total number of alleles, l is the total number of loci, and pop1u and pop2u are the respective allele frequencies of populations 1 and 2 [17]. The GD were calculated using the “poppr” R package [18].

where HT is the expected heterozygosity of the total population and HS is the average heterozygosity of the subpopulation. The FST values were calculated with the method of Weir and Cockerham [19] using the “SNPRelate” R package [20].

GD and FST values were visualized as heatmaps using the “pheatmap” R package [21]; the GD values were then used to plot a phylogenetic tree using the “adegenet” R package [22].

Analysis of population structure

Principal component analysis (PCA) was performed using PLINK to confirm the genetic clustering of each population with dimensional information on PC1 to PC3, which have the highest explanatory power. The population structure analysis was conducted using ADMIXTURE software, which compares the distribution of the genetic components of each population based on the numbers of random common ancestors with various K values [23]. The two analyses were conducted by dividing the samples into two cases: either all samples in each population were used or ≤ 25 samples were selected randomly from each population. The results of the two analyses were visualized using R software.


Population structure from principal component analysis

PCA was performed on the 600K SNP genotype data for the entire population. Fig. 1 shows the genetic clusters for each population. Fig. 1A shows the population clusters obtained using all samples. PC1 and PC2 explained 23.65% of the total variance. Indigenous KNC populations, except for the Black KNC (NL), were separated from the other groups, while the adapted KNC populations and Hanhyup commercial KDC populations clustered together. The Hyunin and Jeju KDC populations also tended to cluster individually; however, this was less clear since the sample sizes of each population differed.

Fig. 1. Results of principal component analysis (PCA) using 600K single nucleotide polymorphism genotype data. (a) The result of PCA using total samples, (b) the result of PCA using randomly selected samples, (c) and (d) the result of PCA without adapted KNC and Hanhyup populations. NR, Red-brown KNC; NY, Yellow-brown KNC; NL, Black KNC; NW, White KNC; NG, Gray-brown KNC; KNC, Korean native chicken.
Download Original Figure

Fig. 1B shows the PCA result obtained using the ≤ 25 randomly selected samples. Compared with the adapted KNC and Hanhyup populations, the indigenous KNC populations, YO, and two local chicken populations (Hyunin and Jeju) clustered together. Figs. 1C and 1D show the clustering result for the KDC population, excluding the adapted KNC and Hanhyup populations derived from imported chicken breeds. Compared with the total population, PC1 and PC2 better explained the genetic distribution of the KDC populations. Fig. 1D indicates that the all eight populations could be distinguished on the basis of PC1 and PC3.

Except for indigenous KNC populations, few samples were used for the populations studied. Allelic polymorphism is an important parameter often used to estimate genetic diversity; it is highly reliant on the effective population size [24]. However, obtaining large sample sizes and standardizing unequal sample sizes are often difficult. Therefore, this study was limited to confirming genetic differences between genetically close groups, as shown in Figs. 1C and 1D.

Genetic diversity from genetic distances and fixation index

The results of the GD and FST analyses are shown in Fig. 2, and were similar to those of the PCA (Fig. 1). The HI and J KDC populations were genetically close to the indigenous KNC group (GD, 0.33–0.42; FST, 0.09–0.16). The HG, HV, and HZ groups, which are the same Plymouth Rock chicken breed, were also close to each other, and the HG and HV groups were being especially genetically close.

Fig. 2. Results of genetic diversity analysis. (a) The heatmap plot using genetic distance values, (b) the heatmap plot using fixation index values. NG, Gray-brown KNC; NL, Black KNC; NR, Red-brown KNC; NY, Yellow-brown KNC; NC, Rhode Island Red C; ND, Rhode Island Red D; NH, Cornish H; NS, Cornish S; YO, Yeonsan Ogye; HI, Hyunin KDC; J, Jeju KDC; HA, Hanhyup A; HF, Hanhyup F; HG, Hanhyup G; HH, Hanhyup H; HS, Hanhyup S; HV, Hanhyup V; HW, Hanhyup W; HY, Hanhyup Y; HZ, Hanhyup Z; KNC, Korean native chicken; KDC, Korean domestic chicken.
Download Original Figure

Although the Hanhyup and NIAS groups included populations originating from the same breed, there was significant GD between them. The NC, ND, and HS, HW populations derived from the Rhode Island Red breed were close genetically, while there was genetic variance between the Hanhyup and NIAS groups. In addition, the HA, HF, HH, and NH, NS populations, which were derived from Cornish breeds, were also distinct from each other. In particular, NH and NS were in the same NIAS group, but were genetically distant. The same result was seen in the phylogenetic tree based on GD (Fig. 3). Branches formed according to the origins of each population.

Fig. 3. Results of phylogenetic tree using genetic distance values. NG, Gray-brown KNC; NL, Black KNC; NR, Red-brown KNC; NY, Yellow-brown KNC; NC, Rhode Island Red C; ND, Rhode Island Red D; NH, Cornish H; NS, Cornish S; YO, Yeonsan Ogye; HI, Hyunin KDC; J, Jeju KDC; HA, Hanhyup A; HF, Hanhyup F; HG, Hanhyup G; HH, Hanhyup H; HS, Hanhyup S; HV, Hanhyup V; HW, Hanhyup W; HY, Hanhyup Y; HZ, Hanhyup Z; KNC, Korean native chicken; KDC, Korean domestic chicken.
Download Original Figure

Seo et al. [25] also found genetic differences between the NIAS and Hanhyup populations, which originate from the same species. They found a relatively high Fis value in the adapted KNC group compared to the Hanhyup group, which means that the correlation between individuals in the NIAS group was high. These results were attributed to the different breeding selection goals of the two groups. For the NIAS-adapted KNC groups, a limited number of individuals imported into Korea were genetically fixed through indigenization. For the Hanhyup group, on the other hand, genetic fixation resulted from specific mating combinations aiming to produce practical systems.

Population structure from ADMIXTURE

The ADMIXTURE results for the 21 populations revealed the genetic components and population structures across entire groups. In the two groups using different sample sizes, the optimal cross-validation (CV) error was 0.495 when K = 8 using the entire population and 0.508 when17 using the smaller random subpopulations (Fig. 4).

Fig. 4. Results of ADMIXTURE analysis. Left plot is the result of ADMIXTURE using total samples, and right plot is the result of ADMIXTURE using randomly selected samples. NG, Gray-brown KNC; NL, Black KNC; NR, Red-brown KNC; NY, Yellow-brown KNC; NC, Rhode Island Red C; ND, Rhode Island Red D; NH, Cornish H; NS, Cornish S; YO, Yeonsan Ogye; HI, Hyunin KDC; J, Jeju KDC; HA, Hanhyup A; HF, Hanhyup F; HG, Hanhyup G; HH, Hanhyup H; HS, Hanhyup S; HV, Hanhyup V; HW, Hanhyup W; HY, Hanhyup Y; HZ, Hanhyup Z; NY, Yellow-brown KNC; KDC, Korean domestic chicken; KNC, Korean native chicken.
Download Original Figure

The ADMIXTURE analysis confirmed the results of the phylogenetic tree; K = 8 using the total sample (Fig. 4A) indicated that all six indigenous KNC populations were distinct. Furthermore, the HI and J KDCs shared common ancestors, comparable to the results of the FST analysis. Unlike the other groups, it was difficult to classify these two populations as independent groups because of possible hybridization with other breeds, or a lack of individual identification and a breeding plan. The Hanhyup and NIAS groups with each having the same origin share a common ancestor, based on the results of the phylogenetic tree. Despite the limited number of individuals, the adapted KNC populations (NC, ND, NH, and NS) were clearly divided into groups.

The results at K = 5 using the selected samples (Fig. 4B) showed that the indigenous KNC populations, except NG and NW, share common ancestors with the HI and J KDCs. Similar results were obtained for other Hanhyup populations in the analysis of all samples. Except for the NG and YO populations, all the chicken populations had a dominant single ancestor when K = 20. The ADMIXTURE analysis produced results similar to those of a diversity study using 25 MS markers; using 18 KDC populations, the groups were separated optimally at K = 15, and populations from the same ancestral species were classified together [13].


This study performed genetic diversity and population structure analyses using high-density SNP genotype data of various KDC populations. The results of the diversity analysis suggest the existence of genetic diversity among different breeds within the large domestic chicken population in Korea. Furthermore, the results suggest genetic fixation and high population uniformity of the KNC populations and emphasize the need for a systematic selection strategy for the Hyunin and Jeju KDC populations.

In summary, the diversity study conducted on the KDC groups indicates that continuous evaluation and management are required to prevent a decline of genetic diversity in each group. This study provides basic genetic information that can improve breeds quickly by selecting for various characteristics of native chickens.

Competing interests

No potential conflict of interest relevant to this article was reported.

Funding sources

This study was supported by the University Innovation Support Project of Chungnam National University (2022-2023).


The Korean domestic chicken samples were kindly provided from the National Institute of Animal Science, Yeonsan Ogye Foundation, Hyunin-nongwon, Jeju Province Livestock Promotion Agency, and Hanhyup-wonjong, Korea.

Availability of data and material

Upon reasonable request, the datasets of this study can be available from the corresponding author.

Authors’ contributions

Conceptualization: Cho E, Kim M, Lee JH.

Data curation: Cho E, Kim M, Kim JH, Roh HJ, Kim SC, Jin DH, Kim DC, Lee JH.

Formal analysis: Cho E, Kim M.

Methodology: Cho E, Kim M.

Software: Cho E, Kim M.

Validation: Cho E, Kim M, Lee JH.

Investigation: Cho E, Kim M, Kim JH, Roh HJ, Kim SC, Jin DH, Kim DC, Lee JH.

Writing - original draft: Cho E.

Writing - review & editing: Cho E, Kim M, Kim JH, Roh HJ, Kim SC, Jin DH, Kim DC, Lee JH.

Ethics approval and consent to participate

This research has been approved by the Institutional Animal Care and Use Committee (IACUC) of Chungnam National University (202212A-CNU-213).



Ellegren H, Galtier N. Determinants of genetic diversity. Nat Rev Genet. 2016; 17:422-33


Nxumalo N, Ceccobelli S, Cardinali I, Lancioni H, Lasagna E, Kunene NW. Genetic diversity, population structure and ancestral origin of KwaZulu-Natal native chicken ecotypes using microsatellite and mitochondrial DNA markers. Ital J Anim Sci. 2020; 19:1275-88


Serrano M, Calvo JH, Martínez M, Marcos-Carcavilla A, Cuevas J, González C, et al. Microsatellite based genetic diversity and population structure of the endangered Spanish Guadarrama goat breed. BMC Genet. 2009; 10:61


Oh JD, Song KD, Seo JH, Kim DK, Kim SH, Seo KS, et al. Genetic traceability of black pig meats using microsatellite markers. Asian-Australas J Anim Sci. 2014; 27:926-31


Choi NR, Seo DW, Jemaa SB, Sultana H, Heo KN, Jo C, et al. Discrimination of the commercial Korean native chicken population using microsatellite markers. J Anim Sci Technol. 2015; 57:5


Fischer MC, Rellstab C, Leuzinger M, Roumet M, Gugerli F, Shimizu KK, et al. Estimating genomic diversity and population differentiation – an empirical comparison of microsatellite and SNP variation in Arabidopsis halleri. BMC Genomics. 2017; 18:69


Karniol B, Shirak A, Baruch E, Singrün C, Tal A, Cahana A, et al. Development of a 25-plex SNP assay for traceability in cattle. Anim Genet. 2009; 40:353-6


Duran C, Appleby N, Edwards D, Batley J. Molecular genetic markers: discovery, applications, data storage and visualisation. Curr Bioinform. 2009; 4:16-27


Kong HS, Oh JD, Lee JH, Jo KJ, Sang BD, Choi CH, et al. Genetic variation and relationships of Korean native chickens and foreign breeds using 15 microsatellite markers. Asian-Australas J Anim Sci. 2006; 19:1546-50


Lee HK, Oh JD, Park CH, Lee KW, Lee JH, Jeon GJ, et al. Comparison for genetic diversity between Korean native commercial chicken brand groups using microsatellite markers. Korean J Poult Sci. 2010; 37:355-60


Choi NR, Hoque MR, Seo DW, Sultana H, Park HB, Lim HT, et al. ISAG-recommended microsatellite marker analysis among five Korean native chicken lines. J Anim Sci Technol. 2012; 54:401-9


Seo JH, Lee JH, Kong HS. Assessment of genetic diversity and phylogenetic relationships of Korean native chicken breeds using microsatellite markers. Asian-Australas J Anim Sci. 2017; 30:1365-71


Roh HJ, Kim KW, Lee J, Jeon D, Kim SC, Ko YG, et al. Genetic diversity of Korean native chicken populations in DAD-IS database using 25 microsatellite markers. Korean J Poult Sci. 2019; 46:65-75


Kranis A, Gheyas AA, Boschiero C, Turner F, Yu L, Smith S, et al. Development of a high density 600K SNP genotyping array for chicken. BMC Genomics. 2013; 14:59


Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016; 48:1284-7


Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007; 81:559-75


Reynolds J, Weir BS, Cockerham CC. Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics. 1983; 105:767-79


Kamvar ZN, Tabima JF, Grünwald NJ. Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ. 2014; 2e281


Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984; 38:1358-70


Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012; 28:3326-8


Kolde R. Pheatmap: pretty heatmaps: R package version [Internet]. 2012.[cited 2023 Jan 3]


Jombart T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics. 2008; 24:1403-5


Alexander DH, Lange K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics. 2011; 12:246


Petit RJ, El Mousadik A, Pons O. Identifying populations for conservation on the basis of genetic markers. Conserv Biol. 1998; 12:844-55


Seo JH, Oh JD, Lee JH, Seo D, Kong HS. Studies on genetic diversity and phylogenetic relationships of Korean native chicken using the microsatellite marker. Korean J Poult Sci. 2015; 42:15-26