MATERIALS AND METHODS
Plant material and DNA extraction
Analysis of the CAPS markers
Peanut (Arachis hypogaea L.) is widely cultivated from tropical to temperate regions such as Africa, America and Asia with an annual production of about 42.4 million tons (FAO, 2020). It is also one of the main sources of cooking oil in the world, which contains 25% protein and 50% of oil. Peanut has a significant role in sustainable agriculture in terms of global food security and nutrition, fuel and energy, sustainable fertilization, and enhanced agricultural productivity as a rotation crop (Feng et al., 2012).
Cultivated peanut is allotetraploid (2n = 4× = 40, AABB) with a genome size of 2800 Mb/1C and the genome composition of cultivated peanut was known to have derived from a recent hybridization of A.duranensis (A subgenome) and A.ipaensis (B subgenome) (Smartt et al., 1978; Seijo et al., 2007; Robledo et al., 2009; Bertioli et al., 2016). The genetic diversity of cultivated peanut is extremely low because of the single recent polyploidization during domestication (Kim et al., 2017). Peanut subgenomes show a high similarity (Kottapalli et al., 2007; Khera et al., 2013) with an estimated repetition rate of 64%, which makes the assembly of peanut genome sequences extremely difficult (Dhillon et al., 1980; Temsch & Greilhuber, 2000; Bertioli et al., 2016). The genome sequences of the diploid ancestors (A.duranensis and A.ipaensis) of cultivated peanut were reported in 2016, which became the basis for understanding the genome of cultivated peanut (Ren et al., 2011). Recently, the reference sequence of cultivated peanut allotetraploid A. hypogaea genome was reported in 2019 and compared with the related diploid A.duranensis and A.ipaensis genomes. A total of 39,888 A subgenome genes and 41,526 B subgenome genes were annotated in the allotetraploid subgenomes (Chen et al., 2019).
Recently, next generation sequencing (NGS) technology has made significant progress and the sequencing cost has dropped sharply (Kim et al., 2017). In addition, the accuracy and productivity of sequencing data have indeed improved innovatively. In particular, NGS technologies such as De nove assembly and resequencing based on a variety of bioinformatics methods have enabled the production of large numbers of single nucleotide polymorphism (SNP) and simple sequence repeats (SSR) in complex genomes (Yang et al., 2012; Lee et al., 2015; Bertioli et al., 2016; Kang et al., 2016). Using NGS technology, high throughput genotyping was conducted by using double-digest restriction-site associated DNA sequencing (ddRADseq), a total of 14,663 SNPs were developed and used for the construction of a genetic linkage map in peanut cultivars (Zhou et al., 2014). Numerous SNP and CAPS markers have been developed from the re-sequencing of the two Korean peanut cultivars “K-OL” and “Pungan”, which means that the molecular marker information can provide valuable guidance and information for peanut breeding program (Kim et al., 2017).
Cleaved amplified polymorphic sequence (CAPS) is the combination of PCR amplification and restriction enzyme analysis, SNP occurs within the recognition site of a restriction enzyme. The digestion of PCR products can be carried out in laboratory with separation of the fragments in agarose gel. Because of the convenience of analysis, development of SNP-based markers such as CAPS has been widely carried out followed by NGS analysis, and the developed markers have been used to figure out genetic diversity or population structure in crops (Rasheed et al., 2017; Wang et al., 2017).
In this study, we aimed to evaluate genetic diversity and population structure in 96 peanut accessions derived from five different origins using the CAPS markers developed from re-sequencing of two Korean peanut cultivars
MATERIALS AND METHODS
Plant material and DNA extraction
A total of 96 peanut accessions obtained from the National Agrobiodiversity Center, Jeonju, Republic of Korea were used for the present study (Table 1). Ninety-six accessions were originally donated from five countries; two accessions from Peru (PRE), thirteen accessions from China (CHN), fifteen accessions from Argentina (ARG), 17 accessions from Brazil (BRA), and 49 accessions from Korea (KOR). In 2017, the accessions were planted in a greenhouse at Pusan National University, Miryang, Republic of Korea. A young leaf from each individual accession was collected to extract genomic DNA. Genomic DNA was extracted for each accession with the CetyltrimethylAmmonium Bromide (CTAB) protocol (Saghai-Maroof et al., 1984) with minor modifications. The quality and quantity of the extracted DNA were measured with a NanoDrop ND-1000 (Thermos Fisher Scientific Inc., USA) and electrophoresis on a 1% agarose gel. Final concentration of each DNA sample was adjusted to 30 ng/µl.
|No||IT noa||Originb||Seed colorc||Varietyd||Growth habite||100 swf||Seed sizeg|
|20||IT030957||KOR||Tan||Breeding line||Half erect||53.90||Middle|
|22||IT110214||KOR||Tan||Breeding line||Half erect||70.00||Middle|
|31||IT172547||KOR||Light brown||Breeding line||Erect||103.98||Big|
|34||IT181768||KOR||Light brown||Breeding line||Erect||94.03||Big|
|39||IT184896||BRA||Red||Unknown||Spreading and Bunch/ Erect||31.60||Small|
cSeed color of each accession was double-checked, and off-types were discarded. Tan indicates the Tannin color.
Analysis of the CAPS markers
A total 30 CAPS markers were used to evaluate genetic diversity and population structure of the peanut accessions in the present study (Supplementary Table S1). The CAPS markers were derived from thirteen different chromosomes (A01, A03, A05, A06, A07, A06, A08; B01, B03, B04, B06, B07, B08). It has been confirmed that 28 of the CAPSs were in intergenic regions and two CAPSs were in coding regions in peanut genome (Kim et al., 2017).
Polymerase chain reaction (PCR) amplifications were conducted in 20 µL reactions containing 60 ng of template DNA, 5nM mixed Primer, 1X reaction buffer, 10mM dNTP, and 1.0 unit of Taq DNA polymerase (Gen-Script USA Inc., Piscataway, N.J., USA). PCR product was digested with enzyme (AseI, DraI, HpaII, MseI, MspI, PstI, Taq. I) (New England Biolabs, USA; Enzynomics, Republic of Korea) and incubated at T-100 thermal cycler (BIO-RAD, USA) using optimum enzyme cutting temperature for 1 h. PCR products and the restriction enzyme- digested PCR products were resolved on 1.5% agarose gels (Promega, USA) to detect the polymorphism.
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) dendrogram was constructed using the MEGA 4 (Tamura et al., 2007). The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the accessions analyzed, and branches corresponding to partitions reproduced in less than 42 % bootstrap replicates collapsed (Felsenstein, 1985). The evolutionary distances were computed among the 96 accessions by using the Maximum Composite Likelihood method (Tamura et al., 2004).
The population structure of 96 peanut accessions was evaluated by Structure v2.3.4 software (https://web.stanford.edu/group/ pritchardlab/structure_software/release_versions/v2.3.4/html/structure.html) under the admixture model. Models were tested for K-values ranging from 1 to 15, with 3 independent runs per K value. To make a decision for the optimum number of K, delta K (ΔK) method was used the software online “harvester structure” (Evanno et al., 2005). Population structure and relationships were analyzed by principle coordinate analysis (PCoA) using software GenAlEx V6.503 (Peakall & Smouse, 2006).
For the estimation of genetic differentiation between subpopulations, the values of FST > 0.25 are taken to mean great differentiation between subpopulations; the range 0.15 to 0.25 indicates high differentiation; and the range 0.05 to 0.15 indicates moderate differentiation, while differentiation can negligible if FST < 0.05. Estimates of Genetic diversity indices were calculated for each locus using GeneAlEx 6.503. The genetic differentiation between individual accessions was calculated using the FST to evaluate the reduction in genotypic heterozygosity (Grasso et al., 2014).
The UPGMA tree placed the peanut accessions into two major clusters (Fig. 1), indicating that most accessions from Korea grouped in a cluster. The other cluster contained with accessions from other four origins and eight accessions collected from Korea show a small group in this cluster.
Accessions from BRA and ARG presented the FST values more than 0.25 compared to the KOR accessions (0.311 and 0.264, respectively) indicating the most significant differences between the accessions (Table 2). Accessions from KOR and CHN, KOR and PER, PER and BRA had FST values ranged from 0.15 to 0.25 showed a high differentiation. Accessions from BRA and CHN, ARG and CHN, ARG and PER, PER and CHN had FST values ranged from 0.15 to 0.25 presented moderate differentiation. Accessions from BRA and ARG had FST values less than 0.05 indicating that the differentiation could be negligible.
The value of the Nei’s genetic distance ranged from 0.012 to 0.437 (Table 3). The accessions from BRA and ARG presented the lowest genetic dissimilarity (0.012). The accessions from KOR and BRA presented the highest genetic dissimilarities (0.437). The highest genetic dissimilarities were observed accessions from KOR between other four origins while genetic dissimilarities of accessions from other four origins had less than 0.1.
The pattern of PCoA (Fig. 2) was similar with the results of the UPGMA tree. The first two axes accounted for 65.03% of the total variation, and the 96 accessions were divided into three broad groups across the first two axes. The first axes separate the KOR accessions into two parts. However, Korea accessions formed a less sticky block with the others. The intermixing of color across the coordinates, further support the UPGMA tree with SNP marker that there is no location-specific grouping (Singh et al., 2013) between accessions from BRA and ARG.
Genetic diversity indices including HO (observed heterozygosity), HE (expected heterozygosity), and the Fixation Index (F) were calculated (Table 4). For each group, the HO and HE calculated using all SNPs by observed genotype frequencies. The F values ranged from -1 to +1. Negative values indicate excess of heterozygosity. Values close to zero expected under random mating while a mass of positive values indicate inbreeding or undetected null alleles. For the 96 peanut individuals, mean HO ranged from 0.070 (KOR) to 0.084 (ARG). The lowest mean HE was found in the population from PER (0.121), whereas the highest was in the population from KOR (0.320). Across the origins, HO (0.079) was significantly less than HE (0.205).
At K = 2, we found maximum Δk (Fig. 3A) values that were plotted against the K to confirm the number of populations. Another lower peak was shown at K = 6 (Fig. 3A). When most accessions divided into the two subpopulations (K = 2, Fig. 3B), a large portion of accessions from Korea belonged to one subgroup (red) while another subgroup (green) revealed features of accessions from other four origins. As we continue to divide subgroups carefully, there is a new division into the subgroups. The most divergent subgroups by origin were formed at K = 6, but all subgroups are mixed in origin. In the red, green, and yellow subpopulations, most accessions derived from KOR with only four accessions from CHN. However, the subpopulation labeled with dark blue were mainly from ARG and BRA, pink and light blue subpopulations mixed with different areas, which coincided with results of the UPGMA tree and PCoA.
Most crops including peanut have undergone a significant loss of genetic diversity during both the evolution and cultivation (Smýkal et al., 2018). The rich genetic diversity of genetic resources is a precondition for improvement in productivity and other goals in crop breeding programs. Genetic heterozygosity (H) known as genetic diversity could reflect the degree of genetic consistency of in the population (Tambasco-Talhari et al., 2005). The lower H values represent that the higher genetic consistency and the lower genetic variation in the population. In the present study, the HO was significantly lower than the HE regardless the origins. This result might be directly affected by low heterozygosity of the tested peanut accessions or a general characteristic of peanut species with low genetic diversity. Besides, the low genetic diversity in the peanut accessions might be results of long-term peanut germplasm collection conducting with a lack of understanding of the genetic background or too much emphasis on the phenotypic variations.
The genetic diversity of the 96 peanut accessions was analyzed by cluster analysis. According to the SNP marker data, the 96 accessions were divided into two subpopulations. Most of the analyzed germplasms come from KOR divided with other germplasms. Genotypes from other origins were mixed in the same group, in this group, some peanut accessions from Korea formed a small separate group. Although low genetic diversity in peanut germplasm has been reported, analyses for population structure and clustering indicated that clear genetic differentiation between germplasm from KOR and other four countries. The STRUCTURE analysis reveals the existence of two subpopulations consistent with the clustering results based on genetic diversity. Our study identified the genetic differentiation in the peanut accessions from the five origins and this result could provide fundamental and visible information for enhancing genetic diversity studies and for finding novel traits in peanut breeding programs.
According to the geographical origin of different peanut accessions, the origins of peanut accessions are divided into East Asia and South America. The East Asia section includes peanut accessions from China and South Korea, while the South America section includes peanut accessions from Argentina, Brazil and Peru. In terms of geographical distribution, the peanut accessions from China and South Korea are the closest, while the other accessions from Argentina, Brazil and Peru are the closest. The Fst values between ARG, BRA, CHN and PER were small, indicating that there was a small genetic differentiation between the four populations. However, the Fst value between KOR and ARG, BRA, CHN and PER is large, which indicates that the genetic distance between peanut varieties from Korea and those from ARG, BRA, CHN and PER is large, and there is a high genetic differentiation between populations, indicating that there is a significant difference between groups from Korea and other groups. Moreover, the Fst value between KOR and CHN was relatively small compared with that between KOR and ARG, BRA and PER, indicating that the genetic differentiation of peanut accessions in KOR and CHN was small. Geographically, it corresponds to the distribution characteristics of East Asia and South America. The long distance geographical separation of East Asia and South America is the reason for the high genetic diversity between KOR and South American peanut populations generally regarded as the origin of peanuts. According to the results of population structure, there are abundant genetic differences between peanut accessions from Korea and those from other sources, which may be the result of human selection, or the peanut accessions from Korea are excellent adapted to the local ecological environment. Also, it is necessary to introduce more peanut germplasm resources into Korean peanut accessions to expand the genetic diversity of Korean peanut accessions in future breeding programs.
In the latest research, peanuts made significant progress in the whole genome sequencing, providing a huge amount of molecular marker information and gene annotations, despite the complexity of genome of cultivated peanut (Bertioli et al., 2016). Especially, SNP markers have good genetic affinity and can directly reflect a genetic diversity in the accessions at the DNA sequence level (Ren et al., 2013). In this study, CAPS markers developed from SNPs clearly separated all accessions into the distinct subpopulations. Therefore, the use of molecular markers including CAPS might be a useful tool to determine genetic diversity and population structure in peanut.
In summary, this investigation about information on genetic diversity is helpful for developing appropriate scientific strategy for peanut breeding (Landjeva et al., 2006) and it can be a great tool for genotype selection in a breeding program. Because of the large size of genome in the peanut species, it is necessary to use molecular markers, and breeders can use the molecular marker data to select the required information in the absence of any pedigree information. The results obtained from this study showed that the successful application of SNP information derived from re-sequencing based on NGS technology, and this study also proved availability of the CAPS marker to figure out genetic diversity and population structure using 96 peanut accessions.