摘要
Improved capacity of genomics and biotechnology has greatly enhanced genetic studies in different areas. Genomic selection exploits the geno type-to-phe no type relationship at the whole-ge nome level and is being implemented in many crops? Here we show that design-thinking and data-mining techniques can be leveraged to optimize genomic prediction of hybrid performance? We phenotyped a set of 276 maize hybrids generated by crossing founder inbreds of nested association mapping populations for flowering time, ear height, and grain yield? With 10 296 310 SNPs available from the parental inbreds, we explored the pat? terns of genomic relati on ships and phenotypic variation to establish training samples based on clustering, graphic n etwork analysis, and genetic mating scheme ? Our analysis showed that training set desig ns outperformed random sampling and earlier methods that either minimize the mean of prediction error variance or maximize the mean of generalized coefficient of determination. Additional analyses of 2556 wheat hybrids from an early-stage hybrid breeding system and 1439 rice hybrids from an established hybrid breeding system validated the approaches. Together, we dem on strated that effective genomic predicti on models can be established with a training set 2%-13% of the size of the whole set, enabling an efficient exploration of enormous inf ere nee space of gen etic combi nations.