136,99 €
Variety trials are an essential step in crop breeding and production. These trials are a significant investment in time and resources and inform numerous decisions from cultivar development to end-use. Crop Variety Trials: Methods and Analysis is a practical volume that provides valuable theoretical foundations as well as a guide to step-by-step implementation of effective trial methods and analysis in determining the best varieties and cultivars. Crop Variety Trials is divided into two sections. The first section provides the reader with a sound theoretical framework of variety evaluation and trial analysis. Chapters provide insights into the theories of quantitative genetics and principles of analyzing data. The second section of the book gives the reader with a practical step-by-step guide to accurately analyzing crop variety trial data. Combined these sections provide the reader with fuller understanding of the nature of variety trials, their objectives, and user-friendly database and statistical tools that will enable them to produce accurate analysis of data.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 525
Veröffentlichungsjahr: 2014
To my Mom, and in memory of my Dad,
Professor Hongzhang Zhao, and Professor Donald H. Wallace
WEIKAI YAN
Agriculture and Agri-Food Canada, Ottawa, Canada
This edition first published 2014 © 2014 by John Wiley & Sons, Inc.
Registered office:John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial offices: 9600 Garsington Road, Oxford, OX4 2DQ, UK The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 350 Main Street, Malden, MA 02148-5020, USA
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.
The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author(s) have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Yan, Weikai. Crop variety trials : data management and analysis / Weikai Yan. 1 online resource. Includes bibliographical references and index. Description based on print version record and CIP data provided by publisher; resource not viewed. ISBN 978-1-118-68855-7 (Adobe PDF) – ISBN 978-1-118-68856-4 (ePub) – ISBN 978-1-118-68864-9 (cloth : alk. paper) 1. Plant varieties–Testing–Databases. 2. Crops–Testing–Databases. I. Title. SB123.45 631.5′2–dc23 2014002859
A catalogue record for this book is available from the British Library.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Cover image: © iStock.com/Jasmina007 Cover design by Wiley
Preface
Chapter 1: Theoretical Framework for Crop Variety Trials
1.1 Heritability under the genotype–location–year framework
1.2 Possible approaches to improve the variety trial efficiency
1.3 Heritability under various scenarios and their interpretations
1.4 Heritability estimated in the genotype–environment framework
1.5 Heritability and target region subdivision
1.6 Genotype-specific heritability as a shrinkage factor
1.7 Estimation of variance components and heritability
1.8 Summary
Chapter 2: An Overview of Variety Trial Data and Analyses
2.1 Levels of variety trial data
2.2 Single-trial data and analyses
2.3 Single-year data and analyses
2.4 Multiyear data and analyses
2.5 Decision-making based on multiple traits
Chapter 3: Introduction to Biplot Analysis
3.1 Biplot and matrix multiplication
3.2 Visualizing a biplot based on the inner-product property
3.3 Constructing a biplot based on research data in the form of a two-way table
3.4 Implementation of biplot analysis
Chapter 4: Data Centering for Biplot Analysis
4.1 Five possible types of data centering
4.2 Suitability of various biplots for genotype evaluation
4.3 Suitability of various biplots for test environment evaluation
4.4 Unique properties of the GGE biplot
4.5 Utilities of other types of biplots
4.6 How to generate biplots based on different data centering
Chapter 5: Data Scaling and Weighting for GGE Biplot Analysis
5.1 The link between the theory of indirect selection in quantitative genetics and test environment evaluation in GGE biplot analysis
5.2 Statistical parameters charactering a variety trial
5.3 Data scaling methods in GGE biplot analysis
5.4 Factor analytic-based GGE biplot
5.5 Preferred data scaling in GGE biplot analysis
5.6 How to implement data scaling in biplot analysis
Chapter 6: Frequently Asked Questions About Biplot Analysis
6.1 Frequently asked questions
6.2 Frequently seen mistakes in biplot interpretation
Chapter 7: Single-Trial Data Analysis
7.1 Objectives and steps in single-trial data analysis
7.2 The discrimination and precision of a variety trial
7.3 Detecting and correcting any human errors
7.4 Spatial analysis to correct any field trend and variation
7.5 A road map for single-trial analysis
7.6 How to implement single-trial data analysis
Chapter 8: Genotype-by-Location Two-Way Data Analysis
8.1 Objectives of single-year genotype-by-location data analysis
8.2 Analysis of highly heritable traits to reveal any human errors
8.3 Summary statistics for individual trials
8.4 Joint analysis of variance across locations
8.5 Mega-environment analysis
8.6 Genotype evaluation
8.7 Test location evaluation
8.8 Number of test locations: too many or too few?
8.9 How to implement the biplot and conventional analyses
Chapter 9: Genotype-by-Trait Data Analysis and Decision-Making
9.1 Model for the genotype-by-trait biplot
9.2 Biplot analysis of genotype-by-trait data from single trials
9.3 Genotype-by-trait data cross all trials
9.4 Genotype-by-trait biplot cross trials within mega-environments
9.5 Genotype evaluation based on multiple traits
9.6 Formulating new crosses based on genotype-by-trait data
9.7 How to implement the genotype-by-trait data analyses
Chapter 10: Trait Association-by-Environment Two-Way Table Analysis
10.1 ABE biplot to study associations among traits in different environments
10.2 The ABE biplot to study target trait-by-explanatory trait associations in different environments
10.3 The QQE biplot to study molecular marker-by-trait associations in different environments
10.4 How to generate the ABE biplots
Chapter 11: Location-by-Trait Two-Way Data Analysis
11.1 Location-by-trait data across genotypes
11.2 Location-by-trait data for individual genotypes
11.3 Repeatable environmental correlations among traits
11.4 How to implement the analyses?
Chapter 12: Mega-environment Analysis Based on Multiyear Data
12.1 What is a mega-environment?
12.2 Strategies of mega-environment analysis based on multiyear data
12.3 Strategy 1: analyze yearly and summarize across years
12.4 Strategy 2: one-step analysis of multiyear data
12.5 Mega-environment analysis and classification of a target region
12.6 Frequently asked questions related to mega-environment analysis
12.7 How to implement mega-environment analysis
12.8 Mega-environment analysis based the GGL+GGE biplot
Chapter 13: Test Location Evaluation Based on Multiyear Data
13.1 Concepts, definitions, and terminologies related to test location evaluation
13.2 Test location evaluation based on the GGE biplot
13.3 Evaluation of individual test locations in the northern mega-environment
13.4 Evaluation of individual test locations in the southern mega-environment
13.5 How to implement test location evaluation
Chapter 14: Genotype Evaluation Based on Multiyear Data
14.1 Genotype evaluation based on the current year data
14.2 Genotype evaluation based on the balanced subset from the multiyear data
14.3 Genotype evaluation based on all data from the multiyear trials
14.4 Comment on “stability analysis” in genotype evaluation
14.5 Comment on fixed versus mixed models in genotype evaluation
14.6 How to implement genotype evaluation based on multiyear data
Chapter 15: Building and Utilizing a Relational Database for Crop Variety Trial Data
15.1 Extract data from the database
15.2 The structure of the COOL database
15.3 Convert raw data into a COOL database
15.4 Editing a database table
Chapter 16: Experimental Design for Variety Trials and Breeding Nurseries
16.1 Experimental design for multilocation variety trials
16.2 Experimental design for trials with uneven number of replicates
16.3 Experimental design for unrandomized nurseries
16.4 Modules for early generation handling
Chapter 17: Modules and Functions in GGEbiplot
17.1 Three main groups of functions
17.2 Data preparation for analysis using GGEbiplot
17.3 Two-way data manipulation
17.4 Four-way data manipulation
17.5 Model selection for generating a biplot
17.6 Options for visualizing a biplot
17.7 Change the appearance of the biplot
17.8 Change the view of the entries and testers
17.9 Image output
17.10 Automatic numerical output
17.11 User-requested numerical output
17.12 Conventional statistical analyses
17.13 Genotype evaluation based on multiple traits
17.14 The 3-D biplot module
17.15 Data plotting
Chapter 18: Conclusions
18.1 How to determine the effectiveness of crop variety trials
18.2 Key points on multi-environment trial data analysis
18.3 Key points on single-trial data analysis
18.4 Key points on multitrait data analysis
18.5 Tools for data management and analysis
References
Index
Chapter 1
Table 1.1
Table 1.2
Chapter 2
Table 2.1
Chapter 3
Table 3.1
Table 3.2
Table 3.3
Chapter 4
Table 4.1
Table 4.2
Table 4.3
Table 4.4
Table 4.5
Table 4.6
Table 4.7
Chapter 5
Table 5.1
Table 5.2
Table 5.3
Chapter 6
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 6.5
Table 6.6
Table 6.7
Chapter 7
Table 7.1
Table 7.2
Table 7.3
Table 7.4
Table 7.5
Table 7.6
Table 7.7
Table 7.8
Table 7.9
Table 7.10
Chapter 8
Table 8.1
Table 8.2
Table 8.3
Table 8.4
Table 8.5
Table 8.6
Table 8.7
Table 8.8
Table 8.9
Table 8.10
Table 8.11
Table 8.12
Table 8.13
Chapter 9
Table 9.1
Table 9.2
Table 9.3
Table 9.4
Table 9.5
Table 9.6
Table 9.7
Table 9.8
Table 9.9
Table 9.10
Table 9.11
Chapter 10
Table 10.1
Table 10.2
Table 10.3
Table 10.4
Chapter 11
Table 11.1
Table 11.2
Table 11.3
Chapter 13
Table 13.1
Table 13.2
Table 13.3
Table 13.4
Chapter 14
Table 14.1
Table 14.2
Table 14.3
Table 14.4
Table 14.5
Table 14.6
Table 14.7
Table 14.8
Chapter 15
Table 15.1
Chapter 16
Table 16.1
Table 16.2
Table 16.3
Table 16.4
Table 16.5
Table 16.6
Table 16.7
Table 16.8
Table 16.9
Table 16.10
Table 16.11
Table 16.12
Cover
Table of Contents
Preface
Chapter
vi
vii
viii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
Crop variety trials are the most valued and best-funded research among applied agricultural researches. Regardless of the economic developmental level and the budget situation, crop variety trials are conducted every year in every region for every major crop of the region. Breeders rely on variety trials to select superior breeding lines to release as new cultivars; farmers rely on variety trials to choose suitable crop cultivars to grow in their farms. Processors rely on variety trials to decide where and of which cultivars to source their grains or other crop products to process.
The direct outcome from crop variety trials is data; the ultimate outcome from crop variety trials is information on the target region, the test locations, and the genotypes, thereby correct decisions can be made on the genotypes for the target region. Data analysis is the process to extract useful information and draw conclusions from the data.
Data analysis is the process to extract useful information and draw conclusions from the data.
Data analyses performed by most researchers conducting variety trials are quite simple, in spite of numerous new and advanced methods advocated by statisticians. In most variety trial systems, the annual report of variety trials is limited to the following aspects: (1) Genotype-by-trait two-way tables for each trial (location), with summary statistics for each trait, such as trial mean, standard error, and least significant difference. (2) Genotype-by-location two-way tables for each trait in absolute values. (3) Genotype-by-location two-way tables for each trait in values relative to the trial mean or to a check. Presenting relative values is one step forward, which serves as a means to remove the environmental main effects and facilitates data summary across trials. (4) Genotypic means across all locations and/or locations within subregions. This is another step forward as this gives genotypic values for the region or subregions, thereby any genotype-by-location interactions across the whole region or a subregion are removed. Genotypic values for a trait can then be used to rank the genotypes, which become the basis for selecting genotypes and recommending cultivars. (5) In addition to genotypic means for the current year, some reports also include genotypic means across recent 2–5 years, when applicable. Genotypic ranking based on data from multiyears is certainly more credible as any genotype-by-year interaction and genotype-by-location-by-year interactions would be removed.
Primitive as it may appear, these simple data summary and analyses are quite effective, as evidenced by the continuous progress in cultivar development and crop production in various crops worldwide. However, the analyses may be improved by asking a few questions. First, when summarizing across all test locations, it is assumed that there are no repeatable genotype-by-location interactions (GL) within the target region represented by these locations. Is this true? When summarizing across locations within subregions, it is assumed that there are repeatable genotype-by-subregion interactions and there are no repeatable GL within subregions. Are these true? If the answer to any of these questions is “no” or “not sure,” then the data summary system may be suboptimal and should be improved. The process to answer these questions is “mega-environment analysis.” Second, the genotypic means across locations and years are calculated under the assumption that all test locations are equally representative of the target mega-environment and equally informative about the genotypes. Are these true? If the answer to any of these is “no” or “not sure,” then the system may be also suboptimal and needs to be improved. The process to answer these questions is “test location evaluation.” Third, two genotypes ranked the same based on genotypic means may be quite different in their specific adaptations or stability across the target region. This is the issue of “stability analysis,” which has been a buzzword in variety trial data analysis. Many stability indices have been proposed during the last 50 years but none is widely used by practical researchers. This is because the researchers are more confused than enlightened by these indices. A clear guidance is needed in this aspect to improve the precision and accuracy of genotype evaluation and cultivar recommendation. Fourth, decisions on genotypes have to be based not only on a single trait like yield but also on quality and other traits; unfortunately desirable traits are often undesirably associated. Genotype evaluation based on undesirably associated traits is a difficult task such that most variety trial reports leave this untouched. However, this is a decision that must be made, and tools and guidelines are needed. Finally, variety trial data analysis and decision-making have been hindered not only by knowledge but also by the availability of relevant, intuitive, verifiable, and user-friendly software. Although many comprehensive, powerful software packages are available, they are designed for use more by professional statisticians than by hands-on plant breeders and agronomists; although statisticians and breeders try to work closely, there is always a large gap between them due to different knowledge base and different research interests.
This book is written to fill the gaps. It is written to help researchers conducting crop variety trials to answer various questions and provide solutions in variety trial design, conduct, data management, data analysis, and decision-making. It starts with the definition of heritability in the framework of multiyear, multilocation variety trials, which is the theoretical foundation of variety trials and crop improvement. Heritability is the measure of the usefulness of the variety trials in variety evaluation. All practical measures in variety trials, from design, conduct, to data analysis, have a single purpose; it is to improve the heritability of variety trials so that superior genotypes can be effectively identified for the target environment (Chapter 1). There are three levels of variety trial data: single trial, multilocation trials in a single year, and multilocation trials in multiple years. The analytical techniques needed include conventional methods such as analysis of variance, variance component analysis, linear correlation, multiple regression, and graphical methods particularly biplot analysis (Chapter 2).
Biplot analysis was first developed by Gabriel (1971) and has become a popular method in variety trial data analysis in the name of “GGE biplot” following some of our work (Yan et al., 2000; Yan, 2001; Yan and Kang, 2003). Biplot analysis is a powerful data visualization tool and can be used to graphically address many research questions including those listed above. However, biplot analysis has not been used properly and adequately in many publications. This is understandable as it is still a new technique to most agricultural researchers and its properties and utilities are still being discovered and developed. The principles of biplot analysis, frequently asked questions, and frequently seen mistakes related to biplot analysis constitute a fair portion of this book (Chapters 3–6). Biplot analysis and conventional statistical analyses are jointly used in the analysis of different levels of data for a single trait (e.g., yield) to address the following issues: spatial or field trend analysis for single-trial data (Chapter 7), mega-environment analysis based on data from single and multiple years (Chapters 8 and 12), test location evaluation based on data from single and multiple years (Chapters 8 and 13), and genotype evaluation based on data from single and multiple years (Chapters 8 and 14). Genotype evaluation and decision-making based on multiple traits are addressed in Chapter 9. In addition, Chapter 10 illustrates the use of biplot analysis in studying trait associations in different environments, which can be extended to quantitative trait loci (QTL) identification based on phenotypic data from multiple environments. Chapter 11 illustrates the use of biplot analysis in studying location-by-trait patterns; this is a new application of biplot analysis and can be useful for processors to identify locations or regions for sourcing grains with desirable quality profile.
Chapter 15 describes a relational database system for storing, managing, and utilizing multilocation, multiyear variety trial data. Chapter 16 describes experimental designs for crop variety trials and breeding nurseries. Most of the biplot analysis and conventional analyses, plus data management and experimental design, are conducted using the GGEbiplot software (www.ggebiplot.com). So the modules and functions of GGEbiplot are systematically but succinctly introduced in the penultimate chapter (Chapter 17). As a crop breeder and the developer of the software I use it for almost all aspects in my breeding work, from experimental design to data management, data analysis, and decision-making. Its high efficiency and user-friendliness allowed me time to write research papers and edit/review manuscripts for many scientific journals, in addition to running a productive oat breeding program.
I write this book as a hands-on plant breeder. All issues addressed in this book are real problems identified from my own breeding work. In fact, since my target region is Eastern Canada, the sample data used in this book are mostly real data from the oat variety trials conducted in Eastern Canada or across Canada. This is out of convenience, but also ensures that the topics are relevant, methods are valid, and conclusions are meaningful and verifiable. This book is written for plant breeders/agronomists who conduct and analyze variety trials and statisticians who work with them. I hope that breeders/agronomists will find the book useful in providing the theoretical framework and a holistic picture about crop variety trials, in providing solutions to experimental design, data management, and data analysis at various levels and aspects, and in clarifying some long-standing confusions related to genotype-by-environment data analysis and stability analysis. I also hope that statisticians will find this book useful in understanding the problems facing plant breeders/agronomists to make their assistance more relevant and efficient. This book is also written for graduate students in the areas of plant breeding, genetics, agronomy, and applied statistics so that they are better prepared as future plant breeders, agronomists, and agricultural statisticians.
