Evauation of training practice essay |
|
The following post is an essay researched and written in anticipation of an exam question for the module Training and Development in pursuit of the Organizational Psychology Master's qualification within the University of London's International Programmes (Birkbeck College). Excerpts may be be used with the citation:
Aylsworth, J. (2011). The Kirkpatrick model, alternatives and their value to practitioners. (url) Accessed: (Month year). "Critically evaluate the Kirkpatrick model of training evaluation in terms of its value to the practitioner. Does any alternative model of training evaluation offer better guidance?" The Kirkpatrick Model,
Alternatives and their Value To Practitioners Kirkpatrick’s (1959) framework is the model of evaluation most commonly used (Goldstein & Ford, 2002) because it is “simple, practical” and because trainers (practitioners) “aren’t much interested in a scholarly approach” (Kirkpatrick, 1996). However, it has been widely criticized (Aguinis & Kraiger, 2009) by those seeking to address its shortcomings (e.g. Holton, 1996).
The plan After briefly discussing key terms, we’ll look at the Kirkpatrick model, comparing and contrasting it with some of the alternatives. We will conclude that, despite its limitations, the Kirkpatrick model remains the most adaptable choice for trainers, particularly if they can blend its summative nature with its formative potential. Key Terms Along with facilitation of transfer, “evaluation” is the third and final phase of the systematic training cycle (Arnold & Silvester, et al., 2004). Evaluation can be defined rather vaguely as “any attempt to obtain feedback” as to the effectiveness and value of the training (Hamblin, 1974) versus a “systematic collection of description and judgmental information” for more far-reaching purposes (e.g. training modification) (Goldstein & Ford, 2002). In fact, the term “training” is not clearcut, e.g. whether outcomes should focus on the trainees as individuals (DOE, 1971) or are more related to organizational effectiveness (Hinrichs, 1976). “Learning” goes beyond “training” to address cognitive changes that become integrated into trainees’ existing cognitive frameworks (Goldstein & Ford, 2002). Returning to evaluation, we really need to consider its purpose. Should it be summative (outcome-focused, and designed to prove and control), or should it be formative (process-focused and designed to foster improvement and learning) (Easterby-Smith, 1994)? Instead of arguing either/or, we point to Iqbal et al.,’s (2011) work, which supports that the distinction need not be clear-cut. It’s also worthwhile to remember that evaluation should reflect the initial needs analysis. Kirkpatrick's model Consisting of four levels or “steps,” as he initially called them (Kirkpatrick 1996), Kirkpatrick's model consists of reaction, learning, behavior and results – and was most likely based on Raymond Katzell’s (1956) “hierarchy of steps” (Smith, 2008). The Kirkpatrick model is widely used to exemplify a summative approach (Goldstein & Ford, 2002). Kraiger et al., (1993) rightly point out that Kirkpatrick’s model says nothing about specific outcome criteria for any of the levels or how they will be assessed. Alliger et al., (1997), in their meta-analysis of the model, found little support for it. The strongest correlation among levels was an r of .26 between the utility (usefulness) component of reaction and immediate learning. Immediate knowledge and knowledge retention correlated at .35, but further correlations onto behavior where quite modest. The division of “reaction” into “affective” and “utility” may be a move in the right direction, but the datasets are still too small to justify a firm recommendation (Goldstein & Ford, 2002). These data do not suggest much of a need for practitioners to invest resources in further evaluation – and certainly, they argue against adding more levels (e.g. Philips’, 1994, return on investment), though the addition of Kaufman & Keller’s (1994) fifth level (societal outcomes) may be a different issue (Aguinis & Kraiger, 2009). However, we cannot do more than merely allude to it here. Holton & Bramley models Yet another framework for consideration of evaluation is linear versus cyclical versus integrated (Thompson, Eriksen-Brown et al., 2009). Kirkpatrick’s four steps fit the linear model. Bramley’s (1999) Improving Organizational Effectiveness Model is a cyclical model, which allows for the contribution of other factors (e.g. performance equals some function of ability x motivation x opportunity) to be considered during the course of the evaluation. Such a model would give practitioners the flexibility to adapt their evaluation based on changing resources, but further guidance is lacking. Holton’s (1996) Evaluation and Research Model represents an integrated approach – one that is more conceptually grounded, acknowledging secondary influences such as motivation and environmental elements. Holton has criticized Kirkpatrick’s model for its lack of correlations among levels, and interestingly, proposes dropping the first level –reaction – which is the only one (Alliger et al., 1997) with a correlation worth acknowledging. The complexity of Holton’s model and the fact that some aspects of it are only beginning to be tested (Holton, 1996) just do not recommend its practicality for trainers, particularly those who are hardest-pressed to justify their positions within the organization in today’s climate of outsourcing and downsizing. Kirkpatrick revisited Kirkpatrick defends his model, writing that he doesn’t care whether it’s a model or a taxonomy “as long as trainers find it useful.” He acknowledges that it doesn’t provide details on how to implement the four levels, writing that its “chief purpose is to clarify the meaning of evaluation and offer guidelines on how to get started and proceed.” We offer that the most helpful guidance for practitioners comes from Easterby-Smith (1994), who suggests that Kirkpatrick’s first two levels (reaction and knowledge) can be formative, while the third and fourth (behavior and results) can remain summative. We think this makes a great deal of sense because the expense of evaluation increases with each level, and Easterby-Smith’s suggestion would give practitioners a resource-considered alternative to proceeding blindly beyond the common practice of “ritual evaluation” (e.g. happy sheets with no real purpose). However, it also assumes that practitioners would have some familiarity with models other than Kirkpatrick as well as having realized the importance of associating the training criteria with the outcomes being measures. Conclusions An important question is “what happens in practice?” and the answer is something of a disconnect. Practitioners seek performance and improvement in motivation (and of course, increased profitability) (Sparrow & Kent, 2005), but evaluations typically don’t measure those outcomes. Saari, Johnson et al., (1998) found that while 92% of U.S. companies conducted some form of training evaluation, others (UK Industrial Society, 2000) found that 83% only assessed trainees' immediate reactions. Finally, we turn to Sloman and his reflections on the 2008 CIPD learning and development survey. Recalling Sloman’s (1999) description of Senge’s (1990) “learning organization” as underdefined and “woolly,” his CIPD article takes us back to the barnyard. He notes that practitioners are wanting a “shift away from menu-led, sheep-dip learning." Yet the areas where practitioners want training to go (e.g. less focused on return on investment and more focused on return on expectation) do not seem to represent outcomes that are easily defined or matched with training design. Certainly, it seems that these “wish-list” outcomes cannot be assessed solely through summative outcomes that do not consider process. Nor, should we expect that the resources for more formative approaches will be forthcoming. And so, we will keep Kirkpatrick’s model a little while longer – looking for opportunities to transition from summative to formative as resources allow. Exam performance: The essay was not used under exam conditions, so how it would have been marked is unknown. |
|