Evaluation of training compares the post-training results to the objectives expected by managers, trainers, and trainees. Too often, training is done without any thought of measuring and evaluating it later to see how well it worked. Because training is both time-consuming and costly, evaluation should be done. The management axiom that “nothing will improve until it is measured” may apply to training assessment. In fact, at some firms, what employees learn is directly related to what they earn, which puts this principle of measurement into practice.
One way to evaluate training is to examine the costs associated with the training and the benefits received through cost/benefit analysis. As mentioned earlier, comparing costs and benefits is easy until one has to assign an actual dollar value to some of the benefits. The best way is to measure the value of the output before and after training. Any increase represents the benefit resulting from training.
However, careful measurement of both the costs and the benefits may be difficult in some situations. Therefore, benchmarking training has grown in usage.

Rather than doing training evaluation internally, some organizations are using benchmark measures of training that are compared from one organization to others.
To do benchmarking, HR professionals in an organization gather data on training and compare it to data on training at other organizations in the industry and of their size. Comparison data is available through the American Society of Training and Development (ASTD) and its Benchmarking Service. This service has training-related data from over 1,000 participating employers who complete detailed questionnaires annually. Training also can be benchmarked against data from the American Productivity and Quality Center and the Saratoga Institute.
In both instances, data is available on training expenditures per employee, among other measures.

It is best to consider how training is to be evaluated before it begins. Donald L.
Kirkpatrick identified four levels at which training can be evaluated.  The ease of evaluating training becomes increasingly more difficult as training is evaluated using reaction, learning, behavior, and results measures.
But the value of the training increases as it can be shown to affect behavior and results instead of reaction and learning-level evaluations. Later research has examined Kirkpatrick’s schematic and raised questions about how independent each level is from the others, but the four levels are widely used to focus on the importance of evaluating training.

Organizations evaluate the reaction level of trainees by conducting interviews or by administering questionnaires to the trainees. Assume that 30 managers attended a two-day workshop on effective interviewing skills. A reaction- level measure could be gathered by having the managers complete a survey that asked them to rate the value of the training, the style of the instructors, and the usefulness of the training to them. However, the immediate reaction may
measure only how much the people liked the training rather than how it benefited them.

Learning levels can be evaluated by measuring how well trainees have learned facts, ideas, concepts, theories, and attitudes. Tests on the training material
are commonly used for evaluating learning and can be given both before and after training to compare scores. To evaluate training courses at some firms, test results are used to determine how well the courses have provided employees with the desired content. If test scores indicate learning problems, instructors get feedback, and the courses are redesigned so that the content can be delivered more effectively.
To continue the example, giving managers attending the interviewing workshop a test at the end of the session to quiz them on types of interviews, legal and illegal questions, and questioning types could indicate that they learned important material on interviewing. Of course, learning enough to pass a test does not guarantee that the trainee can do anything with what was learned or behave differently.
One study of training programs on hazardous waste operations and emergency response for chemical workers found that the multiple-choice test given at the end of the course did not indicate that those trained had actually mastered the relevant material. Also, as students will attest, what is remembered and answered on learning content immediately after the training is different from what may be remembered if the “test” is given several months later.

Evaluating training at the behavioral level involves (1) measuring the effect of training on job performance through interviews of trainees and their coworkers and (2) observing job performance. For instance, a behavioral evaluation of the managers who participated in the interviewing workshop might be done by observing them conducting actual interviews of applicants for jobs in their departments. If the managers asked questions as they were trained and they used appropriate follow-up questions, then a behavioral indication of the interviewing training could be obtained. However, behavior is more difficult to measure than reaction and learning. Even if behaviors do change, the results that management desires may not be obtained.

Employers evaluate results by measuring the effect of training on the achievement of organizational objectives. Because results such as productivity, turnover, quality, time, sales, and costs are relatively concrete, this type of evaluation can be done by comparing records before and after training. For the interviewing training, records of the number of individuals hired to the offers of employment made prior to and after the training could be gathered.
The difficulty with measuring results is pinpointing whether it actually was training that caused the changes in results. Other factors may have had a major impact as well. For example, managers who completed the interviewing training program can be measured on employee turnover before and after the training.
But turnover is also dependent on the current economic situation, the demand for product, and the quality of employees being hired. Therefore, when evaluating results, managers should be aware of all issues involved in determining the exact effect on the training.

If evaluation is done internally because benchmarking data are not available, there are many ways to design the evaluation of training programs to measure improvements.
The rigor of the three designs discussed next increases with each level.

The most obvious way to evaluate training effectiveness is to determine after the training whether the individuals can perform the way management wants them to perform. Assume that a manager has 20 typists who need to improve their typing speeds. They are given a one-day training session and then given a typing test to measure their speeds. If the typists can all type the required speed after training, was the training beneficial? It is difficult to say; perhaps they could have done as well before training. It is difficult to know whether the typing speed is a result of the training or could have been achieved without training.

By designing the typing speed evaluation differently, the issue of pretest skill levels could have been considered. If the manager had measured the typing speed before and after training, he could have known whether the training made any difference. However, a question remains. If there was a change in typing speed, was the training responsible for the change, or did these people simply type faster because they knew they were being tested? People often perform better when they know they are being tested on the results.

Another evaluation design can address this problem. In addition to the 20 typists who will be trained, a manager can test another group of typists who will not be trained to see if they do as well as those who are to be trained. This second group is called a control group. If, after training, the trained typists can type significantly faster than those who were not trained, the manager can be reasonably sure that the training was effective.
There are some difficulties associated with using this design. First, having enough employees doing similar jobs to be able to create two groups may not be feasible in many situations, even in larger companies. Second, because one group is excluded from training, there may be resentment or increased motivation by those in the control group, which could lead to distorted results, either positive or negative. Additionally, this design also assumes that performance measurement can be done accurately in both groups, so that any performance changes in the experimental group can be attributed to the training.
Other designs also can be used, but these three are the most common ones. When possible, the pre-/post-measure or pre-/post-measure with control group design should be used, because each provides a much stronger measurement than the post-measure design alone.

