4

Tools of the Trade

Eight Processes for Gathering Data

Knowing what you are looking for is one thing; working out how best to see it is another. For this, you need the tools of the trade: the measurement methods, tests, and instruments that enable you to assess both the standard signs of talent, such as intelligence and personality, and the different types of fit. You will already be familiar with many of these tools. Yet just as with what to measure, there are some common misperceptions about how to measure and which tools are best to use.

In this chapter, we look at eight major measurement methods, from everyday methods, such as interviews, to deeply technical ones, such as psychometrics and situational judgment tests. For each method, we review the latest research, reveal key points, and explore what it can and cannot do. In chapter 5, we then consider some guidelines for how to decide which method or methods to use.

First, though, we address a more basic question: Why do we need formal measurement methods at all? Why not just ask candidates and those who know them what their talents are, and let things go at that? The answer, which comes down to problems of bias, may seem obvious. But considering it more closely can help us to understand the fundamental challenges facing all measurement methods and thus guide decisions about which tools to use.

Two Basic Ways of Measuring Talent

If you want to know how talented someone is at something, in theory your two best options are testing performance or measuring his or her actual results. Yet in today's work environment, there are few aspects of performance that you can test directly without asking someone's opinion, especially in complex roles. And as we described in chapter 2, measuring results is often not straightforward. It can be hard to know whether outcomes are due to genuine talent or other factors like opportunity or plain luck.

This leaves you two options: ask people themselves how good they are or ask someone else (like their manager). In fact these are the two most common ways of obtaining information about people's talent, and most measurement methods involve at least one of them.

Asking People Themselves

So can people accurately rate how good they are at something? The consensus from research is that most people do have a reasonable sense of whether they have some skill in something. For example, they know whether they can drive a truck and if they are closer to being a novice or an expert. What they are less able to do is to make specific, accurate ratings of how good they are compared to others.

For instance, if you look at everyone overall, you find that people tend to slightly overrate their abilities.1 This is particularly true when they believe that the rating will have some consequence for them, as when it is used in appraisals or to inform selection decisions. Yet this general tendency hides some big individual differences. People vary significantly in the degree to which they over- or underrate themselves.2 Some people overrate greatly, and others underrate, so a big problem with self-ratings is their reliability: that their accuracy can vary greatly from one person to the next. And adding spice to this is the Dunning-Kruger effect: the finding that people of low ability tend to overrate themselves, while people of high ability tend to underrate themselves.3

These tendencies to over- or underrate are examples of what are formally called ratings bias, judgment errors, or response distortion. They are the enemy of accurate measurement, and almost every evaluation you make is affected by them to some degree. Some people, for instance, always tend to respond positively to “yes or no” and “true or false” questions.4 Others tend to respond either more moderately or more extremely. Women, for example, tend to give more extreme ratings than men do.5 And research has shown that demographic and geographical cultural differences can affect how people rate themselves too.6 Just think of the stereotypical British way of describing something they like as “not bad” versus the American “great.”

Together these issues make it difficult to obtain an accurate picture of how good someone is just by asking him or her. And this is assuming that the person answers honestly. It does not even include the issues of faking and cheating.

So what about asking others?

Asking Others

Unfortunately, similar issues exist when asking others how good someone is. Indeed, if the literature showing difficulties with self-ratings is substantial, it is dwarfed by the research into the challenges of rating others.

It seems to be something that people are capable of: there is evidence that the ratings we make of others can accurately reflect their performance levels. Yet almost all of the rating biases that affect our ability to rate ourselves also apply to rating others, and there are more too. You may have heard of some of these, such as the halo effect.7 This is where our ratings of individuals' specific qualities are influenced by our overall impression of them. It is why we tend to rate the performance of people we like or find attractive more highly than that of people we do not. It is also why almost everyone—even the professional assessor—tends to rate the performance of senior leaders more highly than that of lower-level managers.8

Other biases may be less familiar. For instance, it is obvious that we need to know someone or at least have had a chance to observe a person to rate his or her ability at something. What it less clear is that knowing people can hinder the accuracy of our judgments too. Studies differ in their opinion about how long we need to know someone before this bias comes into effect. But after somewhere between eighteen months and five years, we appear to lose our ability to rate others' performance objectively.9

Many biases seem to be culturally based and unconscious: we are simply not aware of them. We appear to rate those similar to ourselves more highly even when we are not aware that they are similar to us (e.g., if they have the same personality profile).10 Then there is the fact that we are overly influenced by first impressions, preconceptions, and stereotypes: we make up our minds too quickly and tend to hear only the information that reinforces our initial impressions. And topping this off is the irony that despite all of this, we are generally overconfident in our judgments about others.11

So as with self-ratings, the ratings of others can be accurate, but they are notoriously unreliable. No matter how objective and free of bias we like to think we are, the sober fact is that we usually are not. Indeed, given all the issues, you could be forgiven for thinking that it is amazing that anyone ever rates anything accurately. Yet we do. We often just need a bit of help.

Eight Less Biased Methods for Measuring Talent or Fit

This is where more formal and structured measurement methods come into play. They are designed to try to minimize the impact of the biases and limitations that plague all our core sources of information about talent. We will look at eight of the most common methods:

  • Tools for sifting candidates
  • Interviews
  • Psychometric tests
  • Assessment centers
  • Situational judgment tests
  • Individual psychological assessment
  • 360-degree feedback
  • Work sample tests, simulations, and games

To understand how effective each method is, we will look not only at its predictive validities, but also at how it helps us overcome the basic challenges of bias. And as for where to begin, let us start with what for many talent measurement processes is the beginning: tools for sifting candidates.

Sifting Tools

Almost all selection processes involve a kind of funnel. You start with a long list of applicants, potentially thousands of them, and then gradually narrow the list. At this stage, the purpose of talent measurement is to identify the candidates who stand the best chance of ultimately being selected.

This sifting process is often a collection of different methods. To minimize expense and speed things up, companies typically begin with low-cost methods that require little human interaction. Reviewing résumés is almost always part of this process, and telephone interviews and psychometrics are common.12

The automated résumé checkers and application form scoring systems that have emerged over the past decade or so are grabbing the headlines in this area. As their names suggest, checkers look directly as people's résumés, while application scorers review information that candidates enter into online forms. Both involve candidates' providing information, which is then rated according to a preprogrammed algorithm or scoring system. These algorithms often include so-called killer items—factors such as work visa status that can result in the immediate rejection of the candidate. Other items or questions can be weighted according to the demands of the role.

In our experience, these systems are popular because they are fast and efficient and because the use of mechanical algorithms tends to be seen as fair. There are some limitations and downsides, of course. Initial setup costs can be high. And these systems do not provide the level of candidate experience that most companies wish to offer their senior-level applicants. In fact, candidates at every level hate long application forms, yet a common mistake we see is for firms to collect more information than they need. There is also the issue that automated systems can reduce diversity because they can exclude candidates with nontraditional backgrounds.

Three developments in sifting are of note. First, companies are increasingly using self-selection. They are open and clear about selection criteria so that those who know that they cannot pass may elect not to apply. Second is the growth of online reference checkers, which are effectively 360-degree feedback tools. They make checking references quick and easy, to the extent that some firms are including checkers in their sifting process. These tools are undeniably attractive and impressive. Yet it remains to be seen whether they will help firms overcome the perennial issue of references: the poor quality of information they provide.

Finally, there is a tendency among some organizations to use information from Facebook, LinkedIn, and other online social networks. It appears to be quite common: a 2012 survey found that 29 percent of firms thought that candidate information on social media sites can be useful.13 Our views on this are straightforward: checking public sites designed for corporate use, such as LinkedIn, seems entirely natural and no more useful or harmful than looking at a résumé. However, some firms have gone further and are now asking candidates to supply their Facebook passwords so that recruiters can review the site content. If you are thinking of doing this, our advice is simple: do not. It has become a legal and political issue in the United States and other countries, as complaints have been made that it breaches personal privacy. Laws to prevent this practice have already been introduced by some US states, and others are likely to follow suit. Even putting aside the legal and moral issues, there is no evidence that this is an effective way to predict performance in the workplace.

Purely on the basis of their efficiency, sifting tools will continue to be a prominent component of many firms' hiring processes. Their automated aspect reduces the impact of rating biases and thereby improves accuracy. Yet these tools are only as good as the scoring systems they use, so the challenge is how to get this scoring right. Unfortunately, the importance of this is sometimes overlooked, or it is masked by the sleek interfaces of these shiny new systems. Moreover, for the moment, independent research into whether they can indeed deliver the benefits that they promise is lacking. Caution and a careful eye to results are thus recommended.


Sifting Tools
When to use: In the early stages of a recruitment process. Automated tools may be useful when you have a large number of candidates and the budget for initial setup costs.
Potential benefits: Can enable you effectively and efficiently to narrow down a large pool of applicants to only those most likely to succeed.
Caveats and concerns: The scoring formula needs to be effective in order for the system to work, and this can take time and effort to ensure. Research is lacking into the predictive accuracy of these systems, and they may have a negative impact on the diversity of successful candidates.

Interviews

Today interviews are the most used measurement method, probably because of the natural desire to want to meet people before making a decision about them.14 Research has shown that interviews done well can lead to accurate judgments. They tend to be used to measure competencies and seem to measure different things from intelligence and personality tests, since they can offer incremental validity over them.15 Yet interviews are based on social interaction and the rating of others, so they are open to all sorts of biases. This means that they can be unreliable as a source of information since interviewers can vary considerably in how they rate the same candidate.16

For example, an intriguing fact about all types of interviews is that whatever their stated purpose, they usually do not in fact appear to measure competencies. Interviewers may think that they do and make competency ratings based on them. Yet in most cases, what they seem to be judging candidates on is really a mixture of how socially skilled the candidates are and the degree to which they have done the type of job before.17 Moreover, from our own research, we know that this seems to be the case regardless of how much training the interviewer has had. Even professional assessors have the same issue. It appears that we have a natural tendency to home in on particular things, such as how intelligent or likable people are. We then base our judgments on these factors rather than on the competencies that we are supposed to be measuring.

To counter the biases inherent in interviews, researchers have found that the best solution is to add structure. In fact, one study has described fifteen separate components of structure. The most common are things like providing scripted questions and banning spontaneous follow-up questions.18 Adding structure does not seem to do much about the issue of not measuring competencies, but it does seem to make interviewers more able to predict performance. In fact, some studies have suggested that structure can double the validity of interviews, from the 0.14 found for unstructured formats up to 0.35.19 This is better than most personality tests. Moreover, structure seems to reduce the opportunity for faking or impression management by candidates, while also reducing the likelihood of claims of unfairness and subsequent litigation.20

As a result of this research, the past twenty years have witnessed a sea change in interviewing as companies have moved to increasingly structured formats. There are many different types, but two main formats have emerged. There are competency-based interviews, which involve a series of questions about specific job-related competencies. You will recognize these from their trademark questions: “Describe a time when you …” or, “Give me an example of when you …” Then there are situational interviews, which ask questions about hypothetical situations, along the lines of, “What would you do if …”

There has been much debate about which type is better, and the evidence is mixed. Generally the finding has been that questions about what people have done enable you to make more accurate judgments than hypothetical questions about what they might do.21 There is also some suggestion that situational interviews may measure problem solving as much as anything else and thus offer lower incremental validity over intelligence tests.22

Yet despite near evangelical enthusiasm for structure from some proponents, the approach does have critics. For starters, competency-based interviews may be fine for lower-level jobs, but for higher-level roles, they do not offer a good enough candidate experience. Even for more junior jobs, if such interviews are too strictly structured, they can amount to little more than an oral examination. Indeed, interviews that discourage the use of unscripted questions risk ruining the key benefit of interviews: the opportunity to engage with the candidate as an individual.

Researchers often complain about many managers' reluctance to add structure to their interviews. This betrays a lack of understanding among researchers about how managers use interviews. Researchers have largely focused on making interviews more able to predict performance. But business interviewers also tend to use them to get a sense of the degree of fit or chemistry that exists between the candidate and the business. And, interestingly, there is evidence that skilled interviewers can gauge this sort of thing pretty well.23

Rounding all this off is emerging evidence that a bias among researchers toward publishing studies that are positive about structured interviews may have led to an overestimation of how good they are.24 For this reason, many companies prefer what are called semistructured interviews. These try to impose some kind of structure on proceedings, while also allowing interviewers to ask spontaneous questions. One example is biographical interviews, which explore candidates' past, usually by following a résumé and asking questions about each job.

Although semistructured interviews may not minimize the impact of bias quite as well as a fully structured interview, these formats can still improve predictive accuracy.25 And, critically, they do not rob interviews of the other benefits that they can provide. However, because semistructured interviews are more open to bias than fully structured ones, they require greater skill to conduct reliably well. This means that businesses need to be able and willing to train their staff in interviewing skills. Unfortunately, in our experience, this is something that is often overlooked.

As a result, firms can fail to make the most of the opportunity that interviews present. Indeed, we regard interviews as one of the biggest lost opportunities in talent measurement at present. We hinted at this in chapter 4 when we argued that businesses should be more explicit about measuring aspects of fit. We have seen other examples of lost opportunities in interviews, too, such as the use of panel interviews. Each interviewer is often provided with training or guidance about his or her role, but there is less input on how interviewers should interact with one another to make the best judgments. This is especially important given the evidence that groups of people can be worse at making decisions than individuals.26 Multiplying the number of interviewers does not guarantee a better outcome.

In chapter 7, therefore, we look further into what businesses can do to make the most of the opportunities that interviews present and enhance the accuracy and reliability of this most fundamental of measurement methods.


Interviews
When to use: In all talent measurement processes. In fact, we believe that no one should ever be hired, fired, or promoted without an interview.
Potential benefits: At best, can offer predictive validities of over 0.35. Can also provide incremental validity over intelligence and personality tests. Can be used to predict person-organization, person-team, and person-manager fit, as well as job performance.
Caveats and concerns: Open to many biases and unreliable as a source of data. The best way of reducing bias—adding structure—can also reduce many of the potential benefits of interviews if too much is added. The best compromise solution—the semistructured interview—requires training if it is to be done well.

Psychometrics

For many people, talent measurement is synonymous with psychometrics. Strictly speaking, this includes all forms of psychological measurement. To most people, though, psychometrics refers to those questionnaire-based tests of things like intelligence, personality, and integrity. Some of them are tests of performance (such as intelligence measures), but most involve people rating themselves in response to questions. This opens the tests to rating biases. Yet by carefully selecting which questions are asked and asking them in certain ways, well-constructed tests can reduce the impact of these biases and achieve good levels of accuracy.

In spite of this, suspicion persists. Studies show that managers place more value on information about candidates when it is gained from interviews than when it is measured by psychometrics.27 They seem to trust their instincts more than the science. Some of this suspicion stems from lingering concerns about faking and cheating. Another issue is concern about the quality of some psychometric tests: some people are just not convinced that they work. And, frankly, who can blame them for their doubts?

Many people can remember the computer-generated reports of a decade ago that attempted to provide some interpretation of scores. Users initially liked the reports because they helped them understand the results. Yet at times, these reports appeared to be not much better than a horoscope. Moreover, over the past ten years, there has been a massive surge in the number of tests available on the market. The challenge for businesses—and what threatens to undermine the psychometric industry—is that although some tests are excellent, many are very poor, and organizations can have a hard time telling the difference. As we reported in chapter 1, only a small proportion of test publishers engage in proper validity studies.28

In chapter 5, we consider how businesses can navigate these issues when choosing tools. For now, what is important is that although there are doubts and concerns, the use of psychometric tests appears to be growing. The bottom line is that they are relatively quick, cheap, and effective. They are seen as an easy win.

Psychometrics has changed a lot over the years. One of the biggest developments was the introduction of first computerized and then online testing. Some people initially questioned whether these tests were equivalent to the old pencil-and-paper versions, but subsequent research has shown that this is not really a problem.29 Others also expressed concern that online tests are more prone to disruptions that can affect people's performance.30 Yet studies on the impact of this differ in their view of how significant it is.31

The reality is that computerized and online testing have made testing easier and better by allowing remote testing and the collection of bigger databases. They also allow vendors to do some interesting new things with their tests. They can, for instance, now measure aspects of how people are taking the test, such as how long each question takes, which can help identify cheating. And the very latest tests are beginning to include high-fidelity item presentation—that is, the use of video clips and animations.

A second major development has been the rise of new ways of scoring tests, such as computerized adaptive testing. Although they differ in approach, these new scoring methods all attempt to do the same thing: get more information from fewer questions. With traditional psychometrics, test takers answer a set of predetermined questions. But with computerized adaptive testing, the test adapts to individual test takers. If they get a question wrong, it gives them an easier one the next time; if they get the question right, it gives them a harder one next. This allows the test to home in on someone's level of ability quickly. Organizations are generally enthusiastic about these innovations, too, as they allow briefer tests. Indeed, the days of the hour-long personality test seem numbered in the face of new tests claiming to offer similar validities in just twenty minutes.

Indeed, one easy conclusion would be that no other method offers the same level of accuracy and reliability in such a short time and at so little cost. Yet there is a sting in the tail here because psychometrics has a hidden weak spot, and it is a big one. Although some vendors claim that the usefulness of a test lies in its validity levels, we heartily disagree. The utility of a test is determined by what is done with it. And the harsh reality is that the way many businesses use psychometrics leaves much to be desired. The results are often either ignored or overly revered as definitive measurements that capture all the complexities that it is worth knowing about people.

Some of this misuse stems from the suspicions we mentioned earlier about quality and reliability. But some of it results from a simple lack of understanding about how best to use the data. In chapter 7, we return to look at this issue and what you can and cannot do with measurement results. Psychometric tests may be the epitome of talent measurement and a key part of its future, but they need to be handled with care.


Psychometrics
When to use: In all talent measurement processes. Tests of ability and intelligence tend to be used earlier in selection processes. Tests of personality, values, and attitudes tend to be used later in the assessment, typically in combination with an interview.
Potential benefits: Relatively quick, cheap, and accurate.
Caveats and concerns: Many of the psychometrics on the market lack proper research to validate their accuracy, and it can be difficult to know which tests work and which do not. In addition, psychometric results need more work and more careful handling than the outputs of many other measurement processes.

Assessment Centers

Assessment centers have long been held up as the gold standard of talent measurement. In their classic format, they involve multiple assessors who are observing a group of participants over a series of exercises and tests and rating them on a number of competencies. Effectively, then, they are a collection of other methods, yet they are also more than the sum of their parts. By involving both test scores and multiple assessor ratings, they seem able to minimize the impact of rater biases and produce reliable judgments.32 They thus appear to enable good predictive judgments, with reported validities of around 0.40.33 This is higher than interviews and personality tests but lower than intelligence tests.

The German military of the 1920s is widely credited with starting the assessment center concept as a way of selecting soldiers, although the term was coined in the 1930s by a Harvard professor. It was not until the 1970s that assessment centers really caught on in businesses, though. Best practice in how to develop and run them is now well documented, and there are numerous books on the subject. But over the years, three particularly interesting issues have emerged.

First, assessment centers are designed to measure competencies. Yet there is now a substantial body of evidence that shows that, like interviews, they may often not actually do so. Evidence suggests, then, that although assessors may appear to measure different competencies, what they are often seeing and actually rating is overall performance in each exercise.34 For example, an assessor may be tasked when observing a group discussion exercise with rating the participants on a number of competencies, such as assertiveness and communication skills. Yet statistical analyses show that more often than not, even though they may give separate ratings to these two competencies, what they are really assessing is the ability to do well in a discussion exercise.

The practical implication is that when you choose or develop assessment centers, you should focus not so much on how they measure particular competencies, but on what exercises they use and their relevance to what you are trying to measure.

Second, assessment centers do not seem to work as well as they used to. Validity levels seem to be falling, with a recent meta-analysis reporting levels of around 0.27.35 If this is true, then assessment centers have lost a quarter of their predictive power in a decade. A number of different theories as to why this is happening have been suggested, but the most convincing is that the quality of assessment centers seems to be declining. Some researchers are concerned about increasing use of off-the-shelf centers that are not sufficiently tailored to organizations' needs. Others cite the disturbingly high number of competencies that many centers seem to try to measure. A recent study, for instance, found that nearly a quarter rate eleven or more competencies, and over three-quarters use five or fewer exercises to do so.36 Finally, a number of concerns have been raised about the lack of training provided to assessors.

As with any other product, if its development is distributed across a range of unregulated suppliers, the quality of each supplier's work will vary, and the overall quality of the product will inevitably fall. In our view, this is precisely what has happened with assessment centers. As with psychometrics, it can be difficult to know when you are getting a quality product. Yet it is not only the market that is at fault. Cost pressures in implementing assessment centers frequently mean that businesses compromise on their development and deployment. These compromises are understandable, but each one chips away at the validity of the assessment center.

The third issue is a very recent one: the use of assessment centers appears to be declining rapidly, particularly in the United States. The forces driving this seem to be the increasing availability of virtual methods such as psychometric testing and online interviews and a desire to reduce costs. The travel costs associated with bringing all the participants to one place seem to be a particular concern for many businesses, especially in a tough economic climate.

One thing to be careful of here is that some of the new online measurement processes that are emerging as alternatives are calling themselves assessment centers even though they are not (because they do not include group exercises). They are simply trying to cash in on assessment centers' good name. This is not to say that all the tools being used to replace assessment centers are poor, of course, and at the end of this chapter, a small case study describes one such effective alternative.

So assessment centers, once held up as an aspirational pinnacle, appear to be in decline. Given the falling standards associated with them, this may not be a bad thing. Yet we would be sad to see them go because, done well, they present a unique opportunity to evaluate talent in a rigorous setting.


Assessment Centers
When to use: In all selection methods. Tend to be used at more junior levels as an assessment process and at more senior levels as a developmental process.
Potential benefits: Rigorous process that provides multiple perspectives on individuals and an opportunity to see them interacting with colleagues. Can be accurate and reliable when done properly.
Caveats and concerns: To deliver the validities that they are capable of, they need to be done properly, without too many compromises.

Situational Judgment Tests

Although they have been around since the 1920s, situational judgment tests (SJTs) feel relatively new. They were little used for many years, until computerized testing arrived in the 1990s and opened up new possibilities for them. Sometimes called low-fidelity simulations, they entail presenting people with realistic work scenarios and then asking questions about them. The scenarios can be presented as written descriptions, videos, or animations, and answers are selected from a multiple-choice list. Some situational judgment tests can adapt to the test taker too, in that responses to one scenario determine which situation is presented next. Many firms like this approach since it means that test takers are presented with the consequences of their choices.

Situational judgment tests have traditionally been built around critical incidents and ask a range of questions about how to respond to them. This type of SJT is still often used for assessing job knowledge, like safety procedures or customer service behaviors. More recently, though, we have also seen the rise of SJTs that try to measure things such as problem solving or aspects of personality.

There has been a fair amount of debate about which type of SJT is superior and how best to develop these tests. For example, two main types of questions can be asked. Some tests ask people about behavioral tendencies—how they tend to respond to situations. Others, known as knowledge-based tests, ask people to select the correct or best response to a situation.37 The jury is still out on which is more effective. Behavioral tendency SJTs seem to offer greater incremental validity over intelligence tests and appear to measure what a candidate can do when performing at her or his best. Knowledge-based SJTs appear less easy to fake and cheat on and seem to measure typical performance levels.38

Situational judgment tests are increasingly popular for two main reasons. First, they look valid, appear fair and relevant, and tend to evoke few complaints. Next, they are fairly valid. A review of ninety-five studies found validities for SJTs of around 0.34—about the same as a good personality test.39 However, the review also found a lot of variability between tests, and in our experience this is certainly the case. Good tests can be really good, but poor tests can be really poor. The key to quality is simple: how well constructed the scenarios and multiple-choice responses are. A poor test will present scenarios that are not that relevant to what is being measured and offer multiple-choice answers that are obvious. A good SJT poses complex, relevant situations and a choice of responses—a number of which could be correct.

Situational judgment tests have also been shown to provide some incremental validity over both intelligence and personality tests. The extra validity is not large, since SJTs tend to measure aspects of both intelligence and personality, but it is there. So if you already use a combination of intelligence and personality tests, SJTs will not typically provide anything extra. However, if you currently use only an intelligence test, adding an SJT will give you roughly the same extra validity as adding a personality test (around 0.07).40

In light of this, you might well wonder why SJTs are not more popular. In our experience, the answer is simple: time and money. Developing a good SJT requires both, since developing the scenarios and questions requires the input of subject matter experts. There are also issues with shelf-life, as some tests may remain useful only for a few years. And for global companies, the need to localize the tests for different geographies can add considerable cost. Given all this, for many companies that already use an intelligence test, adding a personality tool seems quicker, easier, and cheaper.

The use of situational judgment tests has thus largely remained confined to two instances. The first is large-scale recruitment drives, for which the numbers of candidates can justify the costs. We have recently seen some interesting examples of this in retail and customer-service businesses using video-based SJTs. The second is where the testing of specific skills is deemed important. In training needs analysis, for example, a company may have to identify whether employees require training in safety-related issues.

So situational judgment tests are a mixed bag. Sometimes they are not worth the effort. But if you have the right need and the commitment to develop them well, they can be hard to beat.


Situational Judgment Tests
When to use: Training needs analyses or large-scale recruitment drives, preferably confined to particular geographies.
Potential benefits: Easy to run, since the automated process can be completed virtually. Viewed as fair and can offer good levels of validity.
Caveats and concerns: Need to be well designed to deliver the described benefits and may need more maintenance and ongoing development than some other methods.

Individual Psychological Assessment

The dark horse of talent measurement is individual psychological assessment. For starters, it is rarely called that, since every vendor calls its proprietary version of it something different. It is also the least researched method of all, and any independent studies that do exist are generally not that positive. In fact, if you ask most researchers, they would probably dismiss individual psychological assessment as being of dubious validity and value. Yet it has been used in organizations for over thirty years and is a significant part of the measurement market. And despite the lack of positive research, it is also one of the methods that businesses most often choose to use with their senior executives.

Individual psychological assessment can vary in what specifically it entails, but it is a bit like an individual assessment center and is usually used to assess competencies. It typically involves an interview and some psychometric tests, and some vendors add a simulation exercise or presentation. It always results in a report written by the assessor about the individual assessed. This tends to contain a description of personal characteristics—strengths, weaknesses, and ratings of abilities with regard to certain competencies.

The popularity of individual psychological assessment is probably explained by two things. First, it can offer a good experience for the participant. It is very personal, can leave the individual feeling heard and understood, and can have a developmental feel to it. Second, it can offer a good experience for the business. Unlike psychometrics reports, the results are interpreted, and the business often has a chance to speak with the assessor and ask questions about the findings. This personal service does not come cheap. Yet given the cost of poor hiring decisions, many businesses see individual psychological assessment as a good investment and effective risk management, especially for senior leaders.

So what does the research say? One review of validity studies found that the average validity of individual psychological assessments in predicting performance was around 0.26 (just below personality psychometrics). However, validities were higher when individual psychological assessments were used for assessing managers (0.47) than for both technical experts (0.24) and all other groups (0.16).41

Unfortunately, there is not much independent research on the matter. In fact, a major review identified only twenty studies, of which eighteen were conducted during the 1950s and 1960s.42 The reason for this is that the number of people going through individual psychological assessment in each business tends to be fairly small, so obtaining enough results to enable a good analysis of validity is tough.43 Complicating matters is the fact that each vendor tends to have its own approach to doing individual psychological assessment. And, of course, for a vendor even to conduct a study, it needs to access the performance scores of the people assessed, which can be difficult.

Yet what concerns researchers most about individual psychological assessment is not so much whether it can be used to predict performance, but its reliability in doing so. For example, one classic study asked three assessors to each independently assess the same three candidates. It found a high level of disagreement over both specific attributes and overall job suitability. And when fifty other assessors were then asked to review the assessment results, only one-third of them agreed with the original psychologists' judgments.44

Given that the outputs of individual psychological assessment rely solely on the assessor's expertise, it is not really a surprise that great variation is found. After all, professional assessors are open to the same rating biases and judgment errors as everyone else. So the conclusion seems to be that with a good assessor and a good process, individual psychological assessment can be very effective in accurately describing people and predicting performance. But as methods go, it is unreliable. Assessors vary considerably in their ability, and working out which are the good ones is not easy.

Despite this, our experience is that once organizations start using individual assessments, they keep doing so. They like them and believe that they are adding value. They can be used to provide clear and objective comparisons of individuals. They can be used to identify potential risks in hiring or promoting someone. And they can provide a guide for managing and developing individuals. Moreover, of all the measurement methods, individual psychological assessment is the one that probably has the best potential for helping evaluate individuals' level of fit with a business, team, or manager. Of course, whether it currently and typically achieves all these things is another matter altogether.

Indeed, in our view, to minimize the downsides and deliver the potential benefits, individual psychological assessment probably needs more careful management by businesses than any other method. Yet what we tend to find is altogether different. All too often, businesses find a vendor they trust and then hand the process over to it. In chapter 8, we return to this issue when we look at how to manage vendors, and in the appendix we provide a brief guide to choosing a vendor for individual psychological assessments.


Individual Psychological Assessment
When to use: Mid- to senior-level selection or development processes.
Potential benefits: Personal service for participants, excellent opportunity to assess aspects of fit, and results that are tailored to and interpreted for businesses.
Caveats and concerns: Validity of the process is dependent on the quality of the assessor. It thus needs closer ongoing management by the business to ensure effectiveness than any other measurement method.

360-Degree Feedback

As its name suggests, 360-degree feedback involves asking a range of people who interact with an individual to rate or answer questions about him or her. These questions are usually designed to measure either performance levels or certain competencies. The method has its roots in military selection practices from World War II, when selectors found that the inclusion of peer evaluations improved predictions of future performance. The practice moved to the US corporate sector in the 1950s, grew in the 1960s and 1970s, and became popular internationally in the 1980s. It is now an established practice within many, if not most, medium to large organizations. It is mostly used with management populations and for either personal development or as part of performance management or talent identification processes.

This feedback is popular because it is seen as fair and valid and because both businesses and participants tend to like it. It is often viewed as a way for managers and peers to give direct feedback in a safe, nonconfrontational way. Moreover, after a study found that subordinates' appraisals of managers could be as predictive of performance as assessment center ratings, 360s have been touted as a cheap and effective alternative.45 How good are they really, though? After all, 360s are open to the whole gamut of rating biases, and it is not obvious what they do to mitigate the impact of them.

In terms of personal development, the picture does not initially look good. For example, a review of other studies found that the actual impact of 360-degree feedback in improving performance was limited.46 In our experience, though, this is not because 360s cannot aid performance improvement. Instead, it is usually the result of one of three things: a poor tool, poor motivation on the part of the participant, or poor implementation by the business. As with psychometrics, some tools are better than others, and it can be difficult to work out which are the good ones. In addition, 360s depend on participants to make something of them and develop themselves, and firms often do not do enough to support this. Yet 360-degree feedback can succeed as a tool for development only when all three of these issues are addressed.

As for using 360s to assist in performance management and talent spotting, the advice typically given is to avoid doing so. What tends to happen is that either 360-degree feedback reports are given to managers to help inform appraisal ratings, or a mathematical formula is used to help generate performance ratings from the 360s. However, 360s have to overcome all the normal rating biases, which can undermine their effectiveness in two ways.

First, the ratings can fail to distinguish high performers from poor performers, since almost everyone scores high. When we look at how people rate, we find that they generally overrate others' effectiveness and do not use the whole of the rating scale.47 And when feedback givers think their ratings will be shared with a manager or used for performance management, their ratings go up even further and become less predictive of performance.48 For example, when given a scale of 1 to 5, they may primarily use ratings of 3, 4, and 5. In one business we worked with, the result of this was that the average rating was 4.74 out of 5.

Second, 360-degree ratings can be unreliable at distinguishing what each individual's strengths and weaknesses are. We know this because when we analyze 360-degree data, we find that it is extremely susceptible to the halo effect, whereby people are rated high or low on all competencies. In other words, 360s often measure just one thing: “How much I like you” or “How much I rate you.”

In our experience, the vast majority of organizations do not have a culture of giving open and direct feedback. As a result, the two factors we just described tend to significantly distort 360-degree ratings. Many vendors have focused on developing techniques to try to reduce the impact of these biases, for example, by introducing special ways of asking questions and different types of rating scales. Some of these can help, but only to a limited extent. We do not mean to say that 360s cannot be useful in performance appraisals; we are just saying that the conditions need to be right, and often they are not.

To try to circumvent these issues, a few vendors are trying to reinvent 360s and have come up with fundamentally different designs for them. One of these approaches involves rating people not on a traditional low-to-high type scale, but on a scale that runs from “underdeveloped competency” to “overused strength.”49 This is an interesting approach since it does not assume that the more you display a certain behavior, the better it is. A similar approach entails categorizing competencies, putting them in boxes such as “underused behavior” and “overused behavior.” Finally, an approach that we have personally developed involves ranking behaviors or competencies rather than rating them. This forces people to distinguish between strengths and weakness and prevents them from overrating.

Moreover, one interesting stream of research is studies show­ing that the biggest predictor of performance is not how highly people are rated, but the degree of alignment between self-ratings and others' ratings. People who overestimate their abilities tend to have the highest development needs and the lowest performance levels, whereas people who are rated highly by both themselves and others tend to be better performers.50 Thus 360-degree systems that focus on this alignment, rather than only on how highly people are rated, are likely to be far more effective in both predicting success and helping people identify potential performance derailers. The one caveat is that most of the research on this to date has come out of the United States and Europe, and further research needs to be done on how cultural factors may affect the likelihood of ratings by self and others being aligned.


360-Degree Feedback
When to use: In personal development processes and as part of performance appraisal or talent management.
Potential benefits: Enables constructive feedback directly focused on performance improvement issues. Generally liked and valued by participants.
Caveats and concerns: Effectiveness as a development tool depends on the quality of the tool, motivation of the user, and support from the business. Utility for performance appraisal is likely to be strongly undermined when businesses do not already have a culture of open and honest feedback.

So 360s can assist both personal development and performance management, but as with all the other methods, only when they are implemented properly and with a careful eye on the factors that can undermine their effectiveness. Indeed, 360s are particularly vulnerable to rating biases, and we are concerned that businesses often seem unaware of just how much these can distort results. New tools are being developed in an effort to sidestep some of these issues, but only time will tell if they are effective solutions.

Work Sample Tests, Simulations, and Games

Work sample tests have been around for a long time. They involve, quite simply, giving people a sample of work and then seeing how they do. They are generally seen as fair and very capable of predicting performance. They are also relatively free from rating biases since actual performance can be measured. Validities can vary a lot between tests, but figures from 0.33 to 0.54 have been reported.51 This means that they account for anywhere between 10 and 30 percent of the causes of success and can even be as predictive as intelligence tests. They are typically used for skilled and semiskilled jobs, and each test tends to be specific to certain roles and tasks.

We include four main types of measurement method in this group. First, there are work sample tests, such as key stroke tests for data entry clerks or physical fitness tests in the military. We also include the increasing use, in lower-level roles, of job probationary periods as tests of whether someone can perform well.

Second are the simulation exercises often used as part of a bigger measurement process, such as an assessment center. These typically involve role-playing exercises in which participants participate in a simulation of a meeting. Alternatively, in what are called in-tray or e-tray exercises, participants are asked to respond to reports, e-mails, or phone calls about a particular project or series of issues. Role plays done well can involve professional actors, and some of the e-tray products available are very elaborate and include video and audio.

Simulations can be effective, but they need to be well designed and to mimic the work environment accurately. It has been suggested that role plays favor extroverts, and many people report a strong dislike of role plays, even when they may be for personal development purposes. And e-tray exercises may be more of an alternative measure of intelligence than anything else (and so offer little incremental validity over intelligence tests).

Third, there are internships, which are increasingly being used as a way of testing graduate and entry-level new hires. Companies are thus intentionally hiring more people than they ultimately need, with an eye to selecting the best based on actual job performance. This may seem harsh from a young job hunter's perspective, but it involves real work, and it is hard to argue that it is not fair.

Finally, there are the games that are garnering all the headlines these days. Google, Facebook, L'Oréal, Microsoft, hotel and resort chain Harrah's, and the US Air Force have all run online contests to identify potential new employees. One company we recently worked with ran an online contest to identify potential graduate traders, which immersed the would-be traders in an intensive, simulated trading environment. These types of methods are popular because they can provide publicity, boost a brand, and deliver an accurate measure of talent all at the same time. It is not hard to see their attractiveness. However, they may require regular updating, can often be applied to only specific roles, and there is a lack of solid evidence on how effective they really are.

In general, then, work sample tests are to be applauded. However, they do need to be well designed and can be quite costly given their relatively limited applicability.


Work Sample Tests
When to use: In recruitment processes, often for skilled or semiskilled roles.
Potential benefits: When well designed, cannot be beaten for perceived fairness and legal defensibility and may also be good predictors of performance.
Caveats and concerns: Effectiveness depends on design and the relevance of the test to the skills required for an individual to succeed in the role. Simply adding a generic role-play exercise is not likely to deliver much value.

Summary

Almost all of the eight methods we have looked at have to contend with the same rating biases that are present when we just ask individuals or their managers how good they are. Yet all of them are capable of helping us to deal with these biases and arriving at more accurate judgments.

We could have added a ninth category: integrated tools. These are mostly online, are called different things by different vendors, and combine various combinations of the eight core methods to produce bigger tools. They may, for example, contain a combination of psychometrics and an online simulation exercise. These are not bespoke measurement processes specifically designed for individual businesses, but ready-made, off-the-shelf, Web-based solutions. They have the advantage of being relatively cost-efficient and tend to have good validities because they combine measures. However, many of them have only a limited ability to be tailored to the needs of individual businesses and offer little, if any, assessment of the different types of fit. So although they are undoubtedly an interesting development, we are concerned by their focus on benchmarking and one-size-fits-all approach. Nonetheless, these integrated tools may still suit some situations, and it is worth keeping an eye on them. If they can be developed to measure all four types of fit, they could form a large part of measurement's future.

In the meantime, one major question remains. How do you choose which of the eight core methods to use? It is to this issue that we turn in the next chapter.


Case Study
Replacing Assessment Centers with Virtual Measurement
There is a rapidly increasing trend to replace assessment centers with cheaper alternatives. Yet cheaper is not always better and is particularly concerning when it comes to measurement. This is because less expensive measurement processes invariably involve compromises that entail a reduction in validity. And if you reduce validity too much, a measurement process can easily become next to worthless.
Compromise can be done well, though. Consider the recent experience of a US-based multinational. It had historically used assessment centers to assess and develop its leaders. However, as the economic downturn began to be felt, this was no longer economically viable given the significant travel costs that could be incurred. So it looked to develop an online alternative and approached the vendor CEB-Valtera for a solution.
The company had a fairly large competency framework, so the first thing that CEB-Valtera did was to work with the firm to narrow the list of competencies to a more manageable number: only the most relevant were chosen. This made sure that the accuracy of the results would not be diluted by trying to measure too many things, while also keeping the time required to complete the process to a minimum.
CEB-Valtera then identified four measures that would allow it to assess each of the six chosen competencies at multiple points: an intelligence test, a personality psychometric, an online structured interview with an assessor, and a new measure that it developed especially for the process—a multirater situational judgment test. It was like a 360-degree feedback tool in that it asked people who worked with the individuals being assessed to provide information about how they typically acted. But rather than asking feedback givers to provide ratings, it asked them to answer multiple-choice questions about how individuals tended to respond in certain situations. As an output, it provided scores against each of the competencies being measured.
It is too early tell the predictive power of this process or whether it will match the predictive validity that an assessment center might have provided. Yet there is every reason to hope it is effective. For starters, it combines measurement methods, each carefully chosen in part for the incremental validity it could offer over the others. In addition, although it contains generic intelligence and personality measures, it also contains measures that were designed to evaluate the level of fit between individuals and the needs of the company. It might not have quite the level of validity of a great assessment center, but it is a rigorous process with the potential to offer good validities and comes with substantial cost savings.
One final thing. Ask almost any assessor, and she will tell you that she prefers interviewing people face-to-face and that “something is lost” when interviewing online. For that reason, many prefer to avoid it. Yet as Nancy Tippins, senior vice president at CEB-Valtera, notes, people in multinational businesses often have to work virtually. So in this respect, an online interview can provide a realistic simulation of the working environment.

Notes

1. Dudning, D., Heath, C., & Suls, J. M. (2003). Flawed self-assessment: Implications of health, education, and the workplace. Psychological Science in the Public Interest, 5, 69–106.

2. John, O. P., & Robins, R. W. (1994). Accuracy and bias in self-perception: Individual differences in self-enhancement and the role of narcissism. Journal of Personality and Social Psychology, 66(1), 206–219.

3. Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one's own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121–1134.

4. Paulhus, D. L., & Reid, D. B. (1991). Enhancement and denial in socially desirable responding. Journal of Personality and Social Psychology, 60, 307–317.

5. Crandall, J. E. (1973). Sex differences in extreme response style: Differences in frequency of extreme positive and negative ratings. Journal of Social Psychology, 89, 281–293.

6. Gentry, W. A., Yip, J., & Hannum, K. M. (2010). Self-observer rating discrepancies of managers in Asia: A study of derailment characteristics and behaviors in southern and Confucian Asia. International Journal of Selection and Assessment, 18(3), 237–250; Eckert, R., Ekelund, B. Z., Gentry, W. A., & Dawson, J. F. (2010). “I don't see me like you see me, but is that a problem?” Cultural influences on rating discrepancy in 360-degree feedback instruments. European Journal of Work and Organizational Psychology, 19(3), 259–278.

7. Rosenzweig, P. (2007). The halo effect: … and the eight other business delusions that deceive managers. New York, NY: Simon & Schuster.

8. Bernthal, P. R., & Wellins, R. S. (2005). Leadership forecast 2005/2006: Best practices for tomorrow's global leaders. Pittsburgh, PA: Development Dimensions International.

9. Eichinger, R. W., & Lombardo, M. M. (2004). Patterns of rater accuracy in 360-degree. Perspectives, 27, 23–25.

10. Antonioni, D., & Park, J. (2001). The effects of personality similarity on peer ratings of contextual work behaviors. Personnel Psychology, 54, 331–360.

11. Khaneman, D. (2011). Thinking, fast and slow. London: Penguin Books.

12. MacKinnon, R. A. (2010). Assessment and Talent Management Survey 2010. Thame, Oxon: TalentQ.

13. Fallaw, S. S., Kantrowitz, T. M., & Dawson, C. R. (2012). Global assessment trends report. Thames Ditton, Surrey: DHL Group.

14. Chartered Institute of Personnel and Development. (2011). Resourcing and talent planning. Annual survey report. London: Author.

15. Cortina, J. M., Goldstein, N. B., Payne, S. C., Kristl-Davison, H., & Gilliland, S. W. (2000). The incremental validity of interview scores over and above cognitive ability and conscientiousness. Personnel Psychology, 53(2), 325–351.

16. Ryan, A. M., & Sackett, P. R. (1989). Exploratory study of individual assessment practices: Interrater reliability and judgments of assessor effectiveness. Journal of Applied Psychology, 74(4), 568–579.

17. Salgado, J. F., & Moscoso, S. (2002). Comprehensive meta analysis of the construct validity of the employment interview. European Journal of Work and Organizational Psychology, 11(3), 299–324.

18. Campion, M. A., Palmer, D. K., & Campion, J. E. (1997). A review of structure in the selection interview. Personnel Psychology, 50, 655–702.

19. Huffcutt, A. I., & Arthur, W. Jr. (1994). Hunter & Hunter (1984) revisited: Interview validity for entry-level jobs. Journal of Applied Psychology, 79, 184–190; Wiesner, W. H., & Cronshaw, S. F. (1988). The moderating impact of interview format and degree of structure on interview validity. Journal of Occupational Psychology, 61, 275–290.

20. Cook, M. (2009). Personnel selection: Adding value through people. Chichester, West Sussex: Wiley; Terpstra, D. E., Mohamed, A. A., & Kethley, R. B. (1999). An analysis of federal court cases involving nine selection cases. International Journal of Selection and Assessment, 7, 26–34.

21. Campion, M. A., & Campion, J. E. (1994). Structured interviewing: A note on incremental validity and alternative question types. Journal of Applied Psychology, 79(6), 998–1002.

22. Salgado, J. F., & Moscoso, S. (2002). Comprehensive meta analysis of the construct validity of the employment interview. European Journal of Work and Organizational Psychology, 11(3), 299–324.

23. Rynes, S., & Gerhart, B. (1990). Interview assessments of applicant “fit”: An exploratory investigation. Personnel Psychology, 43, 13–35.

24. Oh, I. S., Postlethwaite, B. E., Schmidt, F. L., McDaniel, M. A., & Whetzel, D. L. (2007). Do structured and unstructured interviews have near equal validity? Implications of recent developments in meta-analysis. Paper presented at the 22nd Annual Conference of the Society for Industrial and Organizational Psychology, New York, NY.

25. Silzer, R., & Jeanneret, R. (2011). Individual psychological assessment: A practice and science in search of a common ground. Industrial and Organisational Psychology, 4, 270–296.

26. Ben-Hur, S., & Kinley, N. (2012). Coaching executive teams to better decisions. Journal of Management Development, 31(7), 711–723.

27. Lievens, F., Highhouse, S., & De Corte, W. (2005). The importance of traits and abilities in supervisors' hirability decisions as a function of method of assessment. Journal of Occupational and Organizational Psychology, 78, 453–470.

28. Hogan, R. (2005). In defense of personality measurement: New wine for old whiners. Human Performance, 18(4), 331–341.

29. Mead, A. D., & Drasgow, F. (1993). Effects of administration medium: A meta-analysis. Psychological Bulletin, 114, 449–458.

30. Hughes, D., & Tate, L. (2007). To cheat or not to cheat: Candidates' perceptions and experiences of unsupervised computer-based testing. Selection and Development Review, 23, 13–18.

31. Burke, E. (2006). Better practice for unsupervised online assessment. London: SHL Group.

32. Connelly, B. S., & Ones, D. S. (2008). Inter-rater reliability in assessment centre ratings: A meta-analysis. Paper presented at 23rd Annual Conference of the Society for Industrial and Organizational Psychology, San Francisco.

33. Gaugler, B. B., Rosenthal, D. B., Thornton, G. C., & Bentson, C. (1987). Meta-analysis of assessment center validity. Journal of Applied Psychology, 72(3), 493–511.

34. Lance, C. (2008). Why assessment centers do not work the way they are supposed to. Industrial and Organizational Psychology, 1(1), 84–97.

35. Thornton, G. C., & Gibbons, A. M. (2009). Validity of assessment centers for personnel selection. Human Resource Management Review, 19, 169–187.

36. Eurich, T. L., Krause, D. E., Cigularov, K., & Thornton III, G. C. (2009). Assessment centers: Current practices in the United States. Journal of Business Psychology, 24, 387–407.

37. McDaniel, M. A., & Nguyen, N. T. (2001). Situational judgment tests: A review of practice and constructs assessed. International Journal of Selection and Assessment, 9, 103–113.

38. McDaniel, M. A., Hartman, N. S., Whetzel, D. L., & Grubb, W. L. (2007). Situational judgment tests, response instructions, and validity: A meta-analysis. Personnel Psychology, 60, 63–91.

39. McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001). Use of situational judgment tests to predict job performance: A clarification of the literature. Journal of Applied Psychology, 86, 730–740.

40. McDaniel et al. (2007).

41. Roller, R. L., & Morris, S. B. (2008). Individual assessment: Meta-analysis. In I. L. Kwaske (Chair). Individual assessment: Does the research support the practice? Symposium presented at the 23rd Annual Conference of the Society for Industrial Organizational Psychology, San Francisco, CA.

42. Prien, E. P., Schippmann, J. S., & Prien, K. O. (2003). Individual assessment: As practiced in industry and consulting. Mahwah, NJ: Erlbaum.

43. Fletcher, C. (2011). Individual psychological assessments in organisations: Big in practice, short on evidence? Assessment and Development Matters, 3(2), 23–26.

44. Ryan, A. M., & Sackett, P. R. (1998). Individual assessment: The research base. In R. Jeanneret & R. Silzer (Eds.), Individual psychological assessment: Predicting behavior in organizational settings. San Francisco, CA: Jossey-Bass.

45. McEvoy, G. M., & Beatty, R. W. (1989). Assessment centers and subordinate appraisals of managers: A seven year examination of predictive validity. Personnel Psychology, 42(1), 37–52.

46. Rehbine, N. (2007). The impact of 360-degree feedback on leadership development. Unpublished doctoral dissertation, Capella University, Minneapolis.

47. Jawahar, I. M., & Williams, C. R. (1997). Where all the children are above average: A meta-analysis of the performance appraisal. Personnel Psychology, 50(4), 905–925.

48. Eichinger, R. W., & Lombardo, M. M. (2004). Patterns of rater accuracy in 360-degree. Perspectives, 27, 23–25.

49. Kaiser, R. B., & Kaplan, R. E. (2005). Overlooking overkill? Beyond the 1-to-5 rating scale. Human Resources Planning, 28(3), 7–11.

50. Yammarino, F. J., & Atwater, L. E. (1997). Do managers see themselves as others see them? Implications of self-other rating agreement for human resources management. Organizational Dynamics, 25(4), 35–44.

51. Roth, P. L., Bobko, P., & McFarland, L. A. (2005). A meta-analysis of work sample test validity. Personnel Psychology, 58, 1009–1037.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.182.150