5

Choosing the Best Methods and Tools for Your Business

The eight methods we described in chapter 4 address the challenges of talent measurement in eight different ways. But how does a business choose among them? Chapter 4 provided insights into the various strengths and weaknesses of the methods themselves. In this chapter, we look at how your organization's specific concerns and needs should guide your choice of which method to use.

Key Issues in Choosing Tools

For researchers, the critical factor in selecting a method or tool is what works best or has the highest validity. But for businesses, there is always a trade-off between best practices, and other practical and implementation issues tend to lead their choice. In fact, there are seven key issues you should consider when choosing which method or tool to use:

  • Your location
  • What you want to measure, and why
  • Whom you want to assess
  • Legal and diversity concerns
  • How well a method is likely to work
  • How much it costs
  • How long it takes

We will look at each of these in turn and present them in a rough logical order, but we do not mean to suggest that any one of them is always more important than the others.

Your Location

The issue of location consists of five smaller, interrelated issues: local traditions, availability, regulatory constraints, portability, and language differences.

First, different countries and regions have different traditions of measurement and preferences for tools, and in a few countries, such as South Africa, there are legal restrictions on which tools you can use. So at a basic level, where you are determines which measurement methods you tend to consider.

References, for example, are one of the most used methods in the United States, yet they are far less popular in parts of Europe, including Spain, Portugal, and the Netherlands. Indeed, Europe is a patchwork of varying practices. Intelligence tests are very popular in Spain and the Netherlands, but far less so in Italy, Norway, and Turkey.1 Graphology is used in France and by over 15 percent of companies in the German-speaking part of Switzerland, but hardly used anywhere else.2 Asia is also diverse, though slightly less so than Europe. Interviews are less common in China and South Korea, while selection in Japan tends to emphasize the ability to get on well with others.3 The Middle East is a little more consistent; there, assessment centers and psychometrics are common. In some South American countries, measurement in general is relatively rare.

A related issue is the tools that are available. In South America, for instance, psychometric tests are generally viewed with caution, largely due to the lack of appropriate, locally developed tests. Similarly, in Brazil, the tests available are mostly international ones, written in international Portuguese rather than Brazilian Portuguese. But availability problems are not only about language. With personality tests, for example, although the questions may be translated, they may not make sense in the local culture. For this reason, there has been a move to develop personality tests that are designed by locals for locals. As an idea, it sounds good, but these local tools often lack the rigor and validity levels of the big internationally developed tests.4 A global balancing act is therefore required to accommodate both efficacy and local needs.

The third issue is regulatory differences in how tools are used. In the Netherlands, for example, assessment reports about a job applicant need to be shown to the candidate before the business can use them. And in the United Kingdom and South Africa, there are regulatory or legal constraints on who can use psychometrics.

Next is the issue of the portability of particular methods and tools. In our experience, organizations often start with a locally designed measurement process and then try to apply it globally. This can work, but it has risks. A large global bank, for example, recently struggled when it tried to implement a 360-degree feedback process globally. It worked fine in most countries, but in India, employees had a culture of giving managers the feedback that they thought the managers wanted to hear. This drastically reduced the utility of the measure and prevented the firm from comparing the ratings of people in different countries.

Another example comes from a large US-based multinational that committed to using a well-known and highly regarded intelligence test as part of a restructuring process. However, when it used the test internationally, it found that the US and UK versions of the test were not equivalent. In fact, they were different tests, with different types of questions and different levels of difficulty. In the end, the multinational had to use different tools in different countries, which limited its ability to compare people across locations.

Finally, there is the perhaps more hidden issue of needing to accommodate different languages, even within a single country. A few years ago, we helped a UK-based organization choose a test of intelligence for its graduate recruitment process. In doing so, we discovered that people from nine different nationalities and cultural backgrounds were applying.

Since you need to test people in their first language for a test to be accurate, the company had to find a test that could measure intelligence in all nine languages. So just because your business is not global does not mean that you do not need to worry about global issues.

As a result of these five related issues, then, where you are and where you intend to use your measurement methods can have a big impact on which tools you choose. Indeed, for multinationals, the growing globalization of the workforce is making measurement increasingly difficult.


Location: Recommendations
1. Be aware of local measurement traditions, but try not to be constrained by them. To find out what the local traditions are, ask a local HR leader or vendor. To gain a sense of other approaches, a useful question might be, “How would you measure this in other countries?”
2. Always begin by considering the different locations where you might need to use your tools. You will almost certainly not be able to find one method that suits all locations. But you will at least be able to minimize difficulties and approach them proactively when they do occur.
3. Do not assume that any tool will be portable to other locations. Make sure to ask vendors for specific information about the portability of their tools.
4. Check what languages and nationalities you may need to accommodate. Even if you are operating in only one location, your employees may have diverse backgrounds.

What You Want to Measure, and Why

In terms of what sign of talent you want to measure, some methods are obviously better suited to certain tasks than others, but there are fewer hard-and-fast rules than you might imagine. If you want to measure prior experience, sifting tools and biodata are probably the best options, although interviews and individual psychological assessment are often used as well. If you wish to measure competencies, assessment centers, interviews, and individual psychological assessments are the usual options (though each has limits, as discussed in chapter 4). If you want to measure intelligence, psychometrics is hard to beat. And if you want to measure personality, psychometrics is once again probably the best option, although most methods can be used. Do not assume there is only one way to measure each factor. And if you are not happy with one method, look at some others.

Regarding why you want to measure—your business purpose—there are six common applications for measurement:

  • Selection—including both hiring and promotion
  • Capability review and benchmarking—assessing a group of people in a function or section of a business as part of a restructuring or transformation project
  • Due diligence—assessing executives and other staff as part of a merger or acquisition
  • Talent fishing—identifying potential future leaders
  • Development—assessing people as part of a development process in order to identify their learning needs
  • Performance appraisal—evaluating people as part of a formal or informal appraisal process

Again, there are few hard-and-fast rules here about which methods best suit each of the six. However, sifting tools are really required only for hiring processes, and only work sample tests, interviews, and 360-degree feedback are generally appropriate for performance appraisals. Beyond this, there is a lot of flexibility in what you can do.

In our experience, there are at least two reasons to think ahead about your purpose for using measurement. First is the type of outputs you require and what you need to do with this information. At a basic level, for example, do you need a detailed report, or are some scores sufficient? For example, you may require a method that can provide detailed developmental suggestions. In this case, individual psychological assessment, 360-degree feedback, or assessment centers might be appropriate. Alternatively, you may need a method that will provide simple scores, such as when testing knowledge or skill. In these situations, situational judgment tests (SJTs) or work sample tests can help.

Second, and critically, the purpose of measurement methods is usually not just to measure things. Measurements tend to fulfill broader functions too. Hiring people is sometimes just about employing the best candidate, but it can also be about increasing the diversity of a firm's talent pool and enhancing the firm's reputation. And to make sure that you obtain a tool that can fulfill these broader needs, you need to be clear and explicit about what they are.

Companies tend to approach vendors and say something like, “We need a selection process,” or, “We want to measure intelligence.” They then ask for evidence of validity. Ideally, though, they should go to the market with a broader and more specific list of requirements. To become discriminating about which method to use, businesses first need to be clear about what they need the tools to do.


What You Want to Measure, and Why: Recommendations
1. Be clear on the types of output you need, what you need to do with the information, and what the measurement process needs to achieve for the business. If you are unsure what you need a measure to do, ask your stakeholders what they want from it.
2. Check whether a tool can do what you need it to do. The easiest way to do this is to ask vendors for evidence that it can.

Whom You Want to Assess

Just as you need to ensure that your chosen tools can do what you require of them, you also need to ensure that they are appropriate for the audience. Broadly speaking, at junior levels, for example, the challenge for measurement is usually to sift candidates and spot the most talented in a group. With more senior-level and technical people, however, the challenge shifts to be more about trying to discriminate among a small number of highly qualified individuals.

Psychometric tests of intelligence and work sample tests can be excellent as sifting tools. Yet they tend not to add so much value when used with executives.5 A related issue here is the tenure of individuals. As we noted in chapter 2, intelligence tests can be very predictive of performance in new employees, yet assessments of personality or character are often better predictors of performance in longer-serving employees.6

The question of “whom” also relates to an increasing trend for companies to view the people they are measuring as “customers” and therefore to want to use methods that are appealing to them (or at least unoffensive). This appears to be a bigger concern in the Untied States than elsewhere, but it is a growing issue everywhere.7 It is important because people's perceptions of measurement methods can have a significant impact. In recruitment, for instance, they can influence things like how inclined people are to accept an offer from a company and how likely they are to recommend the business to others.8 People's perceptions of the methods you use are thus important for your brand. They are also particularly pertinent for companies recruiting to fill roles for which it is hard to attract candidates.

Perhaps surprisingly, there appear to be few cultural differences in people's typical views of different methods. In fact, age, gender, and ethnic background seem to make no difference at all.9 The methods most liked tend to be either traditional ones, such as interviews, or ones that are clearly job relevant, such as work sample tests.10 Least favored of all are psychometric measures of integrity and tests based on biodata.

There are a couple of caveats here. First, not all interviews are equally liked. People appear to dislike overly structured ones or being interviewed multiple times and asked the same questions by different interviewers.11 So the quality of the measure matters. Second, how individuals perceive both their own status and the status of the organization seems important. For instance, people who view themselves as having high status tend to dislike tests of their ability or intelligence more than people who do not see themselves this way. People applying for roles at what they perceive to be a high-status organization, however, appear to be less critical of its selection procedures.12

Worth remembering, too, is that how methods are explained, combined, and administered can be just as important as the nature of the test. For example, job applicants asked to take psychometric tests tend to like them more when the reasons for using the tests are clearly explained, there is a balance between the tests and other measurement methods, and the tests did not take more than an hour to complete.13

One final thing: people who “fail” or do not do well or are anxious about a particular type of measure tend not to like it.14 Some people hate interviews, and most people seem to dislike intelligence tests. Yet this does not mean that you should avoid using them if everything else suggests that they are the most appropriate for your needs. When it comes to talent measurement, the customer should definitely not always be king.


Whom You Want to Assess: Recommendations
1. Make sure the methods you use can discriminate among different levels of the type of talent you wish to measure. If you are unsure, ask a vendor about this and what your options are.
2. Think about the participants' experience of the assessment and what methods can do to make it better. Do not assume it will be good. When seeking a tool from vendors, explicitly ask about this aspect. A lot has to do with communication with participants—letting them know why they are taking the test and what it adds to the process—but some methods are just generally more liked.

Legal and Diversity Concerns

Depending on where you are in the world, legal compliance can be a critical issue in deciding which tools to use. For the most part, it involves making sure that you do not unfairly discri­minate against any subgroup in the population—for example, particular ethnic groups, genders, or disabled people. You will often hear this issue referred to as adverse impact.

Avoiding adverse impact is an admirable goal for any measurement method. Yet what makes it so important in some countries is the specificity of the laws regulating it and the type of litigation environment. In some territories, the laws are quite vague about what counts as discrimination, and there is not a strong culture of legally challenging decisions. In countries such as the United States and South Africa, however, the situation is very different. The laws that prohibit discrimination are specific about what counts as unfair and place the burden of being able to prove fairness firmly on businesses. Moreover, when a company gets it wrong, the consequences can prove expensive and damaging, especially in environments as litigious as the United States. As a result, US firms introducing new measurement processes typically conduct a study beforehand to prove that measures are fair and predictive of performance.

At a basic level, then, legal compliance is about knowing what you can and cannot do. This may sound simple, but for global businesses, it can be complex because countries differ as to which subgroups are protected and what is illegal. In most territories, treating people differently or selecting a disproportionate number of people from certain groups is enough to count as evidence of discrimination. In a few, however, such as Taiwan and Turkey, evidence of the intent to discriminate is required. In addition, the preferential treatment of subgroups is prohibited by some countries, such as Turkey and the United Kingdom, as well as in some US states. Yet it is permitted in others, such as Belgium and Chile.

Moreover, even if you are well informed about legal requirements, avoiding adverse impact entirely is often not possible. The issue that most typifies this concerns intelligence tests and ethnicity.

Culture, Diversity, and Intelligence Psychometrics.

As we showed in chapter 2, intelligence measures are far and away the most predictive of success. Yet no other factor or test has also proved quite as big a cause of adverse impact.

The issue is often presented as simply a matter of “whites do better than blacks.” Since the differences between these two groups are widely viewed as the result of social disadvantage, they are generally seen as unfair. Yet it is not this simple.

First, it is not just a matter of “whites versus blacks.” Research shows a similar pattern with many disadvantaged ethnic groups. Aboriginal Australians tend to score lower than all other Australians.15 Turkish and Surinamese immigrants in the Netherlands tend to score lower than Dutch test takers.16 And Moroccan immigrants in Belgium tend to score lower than other Belgians.17 Moreover, whites are not the racial group that scores highest of all. Rather, it is East Asians.

What complicates the situation and makes it contentious is not that intelligence tests do not work for some ethnic subgroups. They work fine: in fact, the ability of these tests to predict performance is roughly the same for all ethnic groups.18 So even among ethnic groups that tend on average to do poorly on intelligence tests, individuals' test scores are still good predictors of whether they will succeed. And in this respect, for businesses that just want the people most likely to perform well—regardless of race—they appear fair tests of talent.

Looming over this issue are some huge ethical and societal concerns. Discrimination, social equality, and meritocracy are all mentioned in the debate over intelligence tests. Some argue that intelligence tests perpetuate societal inequalities. Others think that not using them or engaging in positive discrimination is not meritocratic. Whole books have been written on the subject. Yet for many businesses, the debate is not so much about ethics as about the practical issue of diversity: the desire to have a diverse talent pool and recruit from all population groups.

There is no easy solution. You may see some vendors suggesting that they have “culture-free” intelligence tests that do not result in any adverse impact. Do not believe them, because there is no such thing. We often, for example, hear an otherwise excellent intelligence test called Raven's Matrices referred to as “culture free.” But the research is clear: like all other intelligence tests, it can result in adverse impact.19 Indeed, we know of no measurement method that has been independently proven to be as powerful and yet free from adverse impact. In fact, we know of no other method at all that is free from it. Intelligence tests have made all the headlines because of the amount of adverse impact involved and the sheer volume of research into the matter. But interviews, situational judgment tests, assessment centers, résumé reviewing, and work sample tests have all been shown to be vulnerable to creating adverse impact.20

Businesses thus seem to be faced with two stark choices.21 They can either sacrifice validity by using measures that are less valid than intelligence tests but do not result in as much adverse impact. Or they can sacrifice diversity by ignoring the potential adverse impact of intelligence tests. So what should you do?


Legality and Diversity: Recommendations
1. Be clear on your legal obligations. In foreign locations, this is best achieved through local HR leaders or local vendors.
2. Check the potential adverse impact of tools. The easiest way to do this is to ask vendors for evidence of the adverse impact associated with their tools. And be specific: ask about national differences and the average scores for each ethnic group. Do not merely be satisfied with average scores for whites and nonwhites. These broad comparisons can hide differences that exist for more specific groups. Finally, beware of statements such as, “The test does not show any ethnic differences beyond those already reported in research.” This just means that the test has all the adverse impact of any other test.22
3. To reduce adverse impact, use a range of methods. For example, combining an SJT or personality test with an intelligence measure can reduce adverse impact while adding incremental validity.23
4. If diversity is important for your business, conduct regular adverse impact studies. These are quick and easy and help you to ensure that the measurement methods you are using are enabling you to identify a diverse group of talent. Once a year, compare the demographics of everyone you assess with measurement processes with the people you eventually select or identify as desired talent. If they are broadly the same, then you do not have significant amounts of adverse impact.

Finding a Way Forward.

Organizations in countries like the United States and South Africa have had to act: the law gives them little option. Many businesses outside these countries, though, simply ignore the issue. This certainly is an easy route, and for some businesses operating in ethnically homogeneous areas, it is fine. But for businesses operating in more racially mixed areas or across national boundaries, such a policy cannot be recommended.

Awareness of this has led some organizations to stop using intelligence tests altogether. Yet a number of studies have highlighted how stopping using them can lead to a decrease in the quality of those hired. Moreover, by stopping measuring intelligence and assessing only other things, you can inadvertently replace one form of adverse impact with another. For example, research has shown that selecting people only on the basis of personality scores can lead to adverse impact against women.24

Another commonly suggested solution is banding: grouping scores into broad bands, such as high, medium, and low. The idea is that it can reduce adverse impact by making the bands so broad that they include people from ethnic groups who generally score lower. The efficacy of banding, however, has been hotly debated. In our opinion, it can help, but you are still left with the question of how to select people from within bands, so it is not a complete solution.

An additional possible solution has come from research showing that confidence can improve test performance, whereas anxiety can reduce it. The idea is that minority groups know that they tend to do less well on these tests, which makes them anxious and so reduces how well they do. And research has indeed shown that when the evaluation element of a test is downplayed, the performance of minority groups increases and adverse impact is reduced.25 Yet the research is also clear that although doing this may reduce adverse impact a little, substantial group differences remain.26

Of course, so far all that we have really touched on is the example of intelligence and ethnicity. There are other forms of adverse impact as well. In many countries, discrimination on the basis of disability, gender, or age is illegal. Avoiding adverse impact against disability tends to be more of an implementation issue, and we return to this in chapter 6. The impact of various measurement methods on gender is well researched. Differences between the performance of men and women on both intelligence and personality tests do exist, but they vary considerably among tests. In terms of age, there is evidence that performance on intelligence tests reduces with age, in particular on tests that require many questions to be answered in a short time.27 There is also research showing that people become less extraverted and agreeable with age.28 The impact of age on talent measures, however, is an evolving field, and businesses have generally not yet shown much interest in it.

So legal issues can mean a whole heap of trouble for businesses, but with forethought and care, you can avoid them.

How Well a Method Is Likely to Work

No business wants to spend time and money on a measurement method that does not work. This is why most businesses know to ask this basic question: “How valid is this method or test?” The challenge only begins here, though, because you then need to be able to understand and evaluate the answer. To help you, try following these seven tactics.

Ask for Evidence.

We were recently looking at the validity of a popular US interviewing system that described itself as being accurate and valid. On a Web page entitled “Validity,” the vendor described a wide variety of research showing that interviews can be valid predictors of success. Yet there was not a single mention of any research that the vendor had conducted into the validity of its own system. So rule number one is that you need to get specific and ask vendors for the evidence that their particular method or tool is valid. And beware of statements such as, “The test is predictive,” but do not come with any specific validity figures or evidence.

Ask What Is Meant by Validity.

Validity figures are not always what they appear to be. For starters, there is no one way for vendors to measure or report validity. When you are told that a measurement method has 80 percent validity, it could mean many different things. Classically, validity refers to whether the ratings and scores that people achieve on particular measures can predict their performance in a business. And by and large, this is what you should expect to hear. Yet we have seen some vendors define validity as being whether individuals agree with the results, so when a vendor tells you that a particular measure is valid, you need to ask, “In what way?”

In response to this question, you may sometimes hear phrases such as “content validity,” “criterion validity,” and “construct validity.” If you want to know more about what each of these means, check out the validity section in the appendix at the end of this book. For many people, though, this kind of technical jargon can be confusing and can put them off from delving more deeply into the subject. But it need not do so. All you need to remember is that you are essentially trying to find out two things: “How do you know that the method or tool measures what it is supposed to?” and, What business outcomes do results with this method predict, and to what degree?”

It is worth noting here that “performance” can mean different things. It can mean actual results (such as sales figures), managers' appraisal ratings of individuals, and even self-ratings of performance. Beyond task performance, it can mean contribution to team performance or organizational citizenship behavior. Furthermore, just because a measure can predict performance in skilled and semiskilled workers does not mean that it can also predict performance in managers. There are additional questions that you need to ask when told that a measure can predict performance: “What types of performance?” and, “In what types of people?” Moreover, with measures of potential, extra questions to ask are, “How far ahead can it predict performance?” and, “After how long?”


Beyond Validity
In this book we mainly focus on validity as a key indicator of whether a particular measure or method is effective. There is actually a lot more to determining whether measures are effective than just validity—things like alpha coefficients and scale intervals. We have deliberately chosen not to go into these because they are for the most part deeply technical issues that cannot be lightly touched on, and in our experience, most business users of measurement have neither the time nor need to commit to a full understanding of them. For most businesspeople, then, focusing on issues such as validity and reliability is enough. That does not mean that you shouldn't delve deeper, however. Should you want to, you will find in the appendix further details about validity and information on where you can learn about some of the more technical issues.

Beware of Very High Validity Figures.

When looking at the degree to which methods or tools can predict outcomes, remember that the single best predictor of performance, intelligence, can achieve maximum validities of only 0.5 to 0.6. If you hear anything more than that, start asking questions.

Check How Many People the Tool Has Been Validated With.

One essential question to ask is, “How many people?” For instance, if you are told that a measure can predict, say, absenteeism in semiskilled workers, you need to ask how many people were tested. If the answer comes back with anything fewer than one hundred, then the results may not be reliable. For psychometric tests, ideally you should be looking for two thousand or more people to have been tested.

If the Method or Tool Uses Norm Groups, Check the Quality and Relevance of Them.

Not all methods and tools use norm groups, but some rely on them. Norm groups are comparison groups, a kind of benchmark. They enable you to compare the score of a particular individual on a certain test or measurement method with the scores of other people who have also done the test. This is particularly useful with ability tests, such as measures of intelligence and physical fitness, as it can help you understand what scores mean. For example, an individual may get a score of 25 out of 30 on an intelligence test, which sounds good. But if you then find out that the average score is 27, that score of 25 does not look so good after all. We need to know how well others usually perform to understand precisely how good a score is.

As useful as norm groups may sound, the science of developing them and where they should and should not be used are much-debated issues. What is important for our purposes here is simply that if you are going to use norm groups, then it is critical that they be good ones: if they are not, they may be misleading.

So what counts as a “good” norm group? You need to look for two qualities.29 The first is size—the number of people in the group. Simply put, the bigger, the better. With competency ratings from individual psychological assessments, the norm group may be very small—under one hundred. For psychometrics, however, it will ideally be in the thousands.

The second quality you should look for is relevance. Having a norm group of two thousand white males from Scandinavia is impressive, but if you are trying to interpret the scores of Singaporean women, it is of no use. To be effective, then, a norm group needs to be representative of the people you are assessing. This can be in terms of gender, age, ethnicity, and education level. It can also be in terms of industry, function, and type of role. The more relevant, the better. For job applicants being tested with an intelligence test, for example, the best norm group is not the scores of people already employed, but other applicants for the same type of roles.

One quick way to evaluate the quality of a norm group you are already using is to look at how many of the people you are assessing score above the average for the norm group. If the norm group is perfect, then 50 percent of your people will score above the norm average and 50 percent will score below it. If almost everyone is scoring above or below the norm average, then you know that the norm group may not be relevant enough.

Moreover, for larger organizations it may be worthwhile trying to create your own norm groups specific to your business. The absolute minimum you need for competency and individual psychological assessment ratings is around 50 people. This is low, though, and you would need to be a little cautious about comparisons. For psychometrics, the minimum is around 150 people, although once again this is low. A number you could be completely confident in would be around 2,000, so our suggestions are absolute minimums. Some vendors will try to charge you for creating a specific norm group for your business. Others do not charge. Obviously, we recommend the latter.

Remember Reliability.

For relatively objective methods such as psychometric tests and SJTs, you do not need to ask about reliability. A test cannot be valid without also being reliable, so asking about validity is enough. However, for more subjective methods such as assessment centers and individual psychological assessment, it is important to ask about interrater reliability. This is the degree to which two assessors agree (or disagree) in their ratings and judgments about people. The less reliability and agreement there is between assessors, the less likely results are to be accurate.

Look for Independent Reviews.

This final step is an important one: always look for independent evidence of whether measures work. An easy place to start here is to ask the vendor if any such research exists. You can also do a Web search for the name of the tool. Moreover, with psychometric tests, probably the best thing you can do is to check one of the independent, nonprofit bodies that publish test reviews. The national psychology associations or societies of many countries provide this kind of service. By far our favorite is provided by the University of Nebraska's Buros Institute. Its reviews can contain some deeply technical information, but they also contain some clear and no-nonsense recommendations on whether to use tests.

These, of course, are just questions about validity. However, as we argued in chapter 3, businesses need to think more broadly about the issue of whether measures work. We have discussed, for example, the need to ask about incremental validity. Yet businesses also need to think about what measures need to do over and above merely predicting performance. This could include things like helping managers engage potential new employees, identifying areas new employees may need support with, and helping plan for individuals' development. Validity, then, is not the be-all and end-all, and the most valid test is sometimes not the one that will work best for your business. Nevertheless, it is a good place to start: a test that is not valid will not be able to do much for your business.


Whether the Method Works: Recommendations
1. Think broadly about what measurement methods need to do. Think about what they need to achieve over and above measuring particular things and predicting performance.
2. Do not just accept vendors' validity statements. Question them intensely.
3. Check for independent reviews of tests. An example is the Buros Institute.
4. For more subjective methods, ask about interrater reliability. This particularly applies to assessment centers and individual psychological assessments. Ask to see the details of studies that vendors have undertaken on this.

How Much a Method Costs and How Long It Takes

These two issues are both fairly straightforward and are without doubt the two most common questions we hear managers ask when choosing measurement methods.

With cost, the challenge is to identify what the total cost will be. This is not always simple because of the need to factor in not only immediate defined costs but also potential future costs. Some of these can be obvious, such as initial setup fees, facilities required, and ongoing project management. Others can be less apparent, like the need to develop alternative versions of recruitment situational judgment tests (so that leaked information about how to “pass” them does not enable people to cheat) or the need to develop translations of psychometrics for all the nationalities you need to cover. We return to some of these less obvious issues when we look at contracting with vendors in chapter 8.

For now, two principles are important. First, businesses need to have a clear and definitive costing. And second, when it comes to measurement, lower cost rarely means better. With online psychometrics, for example, if you find a tool priced below the market average, it is likely to be below average quality too. Similarly, driving down the cost of individual psychological assessment can be self-defeating if taken too far, as vendors tend to use less experienced and expensive assessors. And as we have seen, the quality of assessors is all-important.

As for how long a method will take, it is important to know the development time: how long it will take to develop and set up the measurement method. In addition, a bigger worry for frontline managers is the fear that adding measurement to a recruitment process could cause them to lose candidates by making the process take too long. We understand this concern and know that slow processes can lead to lost candidates. In fact, a recent study found that one in three UK firms reports that the length of its recruitment process has led to the loss of potential recruits.30 Yet many methods are very time efficient. Indeed, in our experience, the addition of a measurement process is rarely the reason that companies lose candidates. Instead, where time is an issue, it is often the time that other elements of the recruitment process take.

Of course, time and money often boil down to the same thing. As we reported in chapter 4, developmental assessment centers are becoming shorter in an effort to save time-out-of-business costs. One method where the overlap of time and money can be less obvious, though, is 360-degree feedback. In our experience, firms often do not take into account the cost of people taking time to give feedback. As a guide, each rating question takes about six seconds on average to answer, and each word of written feedback takes about eight seconds. These seconds can add up. Indeed, it is common for 360s to take fifteen to twenty minutes or more to complete. Moreover, each participant typically nominates between six and eight feedback givers. So it is easy for the total time taken by all of them to give their feedback to be over two and half hours. This may not sound a lot for one individual, but once you start to multiply it by the total number of 360s you are doing, it can add up to a surprisingly large amount. Undoubtedly, 360-degree feedback can be a highly cost-effective method. But it needs to be done right.


Cost and Time: Recommendations
1. Make sure you are clear on potential costs as well as immediate actual costs. If you are asking multiple vendors to submit proposals (which you should be), cross-referencing what each vendor has itemized can help reveal hidden or optional costs.
2. Calculate time-out-of-business costs. These can involve more than just the test taker, and although the costs may not be that significant for each case, the total can mount up.
3. Do not just buy cheap. It rarely pays.

Making the Choice

In this chapter, we have discussed various situational and company-specific factors to weigh as you choose methods for measuring talent. We have looked at location; what you choose to measure, and why; legality and diversity; likelihood of success; and costs in time and money. For individual businesses, there may well be other issues, but for most, answering these questions will enable you to choose which tool to use.

These, of course, are just the broad issues, and they often need to be followed by more specific questions, such as which vendor to use, which personality test to use, or the capabilities of a particular 360-degree feedback tool. Unfortunately, we cannot answer this more specific type of question for you because it depends on your circumstances. But in the appendix at the back of this book, we do offer some pointers to help you make these more specific decisions.

Up to now, we have looked at how to identify talent—at the indicators of it that can be measured, the methods of measurement, and how to choose among them. Yet choosing the right measures and tools is only the start. If they are not implemented and used effectively, even the best tools have little impact. And in our experience, implementation—actually doing ongoing talent measurement in your company—is where the real challenges lie.

Over the next three chapters, we therefore look at how to implement talent measurement. To be clear here, we are not talking about operational issues such as how to run assessment centers or project-manage a large-scale assessment process, but broader implementation issues. In chapter 6, we explore the foundations that need to be in place for talent measurement to work effectively, such as competency frameworks, databases, and policy issues. In chapter 7, we look at what you need to do to ensure that measurement methods and tools are used to best effect and how to make the most of the data they provide. And finally, in chapter 8, we look at how to source the expertise to do these things and how to choose and manage measurement vendors.


Case Study
Using New Technology to Do More Than Just Measure
Online technologies are enabling measurement vendors to do some interesting new things, including fulfilling some of the nonassessment aspects of measurement. Consider the collaboration between one of the UK's best-known pharmacy-led retailers and the global measurement vendor Cubiks.
To support its ambition of providing excellent customer care, the retailer decided to develop a new recruitment process specifically designed to identify the candidates best able to deliver this customer experience. As part of the process, it asked Cubiks to develop a new online measurement tool that could help it efficiently and effectively short-list the best candidates from an estimated 1 million applications per year.
The tool obviously had to involve a robust assessment and be a strong predictor of performance. But it also needed to do more: it had to provide a highly engaging experience for candidates, support the company's brand, and present a detailed preview of the role so candidates has realistic expectations of it.
The tool took months to build, but it was worth the wait. Applicants are first shown around a store and presented with typical workplace scenarios. They are then “interviewed” by a virtual store manager, which asks them questions about how they typically behave in work settings. This interview is in fact a cleverly disguised personality test, which produces both some detailed results about candidates' personalities and an overall “fit” score to assist the shortlisting process.
Initial feedback from both candidates and hiring managers has been excellent, and the tool has won a number of industry awards. Candidates like it because it gives them a good sense of what working for the company is like. The business likes it because it communicates the company's brand values while making the hiring process both quicker and more effective.
It is too early to tell how predictive the test is. But the time taken to hire people has been significantly reduced, since fewer candidates are short-listed, and those who are appear to be of a higher caliber than in the past.

Notes

1. Ryan, A., Mcfarland, L., Baron, H., & Page, R. (1999). An international look at selection practices. Personnel Psychology, 52, 359–362.

2. König, C. J., Klehe, U. C., Berchtold, M., & Kleinmann, M. (2010). Reasons for being selective when choosing personnel selection procedures. International Journal of Selection and Assessment, 18, 17–27.

3. Huo, Y. P., Huang, H. J., & Napier, N. K. (2002). Divergence or convergence: A cross-national comparison of personnel selection practices. Human Resource Management, 41, 31–44.

4. Tyler, G., Newcombe, P., & Barrett, P. (2005). The Chinese challenge to the Big-5. Selection and Development Review, 21(6), 10–14.

5. Stagner, R. (1957). Some problems in contemporary industrial psychology. Bulletin of the Menninger Clinic, 21, 238–247.

6. Tracey, J. B., Sturman, M. C., & Tews, M. J. (2007). Ability versus personality: Factors that predict employee job performance. Cornell Quarterly, 48, 313–322.

7. Fallaw, S. W., & Kantrowitz, T. M. (2011). Global assessment trends report. Thames Ditton, Surrey: SHL.

8. Hausknecht, J. P., Day, D. V., & Thomas, S. C. (2004). Applicant reactions to selection procedures: An updated model and meta-analysis. Personnel Psychology, 57, 639–683; Smither, J. W., Reilly, R. R., Millsap, R. E., Pearlman, K., & Stoffey, R. (1993). Applicant reactions to selection procedures. Personnel Psychology, 46, 49–76.

9. Hausknecht et al. (2004); Walsh, B. M., Tuller, M. D., Barnes-Farrell, J. L., & Matthews, R. A. (2010). Investigating the moderating role of cultural practices on the effect of selection fairness perceptions. International Journal of Selection and Assessment, 18(4), 366–379; Anderson, N., Ahmed, S., & Costa, A. C. (2012). Applicant reactions in Saudi Arabia: Organizational attractiveness and core-self-evaluation. International Journal of Selection and Assessment, 20, 197–208.

10. Gilliland, S. W. (1995). Fairness from the applicant's perspective: Reactions to employee selection procedures. International Journal of Selection and Assessment, 3, 11–19.

11. Erker, S., & Buczynski, K. (2009). Are you failing the interview: Survey of global interviewing practices and perceptions. Pittsburgh, PA: Developmental Dimensions International.

12. Sumanth, J. J., & Cable, D. M. (2011). Status and organizational entry: How organizational and individual status affect justice perceptions of hiring systems. Personnel Psychology, 64, 963–1000.

13. Jones, S. (2011). The good, the bad and the ugly: A review of job candidate experiences of psychological testing. Assessment and Development Matters, 3(1), 5–6.

14. Bauer, T. N., Maert, C. P., Dolen, M. R., & Campion, M. A. (1998). Longitudinal assessment of applicant reactions to employment testing and testing outcome feedback. Journal of Applied Psychology, 83, 892–903.

15. Sacket, P. R., Shen, W., et al. (2011). Perspectives from twenty-two countries on the legal environment for selection. In J. L. Farr & N. T. Tippins (Eds.), Handbook of employee selection. New York, NY: Routledge.

16. te Nijenhuis, J., de Jong, M., Evers, A., & van der Flier, H. (2004). Are cognitive differences between immigrants and majority groups diminishing? European Journal of Personality, 18, 405–434.

17. Fontaine, J.R.J., Schittekatte, M., Groenvynck, H., & De Clercq, S. (2006). Acculturation and intelligence among Turkish and Moroccan adolescents in Belgium. Unpublished manuscript, Ghent University.

18. Chan, D., & Hough, L. (2011). Categories of individual difference constructs for employee selection. In J. L. Farr & N. T. Tippins (Eds.), Handbook of employee selection. New York, NY: Routledge.

19. Hausdorf, P. A., LeBlanc, M. M., & Chawla, A. (2003). Cognitive ability testing and employment selection: Does test content relate to adverse impact? Applied HRM Research, 7(2), 41–48.

20. Huffcutt, A. I., & Roth, P. L. (1998). Racial group differences in employment interview evaluations. Journal of Applied Psychology, 83, 179–189; Bobko, P., Roth, P. L., & Potosky, D. (1999). Derivation and implications of a meta-analysis matrix incorporating cognitive ability, alter­native predictors, and job performance. Personnel Psychology. 52, 561–589; Goldstein, H. W., Yusko, K. P., & Nicolopoulos, V. (2001). Exploring black-white subgroup differences of managerial competencies. Personnel Psychology, 54, 783–808.

21. Pyburn, K. M. Jr., Ployhart, R. E., & Kravitz, D. A. (2008). The diversity-validity dilemma: Overview and legal context. Personnel Psychology, 61, 143–151.

22. Thanks to Steve O'Dell of TalentQ for this tip.

23. McDaniel, M. A., Hartman, N. S., Whetzel, D. S., & Grubb, W. L. (2007). Situational judgment tests, response instructions, and validity: a meta-analysis. Personnel Psychology, 60, 63–91.

24. Ryan, A. M., Ployhart, R. E., & Friedel, L. A. (1998). Using personality testing to reduce adverse impact: A cautionary tale. Journal of Applied Psychology, 83(2), 298–307.

25. Steele, C. M., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69(5), 797–811.

26. Sackett, P. R., Hardison, C. M., & Cullen, M. J. (2004). On the value of correcting mischaracterizations of stereotype threat research. American Psychologist, 59(1), 48–49.

27. Avolio, B. J., & Waldman, D. A. (1994). Variation in cognitive, perceptual, and psychomotor abilities across the working life span: Examining the effects of race, sex, experience, education, and occupational type. Psychology and Aging, 9, 430–442.

28. Specht, J., Egloff, B., & Schmukle, S. C. (2011). Stability and change of personality across the life course: The impact of age and major life events on mean-level and rank-order stability of the Big Five. Journal of Personality and Social Psychology, 101(4), 862–882.

29. Tett, R. P., Fitzke, J. R., Wadlington, P. L., Davies, S. A., Anderson, M. G., & Foster, J. (2009). The use of personality test norms in work settings: Effects of sample size and relevance. Journal of Occupational and Organisational Psychology, 82, 639–659.

30. Chartered Institute of Personnel and Development. (2011). Resourcing and talent planning. Annual survey report. London: Author.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.22.58