3
Conception of a Measurement Scale

3.1. Introduction

Before embarking on a long process of constructing a scale, it is essential to consider the type of phenomenon to conceptualize. A precise definition of the construct delimiting its borders makes it possible to observe more judiciously the nature of the concept and to consider the appropriate measure. So how should we conceive the constructs? Should they be considered as causes or consequences of other phenomena? Are they by nature very abstract, difficult to identify, requiring more sustained efforts to measure them?

In this reflection, two fundamental aspects contribute to the conceptualization of a phenomenon. First, it is necessary to gain an understanding of the relationships that may exist between the latent variable(s) representing this phenomenon and the observable indicators measuring it. After a presentation of the different variants at this stage, we will review the main selection criteria for each of them and the particularities of their operationalization. Second, a discussion on the scope of a large, multi-item scale versus a small, single-item scale for reflective measurements will be proposed. The distinction of the application framework of each of the options will later help to inform some of the decisions to be made when developing a scale.

3.2. Conception according to the nature of indicator-construct relationships

Often latent, the concepts studied in marketing (example: satisfaction, quality, involvement) require a set of observable indicators in order to be operationalized. It is therefore legitimate to reflect on the nature of the relationships that may exist between each construct and its indicators. In other words, we must attempt to reach an understanding of how indicators inform the content of the phenomenon. Indeed, this specification has consequences not only on the conception of the measure but also on the overall approach and tools to be used. In this section, we will first present the different types of construct-indicator relationships, also called measurement models. In a second instance, we will identify the main criteria defining the nature of the specification of a construct, as well as the general approach to be followed when conceiving the resulting measure.

3.2.1. The types of construct specification

Generally, two main conceptions shed light on the relationships between a latent variable (construct) and its indicators, giving rise to two main measurement models: the first focuses on the consequences of a latent construct called reflective and the second focuses on the causes of a latent construct called formative. However, these two conceptions of the links between the construct and its indicators are not always mutually exclusive (Jarvis et al. 2003, 20041; Ellwart and Konradt 2011).

In the context of a study, it happens that a construct can be specified as reflective but also as formative. For example, Christophersen and Konradt (2012) specified for the construct of online store usability, both types of models, each demonstrating good predictive validity. Similarly, Gineikiene (2013) has argued that the construct of nostalgia, often seen as reflective, can also be formative. These conceptions (formative or reflective) are, sometimes, even complementary, favoring the appearance of both forms of relationships at the same time, giving rise to a third type of specification called MIMIC (multiple indicators/multiple causes). Three specifications of the connection between a latent construct and its observable indicators are then possible. It should be noted that these chains can be organized in structures which give rise to different levels with relationship patterns of varying complexity2.

3.2.1.1. Reflective measurement scale

According to the first reflective specification, the direction of causality goes from the latent variable to its measures (indicators). In this case, the construct is supposed to reflect (that is, influence) its indicators: the indicators are manifestations of the latent variable and any change in the construct then affects the indicators. Several indicators are possible. The latter can be interchangeable, with factor contributions (loadings, noted: λ) and are associated with measurement error terms (noted: e)3.

A reflective model, with three indicators, can be schematized according to Figure 3.1.

images

Figure 3.1. Reflective measurement model. For a color version of this figure, see www.iste.co.uk/frikha/marketing.zip

This first format has generated a great deal of interest in marketing research, demonstrated by a large number of reflective concepts (attitudes, intentions, etc.). For example, Guillemot and Urien (2010) proposed a reflective measurement scale to construct the motivations for the life story.

3.2.1.2. Formative measurement scale

For the second formative specification, also called index, the link goes in the opposite direction of the indicators to the latent variable, the latter being formed by its measurements. Indicators are the sources or causes of the latent construct. The indicators are not interchangeable and are specific to the formation of the construct. Each one brings its own contribution (outer weight, noted: ϒ) to the formation of the construct. The omission of an indicator represents an omission of part of the phenomenon. Unlike reflective indicators, formative indicators are not associated with error terms. However, the latent variable (construct) is associated with an estimation error (ε)4. Among the constructs conceptualized as formative, we find, for example, the concept of social class.

A formative model, with three indicators, can be schematized according to Figure 3.2.

images

Figure 3.2. Formative measurement model. For a color version of this figure, see www.iste.co.uk/frikha/marketing.zip

For example, Sanchez-Pérez and Iniesta-Bonillo (2004) conceptualized the consumer’s felt commitment towards the retailer as formative. The relative rarity of formative operationalization in marketing compared to reflective specifications can be partly explained by the difficulty of conceptualizing the distinction between these two types of approaches and the statistical use of factor analyses and structural equation methods, allowing easier validation of the reflective approach.

3.2.1.3. MIMIC measurement scale

For the third specification, known as MIMIC (multiple indicators/multiple causes), the construct can be conceptualized in both a formative and reflective way, resulting in a model with both types of indicators, in this case a “hybrid” model. The phenomenon studied is not only formed by certain indicators (formative conception) but also reflects other indicators (reflective conception). This type of model can be the result of a thorough theoretical conceptualization and can sometimes be retained in order to allow the identification of a formative model by adding some reflective indicators. A MIMIC measurement model has two types of errors: estimation error(s) (ε) associated with the latent variable(s) and measurement errors (e) associated with reflective indicators.

A MIMIC model, with three reflective indicators (item 1, item 2, item 3) and two formative indicators (item 4, item 5), can be schematized according to Figure 3.3.

images

Figure 3.3. MIMIC measurement model. For a color version of this figure, see www.iste.co.uk/frikha/marketing.zip

This type of configuration is much less common, but a few constructs seem to be adapted to it. For example, Uhrich and Benkenstein (2010) developed a MIMIC model to measure the construct of the sport stadium atmosphere, with 15 formative indicators and 7 reflective indicators.

3.2.1.4. Specification levels

It is relevant to note that indicator-construct relationships can be specified at several levels. Indeed, a construct can be multidimensional, representing links between latent and observable variables at several levels. A measurement model can therefore be first-order, second-order or even multilevel. More explicitly, a first-order model specifies a single level of latent constructs explaining the covariances between indicators, with the items observed directly reflecting (or causing) the latent variables. However, it is possible that latent constructs themselves may reflect (or cause) other latent constructs. Thus, a higher-order model, particularly a second-order model, refers to a higher level of abstraction than a first-order model (Chin 1998, p. 10). As with first-order measurements, higher-order measurements can represent relationships between indicators and constructs, with reflective or formative relationships at each level (Diamantopoulos et al. 2008). Thus, a second-order measurement model can be seen in four configurations: reflective-reflective, reflective-formative, formative-reflective and formative-formative (Jarvis et al. 2003).

For example, a construct reflecting two latent components (factor A and factor B), which in turn reflect observable indicators, results in a second-order reflective-reflective model that can be schematized in Figure 3.4.

images

Figure 3.4. Second-order reflective-reflective measurement model. For a color version of this figure, see www.iste.co.uk/frikha/marketing.zip

Although in marketing research the constructs specified at several levels seem less frequent, it is important to remember that the specification is not a result of improvisation, but that the study of the literature on the subject treated is the key role in determining the constructs and their indicators. For example, Gurviez and Korchia (2002) conceptualized trust in the brand as a second-order construct (reflective-formative). Another example is the second-order scale (reflective-formative) of Guillard and Pinson (2012) measuring the tendency to keep everything.

3.2.2. The production protocols of a scale according to its nature

It is crucial, from the early stages of developing a scale, to define the nature of the relationships between the latent construct and its indicators. However, it is not uncommon to observe that users, or even scale developers, do not attach importance to such an understanding of the phenomenon to be operationalized, and in most cases, they believe the measurements they obtain have a reflective specification. This is a serious mistake.

Indeed, if a measure has been incorrectly specified, the resulting scale endorses biases in the understanding of the phenomenon (Jarvis et al. 2004; Crié 2005; Petter et al. 2012), thereby disrupting the significance and quality of the data obtained. As an illustration, Collier and Bienstock (2009) found, for the construct of e-service quality, that the two specifications (reflective and formative) do not give rise to the same managerial conclusions. For example, they found that, in some cases, the most important indicator of a formative construct becomes the weakest in a reflective model. Coltman et al. (2008) discussed the specification of two constructs, integration-responsiveness and marketing orientation, often seen as reflective. However, according to them, theoretical and empirical evidence shows the opposite.

Diamantopoulos et al. (2008), having identified the errors that are supported by an incorrect specification of the construct, observed that the various evaluation tests of the model are either overestimated or underestimated. Is it then possible to trust the conclusions drawn from it? MacKenzie et al. (2011, p. 296) have pointed out that the distinction between formative and reflective models is crucial for several reasons:

  • – estimates of structural parameters can be biased when indicators forming a construct are modeled as being reflected by the construct;
  • – most of the scale development procedure recommended in studies applies only to latent constructs with reflective indicators, but if applied to latent constructs with formative indicators, it may undermine conceptual validity.

Taking these observations into account, we will first examine some criteria for selecting types of specifications, then the particularities in the construction and validation procedures of the various specifications (reflective and formative) and finally, we will focus on the use of specifications in marketing research.

3.2.2.1. Criteria for selecting a specification

Specifying the nature of the relationship between indicators and the latent construct should not be done arbitrarily in an intuitive process, especially since several practical guides exist to facilitate the distinction between types of measurement. Four questions that researchers must answer have often been recommended to ensure the appropriate model for specifying a construct (Jarvis et al. 2004; Crié 2005). These questions are about:

  • – the direction of the causal link “construct-indicators”;
  • – the interchangeability of indicators;
  • – the relationship that may exist between the indicators;
  • – the antecedents and consequences of the measures.

More specifically, Jarvis et al. (2004, pp. 78–79) have argued that a construct is specified as formative when the following seven conditions are met:

  • – indicators are seen as characteristics that define the construct;
  • – changes in indicators are supposed to cause changes in the construct;
  • – changes in the construct are not expected to cause changes in indicators;
  • – indicators do not necessarily share a common theme;
  • – eliminating an indicator can alter the conceptual domain of the construct;
  • – a change in value of one of the indicators is not necessarily assumed to be associated with a change in all the other indicators;
  • – indicators are not supposed to have the same antecedents and consequences.

In case these criteria are not verified, the construct is then conceptualized as reflective.

It has been established that the most frequent specification errors, where they exist, relate to the fact that constructs are specified as reflective, whereas they should be considered formative (Jarvis et al. 2004; Diamantopoulos and Siguaw 2006; Diamantopoulos et al. 2008). In addition, for many constructs, both types of specifications are possible (Diamantopoulos and Siguaw 2006; Bagozzi 2011; MacKenzie et al. 2011). However, this does not give the researcher the right to intuitively choose the nature of the specification of a construct, or even to retain them at the same time. It is only from a theoretical basis that allows one to draw a clear, precise definition of the conceptual expectations that the nature of the scale takes shape. Edwards and Bagozzi (2000) emphasized that the specification of measures should be based on conceptual criterion a priori and not on empirical evidence a posteriori, particularly when this evidence is a low level of reliability.

It is logical that a researcher may at times be interested in the attributes that form a construct or the attributes reflected by a construct, but the effort of the conceptualization of the phenomenon remains the crucial element of such an undertaking. The selection of a measurement model (reflective, formative, or even MIMIC) is a problem to which great attention must be paid when operationalizing a construct.

3.2.2.2. Main mechanisms for developing a scale according to its specification

The nature of the specification of a measure can only be fully demonstrated once the theoretical framework is sufficiently precise to examine whether the identified attributes form or reflect the construct. This implies that the first steps of construction and scale selection (definition of the construct, determination of indicators, content validation) are the same for any type of measurement model. Diamantopoulos and Siguaw (2006) believe that producing an item from each perspective would not be different. Thus, the content validity of a set of items is very important to examine in all cases (formative and reflective) in order to capture the conceptually expected relationships between the indicators and the construct (or its dimensions). To guide this specification, the recommendations of Jarvis et al. (2004), highlighted above, can be of great help in clarifying the nature of the measure.

However, it should be recalled that for formative measures, many researchers agree that traditional procedures for developing and validating reflective constructs are not applicable (Diamantopoulos and Winklhofer 2001; Jarvis et al. 2004; Diamantopoulos and Siguaw 2006; Roy 2008; Bagozzi 2011; Malhotra et al. 2012). As such, Bagozzi (2011, p. 268) notes that “traditional procedures for detecting and controlling for random and systematic error rely on internal consistency measures of reliability and classic ideas of construct validity”. Bagozzi (2011) added that “reflective approaches to measurement lend themselves to such procedures as Cronbach alpha and multitrait-multimethod matrices. Similar procedures do not exist for formative approaches to measurement at this time”. This still seems to be the case today.

For reflective models, the indicators are assumed to reflect the latent variable, encouraging redundancy between items. In a way, the more correlated the indicators are, the more they represent the phenomenon under study. Purification procedures such as exploratory factor analysis and Cronbach’s alpha reliability tests are then relevant. As for the formative indicators, they are not supposed to be correlated and are therefore not redundant. This means that, from the beginning of the scale production process, it is necessary to identify all the indicators because any omission guarantees a loss of a part of the construct or may even change its meaning. The application of a classical statistical purification protocol (usually provided in the case of reflective models) for formative items may then lead to the deletion of part of the measurement of the phenomenon in question, disrupting the content validity of the instrument obtained. Diamantopoulos and Siguaw (2006), who examined the consequences of a reflective conceptualization instead of a formative one, observed this type of error, since all formative items can be very different from all reflective items.

It is judicious, from now on, to insist that the traditional procedure for the construction of a measurement scale (definition of the construct, genesis of the items, content validity, data collection, exploratory factorial analyses, reliability tests, construct validity, nomological validity) recommended by Churchill (1979) and the various adaptations associated with it concern reflective measurements. For formative measures, some steps and/or protocols are not suitable (e.g. alpha reliability tests, multitrait-multimethod matrix). Several researchers have examined the methods of evaluating a construct with formative indicators compared to a construct with reflective indicators (MacKenzie et al. 2011).

With regard to the conceptualization and analysis of a formative model, recommendations structured in four main steps were suggested by Diamantopoulos and Winklhofer (2001):

  1. 1) content specification: consists of determining the scope of the latent variable and the content it is intended to capture. This step is crucial insofar as the latent variable is determined by its indicators rather than the other way around. We note that a formative construct seems more abstract than a reflective construct. In the absence of a clear definition, there is a great risk of losing sight of some of the indicators important to the formation of the scale, resulting in a failure of the resulting measure;
  2. 2) the specification of indicators: the items selected must cover the content of the construct studied (latent variable specified). Although, for reflective models, a sample of indicators is supposed to represent the construct; for formative models, a census of all indicators is required. Failure to take an indicator into account leads to changes in the latent variable;
  3. 3) the collinearity of the indicators: the formative indicators must not present collinearity problems. In other words, it is not necessary for the indicators to co-vary with each other. With high correlations between indicators, it is indeed difficult to quantify the effect of each observed variable, taken individually, on the latent variable. To test multi-collinearity, Diamantopoulos and Winklhofer (2001) specify that the VIF (variance inflation factor) indicator, which should be low, can be used. Hair et al. (2011) suggest that this index should be less than 5;
  4. 4) external validity: this involves studying how the index (scale) representing a construct is related to other constructs with which it has theoretical relationships. It is a matter of establishing predictive or nomological validity. Diamantopoulos and Winklhofer (2001) propose to do this by testing either MIMIC models that contain multiple indicators and multiple causes, or the relationship between one formative model and another reflective model, which are supposed to be theoretically linked.

We invite scale developers to ensure the nature of the construct-indicator relationships conceptually recommended in studies before progressing with the construction of their measurement tools in order to follow the appropriate protocol. More specifically, for a formative model, after identifying the observable indicators, it is necessary to establish the content validity of all the generated indicators. This phase is essential for all types of models (reflective and formative) and the definition and genesis procedures are the same. To this end, the literature review and the use of experts are very useful.

However, it would not be relevant to use exploratory factor analysis at this level (formative model). Although some scale developers have used such an analysis5 to avoid the problems of multi-collinearity between indicators, by calculating factor scores that become obvious variables replacing the initial items in their scale estimation (example: Christophersen and Konradt 2012), for Diamantopoulos et al. (2008, p. 1212), this practice seems open to criticism that focuses on two arguments:

  • – first, the interpretation of the common index of the combined indicators is not clear;
  • – second, it is not possible to estimate the importance of each of the initial items in the formation of the construct.

Let us add, for formative models, that it is not possible to calculate the reliability of internal coherence by Cronbach’s alpha, but that the reliability of the test-retest type can be considered (Diamantopoulos 2005, p. 8). In addition, convergent and discriminant validity verification protocols applicable to reflective measurements cannot be used (Roy 2008). Apart from the validation protocols of the construct to be put in place for formative models, Wong (2013, p. 28) clearly states that internal consistency reliability and discriminant validity are not criteria for evaluating this type of formative model because the indicators are not correlated together. For Hair et al. (2011, p. 146), since the “formative indicators are assumed to be error-free […] the concepts of internal consistency reliability and convergent validity are not meaningful […] theoretical rationale and expert opinion play a more important role in the evaluation of formative indexes”.

To judge the adequacy of the formative items for the representation of the latent construct, Rossiter’s (2002) proposal on the importance of content validity makes sense within the framework of formative specifications. Thus, inter-judge reliability indices may be relevant, but not sufficient to evaluate such a model. In addition to theoretical foundations and expert views, empirical justifications are needed. A few additional statistical indicators then make it possible to see more clearly, in particular the VIF index for examining the collinearity of the indicators, the weight of the indicators and their significance (Hair et al. 2012). MacKenzie et al. (2011), supported by a body of research, suggest that where this does not alter the content validity, one should remove from a formative scale:

  • – indicators that do not have a significant factorial weight, as they are considered irrelevant to the formation of the construct;
  • – indicators with a high value in the VIF index, because they are seen to be lacking in validity.

Diamantopoulos (2006) focused on understanding the error term associated with the formative model and pointed out that the variance of the error term associated with the latent variable provides information on the validity of a formative construct: when it is low, all indicators are complete (no omission of items), ensuring good validity.

A formative model is statistically unidentifiable (or under-identified) unless it is related to other variables. Ellwart and Konradt (2011) point out, in this respect, that a formative model is evaluated by estimating its relationships with reflective constructs. Jarvis et al. (2003) recommend that for the validation of latent constructs composed of indicators (formative constructs), it is necessary to seek to establish relationships with other constructs, hence the attention given to nomological validity. More explicitly, in order to test and analyze a formative model, two options are generally possible according to Bagozzi (2011, p. 270):

  • – either add reflective indicators to the model, resulting in a MIMIC model (multiple indicators/multiple causalities);
  • – or relate the model to another latent variable composed of reflective indicators.

Depending on the option chosen, the model estimation methods may be different (Roy 2008; Cenfetelli and Bassellier 2009). Thus, several indices can be used to evaluate the conceptualized measurement model. It therefore appears that not only are the two models (reflective and formative) conceptually different, but their tests and the results they provide are different too. Table 3.1 summarizes some of the specificities of each:

Table 3.1. Particularities of reflective and formative models

Specification types Characteristics Protocol development
Reflective model

The indicators are assumed to be the reflect (the manifestations or consequences) of the latent variable.

Items can be redundant, interchangeable.

Items can be correlated.

Several indicators are possible.

A sample of items may be sufficient to cover the construct.

Adding or deleting an item does not necessarily change the construct.

Traditional scale development protocols, such as those suggested by Churchill (1979) and the various improvements recommended, particularly for studying the validity of the construct, are applicable.
Formative model

The indicators are assumed to be the cause (source) of the latent variable.

The items are not redundant.

The items are not interchangeable.

Items are assumed to not be correlated.

A census of all indicators is essential to cover the construct.

Adding or deleting an item changes the construct.

Some of the traditional scale development protocols, such as exploratory factor analysis (EFA), alpha reliability testing and the multitrait-multimethod matrix (MTMM) for construct validity testing, do not lend themselves to this.

The four-step procedure proposed by Diamantopoulos and Winklhofer (2001) is more appropriate for the development and evaluation of this type of scale.

3.2.2.3. Use of specifications: findings

Without reducing the potential of formative measures, it should be noted that reflective scales are of important use in marketing research. Moreover, even if some insights have been provided to set up and evaluate a formative model, notably by Diamantopoulos et al. (2008), the latter (p. 1211) have pointed out that empirical applications of formative models are still rare and that studies do not provide sufficient information about them. Hardin (2017) adds that formative models require a theory to guide their use. According to him, for the time being, to support causal-formative indicators, several questions need to be resolved. Bagozzi (2007) and Howell et al. (2007) go even further to suggest that reflective models are better measurement alternatives, while recommending whenever it is possible to use reflective measurements. Wilcox et al. (2008) also agree, suggesting that to test theories, formative measures are not as good as reflective measurement models. In addition, the error in specifying a model (specified reflective when in fact it is formative) seems to affect marketing and psychological constructs less (Jarvis et al. 2004). Chang et al. (2016) also support reflective specifications and even believe that reflective specification errors are not as harmful and generate less bias than formative specification errors. This seems to underline that several marketing phenomena have been modeled, without specification errors, in reflective formats.

3.3. Conception of a single-item or multi-item measurement scale

Of the many discussions associated with the number of items related to reflective measurement scales6, it is easy to observe two extreme logics. Churchill (1979) emphasizes the importance of using multi-item scales. But this thesis was discussed by Rossiter (2002) who defends the use of short or even single item scales. Between supporters of multi-item scales and supporters of short scale, the debate remains open. In this section, we will raise the arguments in favor of each of the options.

3.3.1. The C-OAR-SE procedure: controversial contributions

Supporting the relevance of short measurement, the “C-OAR-SE”7 procedure attempted to provide a framework for analyzing constructs from an operational perspective. This approach, on the other hand, has been the subject of much contentious debate. In this section, we will first summarize the main foundations of the C-OAR-SE procedure, then the main discussions it has generated.

3.3.1.1. The C-OAR-SE paradigm: basic principles

Rossiter (2002) proposed a “C-OAR-SE” scale development procedure. He assumes that short or even single-item scales are able to capture latent constructs. Without going back over the details of the various proposed steps, we note that Rossiter stipulates that the definition of the construct must specify the object (example: a company, a product, a brand), the attribute on which the evaluation must be made (example: attitude, quality, satisfaction) and the evaluator (example: respondents to a survey) in order to indicate how the construct must be measured. Rossiter proposes a categorization of objects into three classes: concrete singular, abstract collective and abstract formed. Similarly, he proposes a classification of the attributes of the construct, that is, the dimensions on which the object will be evaluated, into three types: concrete, formed and eliciting. Rossiter believes that the evaluating entity is an integral part of a construct, because the latter depends on the perspective formulated by it (the respondents). Rossiter argues that if the construct has a concrete definition, a long multi-item scale can be replaced by a shorter scale with only few items or even a single item. According to him, if the construct is easily and uniformly imagined by the evaluator, only one indicator (item) is sufficient for its measurement.

Among the other basic aspects of the C-OAR-SE paradigm, the content validity of the scale is important to consider. According to Rossiter, this validity may be sufficient to authorize its use. A traditional statistical purification of items is not necessary. He also stressed that the validity of a construct must be established, regardless of other measures. For him, the content validity is the validity of the construct and must be certified by experts.

3.3.1.2. Discussion of the C-OAR-SE approach

This perspective of using single-item scales is widely supported by some researchers. For example, Bergkvist and Rossiter (2007; 2008)8, supporting the C-OAR-SE procedure, strongly criticized the use of multi-item scales and demonstrated, for the two constructs they studied (attitude towards the ad and brand attitude), that the predictive validity of a single-item scale can be comparable to that of a multiple indicator scale. These results were reconfirmed by Bergkvist and Rossiter (2009) for three constructs (attitude towards the ad, brand attitude and brand purchase intention). This means, according to them (2008), that the theoretical and empirical tests using both types of measurement (single-item and multi-item) are similar. In response to some criticism of some of his research findings, Bergkvist (2015) again supported the relevance of scales with a single indicator. Consequently, it becomes legitimate to ask the following question: are all constructs well-established, concrete and therefore understandable from the respondents’ point of view, in order to allow simplified measurements with a single indicator?

Bergkvist and Rossiter (2008) attempted to provide an answer, noting that many constructs, such as materialism and job satisfaction, are abstract and multi-faceted, making the use of single-item scale obsolete and meaningless. They add, however, that it is possible to divide these abstract constructs into concrete components and associate an item to each of them. Spörrle and Bekk (2014) adopted this logic for a personality measure (Big Five), establishing that a single indicator per dimension can provide a reliable and valid instrument. Konstabel et al. (2017) have also developed a short version of a personality measurement scale where the different dimensions are each represented by one item instead of several. It remains to be seen whether this is at all easy to achieve. The debate around the results found by Bergkvist and Rossiter (2008; 2009) gave rise to divergent exchanges: between Sarstedt et al. (2016a; 2016b) who believe that the use of single-item scales for some constructs can show too much variability and Bergkvist (2016) who is not convinced.

Diamantopoulos (2005) also discussed the C-OAR-SE approach and finds that the definition of the construct is ambiguous and not in line with standard definitions: for example, the inclusion of the assessing entity lacks a theoretical basis. In addition, he notes that the different classifications of objects and attributes are confusing. In short, although Diamantopoulos (2005) acknowledges some undeniable contributions to the C-OAR-SE approach, including the importance of content validity and the use of experts to attest to it, he points out that complementary methodologies are needed to establish measures that researchers could refer to with confidence. A few years after this study, Diamantopoulos et al. (2012) believe that the use of single-item scales should be undertaken with great caution, because even if they have some advantages such as ease of administration and reduced cost (especially in data collection and processing), they often do not seem to have good predictive validity compared to multi-item scales. In addition, Diamantopoulos et al. (2012) noted that although the construct is concrete and understandable for respondents allowing single-item measurement, choosing the indicator in question is not an easy task because, for example, an item may be good for one expert but not for another. Moreover, even if some items may be judged as equivalent in order to understand the same construct according to experts, respondents may have different points of view. In this regard, Ahlawat (1985), after verifying that the content validity established by experts for different item formulations was similar, found that these items measured (low correlations) different constructs according to the respondents. Malhotra et al. (2012) noted that for single-item measurements, reliability cannot be examined. In addition, Malhotra et al. (2012, p. 840) pointed out that even if content validity (established through experts) “is a necessary and useful criterion for scale development […] it cannot be accepted as a sufficient condition for good measurement because the data collection process is always vulnerable to various response biases […] such as social desirability, etc.”. Similarly, Rigdon et al. (2011) disagree that only expert opinion is important to consider: they prefer an approach that takes into account expert opinion, the conceptual and empirical field. Rigdon et al. (2011) believe that the C-OAR-SE procedure can provide a valuable guide for scale development, but that it is dogmatic to consider it appropriate at all cases. A single item to capture the meaning of a construct seems insufficient and a source of bias. In fact, the C-OAR-SE scale construction procedure has not been very successful, as Rossiter (2008) himself acknowledges, attributing it to its difficulty and the time required for its operation.

3.3.2. The choice of a scale apprehension: large or small?

According to previous debates, in case of doubt about the level of abstraction of a latent construct, it would seem wiser to understand it through a multi-item scale. Moreover, it is appropriate for a researcher, when they assume that a single-item scale is capable of grasping the construct in which they are interested, to undertake the necessary tests in order to verify the significance of the item. If the item refers to a single meaning and is not interpreted differently by respondents, then the single-item instrument can be used. The use of a single-item measurement is possible when it is not subject to ambiguity of interpretation. For example, the construct of purchase intention often revealed a clear understanding in the minds of the respondents questioned and it seemed potentially sufficient to use a single item to measure it.

When questioning the relevance of single-item measures in the context of management studies, Fuchs and Diamantopoulos (2009) argue that they should not be barred. However, they suggest that their uses are subject to a set of criteria: the nature of the construct (the more concrete it is, the more possible a single-item measure is), the nature of the instruments (the more redundant the items are, the more possible a single-item measure is), the research objective (the more it refers to an imprecise general knowledge of the construct, the more possible a single indicator measure is), etc. Malhotra et al. (2012) note, among other things, that single-item scales make it easier to obtain answers and are acceptable during preliminary explorations. Sarstedt et al. (2016a) assume that the use of single-item scales can only be used when there are constraints associated with the study, such as budgetary constraints, difficulty in recruiting respondents or in the presence of a limited population size. They add, however, that such scales have low predictive power, which can easily give rise to errors.

In order to make the choice between a single or multi-item scale clearer and more operational, Diamantopoulos et al. (2012) have proposed a guide to decide, from the first steps of a research project. According to this guide, it is clear that multi-item measurements are the most common forms, singleitem scales are of smaller use given the conditions of application they have suggested (small sample size, high correlations between items, exploratory research). For her part, Bassi (2011) recalled that several concepts, particularly in marketing, are multidimensional. She added that it is often “unrealistic to measure attitudes towards complex objects (phenomena) with single-item scales”. This fact seems to be well-established, through the work carried out over several decades by many researchers (Churchill 1979; Peter 1979; Diamantopoulos et al. 2012; Malhotra et al. 2012). This is not a particularity of marketing research, but it seems to be extending to other areas. As such, Rattray and Jones (2007), based on a set of findings from other studies, pointed out that it is not usual to develop a scale based on a single indicator, and multi-item scales are generally preferred to avoid interpretation bias and to reduce measurement errors.

Thus, a single indicator does not seem to be able to fully capture a latent construct. Certainly a high number of items can generate other problems, but can also have many advantages. In this regard, Malhotra et al. (2012) have summarized, on the basis of several studies, some advantages of using multiitem scales, and they are useful: to be able to capture all the facets of a construct, or if the objective of the research is to determine the impact of the different facets of a construct and if it is confirmatory research requiring high reliability. Bergkvist and Rossiter (2008) also pointed out that a multiitem scale reproduces more information and is more reliable than a singleitem scale. Indeed, the meta-analysis undertaken by Churchill and Peter (1984) found that a significant number of items on a scale have important effects on the reliability estimate, which is higher when the number of indicators increases. Peter and Churchill (1986), also through a metaanalysis, noted that reliability has a strong impact on some aspects of the validity of a measure. Table 3.2 summarizes the possibilities of use of each format (single-item and multi-item):

Table 3.2. Wide or short conception: potential for use

Scale width Potential uses
Single-item scale

Applicable for concrete, understandable constructs that are not tainted by ambiguity of interpretation.

Applicable for preliminary examinations or exploratory studies.

Allows cost advantages (data collection and processing) and ease of administration.

Multi-item scale

Applicable for abstract, complex constructs.

Applicable for multidimensional constructs.

Applicable for confirmatory research.

Allows a better coverage of the construct by producing more information.

Avoids interpretation bias.

Reduces measurement errors.

3.4. Conclusion

It is clear that the preliminary steps involved in the conception of a measurement scale are of great importance in delineating the boundaries of a construct and giving it a precise definition. Such efforts make it possible to observe the nature of the conceptualization to be considered and the overall methodological procedure to be implemented in order to develop a scale that can identify the significance of the phenomenon studied. No solution can be adapted by default to all situations, several measurement conception schemes are possible. Scale developers must consider different solutions on the one hand, depending on the nature of the construct-indicator relationships (formative or reflective) and on the other hand, according to a broad or short apprehension (multi-item or single-item) for reflective measurement models.

3.5. Knowledge tests

  1. 1) What are the different conceptualizations of “construct-indicator” relationships?
  2. 2) How does one distinguish a reflective measure from a formative measure?
  3. 3) Can a construct be specified in different ways? Can it be specified at different levels? Why?
  4. 4) Do the different categories of specifications use the same tools for constructing and validating a measurement scale?
  5. 5) What are the main features of the test and validation protocols for formative measures?
  6. 6) What are the limits of a single-item scale?
  7. 7) What are the advantages of a multi-item scale?
  8. 8) What is the basis for choosing a conceptualization for a construct?
  9. 9) What types of conceptualization are more commonly used in marketing research?
  1. 1 This is the same article by Jarvis et al. originally published in 2003.
  2. 2 For graphic conventions, latent constructs and errors are represented by circles or ovals, and observable indicators take the form of rectangles or squares.
  3. 3 The factor contributions (λ) and error terms (e) associated with the reflective model are obtained within the statistical framework of confirmatory factor analysis (CFA), used to validate (convergent validity and discriminant validity) the measurement model. For more information on the CFA, see Chapter 7.
  4. 4 The factor weights (ϒ) and error term (ε) associated with the formative model are obtained as part of the statistics used to validate the measurement model.
  5. 5 In this case, the principal component’s analysis.
  6. 6 It should be recalled that for formative measures, it is necessary to identify all relevant indicators. The debates proposed in this section relating to the choice between a single-item scale or a multi-item scale do not therefore concern them.
  7. 7 C-OAR-SE: abbreviation for Construct, Object, Attribute, Rater, Scale and Enumeration.
  8. 8 This is the same article by Bergkvist and Rossiter, originally published in 2007.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.248.62