resarch methodology assignment 1
resarch methodology assignment 1
Assessment name: Quantitative research design
Description: You will be provided with a quantitative research article to critique. The critique will involve the identification of the strengths and limitations in the design of the study based upon the principles and assumptions of good research design consistent with realising the article’s stated research purpose(s) from a quantitative perspective. On this basis individuals will then develop an alternative research design consistent with realising the research purpose(s), again based on upon the principles of good research design, from a quantitative perspective. Opportunities to practice article critique and designing quantitative research based upon the principles of good research design will be provided in class. Conceptual material to support research design decisions is provided on Blackboard. Note that this assessment does not involve data collection.
Length: 2500 words excluding references, Figures and Tables
Formative or Summative: Formative and Summative
A TEMPLATE FOR READING AND EVALUATING RESEARCH
Article: Lukas, B.A., Whitwell, G.J., &Heide, J.B. (2013). Why Do Customers Get More Than They Need? How Organizational Culture Shapes Product Capability Decisions. Journal of Marketing, 77, 1-12.
Methodology / research Issue Description Evaluation – strengths and limitations Redesign options to address negative evaluations (where appropriate)
PART A. PURPOSE OF THE STUDY.
The aims, research questions and/or hypotheses
In your own words, explain the purpose of the study, and the RQs/hypotheses of the study.
a. independent and dependent variables (more commonly associated with experimental designs) or;
b. predictor and outcome variables (more commonly associated with non-experimental designs);
c. mediators or intervening variables;
d. control variables;
e. any other variables.
Aim was to explain why firms provide products to customers with greater capabilities (e.g., functions) than the customers require or want. Authors hypothesised that two particular dimensions of organizational culture, specifically the degree of adhocracy culture and market culture lead to increased over-provision of product capabilities. They argue this is the case because these two culture types are reflect a commitment to firm differentiation via being ‘leading edge and competitive’, and potentially leading to over-provisioned products. These relationships are then hypothesised to be moderated by customer orientation, the degree to which firms place customer interests before firm interests. Higher customer orientation is hypothesised to reduce the effect of culture on over-provision compared to the effect of lower customer orientation.
Predictors or IVs: 2 dimensions of organisational culture, adhocracy culture and market culture.
Outcome or DV: product capability provision.
Moderators: customer orientation.
Control variables: four: supplier reputation; and supplier product experience (see page 5, col 1, para 6); and two other culture dimensions, bureaucracy and clan culture.
Other variables: none. How clear, specific and understandable are the aims, RQs, and/or hypotheses?
The aims and hypotheses seem specific.
To evaluate if they are sound hypotheses requires understanding the literature in this area. It is difficult to evaluate how sound hypotheses and RQs are without good content knowledge.
We can, however, identify the types of variables in the study. Are the RQs, and/or hypotheses able to be written more clearly?
I think they are fine.
PART B. THE RESEARCH DESIGN
Categorise and briefly describe the research design.
1. Categorise the design: Note that more than one of the above may apply.
• Experimental design?
• Quasi-experimental research?
• Quasi-experimental design
• Written questionnaire survey research?
• Interview design
• Case study design
• Combination of above?
2. Briefly describe the research design. Include any significant design elements present?
1. A cross sectional questionnaire design was used. The unit of analysis was the dyad, and the focus of the data collection concerned one product (from within one ‘strategic business unit’ within a firm. More detail is provided about the product selected by the supplier on page 5, Col 2.
2. Matched supplier-customer dyads were used. Suppliers provided data on the IVs and moderator (dimensions of culture and customer orientation), and the customer provided data about the DV. See p4, and p 10. Thus different sources of data were used for IVs and DVs, reducing the likelihood of common source bias in the data. Weakness: Use of a cross-sectional design has the inability to demonstrate causality. The detailed reasons for this are taken up in Part E (lack of temporal separation, inability to control other plausible causes).
Weakness: Common method bias could exist in that the same method of data collection was used for all data (all written questionnaires).
Strength: The use of matched dyadic data collection. Several positive aspects to this:
1. Increases validity of results as we know customers know the product;
2. prevent the elevation of the correlations (and thus, regression coefficients) between the IVs and DV through using different sources of data. (i.e., reduced extent of one form of common method bias). (but see the other weakness in method above)
Strength: Use of control variables to control other plausible explanations.
How well is the research design aligned to the stated research purpose? Would a different /adapted design be better? If so, what changes do you suggest and why?
There is a strong match. The RQ could be answered, but only as associations among variables, not as causal links.
To remove all other plausible explanations, would need an experiment. This would be impossible to achieve.
To temporally separate IV and DV might assist to demonstrate which came first. But depends on how long the it takes for culture to affect (cause) capability provision. It may take a number of years for changes in culture to influence capability provision. This would require a long term panel design in which all variables are measured repeatedly, probably over many years. This will not in itself exclude other plausible causes.
To reduce common method bias further, other forms of measures apart from questionnaires might be investigated. It is left to you to think what these might be.
For each of the variables identified in part A above, provide:
1. The Conceptual definition;
2. The Operational definitions
For measured variables:
3. Was the measure an existing scale, an existing scale with some adaptation; a new scale? Provide the name and reference for the scale if not original.
4. Briefly describe the measurement properties of the scale. E.g., type of scale (rating scale, Semantic differential scale, Thurston scale); number of items; Number of rating points; Anchor labels, etc.
5. Was evidence of Reliability of the measure provided?
6. Was evidence of the construct Validity of the scale provided?
7. Was the measure provided, or information provided as to its availability?
For experiments: (this was not an experiment so not relevant)
1. How were the IVs manipulated?
2. What evidence of the construct validity of the manipulations were provided? (e.g, manipulation checks; other research).
Coding strategy specified? (No coding took place)
Below is an example of a critique of the measurement of the culture variables. A similar style of critique can occur for all variables.
Predictors or IVs: 2 dimensions of organisational culture.
Adhocracy culture and market culture.
1. Adhocracy: An external focus on firm differentiation through being leading edge and providing unique products/services.
2. Summed (or averaged) scores on items developed by Cameron and Quinn, 2006.
3. An adaptation of the Competing Values Framework scale by Cameron and Quinn (2006). The scale was modified by asking participants to rate each culture on a 1-7 scale. (To see what this really means, and to evaluate it, one would need to find the original scale in which the items were developed and compare the scales).
4. Each culture dimension was assessed by rating 4 items on a 7 point scale, anchored at the ends by “completely inaccurate description” and “completely accurate description”.
5. Reliability: Yes, on page 6 and table 1. They reported composite reliability as their reliability measure (common in marketing studies). You would have to find out what ‘composite reliability’ means to evaluate it.
6. Construct validity: Authors removed items which did not correlate with total scale. They do not report which items. None removed from culture measures. Provided a factor analysis of (composite/parcelled) items. Discriminant and divergent validity of (composite) items established. Evidence of construct validity not provided. But culture framework well established and the original measure widely used. Cameron and Quinn (2006) provide reliability and validity data in Appendix A of their book. Other studies have been conducted on the validity of the scale. To make a full assessment, these would need to be examined.
Outcome or DV: product capability provision.
Moderators: customer orientation.
Control variables: two: supplier reputation; and supplier product experience (see page 5, col 1, para 6).
1 and 2. In particular, how well do the conceptual and operational definitions match?
Inspection of the items suggests a reasonable match. The last item of adhocracy culture does seem to be very similar to product capability provision. “My business unit emphasizes developing new products, features, and services.Trying new things and prospecting for opportunities are valued.” (Underlining added).
3. In particular, how did the adaptation or development occur?
Little information provided about this. Mention is made of using steps outlined by Churchill (1979) in terms of ensuring items match the construct. Some of the wording of the items is different from the original Cameron and Quinn (2006) measure.
5 and 6. In particular, how well does the evidence provided support the reliability and construct validity of the measures?
Evidence of adequate reliability of the items is provided. Evidence of construct validity is not provided;But culture framework well established and the original measure widely used. Cameron and Quinn (2006) provide reliability and validity data in Appendix A of their book. Other studies have been conducted on the validity of the scale. To make a full assessment, these would need to be examined. Strictly speaking, scale adaptations should be re-validated. Construct validation of the revised scales would be advisable. The form of the original scales was significantly different from the rating scales used in the current study. (To see this, it requires looking at the original format as presented by Cameron and Quinn (2006).
Knowledge of the statistical procedures used to evaluate the reliability of the scales is required to be able to critique that aspect. It is NOT expected that such critiques will be provided in BSn502. However, in short, the use of what is called ‘item parcelling’ within the confirmatory factor analysis (CFA) may obscure issues with individual items.
PART D. SAMPLING STRATEGY
1. Was a target population specified? Can a target population be inferred?
2. Describe the overall strategy. How was the sample chosen? If possible, label the strategy (e.g., simple random sampling, cluster sampling, convenience sampling etc)
For probability-based methods:
a. What sampling frame was used or developed? How good was this? How well does this frame match the population?
b. How was a probability sample drawn from the frame? How good was this process? How well does the sample drawn match the sampling frame?
3. What was the final sample size? Was a response rate provided? How can it be calculated from the information provided? Show how, or what information is lacking. Is this size sufficient for the research?
4. To be placed in the critique. How representative of target population/sampling frame/drawn sample was the final sample? How statistically generalizable are the findings? To whom are the findings statistically generalizable?
1. No target population as such specified. Can infer from the introduction and discussion that the target population appears to be and business to business firms that supply products (as opposed to services). For the distinction from services, see P9, col 2, para 3, and P5, col 2.
2. A bought list of IT firms was used as a sampling frame of firms was used from which a random sample of 1024 firms was drawn. These were then screened for eligibility (that is, they supplied products), leaving 317. One might make an argument for this being a convenience sample, or a random sample. See the critique.
2a. A sampling frame was used. This was a bought list of IT companies. See Critique for 2.
2b. It is not stated how the sample was drawn: we can assume it was random as stated; that is, every element in the frame had an equal probability of being included.
3. Final sample size was 100 matched supplier-customer dyads. From the sample of 317, only 105 firms were prepared to nominate a customer (response rate = 105/317 = 33%). From these 105, only 100 customers agreed to participate. The remaining 5 firms were dropped. Final sample size = 100; final response rate = 100/317 = .32.
1. This is not unusual. Theoretically informed work often has ill-defined populations. The ‘operational population’ became IT firms, then IT firms on the database, then those sampled that provided services.
2 and 2a. It depends on how one defines the population. If it is B2B firms in the IT industry, then the sampling strategy can only truly be a probability based sample if ALL IT firms have a known and equal chance of being sampled. All that the authors had access to was an existing database. It is not known if the database was complete, or if all IT firms had a known and equal probability of being included in that database. It is most plausible that neither was the case; although without further information one cannot be certain. Under these circumstances, one might claim a convenience sample was used.
If we assume the sampling frame is high quality (that is, has high coverage of the population with no biasing in its make up), then a random sample was drawn.
Nevertheless, the list may be better than anything the researchers could have built themselves. If this is the case, then sampling from this list is a reasonable process.
2b. The screening of eligible firms AFTER sampling could be a concern. If selection was completely random, then the 317 should still be a random sample of eligible firms from the database. I am prepared to argue that the sample of 317 is a random sample of the database.
3. Many studies have a smaller response rate. The final sample is small. It was sufficient for the research, but the study would have benefited from a larger sample.
Q: How representative of target population/sampling frame/drawn sample was the final sample? How statistically generalizable are the findings? To whom are the findings statistically generalizable?
The large number of firms that did not provide customer details (non-response) undermines any claim as to the statistical representativeness of the final sample to the sampling frame or the population. Therefore strictly speaking, findings cannot be claimed to be statistically generalizable. Given sound theorising, a (tentative) claim might be made of analytic generalizability. 1. I think this is reasonable. It is not possible to have well defined populations of this nature. All that can be defined is a relevant sub-population, such as they have done.
2. More information regarding the sampling frame is required. To assess this, we need to know its coverage of the population. This is a large limitation to evaluation. Theoretically it is possible to develop a sampling frame of IT firms. The issue will be whether or not this will be better (more complete) than the list used. Developing a sampling frame is expensive. It would have been better if the sampling frame only contained eligible members of the population. Given that frame, random sampling was the best process to undertake.
Overcoming non-response in this study would have been difficult. It is known if the non-response was ONLY because firms would not provide customer details: it may be other firms did not respond at all. Reducing non-response would be useful.
Overall, this is a common sampling strategy, but flawed as outlined. Ideally a higher response rate would have been achieved.
1. what does the authors claim to be true as a result of the application of their method
2. In what ways does the author generalise the conclusions? On what basis are these generalisations justified? (Think about analytic and statistical generalisabilty?)
Arguably, there are several knowledge claims (abbreviated here as KC).
1. Page 7 has the main finding.
“…both adhocracy culture … and market culture … in themselves promoteoverprovision, consistent with H1 and H2. However,
a customer orientation only attenuates the overprovision
tendency of an adhocracy culture…. Thus,H3 is supported, and H4 is not supported.”
This is a statement of the statistical findings. They imply that culture affects the DV.
2. Generalization is attempted through both statistical and analytic generalizability.
a. Statistical generalizability. The use of statistical tests BY DEFINITION is based on claims of statistical generalizability.
b. Analytic generalization. There are several claims of generalizability in the Discussion, all based on analytic generalization.
(i) “One contribution is to show that certain CVF cultures
are associated with dark sides, in that their values perpetuate
product-management practices at the expense of the customer.Specifically, our results point to the dark sides ofadhocracy and market cultures in that both CVF cultures
have the potential to engender systematic mismatchesbetween a firm’s decisions on product capability and customer
(ii) “Weshow that a customer orientation is a distinct form of culturethat coexists with the four CVF cultures and contributes
toward a firm’s overall culture by adding valuesthat relate specifically to customers.”
(iii) “The finding that a customer orientation attenuates only
an adhocracy’s overprovision tendency highlights an important
distinction between adhocracy and market cultures.Specifically, Cameron and Quinn (2006) note that althoughboth adhocracy and market cultures share an external focuson differentiation, they differ to the extent that an adhocracy
culture accepts flexibility and discretion, while a market
culture is more stringent and controlling. This particulardifference in cultural characteristics raises the possibilitythat the provision tendencies of more stringent and controllingcultures might be inherently more difficult to attenuatethrough any means, not just difficult to attenuate with a customerorientation.”
Evaluation of use of Statistical generalization: Whether or not this is accepted depends on whether one views the sample as a random sample. The large non-response undermines this claim; and it may be that the sampling frame (the list from which they sampled) has biased coverage of the population: but we do not know.
Evaluation of use of analytic generalization.
(i). Most closely based upon the results of the statistical testing, this claim does seem to go beyond the sample and the population.
(ii) is based upon the confirmatory factor analysis and the correlations. The claim does seem to go beyond the sample and the population. Despite being based on the statistical analyses, it is not a statistically generalizable claim. This is a complex issue: the issue is ‘what is culture’. The author claims customer orientation is a distinct dimension of culture.
(iii) this goes the furthest beyond the study. It makes claims about “stringent and controllingcultures” which are different from those measured. The claim is that the market cultures more difficult to manipulate/influence. This starts to present an intervening explanatory variable beyond that of culture specifically.
This is fine as far as it goes; a more extensive set of analytic generalisations would invoke a theory of culture and capability provisioning that goes beyond just the variables in their operational form. The reasoning in the introduction seems logical, and if those mechanisms do operate, and this study found evidence they do, then a claim of a generalizable theory (or at least, hypotheses) might be defensible. One needs to be tentative however until further studies replicate the general findings.
The is little to support the implicit claims of the findings applying to other firms not in the population.
Conditions for causal claims.
Were causal claims made or inferred? How and how well met are the conditions required for making causal claims (that is, internal validity)?
Author makes no explicit claims of causality. But one could infer that causal relationships are being investigated. Conditions for causality not met:
1. Covariation between IVs and DV:
Yes – seen in correlations and regression coefficients. No change per se reported.
2. Temporal precedence of cause:
No – DV measured first, in fact. “We administered the supplier and customer questionnaires
in two steps. The customer questionnaire was administered
first.” P5. The two questionnaires were presumably administered at a time very close together.
3. Control all other plausible explanations: Because all variables were measured, and there was no manipulation by the researcher, then rival explanations were not fully controlled. There was an attempt to control other explanations by the use of control variables. This helps to some extent, but not all influences ruled out. See threats below.
Alternative explanations for the results (internal validity and threats to internal validity)
What procedures were undertaken to ensure/increase the internal validity of the study? Are other plausible explanations possible?
These internal validity threats all relate to other plausible explanations for the results.
1. Selection. Clearly a potential threat. Selection in this design (cross-sectional) relates to the membership of firms into the levels of the IVs, that is the culture types (or levels); their degree of market and adhocracy, and customer orientation. We cannot rule out that inclusion (selection) into particular levels of those variables has not been influenced by any other factors. These other factors would be selection effects. Cultures probably do not emerge from nothing; rather, systematic differences in other variables, such as the historical environment of the firm, its origins, its founder, etc, may all lead systematically to different cultures. Thus culture is not THE ONLY difference between the firms. These other influences might not only affect culture, but also systematically affect the DV. Despite using control variables, cannot be sure that all other causes were controlled.
Self-selection bias is also a main threat. Non-response was high, thus firms self-selected to participate.
2. History (changes in firms during study due to external influences). Given the (presumably) short time frame of study, there are unlikely to be history effects. Given all firms were in the one industry, industry level changes in the environment more likely to have influenced all firms. Given all firms in one geographical region, history effects more likely to influence all firms equally. Little reason to expect firms with particular culture types to be systematically affected differently. At a reach, those with a more external focus might be more open to history effects as they are more likely to see external influences early. Short time frame makes this unlikely. In any case, the three main culture types are all ‘externally focused’.
3. Maturation ( changes in firms during study due to internal influences). Given the (presumably) short time frame of study, there are unlikely to be maturation effects operating systematically differently for different types of cultures in the short time frame (e.g., more rapid growth in one culture type).
4. Regression. No repeated testing, so regression effects cannot occur.
5. Attrition. Non-response is an issue, but this becomes a selection bias. No ongoing testing, so attrition per se not possible.
6. Testing.Norepeated testing, so testing effects cannot occur.
7. Instrumentation. No repeated testing, so instrumentation effects cannot occur.