Stems and Scales

A familiar method for assessing attitudes is the Likert item. A Likert item consists of two parts: a stem, which is simply a statement of an attitude, and a scale on which people express their agreement with that statement. For example:

 Stem:  I believe that capital punishment is cruel.
 Scale: Disagree strongly
  Disagree somewhat
  Can't say
  Agree somewhat
  Agree strongly

Typically you would be instructed to check the point on the scale which best expressed your degree of agreement (you might also be given an option of reporting that the statement is not applicable). A five-point scale of agreement, like the one above, is probably most common, but longer or shorter scales can be used. Shorter scales are more difficult to get useful information from, though.

Likert items are easy to use and they can give you useful information. A single item, however, will rarely give you any useful information, so you are best to use sets of them. Ideally you would use 40 items or more, but useful information can be got with fewer.

A single item rarely provides useful information simply because responses to it are affected by many factors in addition to the one you're interested in. When several items are used, the consistency of responding produced by an attitude can be detected.

To do that, Likert items are scaled. For example, values may be assigned to each point on the scale in the example above, Disagree strongly could be assigned the value 1 (one), Disagree somewhat could be assigned 2, and so on. A reliability analysis would then be performed. For the reliability analysis you would calculate a split-half reliability coefficient (a measure of the similarity between items) and the correlations of each item with the total score. Items which did not correlate would be omitted from the scale and the reliability analysis performed again. The split-half coefficient must be at least .71 if the scale is to be useful with samples of reasonable size. Further refinements can increase the scale's discriminatory power.

If you manage to put together a reliable scale, you can get a score for the attitude it measures simply by adding the ratings on individual items. If the responses to the items are heavily skewed, you might consider combining the less frequent points on the scale into a single value, but you should only do that if it improves the reliability of the scale.

If you have a lot of missing data items which respondents did not rate you can use the mean rating rather than the total. People who rated only a few items can be omitted. The mean also has the advantage of making the average position on the rating scale obvious. There are other useful ways of calculating a score, so you can usually find one appropriate for your audience. If you don't like assuming the mean represents the missing data, there are other methods for estimating the missing data, but usually they will not appreciably alter the mean score.

You will have the most success with Likert items if you keep the stems short and simple. Avoid the dreaded double-barrelled stem, which asks for two or more judgements. For example, the statement I believe capital and corporal punishment are cruel asks for opinions about two forms of punishment, and is therefore ambiguous. Do not be surprised if an item like that does not survive the reliability analysis.

Stems and Scales © 2001, John FitzGerald
Home page | Decisionmakers' index | E-mail