<%@LANGUAGE="VBSCRIPT" CODEPAGE="1252"%> The value of Likert Scales in measuring attitudes of online learners

The value of Likert scales in measuring attitudes of online learners

Hilary Page-Bucci - February 2003


Attitude is an important concept that is often used to understand and predict people's reaction to an object or change and how behaviour can be influenced
(Fishbein and Ajzen, 1975)

Introduction

Although online learning has grown alongside the progress of digital technology over the last 15 years; the reasoning behind why students become absorbed, practise and achieve a variety of tasks and exercises, or why they avoid others are always of interest to the effectors and evaluators of the learning process.

By establishing the characteristics of distance and online learners; how they become motivated, how they feel about learning online; useful information will be found that would empower the teaching practices and thus ultimately enhance student retention and achievement.

A review of some of the literature available has revealed some research already undertaken in various areas of learning online, such as 'training effectiveness and user attitudes' (Torkzadeh et al, 1999). Torkzadeh et al suggest, " to achieve successful training we need to be cognizant of the user's attitudes towards computers. Further investigation revealed other factors that should be taken into consideration; Miltiadou (1999) suggests that 'it is important to identify motivational characteristics of online students'. By investigating and defining their motivation, it would lead to an understanding of 'self-efficacy beliefs about their own abilities to engage, persist and accomplish specific tasks' (Bandura, 1986; Stipek, 1988 cited by Miltiadou).

The concept of measuring attitude is found in many areas including social psychology and the Social Sciences; they can be complex and difficult to measure and there are a number of different measuring instruments that have been developed to assess attitude.

'Scaling is the science of determining measuring instruments for human judgment' (McIver 1981). One needs to make use of appropriate scaling methods to aid in improving the accuracy of subjective estimation and voting procedures (Turoff & Hiltz 1997). Torgerson (1958) pointed out that scaling, as a science of measuring human judgment, is as fundamental as collecting data on well-developed natural sciences. Nobody would refute the fact that all science advances by the development of its measurement instruments. Researchers are constantly attempting to obtain more effective scaling methods that could be applied to the less well developed yet more complicated social sciences. Scaling models can be distinguished according to whether they are intended to scale persons, stimuli, or both (McIver 1981). For example, Likert scale is a subject-centered approach since only subjects receive scale scores. Thurstone scaling is considered a method to evaluate the stimuli with respect to some designated attributes. It is the stimuli rather than the persons that are scaled (Togerson 1958). Guttman scaling is an approach in which both subjects and stimuli can be assigned scale values (McIver 1981). (Li et al, 2001)


The purpose of this study is to explore the particular method of measuring attitude known as Likert Scales (Likert, 1932), and determine their effectiveness and value in researching attitudes, views and experiences of online learners. These scales according to Taylor and Heath (1996) have become one of the dominant methods of measuring social and political attitudes.

Methodology and Measurement

The methodology used for this research will be by a critique of previous research methodologies. In order to establish the methodology of this research it is first necessary to clarify the term 'attitude'.

Attitude is an important concept that is often used to understand and predict people's reaction to an object or change and how behaviour can be influenced (Fishbein and Ajzen, 1975)

An attitude is a mental and neural state of readiness, organised through experience, exerting a directive or dynamic influence upon the individual's response to all objects and situations to which it is related (Allport, 1935 cited by Gross)

A learned orientation, or disposition, toward an object or situation, which provides a tendency to respond favourably or unfavourably to the object or situation.' (Rokeach, 1968 cited by Gross)

Three of the generally accepted components of the term 'attitude' (Triandis, 1971) appear in some of the above definitions, these are:

By analysing these components, and as Gross (1968) suggests it is a 'hypothetical construct'; it becomes apparent that it cannot be directly measured and the use of only a single statement or question to assess it [attitude] will not be effective in gaining reliable responses.

Attitude scales attempt to determine what an individual believes, perceives or feels. Attitudes can be measured toward self, others, and a variety of other activities, institutions, and situations (Gay, 1996)

There are several types of scales that have been developed to measure attitude:

Thurstone Scales

This is described by Thurstone & Chave (1929) as a method of equal-appearing intervals. Thurstone scalling is 'based on the law of comparative judgment' (Neuman, 2000). It requires the individual to either agree or disagree with a large number of statements about an issue or object. Thurstone scales typically present the reader with a number of statements to which they have to respond, usually by ticking a true/false box, or agree/disagree, i.e. a choice of two possible responses. Although one of the first scaling methods to be developed, the questionnaires are mostly generated by face to face interviews and rarely used in determining attitude measurement today, thus the example below (figure 1) is irrelevant to online learners.

An example of a Thurstone Scale (figure1)

ATTITUDE TOWARD WAR

An individual is asked to check those items which represent his views.

1. A country cannot amount to much without a national honor, and war is the only means of preserving it.
2. When war is declared, we must enlist.
3. Wars are justifiable only when waged in defense of weaker nations.
4. Peace and war are both essential to progress.
5. The most that we can hope to accomplish is the partial elimination of war.
6. The disrespect for human life and rights involved in a war is a cause of crime waves.
7. All nations should disarm immediately.

(Droba, 1930)

figure 1

Source: http://online.sfsu.edu/~psych200/unit8/84.htm

 

Advantages

Disadvantages

Items are weighted or valued rather than subjects More difficult to construct than a Likert scale
Easier to construct than a Guttman scale No more reliable than a Likert scale
  Measures only agreement or disagreement


Guttman Scales (Cumulative scales)

Guttman developed this scale in the 1940s in order to determine if a relationship existed within a group of items. The items are ordered from low to high according to difficulty so that to approve or correctly answer the last item implies approval or success of all prior ones (e.g. self-efficacy scale). The respondent selects an item that best applies. The list contains items that are cumulative, so the respondant either agrees or disagrees, if he/she agrees to one, he/she probably agrees to the previous statements. Arguably this scale does not give enough variation of feelings and perceptions, therefore the author suggests, this would not be appropriate for measuring attitude of online learners.

An example of a Guttman Scale (figure 2):

 

Please indicate what you think about new information technology (IT) by ticking ONE box to identify the statement that most closely matches your opinion (Wilson, 1997)

Agree

IT has no place in the office.

 

IT needs experts to use it in the office.

 

IT can be used in the office by those with training.

 

I'd be happy to have someone use IT to do things for me in the office.

 

I'd be happy to use IT if I was trained.

 

I'd be happy to teach myself to use IT.

 

figure 2:

Source: http://www.hb.se/bhs/nyutb/kurswebb/c-kurser/applirm/qdes4.htm

 

Advantages

Disadvantages

Reproducibility Difficult to construct
More one-dimensional than Likert scaling Scalogram analysis may be too restrictive, only a narrow universe of content can be used
  Cornell technique questionable
  Results no better than summated Likert scales

Semantic Differential Scaling

This is concerned with the 'measurement of meaning', the idea or association that individuals attach to words or objects. The respondent is required to mark on a scale between two opposing opinions (bipolar adjectives) the position they feel the object holds on that scale for them. It is often used in market research to determine how consumers feel about certain products.

Three main factors emerge from the ratings, these are:

The evaluative factor (good-bad, pleasant-unpleasant, kind-cruel); the potency factor (strong-weak, thick-thin, hard-soft); the activity factor (active-passive, slow-fast, hot-cold) (Osgood et al, 1957).

Although this scale is comparatively easy for the respondent to complete, the author argues that this would not be suitable for measuring attitude of online learners as it tends to relate more to material associations than cognizance of feelings.

An example of a Semantic Differential Scale (figure 3):

figure 3

 

Advantages

Disadvantages

Simple to construct Analyses can be complex
Easy for subjects to answer  
Allows for several types of analyses to take place  

 

Likert Scale (Summated scale)

This was developed by Rensis Likert in 1932. It requires the individuals to make a decision on their level of agreement, generally on a five-point scale (ie. Strongly Agree, Agree, Disagree, Strongly Disagree) with a statement. The number beside each response becomes the value for that response and the total score is obtained by adding the values for each response, hence the reason why they are also called 'summated scales' (the respondents score is found by summing the number of responses). Dumas (1999) suggests, ' this is the most commonly used question format for assessing participants' opinions of usability'.

Two examples of Likert Scales (figures 4 & 5):

figure 4

figure 5

Advantages

Disadvantages

Simple to construct Lack of reproducibility
Each item of equal value so that respondents are scored rather than items Absence of one-dimensionality or homogeneity
Likely to produce a highly reliable scale Validity may be difficult to demonstrate
Easy to read and complete  

Reliability and Validity

Likert scale measures are fundamentally at the ordinal level of measurement because responses indicate a ranking only.

As the number of scale steps is increased from 2 up through 20, the increase in reliability is very rapid at first. It tends to level off at about 7, and after 11 steps, there is little gain in reliability from increasing the number of steps (Nunally, 1978, cited by Neuman)

Interestingly, Dyer (1995) states,

'attitude scales do not need to be factually accurate - they simply need to reflect one possible perception of the truth. ……[respondents] will not be assessing the factual accuracy of each item, but will be responding to the feelings which the statement triggers in them'

In line with the above statement, when constructing a Likert scale a pool of statements needs to be generated that are relevant to the attitude (not necessarily fact), (figure 6). The number of choices on the scale should be evenly balanced to retain a continuum of positive and negative statements with which the respondent is likely to agree or disagree although the actual number of choices can be increased. This will help avoid the problem of bias (figure 7) and improves reliability as anyone who answers 'agree' all the time will appear to answer inconsistently.

 

figure 6

 

figure 7

As early as 1967, Tittle et al suggest,

The Likert Scale is the most widely used method of scaling in the social sciences today. Perhaps this is because they are much easier to construct and because they tend to be more reliable than other scales with the same number of items (Tittle et al, 1967)

But there still seems to be some contention within research as to whether Likert Scales are a good instrument for measuring attitude; Gal et al (1994) suggest 'Likert-type scales reveal little about the causes for answers........it appears they have limited usefulness'. Helgeson (1993) states that major reviews 'repeatedly point to two problems: lack of conceptual clarity in defining attitudes.....technical limitations of the instrument used to assess attitude' (Helgeson, 1993 cited by Gal et al 1994). The author suggests that some of these 'major' reviews have taken place prior to 1993, and along with the progress in technology, the reasons for measuring attitude may have also changed. It should also be taken into account that this type of scale is not developed to provide any kind of diagnostic information that shows underlying issues of concern to the individual respondents. There are so many questionnaires students are asked to complete in the course of their studies, the interface and usability should be taken into consideration. There are now also researchers who are in favour of using Likert Scales; Robson (1993) suggests, Likert Scales 'can look interesting to respondents and people often enjoy completing a scale of this kind. This means that answers are more likely to be considered rather than perfunctory; and Neuman (2000) who states, 'the simplicity and ease of use of the Likert scale is its real strength'.

Reservations on the use of a central Neutral Point


Arguments exist for including and not including a neutral point, and it would be reasonable to ask what effect adding a neutral point has on the responses you receive. Is it possible that some respondents may be neutral? In which case it could be argued that by not including a neutral point in a scale, the respondent is compelled to make a decision. Kline (cited by Eysenck, 1998) argues for a middle point, 'even though some participants will very often opt out by remaining indecisive'.

Differing with this opinion it has been suggested,

the traditional idea suggests that the qualitative results between the two scales are unaffected since if the respondents are truly neutral, then they will randomly choose one or the other, so forcing them to choose should not bias the overall results (Kahn et al,2000)

It is also suggested that the exclusion of a neutral point will draw the respondent to make a decision one way or the other. This, states Dumas (1999), 'means that by eliminating a neutral level it is providing a better measure of the intensity of participants' attitudes or opinions'. The author suggests that by preventing the respondent to remain neutral, thus causing them to either 'agree' or 'disagree' could reduce the reliability of the scale as the results will not necessarily be true.

 

Review of Literature

 

Torkzadeh et al (2001) describe the construction of a scale to measure an individual's self-perception and self-competency in interacting with the Internet. They consulted five practitioners and four academics and developed a five-point Likert-type scale (where 1 is strongly disagree to 5 is strongly agree) using a list of 24 items with objectives to explore responses relating to 'unidimensionality, reliability, brevity and simplicity of the factor structure'. The survey was administered at a university in the Southwest region of the United States to a total of 227 students, with an age range from 17 to 57 years.

They used two main criteria for eliminating items that were not considered valid and reliable; firstly if the correlation of each item with the sum of the other items in its category was less than 0.50. This was using the assumption that 'if all items in a measure are drawn from the domain of a single construct, responses to those items should be highly intercorrelated'. The second criterion was for determining reliability; Cronbach's alpha was used to examine each dimension to see if 'additional items could be eliminated without substantially lowering the reliability'. 'Items were eliminated if the reliability of the remaining items would be at least 0.90.'
The resulting figures showed evidence of reliability and construct validity, overall reliability for the scale had a coefficient alpha reliability score of 0.96.
The final recommendation after taking into consideration that this was their first exploratory model stated; 'the instrument should also be validated across other variables such as age, education level and profession in order to assess the generalisability of the scale to a more heterogeneous population' but this was not a reflection of the instrument itself. In conclusion, they stated the 'instrument is useful in its present form' although one must always be aware of the ever changing technologies on the World Wide Web and the need to keep up to date with progress.

"this instrument is short, easy to use, reliable and appropriate for use by academics and practitioners to measure Internet-related self-efficacy." (Torkzadeh et al, 2001)

Shaw et al (2000) used a questionnaire arranged in a Likert format to determine attitudes, views and experiences of a group of nutrition students using an asynchronous learning network. The data was obtained through an online 'IT Appreciation' questionnaire completed in class during week 12 of the course. 'The text match questions allowed students to express opinions in their own words and the multiple choice format consisted of 5 possible responses (some reversed to counteract response sets) to the given statement arranged in a Likert format' (Shaw et al, 2000).

It was concluded that the ALN paradigm could be considered a success as the majority of the respondents agreed with the statements that they had become more independent learners. But it was also noted that the largely positive responses to the Likert questions were contradicted by the student responses to the open ended questions.

From this it was decided that further study should determine the discrepancy between the responses to the Likert question and the open-ended questions; it was also considered a possibility that this could be due to the Hawthorne effect (behaviour may be altered because the respondents know they are being studied.) The author suggests therefore, that although the questionnaire was considered a success, the initial construction of the questionnaire along with how it is presented (i.e. online in the classroom with other students or away from the class situation) needs to be considered carefully. The apparent acquiescence could be because the questions some of the questions were single-sided, (although it was stated otherwise) or perhaps there was a large number of 'don't knows' or 'non-responses'; the results don't include any information on this.

Rovai (2002) used a Likert-type scale, referred to as the 'Classroom Community Scale' in his study of 314 distance learners using Blackboard as the mode of delivery. The research was 'to determine if a significant relationship exists between sense of community and cognitive learning in an online educational environment'; with the premise that if online learners feel an 'emotional connectedness' to a community, their learning and motivation will be increased. 20 statements were used (some reverse scored), with a five-point scale of responses: strongly agree, agree, neutral, disagree and strongly disagree. Cronbach's coefficient alpha was used to calculate the reliability which was .93. Content validity was examined by a panel of experts comprising three university professors of educational psychology. Although there is an in-depth discussion with regard to further research, and assumptions that the respondents were typical students that participate in online distance education, the overall conclusion showed that the Classroom Community Scale 'allowed for the hypothesised relationships between the sense of community and cognitive learning'. The author suggests, this Likert-type scale which has been adapted and renamed shows there is considerable scope for the use of Likert scales in an e-learning environment.

 

Conclusions

Moving questionnaires with Likert scales onto the World Wide Web brings a whole new meaning to questionnaires. They could almost be another source of activity for the online learner. A form of scale that is frequently used is the 'graphic scale', the respondent indicates his/her rating by placing a mark at the appropriate point on a line that runs from one extreme of the attribute to the other. To be a true Likert scale after the series of items has been developed using a graphic rating scale, it is then necessary to determine which items have the highest correlation with a specific criterion measure; only these will be included in the scale.

Although not a graphic scale, figure 8 shows an example of how a Likert scale could be presented in a web page. The use of radio buttons makes it easy to complete, and as there is only one choice, difficult to invalidate by ticking two boxes.

 

It was easy for me to remember how to perform tasks using spreadsheets

Strongly
Disagree

Disagree

Neither

Agree

Strongly
Agree
 figure 8

 

Other methods of presenting Likert scales in a web page are by using slider controls (figures 9, 10, 11 & 12). A “slider control” (also known as a trackbar) is a window containing a slider and optional tick marks. They are useful when you want the respondent to select a discrete value or a set of consecutive values in a range. When the user moves the slider, using either the mouse or the direction keys, the control sends notification messages to indicate the change.

 

figure 9
Source: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/shellcc/platform/commctls/trackbar/trackbar.asp

figure 10
Source: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/shellcc/platform/commctls/trackbar/trackbar.asp

figure 11
Source: http://archive.devx.com/dhtml/articles/nm061102/slider.html

figure 12
Source: http://archive.devx.com/dhtml/articles/nm061102/Hand.html

 

The slider moves in increments that you specify when you create it. For example, if you specify that the slider should have a range of five, the slider can only occupy six positions: a position at the left side of the slider control and one position for each increment in the range.

From a technical aspect, a basic knowledge of programming is useful if the designer of the survey or questionnaire wishes to include slider controls in a web page. Radio buttons (figure 8) require a knowledge of html making them an easier option for the less technically minded.

Although there is some question of the reliability of Likert scales and their analytical capacity, the general consensus is in favour of using Likert scales; this is reinforced by the majority of the latterly dated literature reviewed.

Maurer and Pierce (cited by Maurer and Andrews, 2000) investigated the effectiveness of a Likert scale measure of self-efficacy for academic performance. They suggested the Likert scale can be considered a measure of both magnitude and confidence, and they concluded, based on reliability, predictive validity, and factor analysis data, that a Likert scale measure of self-efficacy is an acceptable alternative to the traditional measure.


Bibliography

Bell, J (1999) Doing Your Research Project. Buckingham: Open University Press

Bouma, G & Atkinson, G (1995) A Handbook of Social Science Research. New York: Oxford University Press

Cohen, L; Manion, L; & Morrison, K (2000) Research Methods in Education. London: RoutledgeFalmer

Coolican, H (1995) Introduction to Research Methods and Statistics in Psychology. London; Hodder and Stoughton

Coolican, H (1990) Research Methods and Statistics in Psychology. London; Hodder and Stoughton

Dumas, J (1999) Usability Testing Methods: Subjective Measures, Part II - Measuring Attitudes and Opinions. American Institutes for Research. Available online: http://www.upassoc.org/html/1999_archive/usability_testing_methods.html [20.02.03]

Dyer, C (1995) Beginning Research in Pschology. Oxford: Blackwell

Dwyer, E (1993) Attitude Scale Construction: A Review of the Literature. Abstract Available online: http://www.askeric.org Document. no. ED359201 [16.01.03]

Eysenck, M (1998) Psychology: An Integrated Approach. Essex; Addison Wesley Longman

Fishbein, M & Ajzen, I (1975) Belief, Attitude, Intention and Behaviour: An Introduction to Theory and Research. London: Addison-Wesley

Gal, I & Ginsburg, L (1994) The Role of Beliefs and Attitudes in Learning Statistics: Towards an Assessment Framework. Journal of Statistics Education, Vol. 2, No. 2 Available online: http://www.amstat.org/publications/jse/v2n2/gal.html [16.02.03]

Gall, M; Borg, W & Gall, J (1996) Educational Research, An Introduction. (6th Edition). USA: Longman

Gay, L (1996) Educational Research: Competencies for Analysis and Application. New Jersey: Prentice Hall

Gross, R (2001) Psychology: The Science of Mind and Behaviour. London; Hodder and Stoughton

Kahn, B; Nowlis, S; & Dhar, R Indifference versus Ambivalence: The Effect of a Neutral Point on Consumer Attitude and Preference Measurement. Available online: http://hops.wharton.upenn.edu/ideas/pdf/00-022.pdf [22.02.03]

Likert, R (1932) A Technique for the Measurement of Attitudes. Archives of Psychology; No.140

Li, Z; Cheng, K; Wang, Y; Hiltz, S; Turoff, M; (2001) Thurstones Law of Comparative Judgment for Group Support. Available online: http://web.njit.edu/~zxl8078/Publication/BB004.PDF [16.02.03]

Likert, R & Hayes, S (1957) Some Applications of Behavioural Research. Paris: Unesco

Maurer, T & Andrews, K (2000) Traditional, Likert and Simplified Measures of Self-efficacy. Educational and Pschological Measurement; Vol. 60, No. 6

Miltiadou, M (1999) Motivational Constructs as Predictors of success in the Online Classroom. Available online: http://seamonkey.ed.asu.edu/~mcisaac/emc703/mariosf.html [7.02.03]

Miltiadou, M (1998?) Validation of the Online Technologies Self-efficacy Scale. Available online: http://seamonkey.ed.asu.edu/~alex/pub/efficacy.pdf [7.02.03]

Moser, C & Kalton G (1989) Survey Methods in Social Investigation. Hampshire, UK: Gower Publishing

Neuman, W.L (2000) Social Research Methods: Qualitative and Quantitative Approaches. USA: Allyn & Bacon

Newell, S & Goldsmith, R (2001) The Development of a scale to measure Perceived Corporate Credibility. Journal of Business Research; Vol. 52, Issue 3

Osgood, E; Suci, G & Tannenbaum, P (1957) The measurement of Meaning. Urbana: University of Illinois

Robson, C (1993) Real World Research: A Resource for Social Scientists and Practitioner-Researchers. Oxford: Blackwell Publishers

Rovai, A (2002) Sense of Community, perceived cognitive learning, and persistence in asynchronous learning networks. The Internet and Higher Education; Vol. 5; Issue 4

Shaw, G & Pieter, W (2000) The Use of Asynchronous Learning Networks in Nutrition Education: Student Attitude, Experiences and Performance. Available online: http://www.aln.org/publications/jaln/v4n1/pdf/v4n1_shawpieter.pdf [24.01.03]

Taylor, B & Heath, A (1996) The Use of Double-sided Items in Scale Construction. Centre for Research into Elections and Social Trends; Working Paper no. 37. Abstract available online: http://www.crest.ox.ac.uk/p37.htm [11.01.03]

Tittle, C & Hill, R (1967) Attitude Measurement and Prediction of Behaviour: An Evaluation of Conditions and Measurement Techniques. Sociometry, Vol. 30

Torkzadeh, R; Pflugheoft, K & Hall, L (1999) Computer self-efficacy training effectiveness and user attitudes: an empirical study. Behaviour & Information Technology vol. 18, no 4

Torkzadeh, G & Van Dyke, T (2001) Development and validation of an Internet self-efficacy scale. Behaviour and Information Technology, vol. 20, no 4

Wilson, T (1997) Online Course on Questionnaire Design. Available online: http://www.hb.se/bhs/nyutb/kurswebb/c-kurser/applirm/qdes4.htm [20.02.03]