Hide

Appropriate Analysis and Presentation of Ordered Categorical Data

Physicians often use ordered categorical (ordinal) scales to approximate an objective evaluation of outcome variables for which precise measurements on continuous scales are not available. One advantage of assigning numerical values is that the severity of conditions can be ranked. However, these numerical values cannot be analysed as continuous data, because the values assigned in ordered categorical scales describe a ranking, not a measurement.

Clinical outcomes measured with ordinal scales are often presented and analysed with inappropriate statistical methods in the medical literature. In the fields of anaesthesia, rheumatology, and nursing, several studies indicated that ordinal data were presented appropriately in only 39–49% and analysed appropriately in 57–63% of journal articles [1~3]. The most common error was the presentation of a mean value for ordinal data. The use of ANOVA to analyse ordinal data was the most common error for the analysis. Other problems included graphs of ordinal data in which data points were connected by lines, and failure to report the raw data required to reanalyse the data appropriately.

Ordinal data may be graded with scales incorporating groups such as 'extremely satisfied', 'satisfied', 'neutral', 'unsatisfied', and 'extremely unsatisfied'. The main limitation of these data is that the interval or distance between the groups is unknown. It is inappropriate to calculate the 'mean' satisfaction in such a group. Although the grading scores are mutually exclusive and encompass all possible outcomes, they do not represent equal spacing between adjacent ranks, as occurs in interval or ratio data. Accordingly, calculating the sum, product, mean, or standard deviation of ordinal data is not appropriate because these functions assume that there is equal spacing between adjacent values [6]. Ranking ordinal data into alphabetical categories (e.g. A through E) makes these limitations more intuitive than ranking the same data numerically (e.g. 1 though 5); adding or multiplying letters together does not make sense.

Descriptive Statistics

Both a measure of "central tendency" and one of variation need to be given. When data are ordinal and skewed, medians and interquartile ranges are appropriate. The median represents the middle value of an ordered data set (i.e. half of the values will be lower and half of the values will be higher than the median) [7].

For example, assume a satisfaction score with five possible values: 1–5, signifying satisfied through dissatisfied. If 100 patients were surveyed during a quality improvement project and the median satisfaction score was before the project was 4 and after the project was 2, it would mean that more than half of the questionnaires would be scored as a 1 or a 2, while the remainder would include scores of 2, 3, 4 and/or 5. It would not, however, be justified to state that a score of 2 after the project represents a 50% reduction in satisfaction score of 4 before the project, as this would require multiplication or division of ordinal data.

Parametric statistical methods (e.g. t-tests and ANOVA) that are used to analyse interval or ratio data assume normality of the data [6]. However, ordinal data do not follow a normal (Gaussian) distribution and cannot be analysed with these methods. Presentation and analysis of ordered categorical data with methods that are inconsistent with the structure of the data may lead to unjustified implications and conclusions. The inappropriate use of the t-test on simulated data sets led to a type I error rate (false positive) confidence interval, indicating that the t-test rejected the null hypothesis more often than it should [7]. An increased type I error rate is cause for concern because incorrect conclusions about treatments are made. Researchers would too often conclude that two treatment groups were significantly different when in fact there was no difference.

CategoryMethod
AnalysisWilcoxon signed rank
Wilcoxon rank sum
Mann–Whitney U
Kruskal–Wallis
Spearman rank correlation
Kendall's rank correlation
Logistic regression
Cohen's kappa
PresentationMedian
Range or interquartile range
Percentage within each rank of a numerical rating scale
Two-group comparisons [8] (paired design)Wilcoxon signed rank test
(unpaired design) Wilcoxon rank sum test
(unpaired design) Mann-Whitney U-test [less usual]

Single Attribute Correlation for Nonparametric Distributions

Linear regression assumes Normality and constant standard deviation of the outcome variable for given values of the explanatory variable. The Pearson correlation coefficient is based on a Normal distribution of both variables and is heavily influenced by outliers. Data should always be plotted first, as only if the relation is at least approximately linear is it sensible to use either linear regression or Pearson's correlation. Nonparametric correlation coefficients, Spearman's or Kendall's, should be used when the assumptions are violated.

Importance

Satisfaction surveys contain a series of "attributes", which are rating scales of a series of specific statements or question (e.g. courtesy, accuracy, timeliness). Whether some attributes are more important than others can be assessed in two ways:

  • "stated importance": by asking customers how important an item is
  • "derived importance": by calculating the relationship between attributes and overall satisfaction

Stated importance: Asking about importance adds unnecessary questionnaire length, because concerning each attribute, not only you would need to ask "were you satisfied (on a scale of 1 to 5)?" but also "how important is this attribute (on a scale of 1 to 5)?" doubling the THIS questionnaire from approximately 50 attribute questions to more than 100. This leads to making the person filling in the questionnaire irritable, or unwilling to complete the form. The results may lead to erroneous improvement strategies because what customers say is important may not one of the drivers of whether they will be satisfied.

Derived importance: uncovers items which are most important to the satisfaction of customers. These attributes will not always be the same attributes that a customer would identify as being most important, but they would be the ones which, if improved upon, will result in higher levels of satisfaction. It is not difficult to calculate the relative importance of a series of attributes, provided that the questionnaire also includes a satisfaction measure of some sort. The basic process is to conduct a correlation analysis, eliminate attributes which have high correlation coefficients with each other and are saying "much the same thing", and then to run a nonparametric regression analysis.

Quadrant Analysis: Importance vs Perception

Linear regression assumes Normality and constant standard deviation of the outcome variable for given values of the explanatory variable. The Pearson correlation coefficient is based on a Normal distribution of both variables and is heavily influenced by outliers. Data should always be plotted first, as only if the relation is at least approximately linear is it sensible to use either linear regression or Pearson's correlation. Nonparametric correlation coefficients, Spearman's or Kendall's, should be used when the assumptions are violated.

Prioritization

The Priority Index is an ordered list of survey items that shows the areas needing the most improvement. Survey items are arranged from the "first item to work on" to the "last item to work on". The index reflects service issues that the hospital is performing relatively poorly on but that are important to the patients. Survey items that have low average scores and high correlation scores will have high priority index scores.

It is calculated as follows: questions are rank ordered according to their top-box score. Questions with the lowest score are given the highest point value. The questions are then ranked again by their rank correlation (Kendall's τ) to "overall satisfaction". Summing together the two ranks provides an overall position in the priority index. Items with the highest totals point to the specific questions where there is the most room for improvement and the most likely to have the greatest impact on overall satisfaction.

Multiple Attributes & Ordinal Logistic Regression

Linear regression assumes Normality and constant standard deviation of the outcome variable for given values of the explanatory variable. The Pearson correlation coefficient is based on a Normal distribution of both variables and is heavily influenced by outliers. Data should always be plotted first, as only if the relation is at least approximately linear is it sensible to use either linear regression or Pearson's correlation. Nonparametric correlation coefficients, Spearman's or Kendall's, should be used when the assumptions are violated.

id=lbl5050leftAbove id=lbl5050leftBelow
id=lbl5050rightAbove id=lbl5050rightBelow
Article Information
Title:
Subtitle:
Author:
Article URL: http://www.qi.org.tw/Quality/ref/likertords.aspx
Created: 2011-06-06 08:56
Updated: 2011-05-19 19:26
Keywords:
Description: