Close
Updated:

A convention to identify and show extreme values in a set of data using a box-and-whisker plot

An article in the J. Empirical Legal Studies, June 2012 at 233, relies on box-and-whisker plots to describe large amounts of its data.  Since the whiskers show the minimum and maximum values for a given variable, the authors chose a convention for how to handle “outside values.”  “An outside value is defined as a value that is larger than the upper quartile plus 1.5 times the interquartile range, or smaller than the lower quartile minus 1.5 times the interquartile range.”  That is a convention that makes sense to define and depict extreme and odd values. They displayed outside values as separate points.

 

For example, a box-and-whisker chart would nicely convey much about the revenue of an industry in a benchmark report.  It would, however, have some companies who reported outside values.  Someone might have entered too many zeros or used a currency other than dollars but did not indicate that or was simply wrong.  The convention gives a way to collar outliers.

 

This blog has explained box-and-whisker plots (See my post of May 3, 2008: sometimes called Tukey boxes; and Jan. 15, 2009: box-and-whisker plots to display data.).  It has also explained inter-quartile ranges (See my post of Nov. 30, 2005: the average of the middle; June 30, 2006: inter-quartile mean; and Dec. 22, 2010: inter-quartile differences.).  As to how to handle outliers, even that topic has appeared in these pages (See my post of April 27, 2010: information theory, power laws, and odd data; July 1, 2010: standardized scores highlight outliers; and Nov. 28, 2011: Winsorize data.).