Tools and Techniques for Business Improvement

 

Changing
Minds

 

Creating
Minds

 

Quality
Toolbook

 

Tools of
the Trade

 

Improvement
Encyclopedia

 

 

Home
Page

 

C style
(book)

 

Business
Articles

 

Thinking
Stories

 

Inspirational
Teaching

 

Heledd's
Site

 

My
Photos

   

 

Here's my
latest book!

Add/share/save
this page:

 

 

 

 

Scatter Diagram (part 2: calculations)

Quality Tools > Tools of the Trade > Scatter Diagram (part 2: calculations)

 

In this article, we will look at calculations for three items around Scatter Diagrams: the correlation coefficient, the regression line and the standard error.

 

The degree of correlation in a Scatter Diagram can be calculated, and which leads to a number known as the correlation coefficient or the coefficient of correlation. This can have a value from -1 through to +1. A correlation coefficient of +1 indicates a perfect positive correlation, with all points in a perfect line going from the bottom left to the top right. A correlation coefficient of -1 indicates a perfect negative correlation, with all points in a perfect line going from top left to bottom right. A coefficient of zero indicates no correlation at all, and points will be randomly scattered across the measurement space.

 

A line of ‘best fit’, or regression line can be drawn through points to indicate the centre locus of the points, as in Figure 1. A way of calculating this is known as the method of ‘least squares’.

 

The standard error is, effectively the standard deviation in a single slice across the diagram. If there is a Normal distribution across the slice (as there may well be if there is a central tendency), then this can be used to predict probable positions of points.

 

 

Fig 1. Variation across the Scatter Diagram

 

Doing the calculations

Calculating these figures may seem daunting, but in fact is quite straightforward. The pictures below show the steps you can take to work out correlation coefficient, standard error and also draw the line of best fit.

The first step is to draw up columns containing the pairs of numbers that make up each point on the Scatter Diagram, and add further columns to multiply each pair and square them individually, and then sum each of the columns.

 

 

Fig. 2.  First stage calculation

 

The next step is to do some fairly fiddly calculations for values a, b and c, as in Figure 3 Using a spreadsheet, once you have set up the formula (check carefully that this is correct!), this is again a simple step.

 

 

Fig. 3.  Second stage calculation

 

The third stage can now be used to calculate the correlation coefficient, r, and the standard errors, s, as in Figure 4.

 

 

Fig. 4.  Third stage calculation

 

Finally, the regression line points can be calculated, using the values of a and c from the second stage calculations. All that is needed to draw a line is two points, so simply select a low and high value of x and work out the values of y, using the standard formula for a straight line,  y=ax + c, as in Figure 5.

 

Fig. 5.  Fourth stage calculation

 

 

Next time: Control chart (part 1: interpretation)

 

This article first appeared in Quality World, the journal of the Chartered Quality Institute

 

Contact —  — My page

 

 

  © Syque 2002-2011

  Massive Content -- Maximum Speed

TOP