How to Find the Correlation Coefficient for 'R' in a Scatter Plot
- 1). Calculate the average value of x and y by summing all values of x and y and dividing by the number of data points. As an example, consider a scatter plot with three (x,y) data points: (0,1), (2,3) and (5,6). The x values 0, 2 and 5 average to (0 + 2 + 5) / 3 = 2.33. The y values 1, 3 and 7 average to (1 + 3 + 7) / 3 = 3.67.
- 2). Calculate the standard deviation of the x and y data points, Sx and Sy, by first calculating the absolute value of the difference between each data point and the average, then squaring these values, averaging the squared values, and finally taking the square root. (See References 3). Continuing the example from step 1, the x values of 0, 2 and 5 give deviations of |0 - 2.33|, |2 - 2.33| and |5 - 2.33|, or 2.33, 0.33 and 2.67. Squaring each of these values gives 5.43, 0.11 and 7.13. The squared values average to 4.22, and taking the square root of this number gives 2.05. Hence, the standard deviation for x, or Sx, is 2.055 and Sy is 2.494.
- 3). Find the slope equation of the linear regression or "best fit" line drawn through the data. Some software graphing programs will perform the linear regression and display the equation on the graph in the form y = mx + b, where m represents the slope and b represents the y-intercept. If the equation of the best-fit line is not available, choose any two points on the line and label them x1,y1 and x2,y2. Then calculate the slope, m, by m = (y2 - y1) / (x2 - x1). In the case of the sample data from step 1, the slope is 1.2105.
Source...