Once the degree of relationship between variables has been established using co-relation analysis, it is natural to delve into the nature of relationship. Regression analysis helps in determining the cause and effect relationship between variables. It is possible to predict the value of other variables (called dependent variable) if the values of independent variables can be predicted using a graphical method or the algebraic method.

__Graphical Method__

__Graphical Method__

It involves drawing a scatter diagram with independent variable on X-axis and dependent variable on Y-axis. After that a line is drawn in such a manner that it passes through most of the distribution, with remaining points distributed almost evenly on either side of the line.

A regression line is known as the line of best fit that summarizes the general movement of data. It shows the best mean values of one variable corresponding to mean values of the other. The regression line is based on the criteria that it is a straight line that minimizes the sum of squared deviations between the predicted and observed values of the dependent variable.

__Algebraic Method__

__Algebraic Method__

Algebraic method develops two regression equations of X on Y, and Y on X.

**Regression equation of Y on X**

Y=a+bXY=a+bX

Where −

· YY = Dependent variable

· XX = Independent variable

· aa = Constant showing Y-intercept

· bb = Constant showing slope of line

Values of a and b is obtained by the following normal equations:

∑Y=Na+b∑X∑XY=a∑X+b∑X2∑Y=Na+b∑X∑XY=a∑X+b∑X2

Where −

· NN = Number of observations

**Regression equation of X on Y**

X=a+bYX=a+bY

Where −

· XX = Dependent variable

· YY = Independent variable

· aa = Constant showing Y-intercept

· bb = Constant showing slope of line

Values of a and b is obtained by the following normal equations:

∑X=Na+b∑Y∑XY=a∑Y+b∑Y2∑X=Na+b∑Y∑XY=a∑Y+b∑Y2

Where −

· NN = Number of observations

**Example**

**Problem Statement:**

A researcher has found that there is a co-relation between the weight tendencies of father and son. He is now interested in developing regression equation on two variables from the given data:

Weight of father (in Kg) | 69 | 63 | 66 | 64 | 67 | 64 | 70 | 66 | 68 | 67 | 65 | 71 |

Weight of Son (in Kg) | 70 | 65 | 68 | 65 | 69 | 66 | 68 | 65 | 71 | 67 | 64 | 72 |

Develop

1. Regression equation of Y on X.

2. Regression equation of on Y.

**Solution:**

XX | X2X2 | YY | Y2Y2 | XYXY |

69 | 4761 | 70 | 4900 | 4830 |

63 | 3969 | 65 | 4225 | 4095 |

66 | 4356 | 68 | 4624 | 4488 |

64 | 4096 | 65 | 4225 | 4160 |

67 | 4489 | 69 | 4761 | 4623 |

64 | 4096 | 66 | 4356 | 4224 |

70 | 4900 | 68 | 4624 | 4760 |

66 | 4356 | 65 | 4225 | 4290 |

68 | 4624 | 71 | 5041 | 4828 |

67 | 4489 | 67 | 4489 | 4489 |

65 | 4225 | 64 | 4096 | 4160 |

71 | 5041 | 72 | 5184 | 5112 |

∑X=800∑X=800 | ∑X2=53,402∑X2=53,402 | ∑Y=810∑Y=810 | ∑Y2=54,750∑Y2=54,750 | ∑XY=54,059∑XY=54,059 |

__Regression equation of Y on X__

__Regression equation of Y on X__

Y = a+bX

Where , a and b are obtained by normal equations

∑Y=Na+b∑X∑XY=a∑X+b∑X2Where ∑Y=810,∑X=800,∑X2=53,402,∑XY=54,049,N=12∑Y=Na+b∑X∑XY=a∑X+b∑X2Where ∑Y=810,∑X=800,∑X2=53,402,∑XY=54,049,N=12

⇒⇒ 810 = 12a + 800b … (i)

⇒⇒ 54049 = 800a + 53402 b … (ii)

Multiplying equation (i) with 800 and equation (ii) with 12, we get:

96000 a + 640000 b = 648000 … (iii)

96000 a + 640824 b = 648588 … (iv)

Subtracting equation (iv) from (iii)

-824 b = -588

⇒⇒ b = -.0713

Substituting the value of b in eq. (i)

810 = 12a + 800 (-0.713)

810 = 12a + 570.4

12a = 239.6

⇒⇒ a = 19.96

Hence the equation Y on X can be written as

Y=19.96−0.713XY=19.96−0.713X

__Regression equation of Y on X__

__Regression equation of Y on X__

X = a+bY

Where , a and b are obtained by normal equations

∑X=Na+b∑Y∑XY=a∑Y+b∑Y2Where ∑Y=810,∑Y2=54,750,∑XY=54,049,N=12∑X=Na+b∑Y∑XY=a∑Y+b∑Y2Where ∑Y=810,∑Y2=54,750,∑XY=54,049,N=12

⇒⇒ 800 = 12a + 810a + 810b … (V)

⇒⇒ 54,049 = 810a + 54, 750 … (vi)

Multiplying eq (v) by 810 and eq (vi) by 12, we get

9720 a + 656100 b = 648000 … (vii)

9720 a + 65700 b = 648588 … (viii)

Subtracting eq viii from eq vii

900b = -588

⇒⇒ b = 0.653

Substituting the value of b in equation (v)

800 = 12a + 810 (0.653)

12a = 271.07

⇒⇒ a = 22.58

Hence regression equation of X and Y is

X=22.58+0.653Y