top of page

# BECC-110 Solved Assignment Solution by Gyaniversity

Assignment Solution

Assignment Code: BECC-110/TMA/2022-23

Course Code: ECC-110

Assignment Name: Introductory Econometrics

Year: 2022-2023

Verification Status: Verified by Professor

Total Marks: 100

Section I: Long answer questions (word limit - 500 words). Each question carries 20 marks. Word limit does not apply in the case of numerical questions. 2 × 20 = 40

1. (a) Distinguish between the Population Regression Function and Sample Regression Function in detail. Use appropriate diagram to substantiate your response. (10)

Ans)

Population Regression Function

Theoretical relationships between a dependent variable and a group of independent or explanatory variables are hypothesised by a population regression function. It follows a straight line. The function describes how a variable Y's conditional expectation reacts to changes in the independent variable X.

𝑌i = 𝐸(𝑌i |𝑋i ) + 𝑢i………………………………………….(i)

Equation shows that the function is composed of a deterministic component E(Y|X) and a nondeterministic or "stochastic" component u. (i)

Examining the factors influencing the dependent variable (Y) given the values of the independent variables is what we are concerned with (X).

Deterministic Component

The deterministic element of the regression model is represented by the conditional expectation of Y. A deterministic line is how it is obtained. A different name for it is the Population Regression Line (PRL). A random error term, denoted by the symbol u, represents the stochastic or non-deterministic component. Take this as an example. Consider a scenario in which we want to investigate the relationship between weekly personal disposable income (PDI) and weekly expenditure for a group of people, using weekly PDI as the independent variable (Y) and weekly expenditure as the dependent variable (X).

The average weekly expenditure is plotted on the vertical axis for each given value of weekly PDI. Since higher-income individuals are more likely to spend more money, it makes sense that there is a positive correlation between weekly PDI and weekly spending. In order to plot the following population regression line on a graph, see the explanation below.

𝐸(𝑌i |𝑋i) = 𝛽1 + 𝛽2𝑋i

Note that the parameters in equation (4.3) are 1 and 2. The intercept of the population regression function is 1 in this case. When the explanatory variable is zero, it shows the expected value of the dependent variable. The slope of the population regression function is 2 in addition. It indicates the amount by which the dependent variable will alter if the independent variable changes by one unit. The relationship between the dependent variable and the independent variable in the population is described by the population parameters.

Stochastic Component

We do not assume a deterministic relationship between X and Y when we gather data from a sample. For instance, two people may have different expenses even though they earn the same amount of money. Let's say there are two people with a combined monthly income of Rs. 20000. While one person may spend Rs. 15000 per month, the other person may spend Rs. 19000. Due to his health or way of life, the differences in monthly expenses for the second person may be greater. The stochastic error term accounts for these differences in the dependent variable. The value of the Y variable is represented by a vertical dotted line in Fig. for a specific value of X.

Sample Regression Function

Rarely do we have access to information about the entire population. A sample of the population is all we have. Therefore, in order to estimate the population parameters, we must use the sample. Due to sampling fluctuations or sampling error, we might not be able to determine the population regression line (PRL). Consider that we have two samples taken from the specified population. We obtain Sample Regression Lines by separating the samples (SRLs). The population is represented by a sample. Two sample regression lines, SRL1 and SRL2, are displayed in Fig.

Two Sample Regression Lines

The population regression line is represented by both sample regression lines. However, the slope and intercept of each SRL differ due to sampling variability. We create the idea of Sample Regression Function (SRF), which consists of Sample Regression Line (SRL) and the error term ui, in a manner similar to the population regression function (PRF) that supports the PRL.

(b) What are the assumptions of a classical regression model? (10)

Ans) A linear regression model is predicated on the following assumptions. A regression model is referred to as the classical linear regression model if it meets the conditions listed below (CLRM). The following are CLRM's presumptions:

1. The regression model is linear in parameters. It may or may not be linear in variables. For example, the equation given below is linear in parameters as well as variables as shown in equation (i)

2. bThe explanatory variable is not correlated with the disturbance term u. This assumption requires that Σ 𝑢i𝑋i = 0 . In other words, the covariance between error term and explanatory variable is zero. This assumption is automatically fulfilled if X is non-stochastic. It requires that the 𝑋􀯜 values are kept fixed in repeated samples.

3. The error term u has a zero expected value or mean value. E(ui | Xi) = 0 in symbols. It does not follow that all error terms have a value of zero. It suggests that the incorrect terms cancel one another out.

4. Each ui has a fixed variance. var(ui) = 2 in symbols. Figure 1 shows the conditional distribution of the error term. Fig. shows the corresponding error variance for a particular value of the error term. You can see from the figure that the error variance is constant across all X variable levels. It describes a situation known as "homoscedasticity."

2. (a) Measurement error in variables is a serious problem in econometric studies. Find out the consequences of measurement errors in i) dependent variable and ii) independent variables.

Ans) So far, we have assumed the variables in the econometric model under study are measured correctly. It means that there are no measurement errors in both explained and explanatory variables. Sometimes we do not have data on the variables that we want to use in the model. This could be for various reasons such as non-response error, reporting error, and computing error. A classic example of measurement error pertains to the variable permanent income used in the Milton Friedman model. Measurement error in variables is a serious problem in econometric studies.

Dependent Variable

Let us consider the following model:

= 𝛼 + 𝛽𝑋i+ 𝑢i …….(i)

where  is permanent consumption expenditure

𝑋i is current income, and

𝑢i is the stochastic disturbance term.

(We place a star mark (*) on the variable that is measured with errors)

Since   is not directly measurable, we may use an observable expenditure

variable 𝑌i such that

𝑌i =  + 𝑒i … (ii)

where 𝑒i denotes measurement error in

= 𝛼 + 𝛽𝑋i + 𝑢i , we estimate

= 𝛼 + 𝛽𝑋i + 𝑢i + 𝑒i

= 𝛼 + 𝛽𝑋i + (𝑢i + 𝑒i )

Independent Variable

There could be measurement error in explanatory variables. Let us assume the

true regression model to be estimated is

𝑌i = 𝛼 + 𝛽+ 𝑢i …………………………….(i)

Suppose we do not have data on variable . On the other hand, suppose we have

data on  . In that case, instead of observing , we observe

𝑋i = + 𝑤i ……………………………….(ii)

where 𝑤i represents error of measurement in .

In the permanent income hypothesis model, for example,

𝑌i = 𝛼 + 𝛽+ 𝑢i

where 𝑌i is current consumption expenditure

is permanent income

𝑢i is stochastic disturbance term (equation error)

From equation (i) and (ii) we find that

𝑌i = 𝛼 + 𝛽(𝑋i − 𝑤i ) + 𝑢i

= 𝛼 + 𝛽𝑋i + (𝑢i − 𝛽𝑤i )

= 𝛼 + 𝛽𝑋i + 𝑧i

Section II: Medium answer questions (word limit - 250 words). Each question carries 10 marks. Word limit does not apply in the case of numerical questions. 3 × 10 = 30

3. Differentiate between Chi-square distribution and t-distribution.

Ans) A chi-squared test checks for a relationship between two sets of data, whereas a t-test checks to see if two sets of data are significantly different from one another in order to test a null hypothesis. A prediction known as the null hypothesis holds that there is no correlation between any two variables.

Both a one-sample t-test and a two-sample t-test are available to students. When the data come from independent observations and have a normal distribution, a one-sample t-test is used to test the null hypothesis that the mean of the data set is unknown. When two sets of data are gathered, the two-sample t-test assesses the null hypothesis. For the results to be reliable, both sets of data must come from the same sample size.

The two sets of data must first be separated into categories before performing a chi-squared test. The chi-squared test is used to compare the two sets of data after they have been divided in order to determine whether there is a correlation between the figures. Students can use a computer programme or a graphing calculator to conduct a t-test or chi-squared test. Even though these tools administer the tests, accurate results depend on students providing the right information.

To determine whether the frequency distribution of a categorical variable deviates from your expectations, use the chi-square goodness of fit test. In order to determine whether two categorical variables are related to one another, the chi-square test of independence is used.

4. What is an estimator? Explain all the properties of an estimator with reference to BLUE.

Ans) An estimator is considered as best linear unbiased estimator (BLUE) if it is linear, unbiased, efficient (with minimum variance). and also, consistent implying that the value of estimator converges to its true population value as the sample size increases.

Unbiasedness

A statistic's value varies between samples as a result of sampling fluctuation. Although the average value of a statistic should be the same as the population parameter, individual values of a statistic may differ from the unknown population parameter. In other words, there should be a central tendency towards x in the sampling distribution of, x. This is referred to as the unbiasedness of an estimator's property. It means that while a single estimate's value may be greater or smaller than the population parameter's unknown value, the estimator itself is not biased toward always having values that are higher or lower than the population parameter.

Minimum Variance

If the variance of one estimator of x is lower than the variance of any other estimator of x, then that estimator is said to be the minimum variance estimator. Consider that there are three x estimators. The variance of the third estimator is the least among the three. It is therefore a minimum variance estimator.

Best Linear Unbiased Estimator

Suppose we consider a class of estimators. Among these estimators, an estimator fulfils three properties, viz., (i) it is linear, (ii) it is unbiased, and (iii) it has minimum variance. In that case, it is called a ‘best linear unbiased estimator’ (BLUE).

Consistency

One large sample property is consistency. The estimator should tend to get closer to the parameter value as the sample size is increased.

5. Two variable regression model could have three functional forms as given below:

𝑌i = 𝛽1 + 𝛽2𝑋i + 𝑢i

𝑙𝑛𝑌i = 𝛽1 + 𝛽2𝑋i + 𝑢i

ln𝑌i = 𝛽1 + 𝛽2lnXi + 𝑢i

How will you decide which is the best model for a given econometric problem?

Ans) Assume that one of our goals is to calculate how the dependent variable will change as the independent variable changes. Model-I can be used in this situation. On the other hand, we should choose a semi-log model if our goal is to estimate the growth rate in the dependent variable as a result of the change in the independent variable (model II). We select the log-linear model if our goal is to measure the elasticity between two variables. The parameter estimates from the three regression models (Models I, II, and III) will differ. The estimators' standard errors will also differ. Additionally, R2, the coefficient of determination, will vary amongst the three models.

The value of R2 obtained from regression models with various dependent variables cannot be compared. We may contrast the R2 of regression models that use the same dependent variable and estimate strategy, though.

Thus, it is impossible to compare the R2 values of Models I and II. By evaluating each model's best fit, we may contrast Models II and III. If the coefficient of determination, statistical significance of estimators, and diagnostic checks of two regression models are nearly identical, we favour the simpler model. The easier to understand and generally well-liked simplified model. The log-linear regression model has a few benefits: (i) the parameters are scale-invariant because they measure percentage changes; (ii) the model directly provides elasticity data; and (iii) the model to some extent moderates the heteroscedasticity issue.

Section III: Short answer questions (word limit - 100 words). Each question carries 6 marks. Word limit does not apply in the case of numerical questions. 5 ×6 = 30

6. Discuss the remedial measures of multicollinearity.

Ans) If the study's objective is to predict the dependent variable's mean value, multicollinearity may not always be a bad thing. The population regression function can be used to forecast the relationship between the dependent variable Y and other collinear explanatory variables if the collinearity between the explanatory variables is anticipated to persist in the future.

The forecast based on the given Regression is not very useful, though, if in another sample the degree of collinearity between the two variables is not very high. Seriou’s collinearity, on the other hand, may be detrimental if the goal of the study is not just prediction but also accurate estimations of the selected model's individual parameters. This is because multicollinearity causes large standard errors of estimators, which widens the confidence interval.

7. What do we mean by Normal Distribution? Explain with the help of a diagram.

Ans) Normal distribution (also called z-distribution) is a continuous probability distribution function. This function is very useful because of Central Limit Theorem. It implies that averages of samples of observations of random variables independently drawn from independent distributions converge in distribution to the normal. It becomes normally distributed when the number of observations is sufficiently large. The normal distribution is also called the bell curve.

Some of the important properties of normal distribution are:

1. The normal distribution curve is bell-shaped.

2. The normal curve is symmetrical about the mean μ.

3. The total area under the curve is equal to 1.

4. The area of the curve is completely described by its mean and standard deviation.

8. There are two types of estimation of parameters: Point Estimation and Interval Estimation. Explain the interval estimation method briefly.

Ans) Assume for the moment that random variable X has a normal distribution. As you are aware, the mean and standard deviation are the two variables that describe a normal distribution. We must estimate the mean E(X) = x and variance,-x-2. on the basis of a sample alone because we do not have data for the entire population (we only have data for a sample).

When we estimate a parameter using a single value, typically the corresponding sample statistic, we are using point estimation. Because the parameter value might not exactly match it, the point estimate might not be realistic. Giving an interval with a certain probability of holding the parameter is an alternative procedure. Here, we define lower and upper bounds where the parameter value is most likely to fluctuate. We also state the likelihood that the parameter will stay within the interval. We refer to the range as the "confidence interval" and the likelihood that the parameter will remain within it as the "confidence level" or "confidence coefficient."

9. What are the three methods of estimation? Discuss.

Ans) The three methods of estimation are as follows:

Least Squares: The term "ordinary least squares" (OLS) denotes a least squares method that is the most straightforward. It suggests that the OLS approach may be given more complexity. It's true that there are other types of least squares, such as generalised least squares (GLS), two-stage least squares (2SLS), and three-stage least squares (3SLS). So be careful while reading about the least squares method and pay attention to whose method is being discussed.

Maximum Likelihood: The maximum likelihood (ML) method presupposes that the variables have a probability distribution. In maximum likelihood estimation, the normal distribution is the probability distribution function that is most frequently utilised. The likelihood function used in the ML approach is generated from the probability distribution function.

Moments Method: The moment generating function attributes are used in the moments method (MOM). For the purpose of estimating the parameters, the moment generating function of specific probability distributions is used.

10. Explain the rejection regions for small samples and large samples.

Ans) Rejection Region for Small Samples: If the population standard deviation is known and there are tiny samples, the z-statistic is used to test the hypothesis. On the other hand, we use the t-statistic if the population standard deviation is unknown. The same standards apply to testing hypotheses as well. However, in the case of the t-distribution, the probability implied by the area under the curve varies with the number of degrees of freedom. Thus, we must consider the degrees of freedom while determining the crucial value of t. When determining the important value of t, keep two factors in mind. Level of significance and degrees of freedom are these.

Rejection Region for Large Samples: Both sides of the standard normal curve's rejection zone. The rejection zone may, however, frequently be located on one of the left or right sides of the typical normal curve. However, if the test has only one tail, the area is plotted along one side of the normal distribution curve. The critical value therefore differs for one-tail and two-tail tests. The phrasing of the alternative hypothesis determines whether to use a one-tail or two-tail test.ASSIGNMENT INFORMATION

## ​

### ​ #### ​

##### 100% Verified solved assignments from ₹ 40  written in our own words so that you get the best marks! #### ​

Don't have time to write your assignment neatly? Get it written by experts and get free home delivery #### ​ #### ​

Download IGNOU's official study material combined into a single PDF file absolutely free! #### ​ 