top of page

# MPC-006 Solved Assignment Solution by Gyaniversity

Assignment Solution

Assignment Code: MPC-006/AST/TMA/2022-23

Course Code: MPC-006

Assignment Name: Statistics in Psychology

Year: 2022-2023

Verification Status: Verified by Professor

Marks: 100

NOTE: All Questions Are Compulsory.

The answers are to be written in own words. Do not copy from the course material or any other source.

### SECTION A

Answer the following question in about 1000 words (wherever applicable) each 15x3=45 Marks

1. Explain the meaning of descriptive statistics and describe organisation of data.

Ans) Descriptive statistics is a type of statistics that tries to describe the data that has been collected. Based on these descriptions, a certain group of people in the population is defined by their similar traits. Classification, tabulation, diagrammatic and graphical presentation of data, and measures of central tendency and variability are all part of descriptive statistics. These measures tell the researchers how the data or scores tend to change over time. This makes it easier to describe the phenomena. Parameters of the distribution are single estimates of a set of data that sum up the distribution. These parameters tell us everything about the distribution.

There are four main ways that statistics are used to organise data. These things:

Classification

Classification is the process of putting things into groups based on how similar they are. A classification is a summary of how often each score or range of scores falls into a certain category for a variable. In its most basic form, a distribution will have a certain value of a variable and the number of people who have had that value. Once the data are collected, they should be put together in a way that lets them draw conclusions. So, by putting data into groups, the investigators take a step toward making a decision. When the raw data are set up as a frequency distribution, the information about score becomes much clearer. The number of cases that fall within a certain class interval or range of scores is shown by the frequency distribution. A frequency distribution is a table that shows the number of times each score was given by a group of people.

Tabulation

Frequency distribution can be shown in a table or a graph. Tabulation is the process of putting data that has already been sorted into a table. When data is put into tables, it is easier to understand and can be used for more statistical analysis. A table is a set of rows and columns with headings and subheadings that organises data in a way that makes sense. A table is made up of these main parts:

1. Table number: If there are more than one table in an analysis, each table should have a number so that it can be found and used again. At the top of the table, the number should be written in the middle.

2. The table's name is: Every table should have a good title that says something about what's in the table. The title should be simple, short, and easy to understand. The table's title should be in the middle of the top row or just below or after the table number.

3. Captions are short headings for columns that don't need to be explained. There may be headings and subheadings in a caption. The headings should go in the middle of each column.

4. Stub: Row headings that are short and easy to understand are called "stubs."

5. Body of the table: This is the real table. It has different cells with numbers or data in them. This data arrangement stays the same as described in the captions and stubs.

6. This is written to the right of the title and tells you how the measurements in the body of the tables are written.

7. This is a qualifying statement that should be written below the table. It should explain some things about the data that the title, caption, and stubs didn't cover.

8. Source of data: At the end of the table, you should list where the data came from.

Graphical Presentation of Data

The goal of putting together a frequency distribution is to give a methodical way to "look at" and understand data. To help people understand this better, the information in a frequency distribution is often shown as a picture or a diagram. Frequencies are plotted on a picture made up of horizontal and vertical lines to show how they are spread out. This is called a graph.

A graph is made with two lines that go in opposite directions. These are called the X-axis and the Y-axis, and they have scales on them. The vertical line is called the ordinate and the horizontal line is called the abscissa. There are many different kinds of graphs, just like there are many different kinds of frequency distributions. These graphs help the reader understand the science better.

The histogram is one of the most common ways to show a graph of a continuous frequency distribution. In this type of distribution, the top of one class is the bottom of the next class. The histogram is made up of a series of rectangles. The width of each rectangle is equal to the class interval of the variable on the horizontal axis, and the height of each rectangle is equal to the frequency of that variable on the vertical axis.

Frequency polygon: Make an abscissa that starts with the letter "O" and ends with the letter "X." Again, build the ordinate starting with the letter "O" and ending with the letter "Y." Now, write the exact limits or midpoints of the class intervals on the abscissa next to the class intervals. We can also add one more limit while keeping zero frequency on both ends of the class-interval range. How big the small squares on graph paper are depends on how many classes you want to plot. The next step is to plot the frequencies on the ordinate using the most comfortable measurement of small squares based on the range of the whole distribution. To draw a frequency polygon, you have to mark each frequency against its class on the height of its own ordinate. After making all of the frequency marks, draw a line between them. Here is the shape.

A frequency curve is a smooth curve that is drawn by hand through a frequency polygon. The goal of smoothing the frequency polygon is to get rid of as many of the random or unpredictable changes in the data as possible.

Diagrammatic Presentation of Data

Statistics can be shown visually in the form of a diagram. They show the information in a way that is easy to understand. Diagrammatic presentation is only used to show the data visually. Graphic presentation, on the other hand, can be used to analyse the data further.

Bar diagram: Bar diagram is most useful for categorical data. A thick line is what a bar is. From the frequency distribution table, a bar diagram is made that shows the variable on the left and the frequency on the right. The height of each bar will depend on how often or how much the variable changes.

Sub-divided bar diagram: A sub-divided bar diagram can be used to study how different parts of a phenomenon fit together. The bar is split and coloured to show how each subcategory of data fits. There will be as many shades as there are subparts in a group of data. Each sub-share class's of the total is shown by how much of the bar it takes up.

Multiple bar diagram: This type of diagram is used to compare two or more sets of phenomena or variables that are linked. A set of bars for people, places, or things that are related are drawn next to each other with no space between them. Different colours or shades are used to tell the difference between the bars in a set.

It can also be called an angular diagram. A pie chart or diagram is a circle with pieces cut out that show how often each variable appears in the distribution. Each sector will be based on how often the variable shows up in the group. A circle represents 360O. So, the 360° angles are split up by percentages. Using this formula, you can find out how many degrees each part of a given magnitude represents.

2. Explain the concept of normal curve with help of a diagram. Explain the characteristics of normal probability curve.

Ans) The shape of an important group of probabilities is shown by the normal curve. The normal curve is used to describe complex systems with random variables that don't change over time. A normal distribution has been found to be true for a lot of natural things. Some things about people, like height, weight, intelligence, and even social skills, can be said to be evenly spread out. Now, if we use the above distribution to draw a frequency polygon, we will get a curve like the one in the figure below:

The curve in the figure above has the shape of a "Bell" and is the same on both sides.

If you figure out the values of Mean, Median, and Mode, you will find that they are all about the same (M = 52, Md = 52, and Mo = 52).

Technically, this bell-shaped curve is called a Normal Probability Curve or just a Normal Curve. Its corresponding frequency distribution of scores, in which the Mean, Median, and Mode all have the same value, is called a Normal Distribution.

Many variables in the physical (e.g., height, weight, temperature, etc.), biological (e.g., age, longevity, blood sugar level), and behavioural (e.g., intelligence, achievement, adjustment, anxiety, socioeconomic status, etc.) sciences are normally distributed in nature. This normal curve is very important for measuring how people think. So, to measure these kinds of behaviour, the Normal Probability Curve, or "Normal Curve" for short, was used as a reference curve. The unit of measurement is called "." (Sigma).

Theoretical Base of the Normal Probability Curve

The normal probability curve is based on the law of probability, which was found by the French mathematician Abraham de Moivre when he looked at different games of chance. In the eighteenth century, he came up with both a mathematical equation for it and a picture of it.

The normal curve, which shows the law of probability, is based on the law of chance, or how likely it is that certain things will happen. When a set of observations fits this mathematical shape, it can be shown as a bell-shaped curve with certain features.

The characteristics of normal probability curve:

1. The Normal Curve is the same on both sides: The normal probability curve is the same on both sides of its vertical ordinate. The fact that the ordinate in the middle of the curve is symmetrical means that the size, shape, and slope of the curve on each side of the curve are the same. In other words, the left and right halves are mirror images of each other up to the middle point.

2. The Unimodel is the Normal Curve: The normal probability curve is unimodel, which means it has only one mode, because there is only one maximum point on the curve.

3. At the centre is where the Maximum Ordinate happens: The point in the middle of the curve, called the midpoint, is always where the ordinate is the tallest. It is equal to 0.3989 on the unit normal curve.

4. Asymptotic to the X Axis, the Normal Curve: Asymptotically, the normal probability curve approaches the horizontal axis. This means that as you move away from the middle point (the maximum ordinate point), the height of the curve keeps going down on both ends, but it never touches the horizontal axis. So, its ends go from (negative infinity) to + (positive infinity).

5. The Height of the Curve Goes Down in Both Directions: The height of a normal probability curve goes down in both directions from its highest point.

6. At point 1 Standard Deviation ( 1 ), the points of flux happen: At a place called the point of influx, the normal curve changes from going up to going down. If we draw lines perpendicular to the curve from these two points of entry to the horizontal X axis, they will meet one standard deviation unit above and below the mean (the central point).

7. The total percentage of the normal curve's area between two points of influxation is always the same: About 68.26% of the curve's area is within the range of 1 standard deviation ( 1 ) unit from the mean.

8. The total area under the normal curve can also be thought of as the probability of 100%: In terms of standard deviations, the total area under the normal curve can be thought of as being close to 100% probability.

9. The Normal Curve goes in two directions: Half of the curve's area is on the left side of the central ordinate that is the largest, and the other half is on the right side. So, the curve has two sides.

10. The Normal Curve is a math model used in the field of Behavioural Sciences. Especially in measuring the mind: This curve is a scale that is used to measure things. This scale's unit of measurement is 1. (the unit standard deviation).

3. The scores obtained by four groups of employees on occupational stress are given below. Compute ANOVA for the same.

(2) Rejection Region

Based on the information provided, the significance level is alpha = 0.05, and the degrees of freedom are df1 = 3 and df2 = 3 , therefore, the rejection region for this F-test is R = {F: F > 2.866}

(4) Decision about the null hypothesis

Since from the sample information we get that F = 3.054 > Fc = 2.866  = 2.866, it is then concluded that the null hypothesis is rejected.

(5) Conclusion

It is concluded that the null hypothesis Ho is rejected. Therefore, there is not enough evidence to claim that not all 4 population means are equal, at the alpha = 0.05 significance level.

### SECTION B

Answer the following questions in about 400 words (wherever applicable) each 5x5=25 Marks

4. Discuss the assumptions of parametric and nonparametric statistics.

Ans) Assumptions of Parametric Statistics:

1) Parametric tests, such as the "t" and "f" tests, can be used to look at data that meets the following conditions:

a) The people in the population from which the sample was taken should be spread out in a normal way.

b) Frequency distributions that follow a normal curve, which is infinite at both ends, are called normal distributions.

c) The variables must have been measured on a ratio or interval scale.

d) Variables are things that can have different values. There are different kinds of variables.

2) Different kinds of variables

a) Dependent variable: A variable that is thought to be an effect; it is usually a variable that can be measured.

b) Variable that is not considered to be a cause.

3) The observation needs to stand on its own. Whether or not a case is in the sample shouldn't change the results of the study too much.

4) The variance of these groups must be the same or, in some cases, they must have a known ratio of variance. The word for this is homoscedasticity.

5) The differences between the samples are equal or close to equal. This is called equality or homogeneity of variances, and it's important to figure out when the samples are small.

6) The observations don't depend on each other. The choice of one case in the sample has nothing to do with the choice of any other case.

Assumptions of Non-parametric Statistics:

1) We often can't meet the conditions and meet the assumptions, so we can't use parametric statistical procedures. In this case, we have no choice but to use non-parametric statistics.

2) If our sample is based on a nominal or ordinal scale, the distribution of the sample is not normally distributed, and the sample size is very small, it is always best to use non-parametric tests to compare samples, make inferences, or test the significance or trustworthiness of the computed statistics. In other words, the following situations are good times to use non-parametric tests:

a) Where the sample size isn't very big. If the sample size is only N=5 or N=6, the only other option is to use non-parametric tests.

b) We use non-parametric tests when it's hard to believe that things like the distribution of scores in the population is normal.

3) Non-parametric statistics are used when the data can be measured on ordinal or nominal scales, or when the data can be expressed in the form of ranks, + and – signs, and classifications like "good" and "bad."

4) It is not known if the people in the population from which samples are taken are normal.

5) The variables are written in the form of nouns.

6) The data are measures that are ranked or given as numbers with the strength of ranks.

5. Using Spearman’s rank order correlation for the following data:

6. Describe various levels of measurement with suitable examples.

Ans) Levels of measurement, also called scales of measurement, tell you how precisely variables are recorded. In scientific research, a variable is anything that can take on different values across your data set (e.g., height or test scores).

The various levels of measurement are:

1. Nominal: the data can only be categorized

2. Ordinal: the data can be categorized and ranked

3. Interval: the data can be categorized, ranked, and evenly spaced

4. Ratio: the data can be categorized, ranked, evenly spaced, and has a natural zero.

Nominal Level

We can categorize our data by labelling them in mutually exclusive groups, but there is no order between the categories.

Examples:

1. City of birth

2. Gender

3. Ethnicity

4. Car brands

5. Marital status

Ordinal Level

We can categorize and rank our data in an order, but we cannot say anything about the intervals between the rankings. Although we can rank the top 5 Olympic medallists, this scale does not tell us how close or far apart they are in number of wins.

Examples:

1. Top 5 Olympic medallists

2. Language ability (e.g., beginner, intermediate, fluent)

3. Likert-type questions  (e.g., very dissatisfied to very satisfied)

Interval Level

We can categorize, rank, and infer equal intervals between neighbouring data points, but there is no true zero point. The difference between any two adjacent temperatures is the same: one degree. But  zero degrees is defined differently depending on the scale – it doesn’t mean an absolute absence of temperature. The same is true for test scores and personality inventories. A zero on a test is arbitrary; it does not mean that the test-taker has an absolute lack of the trait being measured.

Examples:

1. Test scores (e.g., IQ or exams)

2. Personality inventories

3. Temperature in Fahrenheit or Celsius

Ratio Level

We can categorize, rank, and infer equal intervals between neighbouring data points, and there is a true zero point. A true zero means there is an absence of the variable of interest. In ratio scales, zero does mean an absolute lack of the variable. For example, in the Kelvin temperature scale, there are no negative degrees of temperature – zero means an absolute lack of thermal energy.

Examples:

1. Height

2. Age

3. Weight

4. Temperature in Kelvin

7. Explain Kruskal - Wallis ANOVA test and compare it with ANOVA.

Ans) ANOVA is used to compare more than two groups or k groups. However, since ANOVA is a parametric statistic that assumes normality as a key assumption, we also need to know about its non-parametric counterpart. With the Kruskal-Wallis test, the medians of more than two groups are compared to see if they are all the same. The Kruskal-Wallis test is like ANOVA, but it does not use parameters. It can be thought of as ANOVA with data that has been turned into ranks.

That is, the initial data are changed so that they match their ranks before ANOVA is done. In other words, it's like ANOVA, but it uses medians instead of means to calculate results. It can also be thought of as a test of the middle.

We can say the following about the null and alternative hypotheses:

1. H0: the population medians are equal

2. H1: the population medians differ

Comparison of ANOVA and Kruskal Wallis ANOVA Test

The Kruskal-Wallis (KW) ANOVA is the same thing as a one-way ANOVA, but it does not use parameters. Since it doesn't assume that things are normal, the KW ANOVA compares the null hypothesis, which says that there is no difference between the medians of three or more groups, to the alternative hypothesis, which says that there is a significant difference between the medians.

The Wilcoxon-Mann-Whitney (WMW) 2 sample test is similar to the Wilcoxon-Mann-Whitney (KW) ANOVA test, but it is more complicated.

1. The spreads are the same for both groups.

2. Both sets of data have the same shape.

ANOVA compares the averages of different groups to see how similar they are. KW ANOVA compares the medians of these groups. ANOVA compares the data itself, while KW ANOVA turns the data into ranks and then does its calculations. Kruskal-Wallis Analysis of Variance by Ranks is another name for ANOVA.

Let's look at the Example to see how their calculations are different:

We gave a task to three groups, 1, 2, and 3, and we want to see if they are the same or not.

Kruskal Wallis H test:

H = [12 / 18(18+1)] [ (29.52/6) + (682/7) + (73.52/5)] – [3 (18+1)]

H= 66.177 – 57 = 9.177

Chi Square for Degrees of freedom 2 (3 – 1) is 5.99, Therefore reject the H0

In both the case, ANOVA or Kruskal Wallis ANOVA, we will reject the Null Hypothesis, and state that the three groups differ.

F ratio as a function of H:

The fisher’s F or F ratio or ANOVA one way variance is equivalent to H test or Kruskal Wallis test or Kruskal Wallis ANOVA, or ANOVA by rank order. This can also be seen in book Iman and Conover.

Where the rank transform statistics states:

F= [{(k-1)/(N-k)} {((N-1)/H)-1}]-1

If We see from the above mentioned example

F was 8.213 and H was 9.177

F= [{(3-1)/(18-3)} {((18-1)/9.177)-1}]-1

F= [(2/15) {(17/9.177)-1}]-1

F= [0.133 x (1.852-1)]-1 = 0.1214-1

F= 8.231

8. Compute Chi-square for the following data:

Ans)Actual Values:

23, 22

12, 18

Expected Values:

21, 24

14, 16

Chi-Squared Values:

0.190476, 0.166667

0.285714, 0.25

Chi-Square = 0.892857

Degrees of Freedom = 1

p = 0.344704

### SECTION C

9. Type I and type II errors.

Ans) A type I error (false-positive) occurs if an investigator rejects a null hypothesis that is actually true in the population; a type II error (false-negative) occurs if the investigator fails to reject a null hypothesis that is actually false in the population. Although type I and type II errors can never be avoided entirely, the investigator can reduce their likelihood by increasing the sample size (the larger the sample, the lesser is the likelihood that it will differ substantially from the population).

10. Skewness and kurtosis.

Ans) Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution.

11. Point and interval estimations.

Ans) A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean. An interval estimate gives you a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

12. Null hypothesis

Ans) A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations. Hypothesis testing is used to assess the credibility of a hypothesis by using sample data. Sometimes referred to simply as the "null," it is represented as H0.

13. Scatter diagram

Ans) Scatter diagram is a graphical representation of a set of data in which the values of pairs of variables are plotted on a coordinate system. The tool is widely used in statistics and other fields of science and engineering to represent data relationships. The scatter diagram graphs pairs numerical data with one variable on each axis to look for a relationship between them. If the variables are correlated, the points will fall along the line or curve. The better the correlation, the tighter the points will hug the line. The scatter diagram is one of the seven basic quality tools used in root cause analysis.

14. Outliers

Ans) Outliers are extreme score on one of the variables or both the variables. The presence of outliers has deterring impact on the correlation value. The strength and degree of the correlation are affected by the presence of outlier. Suppose you want to compute correlation between height and weight. They are known to correlate positively. Look at the figure below. One of the scores has low score on weight and high score on height (probably, some anorexia patient).

15. Biserial correlation

Ans) The biserial correlation coefficient (rb), is a measure of correlation. It is like the point biserial correlation. But point-biserial correlation is computed while one of the variables is dichotomous and do not have any underlying continuity. If a variable has underlying continuity but measured dichotomously, then the biserial correlation can be calculated.

16. Variance

Ans) The term variance refers to a statistical measurement of the spread between numbers in a data set. More specifically, variance measures how far each number in the set is from the mean (average), and thus from every other number in the set. Variance is often depicted by this symbol: σ2. It is used by both analysts and traders to determine volatility and market security. The square root of the variance is the standard deviation (SD or σ), which helps determine the consistency of an investment’s returns over a period of time.

17. Interactional effect

Ans) Interaction effects include simultaneous effects of two or more variables on the process output or response. Interaction occurs when the effect of one independent variable changes depending on the level of another independent variable. In other way it can be stated that, the effect of one independent variable is not the same for all levels of the other independent variable.

18. Wilcoxon matched pair signed rank test.

Ans) The Wilcoxon matched-pairs signed-rank test is a nonparametric method to compare before-after, or matched subjects. It is sometimes called simply the Wilcoxon matched-pairs test. The Wilcoxon signed rank test is a nonparametric test that compares the median of a set of numbers against a hypothetical median.

## ​

### ​ #### ​

##### 100% Verified solved assignments from ₹ 40  written in our own words so that you get the best marks! #### ​

Don't have time to write your assignment neatly? Get it written by experts and get free home delivery #### ​ #### ​

Download IGNOU's official study material combined into a single PDF file absolutely free! #### ​ 