If the normally distributed curve is infinite

Gaussian normal distribution

 

Preliminary remark:

The term normal distribution, also called Gaussian distribution, is commonly understood to be the bell-shaped curve.

Formally correct, however, the bell curve is the (probability) Density functionthe normal distribution.

The normal distribution itself is the integral of the density function, i.e. an S-shaped curve.

This fact applies to everyone by analogy Distribution functionen.

In this category, the commonly used term normal distribution is used as an exception for the density function.

 

The Normal distribution is the wmost important distribution function in statistics.

Justifications follow at the end of this section.

The Normal distribution bIt has 2 parameters, µ (mean) and s (standard deviation).

 

Normal distribution is "additive noise":

There are infinitely many small influences at work additive on a random variable, then this variable is normally distributed.

(See also the law of propagation of errors)

 

Tests for normal distribution are among other things (test effort increases downwards):

The standardized normal distribution has the mean 0 and the standard deviation 1.

The general (non-standardized) form of the Density functionhas the shape

µ: expected value, mean value s: standard deviation

The prefactor (2ps2)1/2 ensures that the area under the curve (the integral of the density function, i.e. theDistribution function) gets the value 1. This is called normalization.

This is because probabilities cannot be greater than 1.

 

F: Distribution function: cannot be represented in a closed manner.

f: density function

 

The skewness of the normal distribution is = 0, so the normal distribution is symmetrical.

The mean value, mode value and median therefore coincide.

The turning points are at µ + s and µ-s.

The larger s, the "wider" or "flatter" the curve.

An Excel file for "trying out" different µ and s can be found here under normal distribution.xls.

 

Since the distribution function, i.e. the cumulative normal distribution, cannot be represented in a closed manner, in many practical cases other distribution functions are used that "look similar" to the normal distribution in that they are S-shaped (basically every distribution function is S-shaped):

See e.g. Logit model.

 

The normal distribution occupies a particularly prominent position among the distribution functions in statistics, which is why numerous tests were designed to examine data material for normal distribution form. See also adaptation test.

Reason:

  • Additive noise, i.e. the additive influence of many small different disturbances on a random variable, make this variable normally distributed. This is of great importance in the theory of measurement errors. (Multiplicative noise: -> log normal distribution)

  • almost all parametric tests are based on the assumption of normally distributed data.

  • Some other forms of distribution change into the normal distribution at certain border crossings. In particular, the binomial distribution and the Poisson distribution are meant here. See .

  • The greatest importance of the normal distribution in statistics only comes through the

    Central limit theorem to days: Means from samples are approximately normally distributed, regardless of what the initial distribution looks like (this is a technical formulation; mathematically there are a few [technically insignificant] limitations.

The following statistically significant "auxiliary distributions" are derived from the normal distribution:

These 3 distribution functions are not "natural" probability distributions, but technical constructs that are extremely helpful in statistical calculations (-> confidence intervals).

 

For a graphical representation of the normal distribution in Excel see here.

For an illustration of the relationships

Hypergeometric Distribution - Binomial Distribution - Poisson Distribution - Normal Distribution

Excel see here.

 

For the graphical calculation of confidence intervals of the normal distribution, see W.ilrich nomogram.

For the graphical calculation of Excess proportions the normal distribution see Durrant nomogram.

 

For a coherent mathematical representation of the same relationships see here.

For the determination of the "optimal" parameters and µ of a normally distributed sample see example under Maximum Likelihood Estimation (MLE).

See also Chebyshev's inequality.

 

For the extension of the normal distribution to several dimensions, see multidimensional normal distribution.

12.09.2005

 

Reasons for the importance of the normal distribution

 

1. Although the cumulative normal distribution cannot be represented in a closed manner, it still has good mathematical properties that make it manageable in terms of EDP technology.

 

2. Normal distribution describes "additive noise": an infinite number of independent influences, put together additively, result in an exactly normal distribution.

-> Normal distribution is THE distribution for measurement uncertainty. (see also the Error Propagation Act)

 

3. Central limit theorem:

Mean values ​​of random samples are approximately normally distributed, NO MATTER how the distribution of the individual values ​​looks (!)

Approximate means that the larger the sample, the better the statement.

You have to imagine this:

One repeatedly draws samples from a distribution that looks no matter how "impossible" and finds that the sample mean is normally distributed. Mathematically, there are limitations in this regard, but they are hardly relevant in practice.

 

4.

The principle of maximum entropy.

The state of knowledge "We know the mean value, standard deviation and know nothing else (also technically)" is best described by the normal distribution, that is,

Normal distribution Assumption leaves the greatest possible uncertainty here, does not pretend to have more knowledge than one has, so it corresponds exactly to the previously mentioned level of knowledge.

For the level of knowledge "We only know the mean and know nothing else (also technically)", the exponential distribution is the "correct" distribution function.