Climate of the Earth system

Prof. Dr. Markus Meier
Leibniz Institute for Baltic Sea Research Warnemünde (IOW)
E-Mail: markus.meier@io-warnemuende.de

Probability density and distribution#

  1. Probability density function and important parameters

  2. Different probability distributions

Probability density function and important parameters#

Probability density function#

  • let x be a continous (not discrete!) variable that takes values in Ω, for example temperature, probability density function fX(x) of an event X (i.e. T=10°C) is defined as a continous function on R with the following three attributes:

(24)#1.   fX(x)0  for all  xϵΩ,2.   ΩfX(x) dx=1,3.   P(Xϵ(a,b))=abfX(x) dx  for all  (a,b)Ω
  • Question: What is the unit of the pdf?
    Answer: [fX(x)] = [x]1.

  • Question: What is the integral of the pdf? Answer: The cumulative distribution function.

Cumulative distribution function#

  • Cumulative distribution function for an event X is a montonously increasing, non-dimensional function F_X(x) on R defined as:

(25)#FX(x)=xf(r) dr,
  • which is equivalent to:

(26)#limxFX(x)=0limxFX(x)=1ddxFX(x)=fX(x)
  • consequently the probability of the event X to be inside the range of (a,b) is:

P(Xϵ(a,b))=FX(b)FX(a)

Expectation ε#

  • the expectation of a given pdf weighs is with x in the integral:

ε(X)=ΩxfX(x) dxε(g(X))=Ωg(x)fX(x) dx
  • two attributes of the expectation are:

ε(g1(X)+g2(X))=ε(g1(X))+ε(g2(X))ε(ag(X)+b)=aε(g(X))+b

Central moments μ#

  • k-th moment of a continous random variable X:

μ(k)=ε(xk)=Ωxkf(x) dx
  • k-th central moment of a continous random variable X:

μ(k)=Ω(xμ)kf(x) dx
  • example: anomalies with μ mean seasonal cycle

  • mean μ: location parameter μ=μ(1)

  • variance:

Var(X)=μ(2)=Ω(xμ)2f(x) dx
  • standard deviation:

σX=Var(X)
  • Chebyshev’s inequality:

P(|Xμ|λσ)1λ2

Skewness γ1#

  • is a measure of the asymmetry of a distribution: symmetric for γ1=0, scaled version of the third central moment, non-dimensional shape parameter

γ1=Ω(xμσ)3fX(x) dx
../_images/L10_1_skewness.PNG

Kurtosis γ2#

  • is a measure of the peakedness of a distribution: a normal distribution (will be explained later this lecture) has γ2=0, scaled and shifted version of the fourth central moment, non-dimensional shape parameter

γ2=(xμσ)4fX(x) dx3
../_images/L10_2_kurtosis.PNG

Examples#

  • summer sea level at Kieler Förde, μ=0.06, σ=0.19, γ1=0.6, γ2=4.07

../_images/L10_3_example.PNG
  • probability densities of some measured variables

../_images/L10_4_example.PNG

P-quantiles#

  • mean and variance are affected by the tail ends of the pdf (likelihood of extreme values), but p-quantiles xp are insensitive to extreme values.

  • p quantile of 0.3 means that 30% of the x values are below this threshold

FX(xp)=p     with     P(Xε(,xp))=p,P(Xε(xp,))=1p
  • median m50 is the 50%-quantile: half of the distribution lays above and the other half below m50.

FX(m50)=0.5      P(xm50)=P(xm50)=0.5
  • let’s look at the p-quantiles of the log-normal distribution in Figure 5 to get an idea. note the difference of mean and median!

../_images/L10_5_lognormal.PNG

Different probability distributions#

Uniform distribution#

  • symmetric and less peaked than the normal distribution:

fX(x)=U(a,b)={1/(ba)   for all xϵ[a,b]0                 elsewhere
  • with the cumulative distribution function:

FX(x)={0                           for xa(xa)/(ba)   for all xϵ[a,b]1                           for xa
../_images/L10_6_uniform.PNG
  • exercise: calculate μ,Var,σ,γ1,γ2 of the uniform distribution U(a,b)
    solutions: μ(U(a,b))=12(a+b), Var(U(a,b))=112(ba)2, σ(U(a,b))=112(ba), γ1(U(a,b))=0, γ2(U(a,b))=1.2

Normal (Gaussian) distribution#

  • most physical quantities are nearly normal distributed

fN(x)=12πσe(xμ)22σ2   with   XN(μ,σ2)
  • no skewness or kurtosis: γ1=γ2=0

  • no analytical form of cdf, approximation:

(27)#FN(x)12(1+sign(xμσ)1e2π(xμσ)2)
../_images/L10_7_gauss.PNG
  • central limit theorem states: If Xk,k=1,2,... is an infinite series of independent and identically distributed random variables with ε(Xk)=μ and Var(Xk)=σ2 then the average 1nk=1nXk is asymptotically normal distributed. That is:

limn1nk=1n(Xkμ)σnN(0,1)
  • a larger sample size reduces the standard deviation as of:

limn1nk=1n(Xkμ)N(0,σ2n)    σΣ=σn
../_images/L10_8_clt1.PNG
../_images/L10_9_clt2.PNG

Log-normal distribution#

  • distribution of positive definite quantities such as rainfall, wind speed

fX(x)=1σ2π1xe(ln(x)ln(θ))22σ2  for  x>0
  • with the median value θ and

ln(X)N(ln(θ),σ)
../_images/L10_10_lognormal.PNG
  • exercise: derive a general for the k-th central moment of the distribution
    solution: ε(Xk)=θkekσ2/2

χ2-distribution#

  • sum of k independent squared N(0,1) random variables, k number of degrees of freedom, application for the pdfs of variance estimates:

fχ(x)=x(k2)/2ex/2Γ(k/2)2k/2  if  x>0
  • with

(28)#Γ(x)=0ettx1dt for x>0
  • it has handy attributes:

ε(X)=kVar(X)=2k
../_images/L10_11_chisquared.PNG

Student’s t-distribution#

  • application for testing the significance of the differences in the means. be t(k) a test variable with k>0, if A and B are independent random variables such that

Bχ2(k) and AN(0,1)
  • the t-distribution can be written as:

t(k)AB/k
  • using the Γ-function (28) the distribution can also be written as:

FT(t)=Γ((k+1)/2)kπΓ(k/2)(1+t2k)(k+1)2
../_images/L10_12_t.PNG
  • t-test?

Fisher-F-distribution#

  • application for testing the significance of the differences in the variance. for χ2-distributed K and L:

Kχ2(k) and Lχ2(l)
  • the F-distribution is given by:

F(k,l)=K/kL/l
  • alternatively the probsbility density of the F-distribution is also given by:

fF(x)=(k/l)k/2Γ((k+l)/2)Γ(k/2)Γ(l/2)x(k2)/2(1+klx)(k+l)

Summary of theoretical distributions#

../_images/L10_13_summary.PNG

Continous random vectors, multi-variate data#

  • example: vectors X temperature and Y sea level pressure:

X and YfX,Y(x,y)
../_images/L10_14_correlation.PNG