# Why `paramnormal`

?¶

Both in `numpy`

and `scipy.stats`

and in the field of statistics in
general, you can refer to the *location* (`loc`

) and scale (`scale`

)
parameters of a distribution. Roughly speaking, they refer to the
position and spread of the distribution, respectively. For normal
distribtions `loc`

refers the mean (symbolized as \(\mu\)) and
`scale`

refers to the standard deviation (a.k.a. \(\sigma\)).

The main problem that `paramnormal`

is trying to solve is that
sometimes, creating a probability distribution using these parameters
(and others) in `scipy.stats`

can be confusing. Also the parameters in
`numpy.random`

can be inconsistently named (admittedly, just a minor
inconvenience).

```
%matplotlib inline
```

```
import numpy as np
from scipy import stats
```

Consider the lognormal distribution.

In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable \(X\) is log-normally distributed, then \(Y = \ln(X)\) has a normal distribution. Likewise, if \(Y\) has a normal distribution, then \(X = \exp(Y)\) has a log-normal distribution. (from wikipedia)

In numpy, you specify the “mean” and “sigma” of the underlying normal
distribution. A lot lof scientific programmers know what that would
mean. But `mean`

and `standard_deviation`

, `loc`

and `scale`

or
`mu`

and `sigma`

would have been better choices.

Still, generating random numbers is pretty straight-forward:

```
np.random.seed(0)
mu = 0
sigma = 1
N = 3
np.random.lognormal(mean=mu, sigma=sigma, size=N)
```

```
array([ 5.83603919, 1.49205924, 2.66109578])
```

In scipy, you need an additional shape parameter (`s`

), plus the usual
`loc`

and `scale`

. Aside from the mystery behind what `s`

might
bem that seems straight-forward enough.

Except it’s not.

That shape parameter is actually the standard deviation (\(\sigma\))
of the underlying normal distribution. The `scale`

should be set to
the exponentiated location parameter of the raw distribution
(\(e ^ \mu\)). Finally, `loc`

actually refers to a sort of offset
that can be applied to entire distribution. In other words, you can
translate the distribution up and down to e.g., negative values.

In my field (civil/environmental engineering) variables that are often
assumed to be lognormally distributed (e.g., pollutant concentration)
can never have values less than or equal to zerlo. So in that sense, the
`loc`

parameter in scipy’s lognormal distribution **nearly always
should be set to zero**.

With that out of the way, recreating the three numbers above in scipy is done as follows:

```
np.random.seed(0)
stats.lognorm(sigma, loc=0, scale=np.exp(mu)).rvs(size=N)
```

```
array([ 5.83603919, 1.49205924, 2.66109578])
```

## A new challenger appears¶

`paramnormal`

really just hopes to take away some of this friction.
Consider the following:

```
import paramnormal
np.random.seed(0)
paramnormal.lognormal(mu=mu, sigma=sigma).rvs(size=N)
```

```
array([ 5.83603919, 1.49205924, 2.66109578])
```

Hopefully that’s much more readable and straight-forward.

## Greek-letter support¶

Tom Augspurger added a lovely little decorator to let you use greek letters in the function signature.

```
np.random.seed(0)
paramnormal.lognormal(μ=mu, σ=sigma).rvs(size=N)
```

```
array([ 5.83603919, 1.49205924, 2.66109578])
```

## Other distributions¶

As of now, we provide a convenient interface for the following
distributions in `scipy.stats`

:

```
for d in paramnormal.dist.__all__:
print(d)
```

```
normal
lognormal
weibull
alpha
beta
gamma
chi_squared
pareto
exponential
rice
```

Feel free to submit a pull request at Github to add new distributions.