Probability with TensorFlow

Probability with TensorFlow#

While TensorFlow offers some support for statistical inference, TensorFlow-Probability is very strong at this and provides MCMC methods, probability distributions and more.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
import zfit
from zfit import z

Distributions#

There is a whole collection of different distributions to be found in TFP. They have a minimal and well designed interface, which is similar to the SciPy distributions.

tfd = tfp.distributions

cauchy = tfd.Cauchy(loc=1., scale=10.)

sample = cauchy.sample(10)

cauchy.prob(sample)

<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([2.6071094e-02, 3.1442381e-02, 2.7809809e-03, 7.9893971e-06,
       1.1613751e-02, 2.8468275e-02, 1.0488409e-02, 2.7765647e-02,
       5.0200499e-05, 1.2306939e-02], dtype=float32)>

Mixtures of PDFs#

TensorFlow-Probability also supports creating mixtures of different distributions.

mix = 0.3
mix_gauss_cauchy = tfd.Mixture(
  cat=tfd.Categorical(probs=[mix, 1.-mix]),
  components=[
    cauchy,
    tfd.Normal(loc=+1., scale=0.5),
])

sample_mixed = mix_gauss_cauchy.sample(10)

mix_gauss_cauchy.prob(sample_mixed)

<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([0.00575754, 0.5623122 , 0.5668004 , 0.49004784, 0.56098807,
       0.5672368 , 0.00181922, 0.00596103, 0.25527784, 0.5115099 ],
      dtype=float32)>

Joint distributions#

Furthermore, joint distributions of multiple variables are supported.

joint = tfd.JointDistributionNamed(dict(
    c=             tfd.Cauchy(loc=10., scale=1.),
    n=             tfd.Normal(loc=0, scale=2.),
    m=lambda n, c: tfd.Normal(loc=n, scale=c),
))

sample_joint = joint.sample(10)
sample_joint

{'n': <tf.Tensor: shape=(10,), dtype=float32, numpy=
 array([-0.6608666 , -0.48240733,  0.11359614,  0.8748752 ,  1.6925942 ,
        -0.3909465 , -1.6515791 , -0.08820169,  2.4561026 , -2.6474662 ],
       dtype=float32)>,
 'c': <tf.Tensor: shape=(10,), dtype=float32, numpy=
 array([10.767676, 12.420025,  8.83671 , 17.767761, 11.516898,  5.641006,
         9.28953 , 17.31171 ,  0.444829,  8.834191], dtype=float32)>,
 'm': <tf.Tensor: shape=(10,), dtype=float32, numpy=
 array([ 15.952355  ,  -0.11954683, -13.973453  ,  13.250938  ,
         21.058216  ,  -6.8059893 ,  -5.8289185 ,   2.2009766 ,
          1.6966119 ,  -1.2816421 ], dtype=float32)>}

joint.prob(sample_joint)

<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([4.26267012e-04, 2.88798823e-04, 3.41304782e-04, 1.65718538e-05,
       1.13283932e-04, 1.15374685e-04, 1.16462982e-03, 2.66069037e-05,
       6.75684569e-05, 5.00074355e-04], dtype=float32)>

How TFP compares to zfit#

TensorFlow-Probability offers a great choice of distributions to build a model. The flexibility in terms of vectorization and parametrization is larger than in zfit. However, they only provide models with analytically known CDFs and lack any numerical normalization or sampling methods. This excludes any more sophisticated model, convolutions and more.

Internally, zfit simply wraps TFP distributions for certain implementations, such as the Gauss. There is also a standard wrapper, WrapDistribution, that allows to easily wrap any TFP distribution and use it in zfit.