Hide code cell content
%%capture

from IPython import get_ipython

install_packages = "google.colab" in str(get_ipython())
if install_packages:
    !pip install zfit

Quickstart#

In this quick tutorial, we’ll show the basic ideas on what you can do with zfit, without going into much detail or performing advanced tasks.

import numpy as np
import zfit
import zfit.z.numpy  # numpy-like backend
from zfit import z  # backend
No module named 'nlopt'

Create observables#

The observable space in which PDFs are defined is created with the Space class

obs = zfit.Space('x', limits=(-10, 10))

Create data#

We create some unbinned data using numpy. Other constructors, e.g. for ROOT files are also available.

mu_true = 0
sigma_true = 1

data_np = np.random.normal(mu_true, sigma_true, size=10000)
data = zfit.data.Data.from_numpy(obs=obs, array=data_np)

Create a PDF to fit#

Let’s create a Gaussian PDF so we can fit the dataset. To do this, first we create the fit parameters, which follow a convention similar to RooFit:

zfit.Parameter(name, initial_value, lower_limit (optional), upper_limit (optional), other options)
mu = zfit.Parameter("mu", 2.4, -1., 5., step_size=0.001)  # step_size is not mandatory but can be helpful
sigma = zfit.Parameter("sigma", 1.3, 0, 5., step_size=0.001)  # it should be around the estimated uncertainty

Now we instantiate a Gaussian from the zfit PDF library (more on how to create your own PDFs later)

gauss = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

This pdf contains several useful methods, such as calculating a probability, calculating its integral, sampling etc.

Note: Several objects that are returned from methods, like integrate, return tf.Tensor, which are wrapped Numpy arrays. They can directly be used as such or explicitly converted to by calling:

zfit.run(TensorFlow_object)
# Let's get some probabilities.
consts = [-1, 0, 1]
probs = gauss.pdf(consts)
# And now execute the tensorflow graph
result = zfit.run(probs)
print(f"x values: {consts}\nresult:   {result}")
x values: [-1, 0, 1]
result:   [0.01003756 0.05582994 0.17184119]

Fitting#

To fit, we need to take three steps: create the negative \(\log\mathcal{L}\), instantiate a minimizer and then minimize the likelihood.

# Create the negative log likelihood

nll = zfit.loss.UnbinnedNLL(model=gauss, data=data)  # loss

# Load and instantiate a minimizer
minimizer = zfit.minimize.Minuit()
minimum = minimizer.minimize(loss=nll)

params = minimum.params

print(minimum)
FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.1e-05 │             14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name      value  (rounded)    at limit
------  ------------------  ----------
mu             -0.00640465       False
sigma               1.0004       False

And we can plot the result to see how it went.

%matplotlib inline
import matplotlib.pyplot as plt

n_bins = 50
range_ = (-5,5)
_ = plt.hist(data_np, bins=n_bins, range=range_)
x = np.linspace(*range_, num=1000)
pdf = zfit.run(gauss.pdf(x))
_ = plt.plot(x, data_np.shape[0] / n_bins * (range_[1] - range_[0]) * pdf)
../../_images/e1874e640db6b0e48e26b0cef28c9c6e1dbab67864203367688879e2ac2c3e89.png

The FitResult that we obtained contains information about the minimization and can now be used to calculate the errors

print(f"Function minimum: {minimum.fmin}", minimum.fmin)
print(f"Converged: {minimum.converged} and valid: {minimum.valid}", )
print(minimum)
Function minimum: -7712.850874254467
 
-7712.850874254467

Converged: True and valid: True

FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.1e-05 │             14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name      value  (rounded)    at limit
------  ------------------  ----------
mu             -0.00640465       False
sigma               1.0004       False

hesse_errors = minimum.hesse()
print(minimum)
FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.1e-05 │             14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name      value  (rounded)        hesse    at limit
------  ------------------  -----------  ----------
mu             -0.00640465  +/-    0.01       False
sigma               1.0004  +/-  0.0071       False