Hide code cell content
%%capture

from IPython import get_ipython

install_packages = "google.colab" in str(get_ipython())
if install_packages:
    !pip install zfit

Quickstart#

In this quick tutorial, we’ll show the basic ideas on what you can do with zfit, without going into much detail or performing advanced tasks.

import numpy as np
import zfit
import zfit.z.numpy  # numpy-like backend
from zfit import z  # backend

Create observables#

The observable space in which PDFs are defined is created with the Space class

obs = zfit.Space('x', -10, 10)

Create data#

We create some unbinned data using numpy. Other constructors, e.g. for ROOT files are also available.

mu_true = 0
sigma_true = 1

data_np = np.random.normal(mu_true, sigma_true, size=10000)
data = zfit.data.Data.from_numpy(obs=obs, array=data_np)

Create a PDF to fit#

Let’s create a Gaussian PDF so we can fit the dataset. To do this, first we create the fit parameters, which follow a convention similar to RooFit:

zfit.Parameter(name, initial_value, lower_limit (optional), upper_limit (optional), other options)
mu = zfit.Parameter("mu", 2.4, -1., 5., step_size=0.001)  # step_size is not mandatory but can be helpful
sigma = zfit.Parameter("sigma", 1.3, 0, 5., step_size=0.001)  # it should be around the estimated uncertainty

Now we instantiate a Gaussian from the zfit PDF library (more on how to create your own PDFs later)

gauss = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

This pdf contains several useful methods, such as calculating a probability, calculating its integral, sampling etc.

# Let's get some probabilities.
consts = [-1, 0, 1]
probs = gauss.pdf(consts)
print(f"x values: {consts}\nresult:   {probs}")
x values: [-1, 0, 1]
result:   [0.01003756 0.05582995 0.17184121]

Fitting#

To fit, we need to take three steps: create the negative \(\log\mathcal{L}\), instantiate a minimizer and then minimize the likelihood.

# Create the negative log likelihood

nll = zfit.loss.UnbinnedNLL(model=gauss, data=data)  # loss

# Load and instantiate a minimizer
minimizer = zfit.minimize.Minuit()
minimum = minimizer.minimize(loss=nll)

params = minimum.params

print(minimum)
FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.Data: Data obs=('x',) shape=(10000, 1)>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | opt.) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.9e-06 │         14180.10 | -7503.273 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════╛
Parameters
name      value  (rounded)    at limit
------  ------------------  ----------
mu              0.00875873       False
sigma             0.999058       False

And we can plot the result to see how it went.

%matplotlib inline
import matplotlib.pyplot as plt

n_bins = 50
range_ = (-5,5)
_ = plt.hist(data_np, bins=n_bins, range=range_)
x = np.linspace(*range_, num=1000)
probs = gauss.pdf(x)
_ = plt.plot(x, data_np.shape[0] / n_bins * (range_[1] - range_[0]) * probs)
../../_images/84b3c6f2bc4d529b12859951ec19c6f3f142de720f572a10c087e0a04f54e02c.png

The FitResult that we obtained contains information about the minimization and can now be used to calculate the errors

print(f"Function minimum: {minimum.fmin}", minimum.fmin)
print(f"Converged: {minimum.converged} and valid: {minimum.valid}", )
print(minimum)
Function minimum: 14180.095938377548
 
14180.095938377548

Converged: True and valid: True

FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.Data: Data obs=('x',) shape=(10000, 1)>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | opt.) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.9e-06 │         14180.10 | -7503.273 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════╛
Parameters
name      value  (rounded)    at limit
------  ------------------  ----------
mu              0.00875873       False
sigma             0.999058       False

hesse_errors = minimum.hesse()
print(minimum)
FitResult
 of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'>  params=[mu, sigma]] data=[<zfit.Data: Data obs=('x',) shape=(10000, 1)>] constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════╕
│  valid  │  converged  │  param at limit  │   edm   │   approx. fmin (full | opt.) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════╡
│  
True
   │    True
     │      False
       │ 1.9e-06 │         14180.10 | -7503.273 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════╛
Parameters
name      value  (rounded)        hesse    at limit
------  ------------------  -----------  ----------
mu              0.00875873  +/-    0.01       False
sigma             0.999058  +/-  0.0071       False