Show code cell content
%%capture
from IPython import get_ipython
install_packages = "google.colab" in str(get_ipython())
if install_packages:
!pip install zfit
Quickstart#
In this quick tutorial, we’ll show the basic ideas on what you can do with zfit
, without going into much detail or performing advanced tasks.
import numpy as np
import zfit
import zfit.z.numpy # numpy-like backend
from zfit import z # backend
No module named 'nlopt'
Create observables#
The observable space in which PDFs are defined is created with the Space
class
obs = zfit.Space('x', limits=(-10, 10))
Create data#
We create some unbinned data using numpy
. Other constructors, e.g. for ROOT
files are also available.
mu_true = 0
sigma_true = 1
data_np = np.random.normal(mu_true, sigma_true, size=10000)
data = zfit.data.Data.from_numpy(obs=obs, array=data_np)
Create a PDF to fit#
Let’s create a Gaussian PDF so we can fit the dataset. To do this, first we create the fit parameters, which follow a convention similar to RooFit
:
zfit.Parameter(name, initial_value, lower_limit (optional), upper_limit (optional), other options)
mu = zfit.Parameter("mu", 2.4, -1., 5., step_size=0.001) # step_size is not mandatory but can be helpful
sigma = zfit.Parameter("sigma", 1.3, 0, 5., step_size=0.001) # it should be around the estimated uncertainty
Now we instantiate a Gaussian from the zfit PDF library (more on how to create your own PDFs later)
gauss = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)
This pdf contains several useful methods, such as calculating a probability, calculating its integral, sampling etc.
Note: Several objects that are returned from methods, like integrate
, return tf.Tensor
, which are wrapped Numpy arrays.
They can directly be used as such or explicitly converted to by calling:
zfit.run(TensorFlow_object)
# Let's get some probabilities.
consts = [-1, 0, 1]
probs = gauss.pdf(consts)
# And now execute the tensorflow graph
result = zfit.run(probs)
print(f"x values: {consts}\nresult: {result}")
x values: [-1, 0, 1]
result: [0.01003756 0.05582994 0.17184119]
Fitting#
To fit, we need to take three steps: create the negative \(\log\mathcal{L}\), instantiate a minimizer and then minimize the likelihood.
# Create the negative log likelihood
nll = zfit.loss.UnbinnedNLL(model=gauss, data=data) # loss
# Load and instantiate a minimizer
minimizer = zfit.minimize.Minuit()
minimum = minimizer.minimize(loss=nll)
params = minimum.params
print(minimum)
FitResult
of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'> params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]>
with
<Minuit Minuit tol=0.001>
╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│ valid │ converged │ param at limit │ edm │ approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│
True
│ True
│ False
│ 1.1e-05 │ 14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name value (rounded) at limit
------ ------------------ ----------
mu -0.00640465 False
sigma 1.0004 False
And we can plot the result to see how it went.
%matplotlib inline
import matplotlib.pyplot as plt
n_bins = 50
range_ = (-5,5)
_ = plt.hist(data_np, bins=n_bins, range=range_)
x = np.linspace(*range_, num=1000)
pdf = zfit.run(gauss.pdf(x))
_ = plt.plot(x, data_np.shape[0] / n_bins * (range_[1] - range_[0]) * pdf)
The FitResult
that we obtained contains information about the minimization and can now be used to calculate the errors
print(f"Function minimum: {minimum.fmin}", minimum.fmin)
print(f"Converged: {minimum.converged} and valid: {minimum.valid}", )
print(minimum)
Function minimum: -7712.850874254467
-7712.850874254467
Converged: True and valid: True
FitResult
of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'> params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]>
with
<Minuit Minuit tol=0.001>
╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│ valid │ converged │ param at limit │ edm │ approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│
True
│ True
│ False
│ 1.1e-05 │ 14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name value (rounded) at limit
------ ------------------ ----------
mu -0.00640465 False
sigma 1.0004 False
hesse_errors = minimum.hesse()
print(minimum)
FitResult
of
<UnbinnedNLL model=[<zfit.<class 'zfit.models.dist_tfp.Gauss'> params=[mu, sigma]] data=[<zfit.core.data.Data object at 0x7f544c3a1150>] constraints=[]>
with
<Minuit Minuit tol=0.001>
╒═════════╤═════════════╤══════════════════╤═════════╤══════════════════════════════════╕
│ valid │ converged │ param at limit │ edm │ approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═════════╪══════════════════════════════════╡
│
True
│ True
│ False
│ 1.1e-05 │ 14193.08 | -7712.851 │
╘═════════╧═════════════╧══════════════════╧═════════╧══════════════════════════════════╛
Parameters
name value (rounded) hesse at limit
------ ------------------ ----------- ----------
mu -0.00640465 +/- 0.01 False
sigma 1.0004 +/- 0.0071 False