Skip to content

Commit

Permalink
DOC clarify the kernel gradient for GaussianProcesses (#18115)
Browse files Browse the repository at this point in the history
  • Loading branch information
rauwuckl committed Aug 15, 2020
1 parent bdf2ff5 commit eb7b158
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 42 deletions.
14 changes: 8 additions & 6 deletions doc/modules/gaussian_process.rst
Original file line number Diff line number Diff line change
Expand Up @@ -385,12 +385,14 @@ equivalent call to ``__call__``: ``np.diag(k(X, X)) == k.diag(X)``
Kernels are parameterized by a vector :math:`\theta` of hyperparameters. These
hyperparameters can for instance control length-scales or periodicity of a
kernel (see below). All kernels support computing analytic gradients
of the kernel's auto-covariance with respect to :math:`\theta` via setting
``eval_gradient=True`` in the ``__call__`` method. This gradient is used by the
Gaussian process (both regressor and classifier) in computing the gradient
of the log-marginal-likelihood, which in turn is used to determine the
value of :math:`\theta`, which maximizes the log-marginal-likelihood, via
gradient ascent. For each hyperparameter, the initial value and the
of the kernel's auto-covariance with respect to :math:`log(\theta)` via setting
``eval_gradient=True`` in the ``__call__`` method.
That is, a ``(len(X), len(X), len(theta))`` array is returned where the entry
``[i, j, l]`` contains :math:`\frac{\partial k_\theta(x_i, x_j)}{\partial log(\theta_l)}`.
This gradient is used by the Gaussian process (both regressor and classifier)
in computing the gradient of the log-marginal-likelihood, which in turn is used
to determine the value of :math:`\theta`, which maximizes the log-marginal-likelihood,
via gradient ascent. For each hyperparameter, the initial value and the
bounds need to be specified when creating an instance of the kernel. The
current value of :math:`\theta` can be get and set via the property
``theta`` of the kernel object. Moreover, the bounds of the hyperparameters can be
Expand Down
80 changes: 44 additions & 36 deletions sklearn/gaussian_process/kernels.py
Original file line number Diff line number Diff line change
Expand Up @@ -572,8 +572,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined.
Determines whether the gradient with respect to the log of the
kernel hyperparameter is computed.
Returns
-------
Expand All @@ -582,7 +582,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape \
(n_samples_X, n_samples_X, n_dims, n_kernels), optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -796,8 +796,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Returns
-------
Expand All @@ -806,7 +806,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -894,8 +894,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Returns
-------
Expand All @@ -904,7 +904,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -1072,8 +1072,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Returns
-------
Expand All @@ -1082,7 +1082,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -1200,8 +1200,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -1210,7 +1211,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when eval_gradient
is True.
"""
Expand Down Expand Up @@ -1319,8 +1320,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
is evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -1329,7 +1331,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when eval_gradient
is True.
"""
Expand Down Expand Up @@ -1466,8 +1468,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -1476,7 +1479,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -1620,8 +1623,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -1630,7 +1634,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -1809,16 +1813,17 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
K : ndarray of shape (n_samples_X, n_samples_Y)
Kernel k(X, Y)
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims)
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when eval_gradient
is True.
"""
Expand Down Expand Up @@ -1954,8 +1959,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -1964,7 +1970,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -2086,8 +2092,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -2096,7 +2103,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down Expand Up @@ -2240,8 +2247,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
if evaluated instead.
eval_gradient : bool, default=False
Determines whether the gradient with respect to the kernel
hyperparameter is determined. Only supported when Y is None.
Determines whether the gradient with respect to the log of
the kernel hyperparameter is computed.
Only supported when Y is None.
Returns
-------
Expand All @@ -2250,7 +2258,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
optional
The gradient of the kernel k(X, X) with respect to the
The gradient of the kernel k(X, X) with respect to the log of the
hyperparameter of the kernel. Only returned when `eval_gradient`
is True.
"""
Expand Down

0 comments on commit eb7b158

Please sign in to comment.