Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Inconsistent Gradient Values for N-Dimensional Integral of Multidimensional Normal Distribution #2411

Open
simplifier89 opened this issue Jul 4, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@simplifier89
Copy link

🐛 Bug

Description:
I am using gradients to calculate the N-dimensional integral of a Multidimensional Normal Distribution. However, I observed that the gradient values along dimensions are different, which seems strange. Below are the steps and code snippets demonstrating the issue.

Steps to Reproduce:

Compute the integral and gradients using the provided code.
Compare the results with finite difference approximations.

** Code snippet to reproduce **

from botorch.utils.probability.bvn import bvnu
from botorch.utils.probability import bvn, MVNXPB
import torch

torch.set_printoptions(precision=7)
dtype = torch.float64

N_dim = 6
up_limit = torch.ones(N_dim, dtype=dtype)
up_limit[0] = torch.tensor([1.], dtype=dtype)
up_limit.requires_grad_()
integration_limits = torch.tensor([[-torch.inf, -torch.inf]] * N_dim)
integration_limits[...,-1] = up_limit
cov_matrix = (0.5 * torch.ones((N_dim, N_dim), dtype=dtype)).fill_diagonal_(1.)
mvn_pb = MVNXPB(cov_matrix, integration_limits)
mvn_pb.solve()
pb_result = mvn_pb.log_prob.exp()
print(f'pb_result: {pb_result}')
delta_limit = torch.autograd.grad(pb_result, up_limit, create_graph=False, retain_graph=False)
print(f'delta_integral {delta_limit}')

When I calculate the same derivatives with finite differences, all values are the same as expected. We can see that the finite differences values are close to the last two values of delta_limit.

Additionally, if the cov_matrix is an identity matrix, we do not observe this behavior. The gradient values are the same along the dimensions and are close to the finite difference values.

finite_differences = []
for i in range(integration_limits[...,-1].shape[-1]):
    integration_limits[...,-1][i] = 1.0001
    mvn_pb = MVNXPB(cov_matrix, integration_limits)
    mvn_pb.solve()
    pb_result_2 = mvn_pb.log_prob.exp()
    finite_differences.append((pb_result_2 - pb_result) / .0001)
    integration_limits[...,-1][i] = 1.
print(f'finite_differences {finite_differences}')

** Stack trace/error message **

delta_integral (tensor([0.0930755, 0.0930755, 0.0848957, 0.0848957, 0.0782828, 0.0782828],
       dtype=torch.float64),)

Expected Behavior

finite_differences [tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>), tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>), tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>), tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>), tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>), tensor(0.0782873, dtype=torch.float64, grad_fn=<DivBackward0>)] 

System information

Please complete the following information:

  • BoTorch Version 0.11.0
  • GPyTorch Version 1.11
  • PyTorch Version 2.3.0
  • Computer OS macOS Sonoma 14.4.1

Question:
Is this a bug or is there an explanation for this behavior?

@simplifier89 simplifier89 added the bug Something isn't working label Jul 4, 2024
@Balandat
Copy link
Contributor

Balandat commented Jul 5, 2024

Hmm, interesting, thanks for flagging this. I'm not entirely sure what causes this, but it's got to be something in the implementation-weeds. One thing to note is that MVNXPB is approximates the integral, so the gradients are also approximate - depending on the order the bivariate conditioning goes through the dimensions I can see those errors being different for different dimensions.

I would assume that the errors here would grow larger as as the covariance gets more and more singular, I'm curious if you've observed that in practice. E.g. what happens to the size of the errors as you let diag_val in cov_matrix = (0.5 * torch.ones((N_dim, N_dim), dtype=dtype)).fill_diagonal_(diag_val) approach 0.5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants