-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient method for check_parallel_jacobian #1395
base: main
Are you sure you want to change the base?
Changes from 28 commits
62a6346
6a46229
5e4da39
8fbb339
da664b0
9b8a25d
7fe9c5e
ab30049
8c9959c
5021f29
e4ce707
8438d09
e048a22
6d53c8d
12109f9
e175405
8cedc01
a8da700
345a813
330953f
5939198
e30c6d5
50903cc
a54f189
9bd6ce8
0cd2d2d
42e00a2
940d253
c922565
fb1bb91
e226647
1aed06f
02db37e
631e7c5
bf5f26d
ffd0071
10250a5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,7 +28,7 @@ | |
import numpy as np | ||
from scipy.linalg import svd | ||
from scipy.sparse.linalg import svds, norm | ||
from scipy.sparse import issparse, find | ||
from scipy.sparse import issparse, find, triu, diags | ||
|
||
from pyomo.environ import ( | ||
Binary, | ||
|
@@ -3712,8 +3712,9 @@ def ipopt_solve_halt_on_error(model, options=None): | |
|
||
def check_parallel_jacobian( | ||
model, | ||
tolerance: float = 1e-4, | ||
tolerance: float = 1e-8, | ||
direction: str = "row", | ||
zero_norm_tolerance: float = 1e-8, | ||
jac=None, | ||
nlp=None, | ||
): | ||
|
@@ -3746,8 +3747,6 @@ def check_parallel_jacobian( | |
list of 2-tuples containing parallel Pyomo components | ||
|
||
""" | ||
# Thanks to Robby Parker for the sparse matrix implementation and | ||
# significant performance improvements | ||
|
||
if direction not in ["row", "column"]: | ||
raise ValueError( | ||
|
@@ -3762,52 +3761,64 @@ def check_parallel_jacobian( | |
# they correspond to. | ||
if direction == "row": | ||
components = nlp.get_pyomo_constraints() | ||
csrjac = jac.tocsr() | ||
# Make everything a column vector (CSC) for consistency | ||
vectors = [csrjac[i, :].transpose().tocsc() for i in range(len(components))] | ||
mat = jac.tocsr() | ||
|
||
elif direction == "column": | ||
components = nlp.get_pyomo_variables() | ||
cscjac = jac.tocsc() | ||
vectors = [cscjac[:, i] for i in range(len(components))] | ||
|
||
# List to store pairs of parallel components | ||
parallel = [] | ||
|
||
vectors_by_nz = {} | ||
for vecidx, vec in enumerate(vectors): | ||
maxval = max(np.abs(vec.data)) | ||
# Construct tuple of sorted col/row indices that participate | ||
# in this vector (with non-negligible coefficient). | ||
nz = tuple( | ||
sorted( | ||
idx | ||
for idx, val in zip(vec.indices, vec.data) | ||
if abs(val) > tolerance and abs(val) / maxval > tolerance | ||
) | ||
) | ||
if nz in vectors_by_nz: | ||
# Store the index as well so we know what component this | ||
# correrponds to. | ||
vectors_by_nz[nz].append((vec, vecidx)) | ||
else: | ||
vectors_by_nz[nz] = [(vec, vecidx)] | ||
|
||
for vecs in vectors_by_nz.values(): | ||
for idx, (u, uidx) in enumerate(vecs): | ||
# idx is the "local index", uidx is the "global index" | ||
# Frobenius norm of the matrix is 2-norm of this column vector | ||
unorm = norm(u, ord="fro") | ||
for v, vidx in vecs[idx + 1 :]: | ||
vnorm = norm(v, ord="fro") | ||
|
||
# Explicitly multiply a row vector * column vector | ||
prod = u.transpose().dot(v) | ||
absprod = abs(prod[0, 0]) | ||
diff = abs(absprod - unorm * vnorm) | ||
if diff <= tolerance or diff <= tolerance * max(unorm, vnorm): | ||
parallel.append((uidx, vidx)) | ||
|
||
parallel = [(components[uidx], components[vidx]) for uidx, vidx in parallel] | ||
mat = jac.transpose().tocsr() | ||
|
||
# Take product of all rows/columns with all rows/columns by taking outer | ||
# product of matrix with itself | ||
outer = mat @ mat.transpose() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The previous implementation was designed to avoid taking every possible dot product by first sorting vectors by their nonzero structure. I would have expected this to be slow due to unnecessary dot products, but in some testing, it doesn't seem slower. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is sparse matrix multiplication carried out by Scipy. The key here is that 1) Sorting through which products to take should be carried out in the sparse matrix multiplication routine, not manually and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The scipy matrix product is taking every possible combination of non-zero dot products, while the previous implementation takes only dot products between vectors with the same sparsity structure. This is fewer dot products, although, in practice, probably not by a wide margin. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid I don't entirely follow. Scipy's matrix multiply, as near as I can tell, does the same thing. It implements the SMMP algorithm which occurs in two steps: the first to determine the nonzero structure of the matrix, the second to actually compute the values (as near as I can tell). That's basically what you're doing, except 1) it isn't filtering out entries as being "too small to contribute" and 2) it's doing it in C++ instead of Python, which is much faster. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For example, if two rows are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So you're ruling out rows being parallel based on one having an element for the third index (3) and the other having zero for the third index? That can fail if the first row was There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do this after filtering out small entries. |
||
|
||
# Along the diagonal of the outer product you get the dot product of the row | ||
# with itself, which is equal to the norm squared. | ||
norms = np.sqrt(outer.diagonal()) | ||
|
||
# Want to ignore indices with zero norm. By zeroing out the corresponding | ||
# entries of the scaling matrix, we set the corresponding rows and columns | ||
# to zero, which will then be ignored. | ||
|
||
zero_norm_indices = np.nonzero(np.abs(norms) <= zero_norm_tolerance) | ||
inv_norms = 1 / norms | ||
inv_norms[zero_norm_indices] = 0 | ||
|
||
# Divide each row and each column by the vector norm. This leaves | ||
# the entries as dot(u, v) / (norm(u) * norm(v)). The exception is | ||
# entries with "zero norm", whose corresponding rows and columns are | ||
# set to zero. | ||
scaling = diags(inv_norms) | ||
outer = scaling * outer * scaling | ||
|
||
# Get rid of duplicate values by only taking upper triangular part of | ||
# resulting matrix | ||
upper_tri = triu(outer) | ||
|
||
# Set diagonal elements to zero | ||
# Subtracting diags(upper_tri.diagonal()) is a more reliable | ||
# method of getting the entries to exactly zero than subtracting | ||
# an identity matrix, where one can encounter values of 1e-16 | ||
upper_tri = upper_tri - diags(upper_tri.diagonal()) | ||
dallan-keylogic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Get the nonzero entries of upper_tri in three arrays, | ||
# corresponding to row indices, column indices, and values | ||
rows, columns, values = find(upper_tri) | ||
|
||
# We have that dot(u,v) == norm(u) * norm(v) * cos(theta) in which | ||
# theta is the angle between u and v. If theta is approximately | ||
# 0 or pi, sqrt(2*(1 - abs(dot(u,v)) / (norm(u) * norm(v)))) approximately | ||
# equals the number of radians from 0 or pi. A tolerance of 1e-8 corresponds | ||
# to a tolerance of sqrt(2) * 1e-4 radians | ||
|
||
# The expression (1 - abs(values) < tolerance) returns an array with values | ||
# of ones and zeros, depending on whether the condition is fulfilled or not. | ||
# We then find indices where it is filled using np.nonzero. | ||
parallel_1D = np.nonzero(1 - abs(values) < tolerance)[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the motivation for checking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both the previous implementation and the Gurobi implementation checked against a semi-normalized epsilon:
Note that this formulation doesn't scale the tolerance with We could implement the check like
It's addition and subtraction that can cause problems with catastrophic cancellation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would advocate for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the intention with the I don't think avoiding vectorized operations is going to be a major improvement in performance, especially since we'll be adding two multiplications ( |
||
|
||
parallel = [ | ||
(components[rows[idx]], components[columns[idx]]) for idx in parallel_1D | ||
] | ||
|
||
return parallel | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be deleted, but added to.