Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize EquivalenceClasses to use more efficient algorithms #30082

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

frankmcsherry
Copy link
Contributor

@frankmcsherry frankmcsherry commented Oct 18, 2024

This PR updates EquivalenceClasses to form and use a map from expressions to expressions when it attempts to simplify its own expressions. Minimization starts with a self.tidy() to ensure the equivalence classes are equivalence relations, in fact as well as in name. It then repeatedly calls minimize_once, which does:

  1. Form a map from expression to the representative of its class.
  2. Locally update expressions in each class, using first the map, and then optionally column-type based reduction.
  3. Identify idioms and extend classes with equivalences we can derive from those that exist.
  4. Restore the equivalence class invariant.

For me, this is much simpler algorithmically, vs wobbling the stages around. It also has the advantage that we can compare the map from step 1. with the equivalence classes in step 4., and if they are identical we are done. No need to do the extra round of validation.

There isn't much improvement in running times on the incident 217 reproduction. The outlier times for minimize() calls go down by ~4x (in debug builds from ~320ms to ~80ms), but they seem to not be as impactful as the overall work done on the query. There are several additional optimizations that are available, though:

  1. We can use the same map for calls to reduce_expr that would be served (linearly) by the equivalence classes.
  2. We can actually save the map, halving the computation of the map (I tried, but had a bug, so stopped for now).

I'm not sure how much more we'd sneak out of this without some more careful thought. It's not much more than linear time at the moment, and we need to take linear time at least just to look at and record the classes. The larger problem seems to be around needing to do the work at all, and figuring how to not show up with 6k+ expressions that are variously equated (in the repro). The query optimizes in about 7s on a release build, but no more than a few milliseconds are taken in any call to minimize().

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@frankmcsherry frankmcsherry force-pushed the equivalence_improvements branch 2 times, most recently from 1503d49 to 636c143 Compare October 18, 2024 19:39
@frankmcsherry frankmcsherry marked this pull request as ready for review October 18, 2024 20:31
@frankmcsherry frankmcsherry requested a review from a team as a code owner October 18, 2024 20:31
@antiguru antiguru self-requested a review October 18, 2024 20:40
@ggevay ggevay self-requested a review October 19, 2024 07:52
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, but I'll defer the final say to @ggevay. I left some minor comments inside, there seems to be some unused code.

Comment on lines 721 to 740

/// Returns a map that can be used to replace (sub-)expressions.
pub fn reducer(&self) -> BTreeMap<&MirScalarExpr, &MirScalarExpr> {
self.classes
.iter()
.flat_map(|c| c.iter().map(move |e| (e, &c[0])))
.collect()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reducer seems to be unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's meant to be used in the near future, but we can remove for this commit.

Comment on lines 528 to 531
let remap_ref = &remap;
self.classes
.iter()
.all(|c| c.iter().all(move |e| remap_ref.get(e) == Some(&c[0])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let remap_ref = &remap;
self.classes
.iter()
.all(|c| c.iter().all(move |e| remap_ref.get(e) == Some(&c[0])))
self.classes
.iter()
.all(|c| c.iter().all(|e| remap.get(e) == Some(&c[0])))

pub fn reducer(&self) -> BTreeMap<&MirScalarExpr, &MirScalarExpr> {
self.classes
.iter()
.flat_map(|c| c.iter().map(move |e| (e, &c[0])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.flat_map(|c| c.iter().map(move |e| (e, &c[0])))
.flat_map(|c| c.iter().map(|e| (e, &c[0])))

@ggevay
Copy link
Contributor

ggevay commented Oct 21, 2024

Started a Nightly: https://buildkite.com/materialize/nightly/builds/10061

Edit: The lint failure is unrelated to this PR.

Edit 2: Ok, I think that's a successful Nightly. It's showing a benchmark regression, but it's probably just a flake. I can't imagine that query to be significantly affected by this PR. The query is

SELECT f1 FROM v1 ORDER BY f1 DESC LIMIT 1000

Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I wrote some comments.

src/transform/src/analysis/equivalences.rs Show resolved Hide resolved
@@ -525,9 +520,18 @@ impl EquivalenceClasses {
self.classes.extend(to_add);

// Tidy up classes, restore representative.
// Specifically, we want to remove literal errors before restoring the equvialence class structure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in "equvialence"

simplified = simplified || self.replace(expr);
simplified
}
fn reduce_child(&self, expr: &mut MirScalarExpr) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a duplicate code fragment: there is an other, identical reduce_child in the same file. It's not on the trait, though, so self is different, so maybe this is intentional? I'm not sure.

Copy link
Contributor Author

@frankmcsherry frankmcsherry Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so it is different, and it is intentional. This method, and the others in the trait, are meant to sub in for those methods on Equivalences. However, we haven't pivoted the whole codebase over from Equivalences::replace to ExpressionReducer::replace. Internally, minimize_once uses the ExpressionReducer version, but externally (e.g. in EquivalencePropagation) the interface hasn't changed yet (the first goal is to improve minimize rather than uses of Equivalences).

// TODO: remove these measures once we are more confident about idempotence.
let prev = self.clone();
self.minimize_once(&columns);
mz_ore::soft_assert_eq_or_log!(self, &prev, "Equivalences::minimize() not idempotent");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a bit uneasy about removing this check just in the same PR where the stability check in minimize_once is also being changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is fine after checking the new stability check in more detail, see #30082 (comment)

self.tidy();

stable
// The state is stable if every expression is present in `remap` with the same representative, and their sizes are the same.
Copy link
Contributor

@ggevay ggevay Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the advantage of this new stability check compared to the old code that simply set stable = false when a change happens? minimize_once does a lot of stuff, it seems non-trivial to me to verify that this new check achieves the same as the old code. (Also, the old check is, I guess, faster.)

Copy link
Contributor Author

@frankmcsherry frankmcsherry Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old check required, among other things, defensive copies of every expression, because MSE::reduce does not produce a bool result indicating whether a change happened. So we were already cloning everything and doing equality checks on them. The PR moves that cloning into the formation of remap, independently needed for efficient operation, but since we now have a copy of the ground truth we started with, we use it rather than make defensive copies and bespoke change tracking.

it seems non-trivial to me to verify that this new check achieves the same as the old code.

100%. The new check is more likely to achieve the correct answer, and the complexity is in verifying that the old code even did that in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, eliminating the defensive copies is a good point!

When you say that the new code's answer is more correct, you mean that the old code might have sometimes set stable = false even when we were actually already stable, due to making some change that was not actually a meaningful change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the new check seems correct, indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say that the new code's answer is more correct ...

Imo it's not about the answer as much as the process. The new code almost directly confirms that the contents of classes at the start and at the end are identical. The old code had bespoke change tracking that relied on some amount of care in the implementation. It will be harder for the new code to produce a wrong answer going forward, but the old code could pretty easily do that.

Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, modulo the very minor outstanding comments.

Edit: There are significant changes in the meantime. Will do another review round now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants