Implement new AST pattern-matching library #940

Baltoli · 2023-12-19T11:48:12Z

This PR implements a new library to improve the way that we write code to deconstruct and examine sub-terms of large, nested AST patterns. For example, consider getLeftHandSide. This function is around 120 lines of code without explanatory comments, and at one point closes no fewer than 10 nested scopes at once.

Reading, writing and reviewing code of this kind is difficult, tedious and prone to bugs; @Scott-Guest did a good job of untangling the intent of these functions previously, but it's not ideal to be doing a lot of that kind of forensics. There is a sufficient quantity of code in the backend that does this kind of decomposition that I think it's worth finding a more reusable solution.¹

My proposed solution is to implement a template metaprogramming library that defines a small DSL modelling similar patterns to Scott's explanatory comments. For example, with a bit of setup, the comment:

// lhs(\rewrites(\and(\equals(_, _), X), _)) = [X]

becomes the following valid C++ code:

map(rewrites(and_(equals_(any, any), X), any), getSingleton)

which matches the structure of the explanation precisely, rather than losing the intent of the code behind a pile of conditional branches and imperative code.

By composing several of these patterns together, we are able to shrink functions like getLeftHandSide down to a tiny fraction of their original size, while also improving their readability and correctness² and preserving error-checking behaviour.³

The changes in this PR are as follows:

Extract the functions in AST.cpp that perform pattern-matching functions to a new, separate translation unit.
Implement the pattern-matching DSL in pattern_matching.h; this implementation is well-documented internally and I hope will be easy to review as a standalone feature. Additionally, unit tests for this library are added.
Refactor the AST pattern-matching code to use the new abstractions.

Additionally, clang-tidy complains heavily about these functions - simplifying them will allow more code quality and static analysis checks to be enabled. ↩
I found two cases where the specification comments didn't quite match the intent of the code. This is - to be clear - a very good thing for the quality of those comments given the complexity of the code. ↩
One function maps a set of "good" patterns to nullptr and errors for "bad" patterns; we can express that behaviour using the new abstraction. ↩

Scott-Guest

LGTM! This was a much-needed code quality improvement.

I have some ideas in terms of allowing multiple subjects, but let's get this merged first, and I'll do that in a follow-up PR.

(Also, big kudos for the cleanliness, documentation, and testing here! I'd love to see this level of quality in the whole codebase.)

Scott-Guest · 2023-12-22T18:48:42Z

unittests/compiler/pattern_matching.cpp

+  auto bar = term("bar");
+  auto big = term("baz", term("a", term("a1"), term("a2")), term("b"));
+
+  for (auto const &t : {foo, bar, big}) {


I didn't know you could use initializer lists in a range-based for loop like this 🤯

Yep - I probably wouldn't do this outside of a unit test, though. There are some footguns¹ around lifetimes and deduction when you use initializer lists.

Footnotes

That I can't remember in enough detail for my personal rule for real code to sensibly be anything other than "don't do this" ↩

lib/ast/pattern_matching.cpp

Baltoli · 2024-01-08T10:33:45Z

some ideas in terms of allowing multiple subjects

Indeed - the subject(L) lens is a specific instance of a more general lens bind(L, std::string) that binds the result of a nested match to a name, and bubbles up a mapping from names to subterms on success. It so happens that the code we use lenses for currently doesn't need multiple subjects, but it would be a nice generalisation to add for the future!

lib/ast/pattern_matching.cpp

Co-authored-by: Scott Guest <scott.guest@runtimeverification.com>

lib/ast/pattern_matching.cpp

Initial commit of library

82e231c

Baltoli force-pushed the patterns branch from 47a5103 to 82e231c Compare December 19, 2023 11:54

Baltoli added 21 commits December 19, 2023 12:11

Implement more unit tests

fac2e6a

Implement and test literals

06bde4d

Implement matcher

e83a5fc

Finish testing match_first

96f2456

Test match_first with transform

60ee6c1

Port over raw term stripping code

6551ca4

Port getRequires

495c0e0

Reorder to match docs

cf6835b

Simplify

0147fe0

Shrink code further

07e00b1

Golf the code down further

f24a5af

Validate _any_ pattern matched

2f6eaa1

Refactor getRightHandSide

ec88fc7

Pull pattern matching code out into its own file

6a1521b

Reorganise and comment

40386ce

Port getLeftHandSide

2232c98

Simplify make_matcher

d85ac6b

Merge branch 'master' into patterns

e86888d

Merge branch 'master' into patterns

5f5f7fb

Give matcher a better name

656333e

Update with comments in library

5eb6cb2

Baltoli marked this pull request as ready for review December 22, 2023 12:41

Baltoli mentioned this pull request Dec 22, 2023

Pattern matching improvements #938

Closed

Scott-Guest approved these changes Dec 22, 2023

View reviewed changes

Baltoli added 2 commits January 8, 2024 10:09

Fix missing close-paren

08d2e7b

Merge branch 'master' into patterns

0e2887a

Baltoli requested review from theo25 and dwightguth January 8, 2024 10:12

Baltoli requested a review from gtrepta January 8, 2024 10:12

Scott-Guest reviewed Jan 8, 2024

View reviewed changes

lib/ast/pattern_matching.cpp Outdated Show resolved Hide resolved

Update lib/ast/pattern_matching.cpp

2a3a5de

Co-authored-by: Scott Guest <scott.guest@runtimeverification.com>

dwightguth reviewed Jan 8, 2024

View reviewed changes

lib/ast/pattern_matching.cpp Show resolved Hide resolved

Baltoli added the automerge label Jan 10, 2024

rv-jenkins merged commit 276afab into master Jan 10, 2024
7 checks passed

rv-jenkins deleted the patterns branch January 10, 2024 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement new AST pattern-matching library #940

Implement new AST pattern-matching library #940

Baltoli commented Dec 19, 2023 •

edited

Loading

Scott-Guest left a comment •

edited

Loading

Scott-Guest Dec 22, 2023 •

edited

Loading

Baltoli Jan 8, 2024

Baltoli commented Jan 8, 2024

Implement new AST pattern-matching library #940

Implement new AST pattern-matching library #940

Conversation

Baltoli commented Dec 19, 2023 • edited Loading

Footnotes

Scott-Guest left a comment • edited Loading

Choose a reason for hiding this comment

Scott-Guest Dec 22, 2023 • edited Loading

Choose a reason for hiding this comment

Baltoli Jan 8, 2024

Choose a reason for hiding this comment

Footnotes

Baltoli commented Jan 8, 2024

Baltoli commented Dec 19, 2023 •

edited

Loading

Scott-Guest left a comment •

edited

Loading

Scott-Guest Dec 22, 2023 •

edited

Loading