Skip to content

The unification algorithm

Tomer Libal edited this page May 18, 2017 · 4 revisions

Documentation of the Unification Algorithm

Discussion 18/5

Type inference process each top level definition in a file and adds the typed checked definition to the context.

Type checking top-level definitions

For each top level definition, type checking proceeds by applying inference rules which generate guards. Guards are discharged by applying unification and SMT in the following ways:

  1. discharge_guard (with SMT)

This method applies once unification and then SMT and is called when

  • check_inductive_well_typeness
  • tc_abs
  • tc_eqn
  • check_top_level_let_rec
  • building a let rec environment

This function is called at the end of tc_abs, tc_eqn and check_top_level_let_rec, but what about the other two? Does it indeed happen at the end of type checking of a top level definition?

  1. resolve_implicits

This method loop on unification, SMT, unification, .. but only on resolving implicits

  • tc_eqn
  • check_top_level_let
  • check_top+level_let_rec
  • build_let_rec_env

This method calls SMT and seems to be called in the middle of type checking of a top level construct. Is it correct? In any case, it is definitely calling SMT (why?) so would we like inferring implicits to happen at the end only?

  1. solve_deferred_constraints

Is applying only unification

  • tc_abs
  • check_application_args
  • check_top_level_let
  • check_top_level_let_rec
  • tc_tot_or_gtot_term
  • universe_of_aux

Calling only unification and happens in the middle of type checking

  1. force_trivial_guard

Tries both solve_deferred_constraints and resolve_implicits from above and occurs in many places. It asserts that the guard is solvable and happens when we want to confirm the correctness of an inferred type in the middle of type inference.

From the above, it seems that SMT happens at the end of type checking top level definitions (see questions above). Unification happens both at the end and eagerly.

Non-determinism and backtracking

Non determinism can be found both by applying Huet's pre-unification algorithm and when simplifying sub-types to equality relations. It seems it was agreed that we should try to support backtracking as much as possible and restrict it using flags, tactics, etc. There seems to be no other source of non-determinism in the algorithm.

Pre-unification and flex-flex pairs

Pre-unification is not enough as we are using SMT after unification has completed and therefore, we need to apply then a concrete unifier. Failure to eliminate all flex-flex pairs in the non-pattern case result in failure of type checking.

Target of the two phase design

Computations of WPs should happen after type inference and the application of SMT. SMT should be called at the end of accumulating guards (for each top-level definition). What about inferring implicits? Why does it require SMT and why does it happen in the middle of type inference? I tried to turn off SMT on implicit arguments but then the (now non SMT) discharge_guard function returns None and an exception is thrown in Rel.fs. Guards can be eagerly dealt with by unification.

Applying unification at the end of type inference

Not recommended as guards become very complex the longer we take to apply unifiers. An example was given of unification variables denoting implicit arguments, which are being applied to all variables in the scope. It seems that as a counter-measure for that, inferring implicits is being called eagerly (but again, why SMT?).

Applying unification on deterministic unification problems does not pose any problem since we eagerly compute the most general unifier. In case of non-deterministic problems, applying unification eagerly commits to a specific unifier and prevents backtracking in case we find later the unifiers is not the correct one.

Would applying unification eagerly only on deterministic problems (i.e. patterns) be enough to prevent complex terms? (Implicit arguments seems to be patterns, being a unification variable applied to the variables in the context).

Can we decide on that? It means that there would be a phase of eager unification on deterministic problems, then accumulation of constraints until the end of type inference and then applying unification/SMT + backtracking.

Next steps?

  • How to solve resolve_implicits using SMT and being called in the middle of type inference?
  • Can we optimize the solve method to not distinguish between computations and types? Their treatment is pretty similar after all.
  • Do we actually use Huet's procedure? Does dynamic pattern unification not suffice?
  • Should we try to make discharge_guard skip calling SMT and return instead an SMT guard which will be accumulated by type inference? We can then try to discharge all these guards at the end.
  • Aseesm's experiment was to apply Lax mode, which makes all SMT calls succeed. Then use it to annotate all typing information and then run it again with SMT, for type checking only. Since some type inference depend on SMT (sub typing, flex-flex, implicits?), doesnt it annotate wrong typing information?

To proceed, we need to make all calls to SMT return a guard instead of doing anything (in Rel). At the end, we apply SMT on the accumulated guard for each top level.

General Description

The algorithm is defined in the file typechecker/FStar.TypeChecker.Rel.fs

During type checking a guard containing all constraints is being built. When the guard is to be discharged, one calls discharge_guard. This method first tries to apply unification and if SMT is allowed, it then sends the updated obligation to SMT.

Before we try to apply unification, we translate the guard into a list of unification constraints (working list) in wl_of_guard. The unification problem might allow deferred problems or not (the working list defer_ok switch).

Then we can call the algorithm on the problem by recursing on the problems in the list.

Problems can be either computation constraints or typing constraints

Typing constraints

In case deferring constraints is not allowed and the problem is not rigid-rigid or flex-flex, we try one of solve_flex_rigid_join and solve_rigid_flex_meet. Otherwise, we apply the main solving function solve_t'.

Unification Cases (solve_t' in Rel, lines 1531 - 2231)

Unification applies different rules according to the two types we try to unify. In this section we describe the different possible cases.

There are three possible relations: EQ (syntactic unification) and the two directions of the sub type relation. Apparently, syntactic unification is the first option on some common sub typing relations. I could not see back tracking applied. What happens if EQ is not the right relation to use?

Identity

Checked first.

In case both sides are syntactically identical (using BU.physical_equality), we remove this problem and continue

From some reason, we check it twice! Remove one case.

Rigid-rigid cases

This cases are dealt with first.

Comparing two universes

We call solve_one_universe_eq

Arrows

Abstractions

Refinements

Flex-rigid cases

In these cases, we have a variable (or an application with a variable head - HO variable) on one side and a term on the other.

Flex-rigid cases are the most interesting ones. In FO, patterns and FCU, we just replace the variable with a version of the term on the other side.

In the other cases, a non-determinism exists in the problem since we might be able to unify the problem with two different substitutions. The imitate one tries to replace the variable with a term whose head is the same as the head of the other term (which is either a constant of a bvar). The project case tries to project one of the arguments into head position.

Equalities

This is the standard case where we compare them syntactically. We apply solve_t_flex_rigid.

Simple sub-typing

Covers when we have a variable on one side and a universe or an arrow type on the other. According to the comments:

this case is so common, that even though we could delay, it is almost always ok to solve it immediately as an equality besides, in the case of arrows, if we delay it, the arity of various terms built by the unifier goes awry so, don't delay!

These sub-typing problems are treated as syntactical equality instead. Why is it almost always ok to solve them as equality? What happens in the cases it is not ok?

General sub-typing

If we can defer these problems, we do so. Otherwise,

Flex-flex cases

Cases where we have variables on both sides. They have most general unifiers in patterns and FCU but might have inifinite minimal complete unifiers set otherwise.

Therefore, first-order, patterns and FCU should be applied eagerly, the rest should be deferred.

We call the flex_flex function

Clone this wiki locally