Skip to content
nikswamy edited this page Jun 27, 2023 · 22 revisions

This is a partial, evolving guide recommending various syntactic conventions.

Basic syntactic structure

  • Files should begin with the following copyright notice, where YYYY is replaced with a suitable year range (e.g., some long-lived files have the range 2008--2019
(*
   Copyright YYYY Microsoft Research
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at
       http://www.apache.org/licenses/LICENSE-2.0
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
*)
  • Source files should not contain tabs
  • The unit of indentation is 2 spaces
  • Lines should not exceed 100 characters
  • JP: Avoid trailing whitespace

Modular structure

  • Module names should use CamelCase
  • Module filenames should follow the same case convention as the module names themselves, i.e., CamelCase.fst
  • Modules should have separate interface files, i.e., CamelCase.fsti
  • Module abbreviations are preferred over opened modules and namespaces, with well-chosen exceptions
  • Groups of related modules should be composed into a single module using include for easier use by clients

Naming conventions within a module

  • Data constructors must begin with an uppercase letter and, like module names, use upper CamelCase.

  • Identifiers are required by F* to begin with a lowercase character. All lowercase identifiers should delimit words in the identifier with underscores. i.e., partial_map rather than partialMap or partialmap.

    • This is more compatible with our existing conventions
    • It works better with abbreviations in identifiers: e.g., use_SMT_pat rather than the camelCase useSMTPat.
  • Use descriptive, multi-character names for top-level names, especially those exported by a module. (i.e., contrary to OCaml style, prefer giving names like PartialMap.partial_map rather than PartialMap.t, particularly as the module name may often be abbreviated or opened)

  • Redefining commonly used operators with different semantics is a bad idea. We should avoid doing this. Redefining them while preserving semantics is better, but even that should be done rarely. For an example of this, see FStar.Integers, which redefines all the arithmetic operators, extending their domains from just mathematical integers to machine integers also. That's quite special, but at least it does this while preserving their semantics on math integers.

  • Sometimes, the name of a definition may reflect its type, e.g., Seq.seq_of_list. More often, the name of a definition should provide some more descriptive name, reflecting the most meaningful part of its specification paraphrased in English. The name of the lemma below is a good example.

val sel_upd_distinct_key (#k:eqtype) (#v:Type) (m:t k v) (x1 x2:k) (y:v)
  : Lemma (requires x1 =!= x2)
          (ensures sel (upd m x1 y) x2 == sel m x2)
          [SMTPat (sel (upd m x1 y) x2)]
  • Note, although many lemmas in the library have names that include the fragment lemma_, this style is not preferred.

Formatting type annotations

  • Annotate all top-level terms with their full types. This may take one of the following forms:

    let f (x1:t1) ... (xn:tn) : C = e
    

    Or

    val f (x1:t1) ... (xn:tn) : C
    let f x1 ... xn = e
    

    Unless required by needs of an interface, prefer the first style (i.e, use only an annotated let rather than let + val).

    If using the val form, do not repeat the types provided in the val in the let also.

  • Format of a val

    F* accepts both of the following notations for val

    val f (x1:t1) ... (xn:tn) : C
    

    or

    val f : x1:t1 -> ... -> xn:tn -> C
    

    Prefer the first, as it is more compatible with the syntax of let.

  • Prefer writing x:t -> t' rather than (x:t) -> t'. This is particularly useful in conjunction with refinement types, e.g., x:int{x > 0} -> y:int{y > x} -> t, rather than (x:int{x > 0}) -> (y:int{y > x}) -> t, or x:(x:int{x > 0}) -> y:(y:int{y > x}) -> ...

  • Repeated arguments with the same type: Prefer writing (x y :int) -> t rather than x:int -> y:int -> t

  • Linebreaks in type annotations:

    • A type annotation should be on a single line, except if it would exceed 100 characters.
    • Otherwise, if all the arguments fit within 100 characters, annotate the result computation type on a separate line
    • Otherwise, every argument goes and the result type goes on a separate line.

    Here are some examples:

    Fits on a single line:

    val on_domain (a:Type) (#b:a -> Type) (f:arrow a b) : Tot (arrow a b)
    

    Arguments fit on a single line:

    val idempotence_on_domain (#a:Type) (#b:a -> Type) (f:arrow a b)
        : Lemma (on_domain a (on_domain a f) == on_domain a f)
    

    Too many arguments, layout on multiple lines, one type per line. The arguments are triple indented (6 chars in) The colon is double indented (i.e., 4 chars in).

val modifies_preserves_liveness
      (#aloc: aloc_t)
      (#c: cls aloc)
      (s1 s2: loc c)
      (h h': HS.mem)
      (#t: Type)
      (#pre: Preorder.preorder t)
      (r: HS.mreference t pre)
    : Lemma
      (requires 
        modifies (loc_union s1 s2) h h' /\
        loc_disjoint s1 (loc_mreference r) /\
        loc_includes (address_liveness_insensitive_locs c) s2 /\
        h `HS.contains` r)
      (ensures 
        h' `HS.contains` r)
  • Computation types:
    • Fit it on a single line
    • Otherwise effect label and result type (if any) on a single line and other arguments on one line each
    • In pre- and post-conditions, formulae have one conjunct per line (think of /\ as a line delimiter)
    • Alignment is to the start column of the effect label
    • See above for an example

JP: Note that we could make this example use even less horizontal space by removing the triple-indentation (in favor of single-indentation) but I don't know what is the popular style these days.

Formatting logical formulae

  • End the line with a formula connective (typically /\ or ==>)
  • For numeric ranges, prefer semi-open intervals (closed lower bound, open upper bound)
  • When writing numeric ranges, stick to the use of one form of inequality, i.e., prefer 0 <= i /\ i < len with the i sandwiched between its bounds. The alternative len > i /\ 0 <= i is much harder to read. The same applies to the Boolean variants of these ranges, i.e., prefer 0 <= i && i < len.
  • Prefer one conjunct per line, except when there's an obvious reason to prefer having more than one conjunct on the line, e.g., a numeric range like 0 <= i /\ i < len is good to keep on a single line.

Other things:

  • recommended style for if-then-else:
if e1 then
  ...
else
  ...
  • how to indent List.map

  • use (abuse?) of <| and |>

  • spaces around operators?

    • colon:
      • in parameters: (x y:list t) or (x y: list t) or (x y :list t) or (x y : list t)
      • in return types: let f (x:int):int or let f (x:int) : int or …
      • in variable types: let x:int = … or let x : int = … or …
      • in record types and datatype declarations
      • VD: this snippet captures most of the patterns around colon spacing, as currently implemented in the prettyprinter
val fun_set_equal_elim 
      (#t #t': Type)
      (#p: (t -> GSet.set t' -> Type))
      (f1 f2: i_restricted_g_t t (fun x -> g: GSet.set t' {p x g}))
    : Lemma (requires (fun_set_equal f1 f2)) (ensures (f1 == f2))

In most cases, the (x y: list t) style is used, except for computation types (e.g., let regions_of_loc (#al: aloc_t) (#c: cls al) (s: loc c) : GTot (Set.set HS.rid)).

Refinements are either separated by one space from the type, as shown above, if the type contains spaces, or is attached to the type if it doesn't (e.g., (r': HS.rid{Set.mem r' r})). (My motivation for this is that it seems to strike a good balance between making larger types more readable while not needlessly expanding smaller types.)

  • double colon: x::nil or x :: nil

    • Proposal: Use x::nil whenever x and nil do not contain spaces. And use SomeConstructor x y :: z when one of the sides contains spaces.
  • type declarations?

type foo =
| Foo

or

type foo =
  | Foo

VD: Inductives are currently formatted like this:

type pattern =
  | PAny : pattern
  | PVar : name: varname -> pattern
  | PQn : qn: qn -> pattern
  | PType : pattern
  | PApp : hd: pattern -> arg: pattern -> pattern
  • records -- where do the curly braces go?

  • pattern matching?

match x with
| Foo ->
  foo
| Bar ->
  bar

or

match x with
| Foo ->
    foo
| Bar ->
    bar

VD: where the pattern, arrow and branch don't all fit on one line, I'm currently using the first variant, which I think looks neater JP: indenting branch bodies at four characters allows putting long "when" clauses at two characters

  • nested pattern matching: use begin/end or parens?

  • long let definitions -- where does the = go?

let f (x:int)
    : Lemma
      (requires …)
      (ensures …) // does = go here?
  = // or here? with what indentation?
  5 // same line as =, or different line?
  • single-line comments
// Like this?
let x = 5 // like this?
let x = 5  // like this?

or

(* Like this? *)
let x = 5 (* like this? *)
let x = 5  (* like this? *)
  • multi-line comments

Documentation

There are two styles of writing documentations for F* programs: fsdoc and fslit. Both styles are intended for different purposes but can and should coexist.

Before writing documentation, it is strongly encouraged to create .fsti interface files with all your modules.

fsdoc

fsdoc is the basic documentation style, akin to ocamldoc. fsdoc comments are introduced by the (** ... *) construct. Here is when you should use them.

  • val documentation: each val should have a single-line short comment describing its use. Ideally, this is displayed in the interactive mode when hovering the symbol. You should also include, after the short single-line comment, examples of usage of the val:
(** Bla bla bla

    Example:
    {[
        let x : nat = 1 in
        foo y
    ]}
*)
  • Document sectioning: use fsdoc header comments to clearly mark sections and subsections inside your module. You can use big headers ((*** ... *)) or smaller headers ((***** ... *)).

fslit

fslit is a reStructuredText-based documentation tool whose goal is to create tutorials and walkthroughs in a literate programming style. You should use it in complement with fsdoc to provide a more verbose and linear way for a user to understand your module by reading it from top to bottom.

fslit comments are introduced by the /// syntax at the beginning of the line and use reStructuredText syntax for formatting. Here is what you should use them for.

  • Module statement of purpose: there should be an fslit comment at the beginning or the module describing what it aims to achieve and how, with mentions of relevant papers or external sources of documentation for concepts.

  • Code walkthrough: between two vals, you can insert an fslit comment to tell a story about how the module unrolls, patterns of usage between multiple vals, etc.

Clone this wiki locally