Skip to content

Commit

Permalink
Updated format
Browse files Browse the repository at this point in the history
  • Loading branch information
joseandro committed Jul 17, 2023
1 parent 100a43b commit c283bcb
Showing 1 changed file with 17 additions and 17 deletions.
34 changes: 17 additions & 17 deletions content/posts/proofs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ cover:
---
My goal with this post is to explain the theory behind zero knowledge proofs using first principles as I incrementally introduce more context to make sense of it. Here I will try to describe things in a way that most people with some basic comfort level with mathematics can comprehend. The reason why I decided to write about it comes from a personal struggle, most of the material I found available online was either too shallow or too heavy for me to understand zero knowledge proofs. Despite all my efforts to break everything down into chunks that are easier to digest, I must admit that it is a difficult topic built on a lot of theory, but I hope you stick to it. See you on the other side!

# Proof models
## Proof models
Instead of jumping into the definition of zero knowledge proofs directly, we will start from the very beginning: let's think about how **proofs** are generally defined. There are a number of different ways for making proofs (i.e., proof models), but one that easily comes to mind is the model used for mathematical proofs of theorems. [Mathematical proofs](https://en.wikipedia.org/wiki/Mathematical_proof) can take many forms and use multiple strategies. For example, one can prove a theorem using [induction](https://en.wikipedia.org/wiki/Mathematical_induction), [contradiction](https://en.wikipedia.org/wiki/Proof_by_contradiction), [contraposition](https://en.wikipedia.org/wiki/Contraposition), etc. In essence, once the mathematical proof is written down, anyone can follow its instructions step-by-step to verify its validity.

> What is intuitively required from a theorem-proving procedure? First, that it is possible to “prove” a true theorem. Second, that it is impossible to “prove” a false theorem. Third, that communicating the proof should be efficient, in the following sense. It does not matter how long must the prover compute during the proving process, but it is essential that the computation required from the verifier is easy.” [Goldwasser, Micali, Rackoff 1985](https://www.cs.princeton.edu/courses/archive/spr06/cos522/ip.pdf).
Expand All @@ -40,31 +40,31 @@ In the real world though, people prove to others that they know things by intera

This natural, albeit naïve, definition seems easier to implement as an algorithm than the more broad definition of mathematical proofs. This model also gives us an idea of how efficient computation through interaction can take shape: the Prover can bear the load of heavy computation while the Verifier can be tasked with simply checking if the proof is valid or not. Nevertheless, this real world model is still too loosely defined. It would be interesting to define it more formally using computer science theory. Then, perhaps we could show how powerful and efficient it is. For example, using a formal definition, we would know if it is possible to quickly prove to anyone that we know the answer to a game of chess! It would also be interesting to come up with a proof model that works well in untrusted environments, where any actor can cheat to gain personal advantage. Fortunately, this is an active area of research in computer science, the foundation on which zero-knowledge proofs are based on. To understand it, let's start by digging into complexity classes and the Big O notation.

# Complexity theory
## Complexity theory
[According to Wikipedia](https://en.wikipedia.org/wiki/Complexity_class), complexity classes split computation problems (e.g., **decision, search, counting, optimization, or function problems**) solvable by a specific model of computation (e.g., **Turing machines, interactive proof systems, boolean circuits, and quantum computers**) in resource-based (e.g., **time or memory/space**) requirements. I know, it is complex (pun not intended), let's approach it step-by-step:

## Computation problems
### Computation problems
Decision problems simply are problems that expect a **yes** or **no** answer. The subset of inputs to which decision problems return an **yes** answer form together a formal **language** set. We only make proofs for problems with inputs that lead to a **yes** answer (i.e., inputs that are in the **language**). This fact is extremely important since valid proofs only exist for true claims. Furthermore, we always represent the input of computation problems with binary strings.

We will not drill down into the other problems here, but feel free to continue exploring [them](https://en.wikipedia.org/wiki/Computational_problem) if you are interested.

## Models of computation
### Models of computation
Generally speaking, a model of computation describes the output of a computation from its inputs. Here I will describe a few models that will be important in our toolkit:

### Turing machines
#### Turing machines
The [Turing machine (\\(TM\\))](https://simple.wikipedia.org/wiki/Turing_machine) is one of the most popular models of computation, created by Sir Alan Turing himself! The theory behind a \\(TM\\) can get quite abstract and complex. Honestly, if you've never heard of \\(TM\\) before, don't bother. You can think of it as a very simple model of computation that can implement any algorithm, like a simple computer. The deterministic version of a \\(TM\\) simply states that the machine will always output the same answer (e.g., \\(1\\), \\(0\\) or `halt`) from the same inputs. The probabilistic version of a \\(TM\\) may or may not output a different answer from the same input. A deterministic \\(TM\\) can be turned into non-deterministic by using a [random number generator (or RNG)](https://en.wikipedia.org/wiki/Random_number_generation) to make decisions.

### Boolean circuits
#### Boolean circuits
Boolean circuits (or \\(BC\\)) are a generalization of Boolean functions (e.g., functions that output either \\(0\\) or \\(1\\), or `true` or `false`). Modern silicon chips used in computers started out implementing \\(BC\\), they still do. \\(BC\\) are defined using a composition of logic gates, such as `AND`, `OR`, and `NOT`. The \\(BC\\) model of computation is mathematically simpler than the \\(TM\\) model. We will not use \\(BC\\) here, but it is worth noting it because it will be used in future posts.

### Interactive proof systems
#### Interactive proof systems
Interactive proof systems are at the core of zero knowledge proofs. Their importance is so significant that I have dedicated an entire section a few paragraphs down to thoroughly explain what they are. Roughly speaking, they shape the verifiable proof model into a model of computation. For now, just keep in mind that they are another kind of computation model.

## Big O notation
### Big O notation
Before we talk about complexity classes in more detail, I need to describe the [**Big O notation**](https://en.wikipedia.org/wiki/Big_O_notation). It is a mathematical notation used in computer science to describe the **run time** behavior of a program as the length of its inputs grow. You can think of it as the number of instructions an algorithm executes on a problem expressed as a function of its input. The Big O notation is an upper bound estimate, it classifies algorithms according to their growth rates, which may execute in time that is constant \\( O(1) \\), logarithmic \\( O(\log n) \\), quasilinear \\( O(n \log n) \\), polynomial \\( O(n^c) \\), factorial \\( O(n!) \\), etc, where \\( n \\) is the binary input to the algorithm. In computer science, [**we say an algorithm is efficient if it has polynomial run time**](https://medium.com/probably-approximately-correct/polynomial-time-and-efficient-algorithms-16481666827b). This is a complex topic that I described too briefly, so if you are not really familiar with it, I recommend you to watch [this well paced video](https://www.youtube.com/watch?v=Q_1M2JaijjQ) that will take you step-by-step through the whole story behind the Big O notation.

## Complexity classes
### \\(P\\) and \\(NP\\)
### Complexity classes
#### \\(P\\) and \\(NP\\)
\\( P \\) and \\( NP \\) are two fundamental classes. Informally, they separate problems into *easy to solve* and *easy to verify*, respectively. More formally:
- \\( P \\) is the class of **decision problems** to which the answer is **yes** and can be solved in **polynomial time** (i.e., can be solved efficiently) by a deterministic \\( TM \\) relative to the size of its inputs.
- \\( NP \\) is the class of **decision problems** to which the answer is **yes** with a **certificate** that can be verified in **polynomial time** (i.e., can be verified efficiently) by a deterministic \\( TM \\) relative to the size of its inputs. The verification of a solution is done using the inputs to the problem (i.e., the binary string) and the solution **certificate** (or **witness**). As we will soon see, you can also think of the certificate as the proof! The witness certifies (i.e., proves) that the inputs to the problem are in the \\(NP\\)-language.
Expand All @@ -75,7 +75,7 @@ If this is your first time reading about complexity classes, I recommend you to

If you want to know more about \\( P \ vs \ NP \\) without going into crazy loads of math to understand it, I recommend this great two videos: [*\\(P\\) vs. \\(NP\\) and the Computational Complexity Zoo*](https://www.youtube.com/watch?v=YX40hbAHx3s) and [*\\(P\\) vs. \\(NP\\) - The Biggest Unsolved Problem in Computer Science*](https://www.youtube.com/watch?v=EHp4FPyajKQ).

### Bounded-error probabilistic polynomial-time (BPP)
#### Bounded-error probabilistic polynomial-time (BPP)
While both \\(P\\) and \\(NP\\) use a deterministic \\(TM\\), the \\( BPP \\) complexity class contains the languages that can be solved by a **probabilistic \\(TM\\)** running in polynomial time with a bounded error probability. In other words, the execution result will depend on the results of fair coin tosses with a very small (but quantifiable) probability of error. Probabilistic efficient \\(TM\\) are also called \\( PPT \\), probabilistic polynomial-time machines.

In statistics, we use the "fair coin" metaphor to refer to any event that has only **two possible** outcomes with equal chances of happening (i.e., 50:50). If you count the results of **many** fair coin tosses, the odds of heads and tails will converge to 50:50. Furthermore, we also use the fair coin flip as a good source of randomness, as it is a physical event with many variables that are difficult to control and repeat: air speed and drag, force and angle of the flip, time to catch the coin in the air, etc. In practice, the fair coin can be implemented as an algorithm using **RNG**.
Expand Down Expand Up @@ -108,7 +108,7 @@ The complexity class \\( P \\), problems solvable in polynomial time using a det

It is worth trying to connect the dots on your own before we continue. With all the new tools in our toolbox, how would you come up with a model of computation, based on the verifiable proof model, that allows us to define proofs in an efficient and practical way? Using \\(NP\\), maybe? I have introduced a lot of key ingredients that we will now use to define interactive proof systems, let's dig into it.

# Interactive proof systems
## Interactive proof systems
The verifiable proof model I introduced in the beginning of this post was, intentionally, loosely defined. In computer science's realm we need to better define this interaction. It is useful to model new discoveries in a way that we can tap into a larger pool of proven fundamental theory. Fortunately, we have all the theory we need under our belts to expand our proof model with new ingredients!

Until recently there was no real formal definition of computation through interaction. It was only in 1985 that two independent groups of researchers formally defined the concept using complexity theory. Babai defined the [**Arthur–Merlin (AM)** complexity class](https://dl.acm.org/doi/10.1145/22145.22192) and Goldwasser, Micali, and Rackoff defined the [**interactive proofs (IP)** complexity class](https://dl.acm.org/doi/10.1145/22145.22178). Many other flavours of proof systems unfolded from their seminal work.
Expand All @@ -131,7 +131,7 @@ Notice that these ingredients all refer to the behavior of actors within this co

Now that we know what interactive proof systems are all about, we will incrementally play with the ingredients above to analyze what class of problems they enable us to handle. We will move step by step until we can finally use it in zero-knowledge proofs!

## \\(NP\\) (revisited)
### \\(NP\\) (revisited)
With a fresh model of computation under our sleeves, we can now revisit the [\\(NP\\) complexity class](#complexity-classes). Before you continue reading though, I recommend you to pause and think about it for a minute: how would you re-define the \\(NP\\) complexity class using interactive proof systems?

Well, it turns out that the \\(NP\\) complexity class may be viewed as the most simplistic interactive proof system we can think of! In this system, the actors are deterministic (i.e., they can not act randomly) and the number of interactions is limited to 1 (i.e., the witness itself). The Prover computes the certificate of polynomial size in an unbounded time (i.e., we don't care how long it takes). The Verifier is a deterministic \\(TM\\) machine that checks that the proof is valid in polynomial time.
Expand All @@ -144,7 +144,7 @@ This complexity class fills out the necessary characteristics of interactive pro

Next, we will see where we can get to by tweaking one of the ingredients of this model: more rounds of interactions!

## Deterministic interactive proofs (dIP)
### Deterministic interactive proofs (dIP)
The **deterministic interactive proofs (\\(dIP\\))** complexity class contains all languages with an interactive proof system with **\\(k(n)\\)** rounds of deterministic interactions, where **\\(k\\)** is a polynomial function in the input size determining the number of interactions (so that it interacts efficiently too). Here, the Verifier continues to be a **deterministic** \\(TM\\) as in the \\(NP\\) example just covered above.

![Deterministic interactive proof system](img/dIP.png#center)
Expand All @@ -155,7 +155,7 @@ We say that \\(dIP=NP\\) because they are the same proof system when \\(k = 1\\)

In the \\( BPP \\) class definition, we saw the power of non-deterministic \\(TM\\) and how they unlock the classification of multiple real life applications with a small error probability. It turns out that, **for interactions to provide any benefit in this model of computation, the Verifier has to act probabilistically**! In the next sections, we will explore what nice properties lie in probabilistic interactive proof systems.

## Probabilistic interactive proofs (IP)
### Probabilistic interactive proofs (IP)
The \\( dIP \\) proof system introduced interactivity. It expanded \\( NP \\) with \\( k(n) \\) rounds of interaction, but it is not really better than the old and good \\( NP \\). To make \\( dIP \\) probabilistic, we will allow the Verifier to ask some random questions. The Verifier flips coins to decide what questions to ask, the probability is defined over these flips. In \\(IP\\), all flips are private (also known as a private coin model), only the Verifier knows them. This way, we will make much harder for an untrusted Prover to cheat! Which computation model we introduced earlier we could aggregate here to make this possible? Did those coin flips ring a bell? You guessed it correctly, a non-deterministic \\( TM \\)! This way, the Verifier's questions can be modeled probabilistically. This small change will take us much further than the added interactivity in the \\( dIP \\) proof system did, it will move us in the direction that the \\( BPP \\) complexity class did. Some literatures refer to the \\( IP \\) class as \\( IP[k] \\) to show more explicitly that there are k-rounds of interactions in it, but the terminologies are the same.

Before I introduce \\( IP \\) more formally, let's start with an intuitive example which will demonstrate how powerful a system with a non-deterministic Verifier could be in a more "real" situation:
Expand All @@ -182,7 +182,7 @@ All languages in \\( NP \\) have \\( IP \\) proofs, so we can say that they are

To understand how \\( \overline{ISO} \\) is an \\( IP \\) language, you need to understand graphs first. If you are unfamiliar with them or if you need a refresher on graph theory, I recommend you to check out [this short video](https://www.youtube.com/watch?v=LFKZLXVO-Dg). Next, I also recommend you to check out [this chunk of the video](https://youtu.be/TSI3LR5WZmo?t=169) for the proof that \\( \overline{ISO} \notin NP\\) and [this part of the video](https://youtu.be/TSI3LR5WZmo?t=1598) for the proof that \\( \overline{ISO} \in IP \\). These examples are important, be sure to understand these proofs as I will use them later to contrast \\( IP \\) and \\( AM \\) and illustrate zero-knowledge interactive proofs.

## Arthur Merlin (AM)
### Arthur Merlin (AM)
As I briefly mentioned before, the Verifier in the \\( IP[k] \\) proof system uses private coins that aren't shared with the Prover. \\( AM \\), on the other hand, uses public coins where the Prover can see all the coin flips. The \\( AM \\) class is also called \\( AM[2] \\). The most evident differences between \\( AM[2] \\) and \\( IP[k] \\) is that \\( AM \\) is only defined for 2 interactions and produces proofs with public coin.

In \\( AM \\), the Verifier is called Arthur and the Prover is named Merlin. Their interaction is defined as the following:
Expand All @@ -194,7 +194,7 @@ Interestingly, the literature adds more letters to the original \\( AM \\) class

It's important to highlight that in \\( AM \\), the coin flips are revealed to Merlin step by step after each new message from Arthur. It's proven that for every \\(k, \ AM[k] \subseteq IP[k] \\), can you see why? If you watched the video that shows the proof for \\( \overline{ISO} \in IP \\), you probably remember that for it to work, the Prover can not see the coin flip. If the Prover knew the result of the flip she could easily guess the graph permutation correctly. Therefore, keeping the coins private adds more power to interactive proofs systems with the same number of interactions. However, you can still make public coin proof systems as powerful as private ones by adding more rounds of interactions to it, but I will not dive into the details here. Next, we will finally get to the cherry on top of the private interactive proofs cake, zero knowledge proofs!

## Zero knowledge interactive proofs (ZKP)
### Zero knowledge interactive proofs (ZKP)
One of the key motivations for the creation of interactive proofs was to obtain zero knowledge proofs. To give them some color, let me illustrate some use cases. With \\( ZKP \\) one could prove that:
* They have a valid password for a safe without revealing the password to someone else.
* They are above the legal age limit to drink without revealing their age or any other information that could leak their identity details.
Expand Down

0 comments on commit c283bcb

Please sign in to comment.