Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm/PE question #5556

Closed
srenatus opened this issue Jan 10, 2023 · 23 comments
Closed

wasm/PE question #5556

srenatus opened this issue Jan 10, 2023 · 23 comments
Labels

Comments

@srenatus
Copy link
Contributor

I tried your workaround (#3407 (comment)), but unfortunately, partial evaluation is still not retained.

test.rego is my test rego policy which contains partial evaluation:

package test

default doc_allow := false

doc_allow := true {
    input.op == "query"
    data.c.tent == data.data[input.doc.id]
}

Here is I executed your suggested commands

  1. opa build -e "test/doc_allow" -O2 -o bundle.tar.gz test.rego
    ->it did generate bundle.tar.gz
  2. opa build -t wasm -e "test/doc_allow" bundle.tar.gz
    error: entrypoint "test/doc_allow" does not refer to a rule or policy decision
    -> partial evaluation is not retained.

I dig a little bit more. I saw "policy.wasm" contains error like "ï�result op query doc id c tent data � � � PÒ� � � WÒ� � � ZÒ� � � `Ò� � � dÒ� � � gÒ� � � iÒ� � � nÒ� test.rego test/doc_allow var assignment conflict object insert conflict internal: illegal entrypoint id A¿¦
#{"g0": {"test": {"doc_allow": 74}}} "

Here are the steps how I got policy.wasm and found the error:

opa build -t wasm -e "test/doc_allow"  test.rego
tar -xzf bundle.tar.gz  /policy.wasm

Originally posted by @pzou19741 in #3407 (comment)

@srenatus
Copy link
Contributor Author

The optimization build assumes that all that is known can be evaluated already. The optimizer tries to figure out the unknowns on its own, and uses this method:

  1. input is unknown
  2. any data ref that is not part of the bundle manifest's root is unknown

So if you build your optimized bundle from just the one rego file, the bundle root is "", which means all of data.

Now, doing partial evaluation with the assumption that only input is unknown, your policy evaluates to

package test

default doc_allow = false

because since data.data and data.c are assumed part of the bundle (known), and since they are undefined, the rule body can never be true.

However, the outcome you get for building the wasm bundle is still a surprise to me, since it works locally here. What's your OPA version?

$ opa build -e "test/doc_allow" -O2 -o bundle2.tar.gz  test.rego 
$ opa build -twasm -e test/doc_allow -b bundle2.tar.gz          
$ opa eval -b bundle.tar.gz data.test.doc_allow
{
  "result": [
    {
      "expressions": [
        {
          "value": false,
          "text": "data.test.doc_allow",
          "location": {
            "row": 1,
            "col": 1
          }
        }
      ]
    }
  ]
}

The policy.wasm file is the compiled policy, and it needs to know how to display errors. I'm afraid your digging the wrong whole there. If you want to examine the output of PE, check /optimized/test.rego of the first bundle.

What you need to do is to declare your bundle's roots in a bundle manifest, and repeat the steps above. If the manifest is correct and declares a bundle root different from data.data and data.c, you should find a meaningful rule body in /optimized/test.rego of the first bundle.

@pzou19741
Copy link

pzou19741 commented Jan 11, 2023

Thank you, Stephan, for the detailed explanation.
My Opa version is Version: 0.45.0 Platform: windows/amd64 , I'm able to get the same result that you post above,
but it also confused me.

When I open policy file which extracted from bundle.tar.gz .
"/optimized/test.rego test/doc_allow var assignment conflict object insert conflict internal: illegal entrypoint id AÚ¥
#{"g0": {"test": {"doc_allow": 74}}}"
When I oped optimized/test.rego I found it different from the non-optimization compile version.

package test
import data.test.doc_allow

Do you mean bundle.tar.gz(wasm) retained "unknows" -> data.c.tent, the partial evaluation will work? I'd like to test it out before moving the second step(dotnet-opa-wasm change.

Do you have any easy way to test it out ?

Let me further simplify test.rego

package test

default doc_allow := false

doc_allow := true {
    input.op == "query"
    data.c.tent == input.doc.id
}

For the test policy, I provide. the unknown is "data.c.tent" (partial evaluation),

        {
				"query": "data.test.doc_allow == true",
				"input": {
		                   "doc": {
					          "id": "2"
				           }
				 },
				**"unknowns"**: [
				"data.c.tent"
				]
		}

Again, many thanks!

@srenatus
Copy link
Contributor Author

What you find in the policy.wasm file is irrelevant. The strings need to be there in case that error happens in the evaluation of the wasm bundle.

Have you tried setting bundle roots for the first step?

@pzou19741
Copy link

Thank you, Stephan! I'm able to get bundle.tar.gz based on your steps.

I'm just wondering if I can test if partial evaluation working before going to the next step (dotnet-opa-wasm change).

@srenatus
Copy link
Contributor Author

OK taking a step back here -- before you run into the wrong direction because I've misguided you: What is it you're trying to achieve with partial evaluation?

@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

I tried to leverage partial evaluation results to generate cosmos DB query's where clause.

Here is my example with no wasm, opa run as sidecar:

test.rego

package test

default doc_allow := false

doc_allow := true {
    input.operation == "read"
    data.c.tent == data.data[input.doc.id]
}

This is data mapping (data.reports)

{
    "data": {
        "2": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa", 
        "3": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxb"
    }
}

Here is OPA compile post request

http://xxxx:8181/v1/compile

{
		"query": "data.test.doc_allow == true",
		"input": {
		   "operation":"read",
                       "doc": {
			  "id": "2"
		       }
		},
		"unknowns": [
		"data.c.tent"
		]
}

(input.json)
I got expected result contains queries as below :

{"result":{"queries":[[{"terms":[{"type":"ref","value":[{"type":"var","value":"eq"}]},{"type":"ref","value":[{"type":"var","value":"data"},{"type":"string","value":"c"},{"type":"string","value":"tent"}]},{"type":"string","value":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"}],"index":0}]]}}

Parser the Result Queries

c.tent = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"

so the cosmosDB query will be

select * from docs c where  c.tent = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"

The reason we use wasm is we do not want to run opa as sidecar.
So we build wasm as eval policy like below

static void EvaluatePartialEval()
{
	using var module = OpaPolicyModule.Load("test.wasm");
	using var opaPolicy = module.CreatePolicyInstance();
	const string data =@"{
		""data"": {
			""2"": ""xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"", 
			""3"": ""xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxb""
		}
	}";
	opaPolicy.SetDataJson(data);
	string input = @"{
				""query"": ""data.test.doc_allow == true"",
				""input"": {
				   ""operation"":""read"",
		                     ""doc"": {
					            ""id"": ""2""
				                  }
				},
				""unknowns"": [
				""data.c.tent""
				]
		}";
	string output = opaPolicy.EvaluateJson(input); // use Evaluate<T>(...) for a higher-level API

	Console.WriteLine($"Cosmos evaluate output is : {output}");
}

Please reference:
https://github.com/christophwille/dotnet-opa-wasm/blob/master/src/Opa.Wasm.ConsoleSample/Program.cs#L25

@srenatus
Copy link
Contributor Author

I tried to leverage partial evaluation results to generate cosmos DB query's where clause.

OK, so we've taken the wrong turn here:

If you were to use partial eval to generate SQL filters or some such thing, there's currently no solution here.

Wasm bundles by their very nature are compiled policies. There's no way to partially evaluate them, because all the original Rego is gone. Hence there is nothing to be done in any of the Wasm SDKs to fix this.

The only thing you could try here, I suppose, is to compile the topdown interpreter, i.e. the thing that does the partial evaluation in OPA, into Wasm, using golang's Wasm support. However, that's not a thing anybody tried before, so I'm not aware of how bumpy that road is going to be.

@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

We adopt Partial Evaluation (performance reason) while opa run as a sidecar. but some of our use cases do not support sidecars. so we have to run everything in one process. that's why we decide to opa build as wasm bundle. We expected the same policy eval have some results.

@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

@srenatus Can you suggest if there is any other approach partial evaluation work but OPA and app(dotnet) run in one process?

@srenatus
Copy link
Contributor Author

You cannot put the interpreter that's doing the partial evaluation, a golang library, into any other program, I'm afraid. You could go the experimental topdown-via-wasm route outlined above.

That aside, there might be more to explore between embedding (which might be impossible) and having a side-car. Like, can you shell out an opa eval --partial command?

@srenatus srenatus changed the title build wasm from optimized bundle wasm/PE question Jan 12, 2023
@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

Do you mean dotnet app to invoke opa eval --partial command for partial eval policy only?

Golang should work but there is some effort there.
Here is code example I tried out early and worked, but unfortunately we decided to use dotnet

	ctx := context.Background()

	module := `
	package example

	allow {
		input.subject.clearance_level >= data.reports[_].clearance_level
		}
		allow {
		data.break_glass = true
		}
	`

	var pre map[string]interface{}
	decoder := json.NewDecoder(bytes.NewBufferString(`{
		"subject": {
		"clearance_level": 4
		}
		}`))

	if err := decoder.Decode(&pre); err != nil {
		// Handle error.
	}
	r := rego.New(
		rego.Query("data.example.allow"),
		rego.Module("example.rego", module),
		rego.Input(pre),
		rego.Unknowns([]string{"data.reports"}),
	)

	pr, err := r.Partial(ctx)
	if err != nil {
		panic(err.Error())
	}
....

@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

I just want to be on the same page #3407 Support Partial Evaluation in WASM is impossible, right? If yes, I suggested documenting it.

@srenatus
Copy link
Contributor Author

Do you mean dotnet app to invoke opa eval --partial command for partial eval policy only?

Yes. Your golang example would roughly translate to this (with input.json containing your pre, module.rego containing the policy):

opa eval -d example.rego -u data.reports -i input.json --format source --partial data.example.allow

@pzou19741
Copy link

pzou19741 commented Jan 12, 2023

Thank you, Stephan.

I need help here on cli opa eval.

Based on rego, data and input defined here : #5556 (comment)

opa eval -d test.rego -u data.reports -i input.json --format json --partial data.test.doc_allow

I always got

{
  "partial": {}
}

I tried --format as source but get empty return.

opa eval -d test.rego -u data.reports -i input.json --format source --partial data.test.doc_allow

@srenatus
Copy link
Contributor Author

That rego file needs different unknowns. I've come up with those parameters based on your Golang calls from #5556 (comment)

Try -u data.c,data.data.

@pzou19741
Copy link

Thank you.
I tried both

opa eval -d test.rego -u data.reports -i input.json --format source --partial data.test.doc_allow -u data.c.tent

and

opa eval -d test.rego -u data.reports -i input.json --format source --partial data.test.doc_allow -u data.c

I did not get the expected partial eval result but result like below

# Query 1
data.partial.test.doc_allow

# Module 1
package partial.test

default doc_allow = false
``

@srenatus
Copy link
Contributor Author

data.data needs to be unknown, too.

@pzou19741
Copy link

I got the same result after add "-u data.data" (-u data.c,data.data get error)

opa eval -d test.rego -u data.reports -i input.json --format source --partial data.test.doc_allow -u data.c -u data.data
# Query 1
data.partial.test.doc_allow

# Module 1
package partial.test

default doc_allow = false

@srenatus
Copy link
Contributor Author

OK I took another stab at it. I hadn't noticed that your reports are known. Try this:

test.rego

package test

default doc_allow := false

doc_allow := true {
    input.operation == "read"
    data.c.tent == data.reports[input.doc.id]
}

reports.json

{
    "reports": {
        "2": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa", 
        "3": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxb"
    }
}

input.json

{
  "operation": "read",
  "doc": {
    "id": "2"
  }
}

$ opa eval -fpretty -p -u data.c -i input.json -d test.rego -d reports.json data.test.doc_allow
+-----------+---------------------------------------------------------+
| Query 1   | data.partial.test.doc_allow                             |
+-----------+---------------------------------------------------------+
| Support 1 | package partial.test                                    |
|           |                                                         |
|           | default doc_allow = false                               |
|           |                                                         |
|           | doc_allow {                                             |
|           |   data.c.tent = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa" |
|           | }                                                       |
+-----------+---------------------------------------------------------+

@pzou19741
Copy link

Thank you so much!
So I need to write some parser to retrieve data.c.tent = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa" from doc_allow {}, if doc_allow {} exists otherwise, will be false.
I can definitely test more.

@srenatus
Copy link
Contributor Author

Please be aware that there are other output formats:

Source:

$ opa eval -fsource -p -u data.c -i input.json -d test.rego -d reports.json data.test.doc_allow
# Query 1
data.partial.test.doc_allow

# Module 1
package partial.test

default doc_allow = false

doc_allow {
	data.c.tent = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"
}

JSON (default):

$ opa eval -p -u data.c -i input.json -d test.rego -d reports.json data.test.doc_allow | jq -c
{"partial":{"queries":[[{"terms":{"type":"ref","value":[{"type":"var","value":"data"},{"type":"string","value":"partial"},{"type":"string","value":"test"},{"type":"string","value":"doc_allow"}]},"index":0}]],"modules":[{"package":{"path":[{"type":"var","value":"data"},{"type":"string","value":"partial"},{"type":"string","value":"test"}]},"rules":[{"default":true,"head":{"name":"doc_allow","value":{"type":"boolean","value":false},"ref":[{"type":"var","value":"doc_allow"}]},"body":[{"terms":{"type":"boolean","value":true},"index":0}]},{"head":{"name":"doc_allow","value":{"type":"boolean","value":true},"ref":[{"type":"var","value":"doc_allow"}]},"body":[{"terms":[{"type":"ref","value":[{"type":"var","value":"eq"}]},{"type":"ref","value":[{"type":"var","value":"data"},{"type":"string","value":"c"},{"type":"string","value":"tent"}]},{"type":"string","value":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxa"}],"index":0}]}]}]}}

@pzou19741
Copy link

Thank you so much for your great help, Stephan! It is not an easy job to make partial evaluation CLI work with all these combinations of parameters.
I'm sure our thread can be a good example to benefit others as well.

@anderseknert
Copy link
Member

Happy to see things resolved eventually. If there's anything left actionable here, let me know, and I'll reopen. Closing for now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants