Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mesh rc 0.4.2 #1573

Closed
wants to merge 78 commits into from
Closed

Conversation

nodiesBlade
Copy link
Contributor

@nodiesBlade nodiesBlade commented Jul 15, 2023

Description

Summary generated by Reviewpad on 15 Jul 23 02:25 UTC

This pull request includes the following changes:

  • File "schema.go": Adds JSON schema definitions for different objects related to the mesh package, including fallback node file, node file, plain chains map, and rich chains map.

  • File "servicer.go": Adds a new file with package name "mesh". Adds imports, structs, and functions related to managing and reloading servicers in the mesh package.

  • File "server.go": Adds package imports, structs, functions, and modifications to existing code. These changes seem to be related to handling HTTP requests and proxying.

  • File "logger.go": Implements logging functionality for the mesh package, including functions for logging at different levels and a method to initialize the logger.

  • File "api.go": Adds a new function called "Health" to query the health status of the API version.

  • File "querier.go": Modifies a function call with an additional parameter in the x/pocketcore/keeper directory.

  • File "service_test.go": Makes changes to set up the HTTP client, configure gock mocks for testing, and make a request to a specific URL.

  • File "service_test.go": Adds imports, variable initialization, defer statements, and gock mocks for testing. Modifies test cases and adds a new test function.

  • File "auth.go": Adds a new file to the app/cmd/rpc/mesh package, which includes functions related to authentication in the mesh package.

  • File "evidence_worker.go": Adds various imports, types, and functions related to evidence workers and configuration.

  • File "mesh.go": Adds a new file called "mesh.go" with the implementation of the MeshConfig struct and related functions.

  • File "querier.go": Changes a function call with an additional argument in the x/pocketcore/keeper directory.

  • File "service_test.go": Adds imports, variable initialization, defer statements, gock mocks, test cases, and a new test function.

  • File "keys.go": Adds a new file to the app/cmd/rpc/mesh directory, which includes structs and functions related to loading and storing servicers and nodes.

  • File "app.go": Adds a new file to the app/cmd/rpc/mesh/ directory, which includes various packages, imports, constants, variables, and functions related to starting the Mesh RPC server.

  • File "rpc_test.go": Changes a variable name from simParams to SimRelayParams.

  • File "cli.md": Adds a new section for starting a Mesh Node in the command-line interface documentation, including instructions and available options.

  • File "rpc.go": Adds an additional parameter to the StartRPC function, appends additional routes if meshNode is true, and no deletions or modifications.

  • File "types/app.go": Introduces a new type called "EvidenceWorker" and adds new fields to the "PocketConfig" type. Adds a new function "DefaultConfig" for initializing default values.

If you need further assistance or details on specific changes, please let me know.

Fix a high memory consumption that also is part of the issue pokt-network#1457.
Under high load of requests (1000/rps or more) the RAM got crazy and scale up to 40GB or close to that.
Now after the fix of pokt-network#1457 with the worker pool, the node remains under 14gb of ram in my local tests.
* Fixed RPC timeout handled as Seconds instead of Milliseconds
* Updated mesh.md to handle new cache configurations
* Updated mesh.md to list /v1/private/mesh/session as required on the whitelist endpoints/paths
* Fixed /v1/private/mesh/updatechains to properly update them on memory and disk
* Added hot reload for servicer private key files (add & remove)
  * on add turn on the checks and start allowing it
  * on remove stop receiving and consume all the pending relays in queue
* Version bump
* Enhanced log about missing sessions
* Version Bump
…rivate key is removed after it been supported by the mesh node.

* Version Bump
…ral solution)

* Fixed error that panic process when load servicer_url without http/https schema. Now it will properly report the error.
* Added manual cron to compact relays database every hour.
* Removed a log2.Fatal that was crashing the process.
* relay_cache_background_sync_interval was not used
* relay_cache_background_compaction_interval was not used

Added:
* hot_reload_interval allow to turn off using 0 the hot reload of chains/servicers - otherwise the amount of MS it will check the files again

Updated:
* Now health check of servicers is done every 60s - was 30s - future: will be configurable through config.json
* Now old sessions are evaluated to be removed every 30m - was 30s - future: will be configurable through config.json
* config.json example of docs.

Removed:
* Manual relays db compaction job removed; We receive reports that it was corrupting relays database if you run at same time of background configured by relay_cache_background_compaction_interval
… from storage in any case after they are success/failed.

Fixed log that was printing node instead of app public key.
Added different key format.
Refactor connectivity checks.
Refactor node/servicer internal structure of mesh to reduce amount of worker/cron instances.
Refactor chains/keys reload.
Added FullNode worker dynamic resize on servicers change.
Updated servicers reload to only run the modification on maps when there is something new/removed.
…e and better readability of the code without so many casts.

Refactor fullNode.Servicer to be a map instead of a slice.
Enhance a bit more the logs and bootstrap time information.
Added metrics config support.
Refactor code to split in files.
Bump pond version to 1.8.3 (patch).
Clean up the code.
Update config to handle rpc timeout for different things like chains, client and pocket node calls with a different value.
Ensure that http response body is read even on errored request to reuse connections.
Enhanced chains reload logs.
Enhanced startup logs.
… so many edge cases and possible infinite goroutine spams.

Added name property to nodes as optional key, if not set use the hostname of the node url.
Added minWorker, maxWorker, maxCapacity to prometheus metrics collectors.
Refactor minWorker, maxWorker and maxCapacity option in config.
Bump default to a more real world value.
Updated docs.
jorgecuesta and others added 27 commits July 10, 2023 15:36
…ockchain call.

Reduce default values for the retries.
Moved CodeRequestHash & CodeInvalidBlockHeightError from invalidate session codes to non retryable ones.
… address. Prevent the race condition on the evidence store and do not hold relays evidence due to other address is using the queue.
…nse. Instead, return a more generic information but log the real error for internal node runner tracking.

Enhance error status types for servicer and chain not found.
…uld be discarded.

Added session validation before notify to avoid call a node if the session was already invalidated.
Bump version to 0.3.2
Enhance code base on @PoktBlade comments.
Added unit-test for ShouldAssumeOptimisticSession function.
Added missing session height validation on relays.
…reduce the latency on session changes.

Fixed issue on /v1/private/mesh/check that throw error asking latest height params.
Fixed relay_time metric that was not wrapping around "validate" method, and it is the one that could take more time after call blockchain.
Added missing initialization of the metrics for session storage.
Added metric for call blockchain as a separated one.
Enhance few error logs.
Added additional metrics.
Fixed multiple issues detected around offloaded session validate tasks.
Fixed an issue that prevent session behind validated after the session already exists for other servicer handled by the mesh.
Added more defensive code question to avoid panic due to nil pointers.
… conditions and reduce memory consumption.

Enhance logs for relay and session.
Enhance metrics for notify process.
…k session height 100589.

Removed jump lines (\n) on the errors provided by the pocketcore code. This difficult the usage of tooling like Loki that will collect a line of text before the jump line as an entry.
… will be done by GetSession a few lines below.
@reviewpad reviewpad bot added the large Pull request is large label Jul 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
large Pull request is large waiting-for-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants