Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: github bot to scan the new instance request. #70

Open
dalf opened this issue Jul 16, 2020 · 5 comments
Open

Feature: github bot to scan the new instance request. #70

dalf opened this issue Jul 16, 2020 · 5 comments
Labels
enhancement New feature or request

Comments

@dalf
Copy link
Contributor

dalf commented Jul 16, 2020

As soon there is a new instance request, a bot can:

The bot can also scan for a comment like "@searx-bot add instance" (comment from the project maintainers) and add the instance automatically.

But once implemented, what is the value of a human review ?

@dalf dalf added the enhancement New feature or request label Jul 16, 2020
@unixfox
Copy link
Member

unixfox commented Jul 16, 2020

That's a good idea! The bot could also technically check for some others components like TLS, IPv6.

But I think there should always be a human review. Maybe we could implement something like "/lgtm" command only accessible to the collaborators and we could for example require two LGTM from the collaborators in order to have the instance merged.
If the bot check that there are two LGTM then it automatically add the instance.

@unixfox
Copy link
Member

unixfox commented Sep 7, 2020

What's the programming language that you want to have for this bot?

@dalf
Copy link
Contributor Author

dalf commented Sep 7, 2020

I've started to write https://github.com/dalf/botsandbox in Python (more experimental than anything else)

The idea is:

  • the bot runs on check.searx.space to avoid issues with filtron / antibot-proxy / whatever on the searx instances (if the bot can't scan the instance, then searx.space too).
  • if there is a new instance:
    • it queues for a scan using searx-stats2 using Celery and redis.
    • it is just a function call from the Python point of view.
    • one scan at a time (if for some reason there is some spam on the searx-instances issue tracker, the bot will be able to deal with it).
    • some additional tests (filtron, DNSSEC)

probot for node seems cleaner, but

  • it adds a new language (golang would be okay, but I'm not sure this is the right language for that purpose).
  • as I understand the only to call Python from node, is to spawn a new process:
    • to scan an instance (call searx-stats2)
    • to add / remove an instance from searx-instance (call searx-instances).

@unixfox
Copy link
Member

unixfox commented Sep 7, 2020

I'm fine with python even though my main preferred language is JavaScript.

Webhook is a good idea. Github apps is probably better because Personnal Access token gives too much access through your account if somehow it gets leaked.

If the bot is open source, how are the contributors going to test it if it's needed to send requests from check.searx.space?

One idea that I've: A temporary environment for each PR that is run using the IP of check.searx.space like Gitlab is already doing with Review Apps, see an example here for websites: https://youtu.be/h2pv_syqO24?t=110. Each commit gets a new temporary environment so that the developer can test each new changes. You don't even need to run the python app on your VPS, you could like to run it on a separate server (in docker) that use check.searx.space server as a proxy.

I don't know how to do that, is there some kind of apps that already do this kind of thing?

@dalf
Copy link
Contributor Author

dalf commented Sep 8, 2020

About a GitHub App run on runner, but self Hosted Runner are discouraged on public repository : https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories

Network connection from Github runner:

  • Proxy: it won't work for CSP and TLS grade, so I guess it is no go. Details:

  • A Wireguard connection to check.searx.space, but the private key can't be shared otherwise it provides a open proxy to anyone: so only for the master branch, not for the PR.

So whatever the solution I can think of, only the master branch of the bot could run the tests on check.searx.space; for the PRs and forks, the code will run on github runner.

But even it is GitHub App, it requires a test environment:

  • either a repository.
  • either some tools to simulate Github behaviors.

@searx searx locked and limited conversation to collaborators Dec 29, 2023
@searx searx deleted a comment Dec 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants