Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman compose up: psycopg2.OperationalError: connection to server at "127.0.0.1", port 5432 failed: Connection refused #395

Open
jwmatthews opened this issue Sep 27, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@jwmatthews
Copy link
Member

jwmatthews commented Sep 27, 2024

OS: MacOS

I am testing podman compose up and ran into below.
I believe this is an intermittent issue and will not be easily reproduced, I don't yet know the root cause.

kai_1               | psycopg2.OperationalError: connection to server at "127.0.0.1", port 5432 failed: Connection refused
kai_1               | 	Is the server running on that host and accepting TCP/IP connections?

I am guessing this may be from something not cleaning up.
I have been running between podman compose up and make run-postgres & make run-server
Been going back and forth between those modes testing workflows

Now I'm seeing an error of the podman compose flow saying it can't connect to the DB.
Filing this issue as I've spoken to a few other folks who have also hit this issue intermittently, this is first time I've seen it with podman compose up

The entire console output from $ TAG="local" podman compose up is below
https://gist.github.com/jwmatthews/d48abfccc368304c5a2fe088c4a0e12a

$ TAG="local" podman compose up   
>>>> Executing external compose provider "/usr/local/bin/docker-compose". Please see podman-compose(1) for how to disable this message. <<<<

Creating network "kai_default" with the default driver
Creating volume "kai_kai_db_data" with default driver
Creating kai_kai_db_1 ... done
Creating kai_kai_1    ... done
Attaching to kai_kai_db_1, kai_kai_1
kai_1               | Waiting until postgres is ready
kai_db_1            | Warning: Can't detect cpuset size from cgroups, will use nproc
kai_db_1            | The files belonging to this database system will be owned by user "postgres".
kai_db_1            | This user must also own the server process.
kai_db_1            | 
kai_db_1            | The database cluster will be initialized with locale "en_US.utf8".
kai_db_1            | The default database encoding has accordingly been set to "UTF8".
kai_db_1            | The default text search configuration will be set to "english".
kai_db_1            | 
kai_db_1            | Data page checksums are disabled.
kai_db_1            | 
kai_db_1            | fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok
kai_db_1            | creating subdirectories ... ok
kai_db_1            | selecting dynamic shared memory implementation ... posix
kai_db_1            | selecting default max_connections ... 100
kai_db_1            | selecting default shared_buffers ... 128MB
kai_db_1            | selecting default time zone ... Etc/UTC
kai_db_1            | creating configuration files ... ok
kai_db_1            | running bootstrap script ... ok
kai_db_1            | performing post-bootstrap initialization ... ok
kai_db_1            | syncing data to disk ... ok
kai_db_1            | 
kai_db_1            | 
kai_db_1            | Success. You can now start the database server using:
kai_db_1            | 
kai_db_1            |     pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start
kai_db_1            | 
kai_db_1            | initdb: warning: enabling "trust" authentication for local connections
kai_db_1            | initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
kai_db_1            | waiting for server to start....2024-09-27 12:59:21.822 UTC [35] LOG:  redirecting log output to logging collector process
kai_db_1            | 2024-09-27 12:59:21.822 UTC [35] HINT:  Future log output will appear in directory "log".
kai_db_1            |  done
kai_db_1            | server started
kai_db_1            | /var/run/postgresql:5432 - accepting connections
kai_db_1            | => sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
kai_db_1            | ALTER ROLE
kai_db_1            | waiting for server to shut down.... done
kai_db_1            | server stopped
kai_db_1            | Starting server...
kai_db_1            | 2024-09-27 12:59:22.171 UTC [1] LOG:  redirecting log output to logging collector process
kai_db_1            | 2024-09-27 12:59:22.171 UTC [1] HINT:  Future log output will appear in directory "log".
kai_1               | 
kai_1               | Postgres is ready
kai_1               | ################################################
kai_1               | load-data has never been run.                  #
kai_1               | Please wait, this will take a few minutes.     #
kai_1               | ################################################
kai_1               | Using custom config file
kai_1               | Console logging for 'kai' is set to level 'INFO'
kai_1               | File logging for 'kai' is set to level 'DEBUG' writing to file: '/kai/logs/kai_psql.log'
kai_1               | config: log_level='info' file_log_level='debug' log_dir='$pwd/logs' demo_mode=False trace_enabled=True gunicorn_workers=8 gunicorn_timeout=3600 gunicorn_bind='0.0.0.0:8080' incident_store=KaiConfigIncidentStore(solution_detectors=<SolutionDetectorKind.NAIVE: 'naive'>, solution_producers=<SolutionProducerKind.TEXT_ONLY: 'text_only'>, args=KaiConfigIncidentStorePostgreSQLArgs(provider=<KaiConfigIncidentStoreProvider.POSTGRESQL: 'postgresql'>, host='127.0.0.1', database='kai', user='kai', password='dog8code', connection_string=None, solution_detection=<SolutionDetectorKind.NAIVE: 'naive'>)) models=KaiConfigModels(provider='ChatOpenAI', args={'model': 'gpt-4o'}, template=None, llama_header=None, llm_retries=5, llm_retry_delay=10.0) solution_consumers=[<SolutionConsumerKind.DIFF_ONLY: 'diff_only'>, <SolutionConsumerKind.LLM_SUMMARY: 'llm_summary'>]
kai_1               | INFO - 2024-09-27 12:59:33,339 - kai.service.kai_application.kai_application - [  kai_application.py:54   -             __init__()] - Tracing enabled.
kai_1               | /kai/kai/service/llm_interfacing/model_provider.py:144: LangChainDeprecationWarning: The class `ChatOpenAI` was deprecated in LangChain 0.0.10 and will be removed in 0.3.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run `pip install -U langchain-openai` and import as `from langchain_openai import ChatOpenAI`.
kai_1               |   self.llm: BaseChatModel = model_class(**model_args)
kai_1               | INFO - 2024-09-27 12:59:33,557 - kai.service.kai_application.kai_application - [  kai_application.py:63   -             __init__()] - Selected provider: ChatOpenAI
kai_1               | INFO - 2024-09-27 12:59:33,557 - kai.service.kai_application.kai_application - [  kai_application.py:64   -             __init__()] - Selected model: gpt-4o
kai_1               | Traceback (most recent call last):
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 145, in __init__
kai_1               |     self._dbapi_connection = engine.raw_connection()
kai_1               |                              ^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 3292, in raw_connection
kai_1               |     return self.pool.connect()
kai_1               |            ^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 452, in connect
kai_1               |     return _ConnectionFairy._checkout(self)
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 1269, in _checkout
kai_1               |     fairy = _ConnectionRecord.checkout(pool)
kai_1               |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 716, in checkout
kai_1               |     rec = pool._do_get()
kai_1               |           ^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/impl.py", line 169, in _do_get
kai_1               |     with util.safe_reraise():
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
kai_1               |     raise exc_value.with_traceback(exc_tb)
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/impl.py", line 167, in _do_get
kai_1               |     return self._create_connection()
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 393, in _create_connection
kai_1               |     return _ConnectionRecord(self)
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 678, in __init__
kai_1               |     self.__connect()
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 902, in __connect
kai_1               |     with util.safe_reraise():
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
kai_1               |     raise exc_value.with_traceback(exc_tb)
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/pool/base.py", line 898, in __connect
kai_1               |     self.dbapi_connection = connection = pool._invoke_creator(self)
kai_1               |                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/engine/create.py", line 637, in connect
kai_1               |     return dialect.connect(*cargs, **cparams)
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/sqlalchemy/engine/default.py", line 616, in connect
kai_1               |     return self.loaded_dbapi.connect(*cargs, **cparams)
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               |   File "/opt/app-root/lib64/python3.12/site-packages/psycopg2/__init__.py", line 122, in connect
kai_1               |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
kai_1               |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kai_1               | psycopg2.OperationalError: connection to server at "127.0.0.1", port 5432 failed: Connection refused
kai_1               | 	Is the server running on that host and accepting TCP/IP connections?


$ podman version
Client:       Podman Engine
Version:      5.3.0-dev
API Version:  5.3.0-dev
Go Version:   go1.23.0
Git Commit:   ef905ef8d0c2dd9468029c6575ce4ab74442b725
Built:        Fri Aug 30 12:54:11 2024
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      5.3.0-dev-e8410b839
API Version:  5.3.0-dev-e8410b839
Go Version:   go1.22.6
Built:        Thu Aug 15 20:00:00 2024
OS/Arch:      linux/arm64
@jwmatthews jwmatthews added the bug Something isn't working label Sep 27, 2024
@jwmatthews jwmatthews self-assigned this Sep 27, 2024
@jwmatthews
Copy link
Member Author

I've restarted the podman machine VM and still see above.

@jwmatthews
Copy link
Member Author

The issues I saw were related to my config.toml
I ended up having a database configuration which caused the problem.

Issue here:

[incident_store.args]
provider = "postgresql"
host = "127.0.0.1"

The problem is that when we run via podman compose up, we are expecting that the database hose is kai_db, it is set via below
https://github.com/konveyor/kai/blob/main/compose.yaml#L20
KAI__INCIDENT_STORE__ARGS__HOST: kai_db

When we do the below in config, it takes precedence, which breaks us.

[incident_store.args]
host = "127.0.0.1"

What I did was:

  1. cp kai/config.toml build/config.toml
  2. Edited build/config.toml, minimal edits of the model, I left the incident_store config as it is in kai/config.toml

This is what I ran with when I saw the error

$ cat build/config.toml
# Default configuration file for Kai. For a better understanding of the
# configuration options, please refer to `build/example_config.toml`

log_level = "info"
file_log_level = "debug"
log_dir = "$pwd/logs"
demo_mode = false
trace_enabled = true

solution_consumers = ["diff_only", "llm_summary"]

[incident_store]
solution_detectors = "naive"
solution_producers = "text_only"

[incident_store.args]
provider = "postgresql"
host = "127.0.0.1"
database = "kai"
user = "kai"
password = "dog8code"

[models]
provider = "ChatOpenAI"

[models.args]
model = "gpt-4o"

Then I used the below and it worked:


 $ cat build/config.toml
# Default configuration file for Kai. For a better understanding of the
# configuration options, please refer to `build/example_config.toml`

log_level = "info"
file_log_level = "debug"
log_dir = "$pwd/logs"
demo_mode = false
trace_enabled = true

solution_consumers = ["diff_only", "llm_summary"]

[incident_store]
solution_detectors = "naive"
solution_producers = "text_only"

#[incident_store.args]
#provider = "postgresql"
#host = "127.0.0.1"
#database = "kai"
#user = "kai"
#password = "dog8code"

[models]
#provider = "ChatIBMGenAI"
provider = "ChatOpenAI"

[models.args]
model = "gpt-4o"

@jwmatthews
Copy link
Member Author

Recap of what I saw:

  1. copy kai/config.toml to build/config.toml
  • When I did this, I ended up overwriting the database host, I did not realize this
  1. run via podman compose up
  2. the compose.yaml is setting the postgres host as an environment variable
  3. Kai runs and it takes the value from config.toml as precedence and NOT the environment variables
  4. Kai is unable to reach the database as it's trying to reach 127.0.0.1:5432 instead of kai_db:5432

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant