Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling OpenAI 429's gracefully #4153

Open
pascalwhoop opened this issue Jul 25, 2024 · 0 comments
Open

Handling OpenAI 429's gracefully #4153

pascalwhoop opened this issue Jul 25, 2024 · 0 comments

Comments

@pascalwhoop
Copy link

pascalwhoop commented Jul 25, 2024

Expected Behavior (Mandatory)

Ability to control OpenAI backoff strategy for large volume of embeddings calls. This is standard practice in almost any library I've used because we cannot assume we have infinite capacity from our API providers.

Actual Behavior (Mandatory)

] version=71, last transaction in previous log=5140, rotation took 51 millis, started after 7843 millis."}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"Error during iterate.commit:"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"1887 times: org.neo4j.graphdb.QueryExecutionException: Failed to invoke procedure `apoc.ml.openai.embedding`: Caused by: java.io.IOException: Server returned HTTP response code: 429 for URL: https://api.openai.com/v1/embeddings"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"332 times: org.neo4j.graphdb.QueryExecutionException: Failed to invoke procedure `apoc.ml.openai.embedding`: Caused by: java.net.SocketTimeoutException: Connect timed out"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"Error during iterate.execute:"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"332 times: Connect timed out"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"1887 times: Server returned HTTP response code: 429 for URL: https://api.openai.com/v1/embeddings"}

How to Reproduce the Problem

Try embedding 5M nodes at 2000 nodes batched per API request (to maximise throughput) so you end up hitting the 429 for too many tokens per minute

Specifications (Mandatory)

CALL apoc.periodic.iterate(
    'MATCH (p:`Entity`) RETURN p', 
    'CALL apoc.ml.openai.embedding([item in $_batch | item.p.name], $apiKey, {endpoint: $endpoint, model: $model}) YIELD index, text, embedding CALL apoc.create.setProperty($_batch[index].p, $attribute, embedding) YIELD node RETURN node', 
    {`batchMode`: 'BATCH_SINGLE', `batchSize`: 2000, `concurrency`: 50, `parallel`: 'true', `params`: {`apiKey`: 'KEY', `attribute`: 'embedding', `endpoint`: 'https://api.openai.com/v1', `model`: 'text-embedding-3-small'}}
) YIELD batch, operations

Currently used versions

# pypher
python-cypher==0.20.1

helm chart
- name: neo4j 
  version: 5.20.0
  repository: https://neo4j.github.io/helm-charts/

Versions

  • OS: GKE
  • Neo4j: 5.20.0
  • Neo4j-Apoc: 5.20.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

2 participants