Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add afterCompletion callback option for runTools to enable easily building multi-model / multi-agent flows #1064

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions helpers.md
Original file line number Diff line number Diff line change
Expand Up @@ -585,6 +585,42 @@ async function main() {
main();
```

#### Use afterComplete for multi-agent patterns

The `afterComplete` callback allows for some powerful multi-agent patterns. By passing runner.messages to another LLM chat within afterComplete, you can easily have another model analyze the conversation and do things like conditionally inject web research or other targeting guidance to help the first model overcome problems.

```ts
import OpenAI from 'openai';

const client = new OpenAI();

async function main() {
let shouldInjectMessage = false // You can do any kind of conditional logic you want
const runner = client.chat.completions
.runTools({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: "How's the weather this week in Los Angeles?" }],
tools: [
// Whole bunch of tools...perhaps so many that we need to offload some cognitive overhead to another chat via afterCompletion...
],
},
{
afterCompletion: async () => {
if (!shouldInjectMessage) {
runner._addMessage({
role: 'system',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an example, but I think we'd want this to be an assistant message also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm a bit unclear on this: I've been using system for these kinds of messages where I "inject" important context or attempt to give the model a strong nudge to go in a different direction or update its instructions partway through a chat, and that seems to work well for me. (I also display all assistant messages in my FE by default and wouldn't want to display this message, but I could find a way around that if needed.) Could I potentially get better results in such cases by using the assistant role and changing up how I word the message content accordingly? I'm consistently getting the model to respond intelligently to inline web research using my current approach, but I'm curious and honestly I've probably neglected the assistant role a bit when it comes to manually inserted messages!

content: `Here's some up-to-date information I've found from the web that can help you with your next response: 42.`,
});

shouldInjectMessage = true;
}
},
})
}

main();
```

#### Integrate with `zod`

[`zod`](https://www.npmjs.com/package/zod) is a schema validation library which can help with validating the
Expand Down
21 changes: 19 additions & 2 deletions src/lib/AbstractChatCompletionRunner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ const DEFAULT_MAX_CHAT_COMPLETIONS = 10;
export interface RunnerOptions extends Core.RequestOptions {
/** How many requests to make before canceling. Default 10. */
maxChatCompletions?: number;
/** A callback to be run after each chat completion (and after any tools have been run for the completion).
* Can be used, for example, to make an LLM call to analyze the conversation thus far and provide guidance
* or supplemental information by injecting a message via runner._addMessage().
* Receives the chat completion that it was run after as an argument.
*/
afterCompletion?: (completion: ChatCompletion) => Promise<void>;
}

export class AbstractChatCompletionRunner<
Expand Down Expand Up @@ -274,7 +280,7 @@ export class AbstractChatCompletionRunner<
const role = 'function' as const;
const { function_call = 'auto', stream, ...restParams } = params;
const singleFunctionToCall = typeof function_call !== 'string' && function_call?.name;
const { maxChatCompletions = DEFAULT_MAX_CHAT_COMPLETIONS } = options || {};
const { maxChatCompletions = DEFAULT_MAX_CHAT_COMPLETIONS, afterCompletion } = options || {};

const functionsByName: Record<string, RunnableFunction<any>> = {};
for (const f of params.functions) {
Expand Down Expand Up @@ -345,6 +351,10 @@ export class AbstractChatCompletionRunner<

this._addMessage({ role, name, content });

if (afterCompletion) {
await afterCompletion(chatCompletion);
}

if (singleFunctionToCall) return;
}
}
Expand All @@ -359,7 +369,7 @@ export class AbstractChatCompletionRunner<
const role = 'tool' as const;
const { tool_choice = 'auto', stream, ...restParams } = params;
const singleFunctionToCall = typeof tool_choice !== 'string' && tool_choice?.function?.name;
const { maxChatCompletions = DEFAULT_MAX_CHAT_COMPLETIONS } = options || {};
const { maxChatCompletions = DEFAULT_MAX_CHAT_COMPLETIONS, afterCompletion } = options || {};

// TODO(someday): clean this logic up
const inputTools = params.tools.map((tool): RunnableToolFunction<any> => {
Expand Down Expand Up @@ -470,9 +480,16 @@ export class AbstractChatCompletionRunner<
this._addMessage({ role, tool_call_id, content });

if (singleFunctionToCall) {
if (afterCompletion) {
await afterCompletion(chatCompletion);
}
return;
}
}

if (afterCompletion) {
await afterCompletion(chatCompletion);
}
}

return;
Expand Down
7 changes: 4 additions & 3 deletions src/resources/beta/chat/completions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export {
ParsingFunction,
ParsingToolFunction,
} from '../../../lib/RunnableFunction';
import { RunnerOptions } from '../../../lib/AbstractChatCompletionRunner';
import { ChatCompletionToolRunnerParams } from '../../../lib/ChatCompletionRunner';
export { ChatCompletionToolRunnerParams } from '../../../lib/ChatCompletionRunner';
import { ChatCompletionStreamingToolRunnerParams } from '../../../lib/ChatCompletionStreamingRunner';
Expand Down Expand Up @@ -119,19 +120,19 @@ export class Completions extends APIResource {
runTools<
Params extends ChatCompletionToolRunnerParams<any>,
ParsedT = ExtractParsedContentFromParams<Params>,
>(body: Params, options?: Core.RequestOptions): ChatCompletionRunner<ParsedT>;
>(body: Params, options?: RunnerOptions): ChatCompletionRunner<ParsedT>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type has been wrong such that it causes a TS error when you try to pass maxCompletions. Will Stainless auto-gen remove my change? If so, what's the right way to fix this? It's been driving me nuts 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for fixing! No the generator will leave this change in :)

Would you mind opening a separate PR with this change so we can get it merged faster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Will update now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


runTools<
Params extends ChatCompletionStreamingToolRunnerParams<any>,
ParsedT = ExtractParsedContentFromParams<Params>,
>(body: Params, options?: Core.RequestOptions): ChatCompletionStreamingRunner<ParsedT>;
>(body: Params, options?: RunnerOptions): ChatCompletionStreamingRunner<ParsedT>;

runTools<
Params extends ChatCompletionToolRunnerParams<any> | ChatCompletionStreamingToolRunnerParams<any>,
ParsedT = ExtractParsedContentFromParams<Params>,
>(
body: Params,
options?: Core.RequestOptions,
options?: RunnerOptions,
): ChatCompletionRunner<ParsedT> | ChatCompletionStreamingRunner<ParsedT> {
if (body.stream) {
return ChatCompletionStreamingRunner.runTools(
Expand Down
Loading