Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding inference trace injection #36890

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
06cef91
adding inference trace injection
Aug 14, 2024
9dc2cf9
changing the interface based on feedback
Aug 16, 2024
58a032b
updates
Aug 16, 2024
ec1cd16
changing name of environment variable
Aug 20, 2024
3270076
changes based on review comments and some other changes
Sep 6, 2024
7cbbc0b
file name change
Sep 6, 2024
941a9ae
fixing exception handling
Sep 10, 2024
bcc6e74
relocating inference trace instrumentation
Sep 10, 2024
709923c
reverting change in azure core tracing
Sep 10, 2024
baac83f
Merge branch 'main' into mhietala/inference_genai_tracing
Sep 16, 2024
a64d870
fixes
Sep 16, 2024
198b9cd
changing span and model name for cases when model info not available
Sep 17, 2024
cd8bba2
some fixes
Sep 17, 2024
b28a3fe
adding sync trace tests
Sep 20, 2024
b549b38
fix and async trace test
Sep 23, 2024
469d32c
updating readme and setup
Sep 23, 2024
f1424a1
adding tracing sample
Sep 23, 2024
92da09a
changes based on review comments
Sep 25, 2024
d9652f5
changed to readme based on review comments
Sep 26, 2024
6da2a7d
removed distributed_trace and some other updates
Sep 26, 2024
521f7f0
fixing pre python v3.10 issue
Sep 26, 2024
814f87f
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Sep 26, 2024
8c80099
test fixes
Sep 26, 2024
514dea4
Fix some of the non-trace tests
dargilco Sep 26, 2024
83f85d6
fixing issues reported by tools
Sep 27, 2024
79ea9b3
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Sep 27, 2024
e8dd67d
adding uninstrumentation to the beginning of tracing tests
Sep 27, 2024
0c286c3
updating readme and sample
Sep 27, 2024
1aaf87c
adding ignore related to tool issue
Sep 27, 2024
a1b1f13
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Sep 30, 2024
510a6ca
updating code snippet in readme
Sep 30, 2024
04da0e6
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Sep 30, 2024
fa8e8b0
Add missing `@recorded_by_proxy` decorators to new tracing tests
dargilco Oct 1, 2024
e410c31
Push new recordings
dargilco Oct 1, 2024
18b3d92
fixing issues reported by tools
Oct 2, 2024
200ab61
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Oct 2, 2024
4a56354
adding inference to shared requirements
Oct 2, 2024
3113e35
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Oct 2, 2024
58a754f
remove inference from setup
Oct 2, 2024
4ed67dc
adding comma to setup
Oct 3, 2024
5a0aa71
updating version requirement for core
Oct 3, 2024
1214978
changes based on review comments
Oct 7, 2024
1350293
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Oct 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions sdk/ai/azure-ai-inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,14 @@ To update an existing installation of the package, use:
pip install --upgrade azure-ai-inference
```

If you want to install Azure AI Inferencing package with support for OpenTelemetry based tracing, use the following command:

```bash
pip install azure-ai-inference[trace]
```



## Key concepts

### Create and authenticate a client directly, using API key or GitHub token
Expand Down Expand Up @@ -534,6 +542,88 @@ To report issues with the client library, or request additional features, please

* Have a look at the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder, containing fully runnable Python code for doing inference using synchronous and asynchronous clients.

## Tracing
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions.

### Setup

The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be recorded in the traces or not. By default, the message contents are not recorded as part of the trace. When message content recording is disabled any function call tool related function names, function parameter names and function parameter values are also not recorded in the trace. Set the value of the environment variable to "true" (case insensitive) for the message contents to be recorded as part of the trace. Any other value will cause the message contents not to be recorded.

You also need to configure the tracing implementation in your code by setting `AZURE_SDK_TRACING_IMPLEMENTATION` to `opentelemetry` or configuring it in the code with the following snippet:

<!-- SNIPPET:sample_chat_completions_with_tracing.trace_setting -->

```python
from azure.core.settings import settings
settings.tracing_implementation = "opentelemetry"
```

<!-- END SNIPPET -->


Please refer to [azure-core-tracing-documentation](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme) for more information.

### Exporting Traces with OpenTelemetry

Azure AI Inference is instrumented with OpenTelemetry. In order to enable tracing you need to configure OpenTelemetry to export traces to your observability backend.
Refer to [Azure SDK tracing in Python](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme?view=azure-python-preview) for more details.

Refer to [Azure Monitor OpenTelemetry documentation](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) for the details on how to send Azure AI Inference traces to Azure Monitor and create Azure Monitor resource.

### Instrumentation

Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API.

<!-- SNIPPET:sample_chat_completions_with_tracing.instrument_inferencing -->

```python
from azure.core.tracing.ai.inference import AIInferenceInstrumentor
# Instrument AI Inference API
AIInferenceInstrumentor().instrument()
```

<!-- END SNIPPET -->


It is also possible to uninstrument the Azure AI Inferencing API by using the uninstrument call. After this call, the LLM traces will no longer be emitted by the Azure AI Inferencing API until instrument is called again.

<!-- SNIPPET:sample_chat_completions_with_tracing.uninstrument_inferencing -->

```python
AIInferenceInstrumentor().uninstrument()
```

<!-- END SNIPPET -->

### Tracing Your Own Functions
The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/).

<!-- SNIPPET:sample_chat_completions_with_tracing.trace_function -->

```python
from opentelemetry.trace import get_tracer
tracer = get_tracer(__name__)

# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes
# to the span in the function implementation. Note that this will trace the function parameters and their values.
@tracer.start_as_current_span("get_temperature")
def get_temperature(city: str) -> str:

# Adding attributes to the current span
span = trace.get_current_span()
span.set_attribute("requested_city", city)

if city == "Seattle":
return "75"
elif city == "New York City":
return "80"
else:
return "Unavailable"
```

<!-- END SNIPPET -->

## Contributing

This project welcomes contributions and suggestions. Most contributions require
Expand Down
4 changes: 3 additions & 1 deletion sdk/ai/azure-ai-inference/dev_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
-e ../../../tools/azure-sdk-tools
../../core/azure-core
aiohttp
../../core/azure-core-tracing-opentelemetry
aiohttp
opentelemetry-sdk
1 change: 1 addition & 0 deletions sdk/ai/azure-ai-inference/samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ similarly for the other samples.
|[sample_get_model_info.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_get_model_info.py) | Get AI model information using the chat completions client. Similarly can be done with all other clients. |
|[sample_chat_completions_with_model_extras.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_model_extras.py) | Chat completions with additional model-specific parameters. |
|[sample_chat_completions_azure_openai.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py) | Chat completions against Azure OpenAI endpoint. |
|[sample_chat_completions_with_tracing.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tracing.py) | Chat completions with traces enabled. Includes streaming and non-streaming chat operations. The non-streaming chat uses function call tool and also demonstrates how to add traces to client code so that they will get included as part of the traces that are emitted. |

### Text embeddings

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
# ------------------------------------
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
# ------------------------------------
"""
DESCRIPTION:
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
This sample demonstrates how to get a chat completions response from
the service using a synchronous client. The sample also shows how to
set default chat compoletions configuration in the client constructor,
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
which will be applied to all `complete` calls to the service.

This sample assumes the AI model is hosted on a Serverless API or
Managed Compute endpoint. For GitHub Models or Azure OpenAI endpoints,
the client constructor needs to be modified. See package documentation:
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/README.md#key-concepts

USAGE:
python sample_chat_completions_with_tracing.py

Set these two environment variables before running the sample:
1) AZURE_AI_CHAT_ENDPOINT - Your endpoint URL, in the form
https://<your-deployment-name>.<your-azure-region>.models.ai.azure.com
where `your-deployment-name` is your unique AI Model deployment name, and
`your-azure-region` is the Azure region where your model is deployed.
2) AZURE_AI_CHAT_KEY - Your model key (a 32-character string). Keep it secret.
"""


M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
import os
from opentelemetry import trace
# opentelemetry-sdk is required for the opentelemetry.sdk imports.
# You can install it with command "pip install opentelemetry.sdk".
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason
from azure.core.credentials import AzureKeyCredential

# [START trace_setting]
from azure.core.settings import settings
settings.tracing_implementation = "opentelemetry"
# [END trace_setting]

# Setup tracing to console
exporter = ConsoleSpanExporter()
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter))


def chat_completion_streaming(key, endpoint):
client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
response = client.complete(
stream=True,
messages=[
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="Tell me about software engineering in five sentences."),
],
)
for update in response:
if update.choices:
print(update.choices[0].delta.content or "", end="")
pass
client.close()

# [START trace_function]
from opentelemetry.trace import get_tracer
tracer = get_tracer(__name__)

# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes
# to the span in the function implementation. Note that this will trace the function parameters and their values.
@tracer.start_as_current_span("get_temperature")
def get_temperature(city: str) -> str:

# Adding attributes to the current span
span = trace.get_current_span()
span.set_attribute("requested_city", city)

if city == "Seattle":
return "75"
elif city == "New York City":
return "80"
else:
return "Unavailable"
# [END trace_function]


def get_weather(city: str) -> str:
if city == "Seattle":
return "Nice weather"
elif city == "New York City":
return "Good weather"
else:
return "Unavailable"


def chat_completion_with_function_call(key, endpoint):
import json
from azure.ai.inference.models import ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition

weather_description = ChatCompletionsToolDefinition(
function=FunctionDefinition(
name="get_weather",
description="Returns description of the weather in the specified city",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city for which weather info is requested",
},
},
"required": ["city"],
},
)
)

temperature_in_city = ChatCompletionsToolDefinition(
function=FunctionDefinition(
name="get_temperature",
description="Returns the current temperature for the specified city",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city for which temperature info is requested",
},
},
"required": ["city"],
},
)
)

client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
messages=[
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="What is the weather and temperature in Seattle?"),
]

response = client.complete(messages=messages, tools=[weather_description, temperature_in_city])

if response.choices[0].finish_reason == CompletionsFinishReason.TOOL_CALLS:
# Append the previous model response to the chat history
messages.append(AssistantMessage(tool_calls=response.choices[0].message.tool_calls))
# The tool should be of type function call.
if response.choices[0].message.tool_calls is not None and len(response.choices[0].message.tool_calls) > 0:
for tool_call in response.choices[0].message.tool_calls:
if type(tool_call) is ChatCompletionsToolCall:
function_args = json.loads(tool_call.function.arguments.replace("'", '"'))
print(f"Calling function `{tool_call.function.name}` with arguments {function_args}")
callable_func = globals()[tool_call.function.name]
function_response = callable_func(**function_args)
print(f"Function response = {function_response}")
# Provide the tool response to the model, by appending it to the chat history
messages.append(ToolMessage(tool_call_id=tool_call.id, content=function_response))
# With the additional tools information on hand, get another response from the model
response = client.complete(messages=messages, tools=[weather_description, temperature_in_city])

print(f"Model response = {response.choices[0].message.content}")


def main():
# [START instrument_inferencing]
from azure.core.tracing.ai.inference import AIInferenceInstrumentor
# Instrument AI Inference API
AIInferenceInstrumentor().instrument()
# [END instrument_inferencing]

try:
endpoint = os.environ["AZURE_AI_CHAT_ENDPOINT"]
key = os.environ["AZURE_AI_CHAT_KEY"]
except KeyError:
print("Missing environment variable 'AZURE_AI_CHAT_ENDPOINT' or 'AZURE_AI_CHAT_KEY'")
print("Set them before running this sample.")
exit()

M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
print("===== starting chat_completion_streaming() =====")
chat_completion_streaming(key, endpoint)
print("===== chat_completion_streaming() done =====")

print("===== starting chat_completion_with_function_call() =====")
chat_completion_with_function_call(key, endpoint)
print("===== chat_completion_with_function_call() done =====")
# [START uninstrument_inferencing]
AIInferenceInstrumentor().uninstrument()
# [END uninstrument_inferencing]


if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions sdk/ai/azure-ai-inference/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,7 @@
"typing-extensions>=4.6.0",
],
python_requires=">=3.8",
extras_require={
'trace': ['azure-core-tracing-opentelemetry']
}
)
Loading
Loading