Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Disable Sagemaker endpoint (or cross-encoder per workspace) #588

Merged
merged 28 commits into from
Oct 21, 2024

Conversation

charles-marion
Copy link
Collaborator

Issue #, if available:
#222

Description of changes:
This change is a follow up of #286 by @azaylamba

Currently cross encoder models are used to rank the search results but the models available need to be hosted on Sagemaker which increases cost significantly. Having an option to disable cross encoder models would be helpful while exploring the chatbot so that Sagemaker costs can be avoided.

Changes:

  • Disabling the use of Sagemaker in the CLI will change the list of available models in the config. Based on the available models, cross encoders will be available or not. (since the only cross encoder models is in SageMaker, it will disable it)
  • Make the cross encoder parameter optional during the workspace creation
  • Add an integ test.

Thank you @azaylamba for your help with this change.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

azaylamba and others added 28 commits December 23, 2023 20:03
…t the models available need to be hosted on Sagemaker which increases cost significantly. Having an option to disable cross encoder models would be helpful while exploring the chatbot so that Sagemaker costs can be avoided.

Added a config to enable/disable cross encoder models.
Also added options to selected embedding models, so that sagemakerModels are not created automatically.
Persisted enableSagemakerModels config so that it can be used directly instead of relying on sagemakerModels length.
Added basic feedback mechanism for responses generated by the chatbot.
The feedbacks are stored in DynamoDB which can be queried to do analysis as required by admin users.
In future we can add a UI page to display the feedbacks, but for now these are being stored and manual analysis would be required.
The feedbacks are not adding to the learning of the chatbot.
Hybrid Search won't be available if cross encoding is not enabled.
Default embeddings models was not being set correctly. Also error was thrown related to suppression rules if sagemaker models were not enabled. Used props.config.llms.enableSagemakerModels config to add the NAG suppression rules.
Also removed duplicated config.
@charles-marion charles-marion merged commit a1d2aa9 into aws-samples:main Oct 21, 2024
1 check passed
@charles-marion charles-marion deleted the disable_sagemaker branch October 21, 2024 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants