Feature: replace API Gateway with AppSync to enable deployment in reg…

…ulated environments (#269) * feat: appsync * feat(appsync): API * feat(appsync): api * feat(appsync): poetry env * feat(appsync): ongoing checkpoint * feat(appsync): checkpoint * feat(appsync): final with removal of ApiGw * feat(appsync): working GraphQL * feat(appsync): CF configuration * feat(appsync): removal of CF in front of AppSync * feat(appsync): using AOI for CF * feat(appsync): fix Delete session removed unused API resources * feat(appsync) : delete session result * feat(appsync): moving handler function correctly replacing api_handler function for better code diff * feat(appsync): json decoder * feat(appsync): cleanup * feat(appsync): exclude autogenerated file from prettier linting * feat(appsync): remove unused imports * feat(appsync): remove unused imports * feat(appsync): fix appsync merge errors * feat(appsync): fixed semantic_search method based on graphql signature * fix(appsync): correct logic for adding form fields to POST * fix(appsync): merged api stability and cleanup * feat(appsync): schema improvements * fix(appsync): reverting changes * feat(appsync): single api * fix(appsync): remove double resolver creation * fix(appsync): start kendra data sync * fix(appsync): correct field type * doc(appsync): update READMEs and diagrams * fix(appsync): correct subnet filtering * feat(appsync): cleanup legacy apigw functionality
aws-samples · Jan 15, 2024 · 4e59d88 · 4e59d88
1 parent 47e178f
commit 4e59d88
Show file tree

Hide file tree

Showing 115 changed files with 9,297 additions and 2,690 deletions.
diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
@@ -0,0 +1,12 @@
+name: smoke-build
+on: push
+jobs:
+  build-cdk:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v3
+        with:
+          node-version: "18"
+      - run: npm install
+      - run: npm run build
diff --git a/.gitignore b/.gitignore
@@ -428,6 +428,7 @@ deploy-models
 cli/*.js
 bin/*.js
 lib/**/*.js
+!lib/chatbot-api/functions/resolvers/**/*.js
 !jest.config.js
 *.d.ts
 node_modules
@@ -445,3 +446,4 @@ bin/config.json
 # Docs
 docs/.vitepress/cache
 docs/.vitepress/dist
+lib/user-interface/react-app/.graphqlconfig.yml
diff --git a/.prettierignore b/.prettierignore
@@ -7,4 +7,8 @@ coverage
 build
 public
 out
-cdk.out
+cdk.out
+lib/user-interface/react-app/src/API.ts
+lib/user-interface/react-app/src/graphql/mutations.ts
+lib/user-interface/react-app/src/graphql/queries.ts
+lib/user-interface/react-app/src/graphql/subscriptions.ts
diff --git a/README.md b/README.md
diff --git a/assets/architecture.png b/assets/architecture.png
diff --git a/docs/appsync/appsync.md b/docs/appsync/appsync.md
@@ -0,0 +1,22 @@
+# Using AppSync
+
+Define or change the schema in `./lib/chatbot-api/schema`.
+
+At the moment we only use the `schema-ws.graphql` to define the real-time API. The REST API might be replaced by AppSync in the future.
+
+If you modified the definition for the schema, you can regenerate the client code using
+
+```bash
+cd lib/user-interface/react-app
+npx @npx @aws-amplify/cli codegen add --apiId <api_id> --region <region>
+```
+
+Accept all the defaults.
+
+If you use a None data source, you need to modify `src/API.ts` adding:
+
+```ts
+export type NoneQueryVariables = {
+  none?: string | null;
+};
+```
diff --git a/docs/document-retrieval/retriever.md b/docs/document-retrieval/retriever.md
@@ -0,0 +1,32 @@
+# Document stores
+
+This solution uses "pluggable" document stores to implement RAG, also called **engines**. A document store implementation must provide the following functions:
+
+- a **query** function to retrieve documents from the store. The function is invoked with the following parameters:
+  - `workspace_id`: str - the workspace id
+  - `workspace`: dict: a dictionary containing additional metadata related to your datastore
+  - `query`: str - the query to search the documents
+  - `full_response`: boolean - a flag indicating if the response should also include the retrieval scores
+- a **create** function that gets invoked when a new workspace using this document store is created. Perform any operations needed to create resources that needs to be exclusively associated with the workspace
+- a **delete** function that gets invoked when a workspace is removed. Cleanup any resources you have might have created that are to the exclusive use of the workspace
+
+To keep the current convention, create a new folder inside `layers/python-sdk/python/genai_core/` called after your engine in which you create the different functions. Export all the different functions via an `__init__.py` file in the same folder.
+
+You need to modify the `semantic_search.py` file to add the invocation for your query function based on the type of engine.
+
+You also need to add a specific `create_workspace_<engine>` in the `workspaces.py` file. This function is then invoked by the REST API workspace route handler `lib/chatbot-api/functions/api-handler/routes/workspaces.py`.
+
+[ ] Added API handler function
+
+[ ] Added create_workspace
+
+The call to the **delete** function for your workspace must be added to the `rag-engines/workspaces/functions/delete-workspace-workflow/delete/index.py` function.
+
+[ ] Added delete
+
+The support for your workspace type must also be added to the front-end.
+
+- `react-app/src/components/pages/rag/workspace` to implement the document store specific settings
+- `react-app/src/components/pages/rag/create-workspace` to implement components necessary to create a new workspace based on this document store
+
+[ ] Added UI
diff --git a/docs/sagemker-hosting/inference-script.md b/docs/sagemker-hosting/inference-script.md
@@ -0,0 +1,32 @@
+# Inference script
+
+We are using a multi-model enpoint hosted on Sagemaker and provide a inference script to process requests and send responses back.
+
+The inference script is currently hardcoded with the supported models (lib/rag-engines/sagemaker-rag-models/model/inference.py)
+
+```py
+embeddings_models = [
+    "intfloat/multilingual-e5-large",
+    "sentence-transformers/all-MiniLM-L6-v2",
+]
+cross_encoder_models = ["cross-encoder/ms-marco-MiniLM-L-12-v2"]
+```
+
+The API is JSON body based:
+
+```json
+{
+  "type": "embeddings",
+  "model": "intfloat/multilingual-e5-large",
+  "input": "I love Berlin"
+}
+```
+
+```json
+{
+  "type": "cross-encoder",
+  "model": "cross-encoder/ms-marco-MiniLM-L-12-v2",
+  "input": "I love Berlin",
+  "passages": ["I love Paris", "I love London"]
+}
+```
diff --git a/lib/authentication/index.ts b/lib/authentication/index.ts
@@ -52,6 +52,10 @@ export class Authentication extends Construct {
       value: userPool.userPoolId,
     });
 
+    new cdk.CfnOutput(this, "IdentityPoolId", {
+      value: identityPool.identityPoolId,
+    });
+
     new cdk.CfnOutput(this, "UserPoolWebClientId", {
       value: userPoolClient.userPoolClientId,
     });

diff --git a/lib/aws-genai-llm-chatbot-stack.ts b/lib/aws-genai-llm-chatbot-stack.ts
@@ -148,8 +148,7 @@ export class AwsGenAILLMChatbotStack extends cdk.Stack {
       userPoolId: authentication.userPool.userPoolId,
       userPoolClientId: authentication.userPoolClient.userPoolClientId,
       identityPool: authentication.identityPool,
-      restApi: chatBotApi.restApi,
-      webSocketApi: chatBotApi.webSocketApi,
+      api: chatBotApi,
       chatbotFilesBucket: chatBotApi.filesBucket,
       crossEncodersEnabled:
         typeof ragEngines?.sageMakerRagModels?.model !== "undefined",

diff --git a/lib/chatbot-api/appsync-ws.ts b/lib/chatbot-api/appsync-ws.ts
@@ -0,0 +1,104 @@
+import * as cdk from "aws-cdk-lib";
+import * as appsync from "aws-cdk-lib/aws-appsync";
+import { Code, Function, LayerVersion, Runtime } from "aws-cdk-lib/aws-lambda";
+import { SqsEventSource } from "aws-cdk-lib/aws-lambda-event-sources";
+import { Construct } from "constructs";
+import { Shared } from "../shared";
+import { IQueue } from "aws-cdk-lib/aws-sqs";
+import { ITopic } from "aws-cdk-lib/aws-sns";
+import { UserPool } from "aws-cdk-lib/aws-cognito";
+import { NodejsFunction } from "aws-cdk-lib/aws-lambda-nodejs";
+import * as path from "path";
+
+interface RealtimeResolversProps {
+  readonly queue: IQueue;
+  readonly topic: ITopic;
+  readonly userPool: UserPool;
+  readonly shared: Shared;
+  readonly api: appsync.GraphqlApi;
+}
+
+export class RealtimeResolvers extends Construct {
+  public readonly outgoingMessageHandler: Function;
+
+  constructor(scope: Construct, id: string, props: RealtimeResolversProps) {
+    super(scope, id);
+
+    const powertoolsLayerJS = LayerVersion.fromLayerVersionArn(
+      this,
+      "PowertoolsLayerJS",
+      `arn:aws:lambda:${
+        cdk.Stack.of(this).region
+      }:094274105915:layer:AWSLambdaPowertoolsTypeScript:22`
+    );
+
+    const resolverFunction = new Function(this, "lambda-resolver", {
+      code: Code.fromAsset(
+        "./lib/chatbot-api/functions/resolvers/send-query-lambda-resolver"
+      ),
+      handler: "index.handler",
+      runtime: Runtime.PYTHON_3_11,
+      environment: {
+        SNS_TOPIC_ARN: props.topic.topicArn,
+      },
+      layers: [props.shared.powerToolsLayer],
+    });
+
+    const outgoingMessageHandler = new NodejsFunction(
+      this,
+      "outgoing-message-handler",
+      {
+        entry: path.join(
+          __dirname,
+          "functions/outgoing-message-appsync/index.ts"
+        ),
+        layers: [powertoolsLayerJS],
+        handler: "index.handler",
+        runtime: Runtime.NODEJS_18_X,
+        environment: {
+          GRAPHQL_ENDPOINT: props.api.graphqlUrl,
+        },
+      }
+    );
+
+    outgoingMessageHandler.addEventSource(new SqsEventSource(props.queue));
+
+    props.topic.grantPublish(resolverFunction);
+
+    const functionDataSource = props.api.addLambdaDataSource(
+      "realtimeResolverFunction",
+      resolverFunction
+    );
+    const noneDataSource = props.api.addNoneDataSource("none", {
+      name: "relay-source",
+    });
+
+    props.api.createResolver("send-message-resolver", {
+      typeName: "Mutation",
+      fieldName: "sendQuery",
+      dataSource: functionDataSource,
+    });
+
+    props.api.createResolver("publish-response-resolver", {
+      typeName: "Mutation",
+      fieldName: "publishResponse",
+      code: appsync.Code.fromAsset(
+        "./lib/chatbot-api/functions/resolvers/publish-response-resolver.js"
+      ),
+      runtime: appsync.FunctionRuntime.JS_1_0_0,
+      dataSource: noneDataSource,
+    });
+
+    props.api.createResolver("subscription-resolver", {
+      typeName: "Subscription",
+      fieldName: "receiveMessages",
+      code: appsync.Code.fromAsset(
+        "./lib/chatbot-api/functions/resolvers/subscribe-resolver.js"
+      ),
+      runtime: appsync.FunctionRuntime.JS_1_0_0,
+      dataSource: noneDataSource,
+    });
+
+    this.outgoingMessageHandler = outgoingMessageHandler;
+  }
+}
diff --git a/lib/chatbot-api/functions/api-handler/index.py b/lib/chatbot-api/functions/api-handler/index.py
@@ -1,17 +1,8 @@
-import json
-import genai_core.types
-import genai_core.parameters
-import genai_core.utils.json
-from pydantic import ValidationError
-from botocore.exceptions import ClientError
 from aws_lambda_powertools import Logger, Tracer
 from aws_lambda_powertools.logging import correlation_paths
 from aws_lambda_powertools.utilities.typing import LambdaContext
-from aws_lambda_powertools.event_handler.api_gateway import Response
 from aws_lambda_powertools.event_handler import (
-    APIGatewayRestResolver,
-    CORSConfig,
-    content_types,
+    AppSyncResolver,
 )
 from routes.health import router as health_router
 from routes.embeddings import router as embeddings_router
@@ -27,13 +18,7 @@
 tracer = Tracer()
 logger = Logger()
 
-
-cors_config = CORSConfig(allow_origin="*", max_age=0)
-app = APIGatewayRestResolver(
-    cors=cors_config,
-    strip_prefixes=["/v1"],
-    serializer=lambda obj: json.dumps(obj, cls=genai_core.utils.json.CustomEncoder),
-)
+app = AppSyncResolver()
 
 app.include_router(health_router)
 app.include_router(rag_router)
@@ -47,55 +32,9 @@
 app.include_router(kendra_router)
 
 
-
-@app.exception_handler(genai_core.types.CommonError)
-def handle_value_error(e: genai_core.types.CommonError):
-    logger.exception(e)
-
-    return Response(
-        status_code=200,
-        content_type=content_types.APPLICATION_JSON,
-        body=json.dumps(
-            {"error": True, "message": str(e)}, cls=genai_core.utils.json.CustomEncoder
-        ),
-    )
-
-
-@app.exception_handler(ClientError)
-def handle_value_error(e: ClientError):
-    logger.exception(e)
-
-    return Response(
-        status_code=200,
-        content_type=content_types.APPLICATION_JSON,
-        body=json.dumps(
-            {"error": True, "message": str(e)},
-            cls=genai_core.utils.json.CustomEncoder,
-        ),
-    )
-
-
-@app.exception_handler(ValidationError)
-def handle_value_error(e: ValidationError):
-    logger.exception(e)
-
-    return Response(
-        status_code=200,
-        content_type=content_types.APPLICATION_JSON,
-        body=json.dumps(
-            {"error": True, "message": [str(error) for error in e.errors()]},
-            cls=genai_core.utils.json.CustomEncoder,
-        ),
-    )
-
-
 @logger.inject_lambda_context(
-    log_event=True, correlation_id_path=correlation_paths.API_GATEWAY_REST
+    log_event=True, correlation_id_path=correlation_paths.APPSYNC_RESOLVER
 )
 @tracer.capture_lambda_handler
 def handler(event: dict, context: LambdaContext) -> dict:
-    origin_verify_header_value = genai_core.parameters.get_origin_verify_header_value()
-    if event["headers"]["X-Origin-Verify"] == origin_verify_header_value:
-        return app.resolve(event, context)
-
-    return {"statusCode": 403, "body": "Forbidden"}
+    return app.resolve(event, context)
diff --git a/lib/chatbot-api/functions/api-handler/routes/cross_encoders.py b/lib/chatbot-api/functions/api-handler/routes/cross_encoders.py
@@ -4,7 +4,7 @@
 from typing import List
 from pydantic import BaseModel
 from aws_lambda_powertools import Logger, Tracer
-from aws_lambda_powertools.event_handler.api_gateway import Router
+from aws_lambda_powertools.event_handler.appsync import Router
 
 tracer = Tracer()
 router = Router()
@@ -14,23 +14,22 @@
 class CrossEncodersRequest(BaseModel):
     provider: str
     model: str
-    input: str
+    reference: str
     passages: List[str]
 
 
-@router.get("/cross-encoders/models")
+@router.resolver(field_name="listCrossEncoders")
 @tracer.capture_method
 def models():
     models = genai_core.cross_encoder.get_cross_encoder_models()
 
-    return {"ok": True, "data": models}
+    return models
 
 
-@router.post("/cross-encoders")
+@router.resolver(field_name="rankPassages")
 @tracer.capture_method
-def cross_encoders():
-    data: dict = router.current_event.json_body
-    request = CrossEncodersRequest(**data)
+def cross_encoders(input: dict):
+    request = CrossEncodersRequest(**input)
     selected_model = genai_core.cross_encoder.get_cross_encoder_model(
         request.provider, request.model
     )
@@ -39,6 +38,6 @@ def cross_encoders():
         raise genai_core.types.CommonError("Model not found")
 
     ret_value = genai_core.cross_encoder.rank_passages(
-        selected_model, request.input, request.passages
+        selected_model, request.reference, request.passages
     )
-    return {"ok": True, "data": ret_value}
+    return [{"score": v, "passage": p} for v, p in zip(ret_value, request.passages)]