Serverless Machine Learning Inference with OCI Functions on Kubernetes
PythonServerless computing is a cloud computing execution model where the cloud provider manages the infrastructure and dynamically allocates resources based on incoming requests. Oracle Cloud Infrastructure (OCI) Functions is a serverless platform that lets you run code without managing infrastructure. It's typically used for executing short-lived functions in response to triggers such as HTTP requests.
To create a serverless machine learning inference on Kubernetes with OCI, you will use several resources:
- OCI Functions Application: A logical grouping of functions within OCI. This is akin to a project that holds related functions together.
- OCI Functions Function: Each function within the application is a piece of code that performs a specific task. This is where you'd deploy your machine learning model inference code.
- OCI ApiGateway: To expose the function via HTTP so that it can be triggered over the internet, using an API gateway is the standard approach.
Here's a Python program that automates the setup of these resources using Pulumi. This program does not deploy a specific machine learning model but sets up the infrastructure required to deploy such a model. You would need to provide the Docker image containing your serverless function code and any additional configurations specific to your use case.
import pulumi_oci as oci # Configuration variables for the function application and function deployment compartment_id = 'ocid1.compartment.oc1..your_compartment_id' # Replace with your compartment OCID image_uri = 'your_image_uri' # URI of the container image in OCI Registry or other registries function_memory_in_mbs = 128 # Amount of memory in megabytes allocated to your function # Create an Application on OCI Functions service app = oci.functions.Application("app", compartment_id=compartment_id, display_name="my-functions-app", subnet_ids=["subnet_ocid1", "subnet_ocid2"], # List of subnet OCIDs for the application ) # Deploy a Function within the created Application func = oci.functions.Function("func", application_id=app.id, display_name="my-model-inference-function", image=image_uri, memory_in_mbs=function_memory_in_mbs, timeout_in_seconds=30, # Max function execution time config={ # Environment variables can be provided here "MODEL_URL": "oci://bucket_name@namespace/path/to/model", }, ) # Exposing the function via API Gateway # First, create the API Gateway api_gw = oci.apigateway.Gateway("api_gw", compartment_id=compartment_id, display_name="my-api-gateway", is_enabled=True, subnet_id="subnet_ocid3", # Replace with your Subnet OCID ) # Define an API deployment that will route to our function api_deployment = oci.apigateway.Deployment("api_deployment", compartment_id=compartment_id, display_name="my-api-deployment", gateway_id=api_gw.id, path_prefix="/infer", specification=open("api_specification.json").read(), # API spec defined in Swagger or OpenAPI format ) # Export the function and API Gateway URLs pulumi.export('function_ocid', func.id) # The OCID of the function to invoke it using the OCI SDK or CLI pulumi.export('api_endpoint', api_deployment.endpoint_url) # The URL to invoke the function via the API Gateway
In the above program:
- We create an application to hold our functions using
oci.functions.Application
. - We define a function with a docker image containing our ML model and deployment-specific configurations.
- We create an API Gateway and define a deployment with a path prefix
/infer
that forwards requests to our function.
The function's configuration, including memory allocation, timeout, and environment variables like
MODEL_URL
, will depend on your specific model and use case.Note that this code assumes that you have already:
- Created the necessary compartments (
compartment_id
) in OCI. - Set up the necessary networking, including subnets.
- Pushed a Docker image (
image_uri
) that contains your machine learning inference code to OCI Registry or another container registry.
Remember to replace placeholder values like
ocid1.compartment.oc1..your_compartment_id
,your_image_uri
,subnet_ocid1
, etc., with actual values from your OCI setup.The
api_specification.json
file should contain the API specification in OpenAPI (formerly Swagger) format that defines how the API Gateway will handle requests to your function.This program sets the foundation for running serverless machine learning inference on OCI using Pulumi to automate resource provisioning and deployment.