Alert
Amazon SageMaker HyperPod now supports data capture for inference workloads
Amazon SageMaker HyperPod now supports data capture for inference workloads, a new capability that records inference request and response payloads from pro
Alert
Amazon SageMaker HyperPod now supports data capture for inference workloads, a new capability that records inference request and response payloads from pro
Amazon SageMaker HyperPod now supports data capture for inference workloads, a new capability that records inference request and response payloads from production endpoints to Amazon S3. Customers deploying generative AI models on HyperPod need visibility into model inputs and outputs to detect drift, troubleshoot production issues, build evaluation datasets, and continuously improve their deployed models, but previously had to build custom logging pipelines outside of the service to obtain this visibility.
With data capture, customers can train speculative decoding draft models from their real production traffic for better performance than generic draft models, build evaluation pipelines from production data, feed fine-tuning jobs with real-world inputs, and maintain audit trails for compliance. Customers choose where to capture inference traffic on each endpoint, at the SageMaker endpoint, the load balancer, or the model pod. Captured data is delivered asynchronously to their Amazon S3 bucket without blocking inference, and supports configurable sampling and customer-managed AWS KMS encryption. You can enable data capture when deploying models through the HyperPod Inference Operator, and use the captured data with Amazon SageMaker Model Monitor and your existing evaluation, fine-tuning, and draft-model training workflows.
This feature is available for SageMaker HyperPod clusters using the EKS orchestrator in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more, see Data capture for inference on HyperPod.
Today, Amazon GameLift Streams launched Generation 6e G6e stream classes, providing enhanced GPU performance for streaming high-fidelity, graphically demanding games and applications. The new G6e stream classes are pow…
Amazon SageMaker Unified Studio IAM domains now includes an interactive interface for creating and managing feature groups in SageMaker Feature Store, eliminating the need to write code for common feature management task…
Over the last 25 years of building Google’s global network, we’ve navigated major architectural eras — from the Internet, to streaming, and the cloud. Today, we are squarely in the midst of a fourth: the AI era. The appl…