Viewing 1 current event matching “inference” by Date.

Sort By: Date Event Name, Location , Default
Thursday
May 14
Cloud Native May Meetup: Change-driven Architecture and Serving Inference at Scale
Reperio Health

Cloud Native PDX May: Change-driven Architecture and Serving Inference at Scale

This May is all about scaling; scaling your infrastructure management, or scaling your LLM inference serving. Join us to find out about some open source tools to make managing large modern stacks easier.

Date: Thursday, May 14 Time: 5:30–7:30 PM Location: Reperio Health, 4784 SE 17th Ave Suite 120, Portland, OR

Recording: Talks are typically recorded (opt-in by speakers)

A big thank you to Microsoft for sponsoring food & beverage, and to Reperio Health, our venue host.

Drasi, a new take on Change Driven Architectures: Aman Singh, Microsoft

Modern cloud-native systems constantly generate data changes, and applications often need to react to them. Building change-driven solutions that respond to specific changes in distributed data is challenging. This talk introduces Drasi, a CNCF Sandbox project that simplifies the design and implementation of change-driven architectures using Graph Queries and pluggable components. For example, with Drasi you can declaratively write automation to detect and respond to running containers with newly identified vulnerabilities across pods and deployments in a Kubernetes cluster. Join us for a walkthrough of real-world use cases that show how Drasi’s approach brings structure and responsiveness to complex distributed environments - without writing custom code.

Dynamo: Large Scale Distributed Inference David Zeir, Director, DL System Software, Nvidia Neelay Shah, Distinguished Engineer, Nvidia

This talk introduces Dynamo, NVIDIA's open-source Kubernetes-native distributed inference platform. We'll cover the problem space, walk through Dynamo's architecture — disaggregated prefill/decode, KV-cache-aware routing, and a transport layer that moves KV blocks directly between GPUs — and dig into the Kubernetes integration for scheduling, autoscaling, and graceful failure handling. We'll close with a demo of Dynamo serving a real workload.

Website

Viewing 0 past events matching “inference” by Date.

Sort By: Date Event Name, Location , Default
No events were found.