observability

5 posts tagged “observability”

July 19, 2026

Running a Fleet of Firecracker microVMs for eve.dev Agents, Part 7: Fleet Operations

The final part of the series. We make the fleet operable — event-driven host autoscaling with lifecycle-hook draining, a reconciliation loop that reschedules agents off a dead host, per-agent logs and metrics rolled up across the fleet, and the FinOps view that turns packing density into cost per agent.

aws aws-cdk firecracker eve agents typescript autoscaling observability finops ai-infrastructure

July 18, 2026

A 101 Guide: Running an Eve Agent in a Firecracker microVM on AWS

A beginner's walkthrough of provisioning bare-metal AWS infrastructure with CDK TypeScript, then booting a Firecracker microVM to self-host a Vercel eve agent.

aws aws-cdk firecracker eve agents typescript ai-development observability opentelemetry finops

June 7, 2026

Building a Hybrid LLM Platform on EKS, Part 7: Observability and Cost Telemetry

Part 7 of our hands-on EKS series. We instrument the TypeScript router with OpenTelemetry, upgrade Prometheus to kube-prometheus-stack for GPU and vLLM metrics, add Grafana Tempo for distributed traces, and wire Langfuse so every request shows its backend, token count, and dollar cost.

eks kubernetes aws-cdk opentelemetry prometheus grafana langfuse observability typescript ai-infrastructure

May 21, 2026

Observability for LLM Applications on Kubernetes: Tokens, Traces, and Cost per Request

How to instrument self-hosted and hybrid LLM workloads with OpenTelemetry, Prometheus, and Langfuse — tracking time-to-first-token, tokens per second, GPU utilization, and unit economics down to the individual request.

kubernetes llm observability opentelemetry finops ai-infrastructure

February 7, 2025

Using AI to Monitor Kubernetes Clusters and Make Dynamic Scaling Decisions

How to move beyond static thresholds and use AI-driven observability to detect anomalies, predict traffic patterns, and automate scaling decisions across your Kubernetes infrastructure.

kubernetes ai monitoring autoscaling observability