CLAUDE.md 10 KB

CLAUDE.md - OpenCost AI Assistant Guide

This document provides guidance for AI assistants working with the OpenCost codebase.

AI Assistant Behaviour

  • Never include claude.ai session links or URLs in commit messages or pull request bodies.

Project Overview

OpenCost is an open source Kubernetes cost monitoring tool maintained by the Cloud Native Computing Foundation (CNCF). It provides real-time cost allocation, asset tracking, and cloud cost monitoring for Kubernetes clusters across multiple cloud providers.

Key Features:

  • Real-time cost allocation by namespace, pod, controller, service, etc.
  • Multi-cloud cost monitoring (AWS, Azure, GCP, Alibaba, Oracle, OTC, DigitalOcean, Scaleway)
  • Dynamic on-demand pricing via cloud provider APIs
  • CSV-based custom pricing for on-prem clusters
  • MCP (Model Context Protocol) server for AI agent integration
  • Prometheus metrics export

Repository Structure

opencost/
├── cmd/costmodel/          # Main entry point (main.go)
├── core/                   # Core module (shared libraries)
│   └── pkg/
│       ├── clusters/       # Cluster management
│       ├── env/            # Environment variable utilities
│       ├── filter/         # Query filter implementations
│       ├── log/            # Structured logging
│       ├── model/          # Core data models
│       ├── opencost/       # OpenCost domain types (Allocation, Asset, CloudCost)
│       ├── storage/        # Storage abstractions
│       └── util/           # Utility packages
├── modules/
│   ├── collector-source/   # Custom metrics collector (alternative to Prometheus)
│   └── prometheus-source/  # Prometheus data source implementation
├── pkg/
│   ├── cloud/              # Cloud provider implementations
│   │   ├── aws/
│   │   ├── azure/
│   │   ├── gcp/
│   │   ├── alibaba/
│   │   ├── oracle/
│   │   ├── digitalocean/
│   │   ├── scaleway/
│   │   └── otc/            # Open Telekom Cloud
│   ├── cloudcost/          # Cloud cost processing pipeline
│   ├── clustercache/       # Kubernetes cluster caching
│   ├── cmd/costmodel/      # Cost model command implementation
│   ├── config/             # Configuration management
│   ├── costmodel/          # Core cost model logic and API handlers
│   ├── customcost/         # Custom cost plugin support
│   ├── env/                # Environment variable definitions
│   ├── mcp/                # MCP server implementation
│   └── metrics/            # Prometheus metrics
├── configs/                # Default pricing configurations
├── kubernetes/             # Kubernetes manifests (deprecated - use Helm)
├── protos/                 # Protocol buffer definitions
├── spec/                   # OpenCost specification
└── ui/                     # UI components (main UI in opencost/opencost-ui repo)

Development Setup

Prerequisites

  • Go 1.25+ (see go.mod for exact version)
  • Docker with buildx support
  • just - command runner
  • Tilt - for local Kubernetes development
  • Kubernetes cluster (local or remote)
  • Prometheus instance

Quick Start Commands

# Run all unit tests
just test

# Run tests for specific module
just test-core
just test-opencost
just test-prometheus-source
just test-collector-source

# Build local binary
just build-local

# Run locally (requires Prometheus and optionally Kubernetes access)
PROMETHEUS_SERVER_ENDPOINT="http://127.0.0.1:9080" go run ./cmd/costmodel/main.go

# Start development environment with Tilt
tilt up

Running Locally Without Kubernetes

Set PROMETHEUS_SERVER_ENDPOINT to your Prometheus URL:

# Port-forward to Prometheus in your cluster
kubectl port-forward svc/prometheus-server 9080:80

# Run OpenCost
PROMETHEUS_SERVER_ENDPOINT="http://127.0.0.1:9080" go run ./cmd/costmodel/main.go

Running Integration Tests

INTEGRATION=true just test-integration

Build Commands

# Build local binary
just build-local

# Build multi-arch binaries
just build-binary <version>

# Build and push Docker image
just build <image-tag> <release-version>

# Validate protobuf definitions
just validate-protobuf

Key Environment Variables

Core Configuration

Variable Default Description
PROMETHEUS_SERVER_ENDPOINT (required) Prometheus server URL
API_PORT 9003 OpenCost API port
CLUSTER_ID auto-detected Cluster identifier
CONFIG_PATH /var/configs Configuration directory

MCP Server

Variable Default Description
MCP_SERVER_ENABLED false Enable MCP server
MCP_HTTP_PORT 8081 MCP server HTTP port

Cloud Providers

Variable Description
AWS_ACCESS_KEY_ID AWS authentication
AWS_SECRET_ACCESS_KEY AWS authentication
AZURE_OFFER_ID Azure pricing offer ID
AZURE_BILLING_ACCOUNT Azure billing account
CLOUD_PROVIDER Force cloud provider (aws, azure, gcp, etc.)
USE_CSV_PROVIDER Enable CSV-based custom pricing
CSV_PATH Path to CSV pricing file

Prometheus Settings

Variable Default Description
PROMETHEUS_QUERY_TIMEOUT 120s Query timeout
PROMETHEUS_QUERY_RESOLUTION_SECONDS 300 Query resolution
MAX_QUERY_CONCURRENCY 5 Concurrent queries
PROM_CLUSTER_ID_LABEL cluster_id Cluster ID label name

Feature Flags

Variable Default Description
CLOUD_COST_ENABLED false Enable cloud cost ingestion
CARBON_ESTIMATES_ENABLED false Enable carbon estimation
COLLECTOR_DATA_SOURCE_ENABLED false Use collector instead of Prometheus

API Endpoints

Main API runs on port 9003 by default:

Endpoint Description
GET /allocation Cost allocation data
GET /allocation/summary Summarized allocation
GET /assets Asset cost data
GET /assets/carbon Asset carbon estimates
GET /cloudCost Cloud cost data
GET /customCost/status Custom cost status
GET /metrics Prometheus metrics

Code Conventions

Go Style

  • Use structured logging via github.com/opencost/opencost/core/pkg/log
  • Environment variables accessed through pkg/env or core/pkg/env
  • Errors should be wrapped with context

Before committing, always run:

go fmt ./...
go vet ./...

Module Structure

OpenCost uses Go workspace with multiple modules:

  • github.com/opencost/opencost - Main module
  • github.com/opencost/opencost/core - Core shared library
  • github.com/opencost/opencost/modules/prometheus-source - Prometheus integration
  • github.com/opencost/opencost/modules/collector-source - Metrics collector

When adding dependencies, ensure they're added to the correct module.

Testing

  • Unit tests use standard Go testing (*_test.go files)
  • Integration tests require INTEGRATION=true environment variable
  • Use mocks for external dependencies
  • Test files should be co-located with implementation

Logging

import "github.com/opencost/opencost/core/pkg/log"

log.Infof("Processing allocation for window: %s", window)
log.Errorf("Failed to query Prometheus: %v", err)
log.Warnf("Missing pricing data, using defaults")
log.Debugf("Detailed debug information")

Pull Request Guidelines

  1. Link related issues using: Fixes #123, Closes #456
  2. Describe user-facing changes and breaking changes
  3. Include test coverage for new functionality
  4. Run just test before submitting
  5. Use signed commits (Signed-off-by header required)

Architecture Notes

Data Flow

  1. Prometheus collects Kubernetes metrics (CPU, memory, etc.)
  2. OpenCost queries Prometheus for resource usage data
  3. Cloud Provider APIs provide pricing information
  4. Cost Model combines usage × pricing to compute costs
  5. API/MCP exposes cost data to users and AI agents

Key Types

  • Allocation - Cost allocation for a workload over a time window
  • Asset - Infrastructure asset (node, disk, load balancer)
  • CloudCost - Cloud service costs from billing APIs
  • Window - Time range for queries

Cloud Provider Detection

OpenCost auto-detects the cloud provider from:

  1. CLOUD_PROVIDER environment variable (explicit override)
  2. Kubernetes node labels
  3. Instance metadata services

Common Tasks

Adding a New Cloud Provider

  1. Create package under pkg/cloud/<provider>/
  2. Implement the models.Provider interface
  3. Add environment variables in pkg/env/costmodel.go
  4. Register in pkg/cloud/provider/provider.go
  5. Add default pricing config in configs/

Adding a New API Endpoint

  1. Add handler method to pkg/costmodel/router.go or appropriate file
  2. Register route in pkg/cmd/costmodel/costmodel.go
  3. Add tests in corresponding *_test.go file

Modifying Protobuf Definitions

  1. Edit .proto files in protos/
  2. Run ./generate.sh to regenerate Go code
  3. Run just validate-protobuf to verify

Cost Model Concepts

Core formulas from the OpenCost Specification (spec/opencost-specv01.md):

  • Total Cluster Costs = Cluster Asset Costs + Cluster Overhead Costs
  • Cluster Asset Costs = Resource Allocation Costs + Resource Usage Costs
  • Workload Costs = max(request, usage) for CPU/memory resources
  • Idle Costs = Allocation costs not attributed to any workload

Useful Links