Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions design/EP-476-enterprise-enablement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# EP-476: Enterprise Enablement

* Issue: [#476](https://github.com/kagent-dev/kagent/issues/476)

## Background

This EP addresses enterprise deployment requirements for kagent, specifically authentication, multi-tenancy, and audit logging. These features are prerequisites for production deployment in regulated environments.

## Motivation

Currently kagent has limited support for enterprise deployment scenarios:

1. **Authentication**: Only `UnsecureAuthenticator` is implemented
2. **Multi-tenancy**: Controller operates cluster-wide with no namespace isolation
3. **Audit logging**: Basic HTTP logging without compliance-ready audit trail

### Goals

1. Implement OAuth2/OIDC authentication provider
2. Add namespace-scoped controller mode for multi-tenancy
3. Add structured audit logging for compliance

### Non-Goals

- RBAC authorization (future work)
- Multi-cluster support (separate EP)
- Air-gapped installation (documentation only)

## Implementation Details

### OAuth2/OIDC Authentication

New `OAuth2Authenticator` implementing `auth.AuthProvider`:

```go
// go/internal/httpserver/auth/oauth2.go
type OAuth2Config struct {
IssuerURL string
ClientID string
Audience string
RequiredScopes []string
UserIDClaim string // default: "sub"
RolesClaim string // default: "roles"
}
```

Features:
- JWT validation with JWKS caching
- Configurable claims extraction
- Scope and audience validation
- Bearer token from header or query parameter

### Namespace-Scoped Controller

Add `watchedNamespaces` parameter to reconciler:

```go
// go/internal/controller/reconciler/reconciler.go
type kagentReconciler struct {
// If empty, cluster-wide. If set, only these namespaces.
watchedNamespaces []string
}

func (a *kagentReconciler) validateNamespaceIsolation(namespace string) error
```

All reconcile methods call `validateNamespaceIsolation()` before processing.

### Structured Audit Logging

Middleware that logs compliance-ready JSON:

```go
// go/internal/httpserver/middleware.go
type AuditLogConfig struct {
Enabled bool
LogLevel int
IncludeHeaders []string
}
```

Logged fields: `request_id`, `timestamp`, `user`, `user_roles`, `namespace`, `action`, `status`, `duration_ms`

### Helm Configuration

```yaml
# values.yaml
controller:
watchNamespaces: [] # empty = cluster-wide

pdb:
enabled: false
controller:
minAvailable: 1

metrics:
serviceMonitor:
enabled: false
```

### Test Plan

- Unit tests for OAuth2 token validation (8 test cases)
- Unit tests for namespace isolation (15 test cases)
- Unit tests for audit middleware (11 test cases)
- Integration tests with mock OIDC server

## Alternatives

1. **Use existing auth middleware**: Rejected - kagent needs session-based auth for A2A protocol
2. **Namespace isolation via RBAC only**: Rejected - controller still needs to enforce boundaries
3. **External audit logging**: Considered - middleware approach is simpler and integrates with existing logging

## Open Questions

1. Should OAuth2 config be a CRD or Helm values? (Currently Helm values)
2. Integration with OpenShift OAuth server? (Future work)
112 changes: 112 additions & 0 deletions docs/openshift-deployment-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# OpenShift Deployment Guide

This guide covers OpenShift-specific deployment considerations for kagent.

## Prerequisites

- OpenShift 4.12+
- `oc` CLI configured
- Helm 3.x

## Installation

```bash
# Create namespace
oc new-project kagent

# Install CRDs
helm install kagent-crds ./helm/kagent-crds/ -n kagent

# Install kagent with OpenShift Route
helm install kagent ./helm/kagent/ -n kagent \
--set providers.openAI.apiKey=$OPENAI_API_KEY \
--set openshift.enabled=true
```

## Security Context Constraints

kagent runs with `restricted-v2` SCC by default. No special SCCs required.

For custom SCCs:

```bash
# View current SCC
oc get pod -n kagent -o yaml | grep -A5 securityContext

# Grant specific SCC (if needed)
oc adm policy add-scc-to-user anyuid -z kagent-controller -n kagent
```

## Routes

The Helm chart creates an OpenShift Route when `openshift.enabled=true`:

```yaml
# values.yaml
openshift:
enabled: true
route:
host: kagent.apps.example.com # optional
tls:
termination: edge
```

Access the UI:

```bash
oc get route kagent-ui -n kagent -o jsonpath='{.spec.host}'
```

## Pod Security Standards

kagent is compatible with PSS `restricted` profile:

| Setting | Value |
|---------|-------|
| `runAsNonRoot` | true |
| `allowPrivilegeEscalation` | false |
| `capabilities.drop` | ALL |
| `seccompProfile` | RuntimeDefault |

## High Availability

```yaml
# values-ha.yaml
controller:
replicas: 2

ui:
replicas: 2

pdb:
enabled: true
controller:
minAvailable: 1
ui:
minAvailable: 1
```

```bash
helm upgrade kagent ./helm/kagent/ -n kagent -f values-ha.yaml
```

## Troubleshooting

```bash
# Check pod status
oc get pods -n kagent

# Check SCC violations
oc get events -n kagent | grep -i scc

# View controller logs
oc logs -l app.kubernetes.io/component=controller -n kagent

# Check Route status
oc describe route kagent-ui -n kagent
```

## See Also

- [Installation Guide](https://kagent.dev/docs/kagent/introduction/installation)
- [Helm Chart README](../helm/README.md)
63 changes: 63 additions & 0 deletions go/internal/controller/reconciler/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ type kagentReconciler struct {

defaultModelConfig types.NamespacedName

// watchedNamespaces contains the list of namespaces the controller is allowed to operate on.
// If empty, all namespaces are allowed (cluster-wide mode).
// If non-empty, only operations within these namespaces are permitted (namespace-scoped mode).
watchedNamespaces []string

// TODO: Remove this lock since we have a DB which we can batch anyway
upsertLock sync.Mutex
}
Expand All @@ -67,16 +72,50 @@ func NewKagentReconciler(
kube client.Client,
dbClient database.Client,
defaultModelConfig types.NamespacedName,
watchedNamespaces []string,
) KagentReconciler {
return &kagentReconciler{
adkTranslator: translator,
kube: kube,
dbClient: dbClient,
defaultModelConfig: defaultModelConfig,
watchedNamespaces: watchedNamespaces,
}
}

// isNamespaceAllowed checks if the given namespace is allowed based on the watched namespaces configuration.
// Returns true if:
// - watchedNamespaces is empty (cluster-wide mode, all namespaces allowed)
// - namespace is in the watchedNamespaces list (namespace-scoped mode)
func (a *kagentReconciler) isNamespaceAllowed(namespace string) bool {
if len(a.watchedNamespaces) == 0 {
return true
}
return slices.Contains(a.watchedNamespaces, namespace)
}

// validateNamespaceIsolation checks if an operation in the given namespace is allowed.
// Returns an error if namespace isolation is violated.
func (a *kagentReconciler) validateNamespaceIsolation(namespace string) error {
if !a.isNamespaceAllowed(namespace) {
return fmt.Errorf("namespace %q is not in the list of watched namespaces; controller is in namespace-scoped mode watching: %v", namespace, a.watchedNamespaces)
}
return nil
}

// IsNamespaceScopedMode returns true if the controller is operating in namespace-scoped mode
// (i.e., watching specific namespaces rather than all namespaces).
func (a *kagentReconciler) IsNamespaceScopedMode() bool {
return len(a.watchedNamespaces) > 0
}

func (a *kagentReconciler) ReconcileKagentAgent(ctx context.Context, req ctrl.Request) error {
// Enforce namespace isolation in namespace-scoped mode
if err := a.validateNamespaceIsolation(req.Namespace); err != nil {
reconcileLog.Info("Skipping agent reconciliation due to namespace isolation", "agent", req.NamespacedName, "error", err)
return nil
}

// TODO(sbx0r): missing finalizer logic
agent := &v1alpha2.Agent{}
if err := a.kube.Get(ctx, req.NamespacedName, agent); err != nil {
Expand Down Expand Up @@ -171,6 +210,12 @@ func (a *kagentReconciler) reconcileAgentStatus(ctx context.Context, agent *v1al
}

func (a *kagentReconciler) ReconcileKagentMCPService(ctx context.Context, req ctrl.Request) error {
// Enforce namespace isolation in namespace-scoped mode
if err := a.validateNamespaceIsolation(req.Namespace); err != nil {
reconcileLog.Info("Skipping MCP service reconciliation due to namespace isolation", "service", req.NamespacedName, "error", err)
return nil
}

service := &corev1.Service{}
if err := a.kube.Get(ctx, req.NamespacedName, service); err != nil {
if apierrors.IsNotFound(err) {
Expand Down Expand Up @@ -214,6 +259,12 @@ type secretRef struct {
}

func (a *kagentReconciler) ReconcileKagentModelConfig(ctx context.Context, req ctrl.Request) error {
// Enforce namespace isolation in namespace-scoped mode
if err := a.validateNamespaceIsolation(req.Namespace); err != nil {
reconcileLog.Info("Skipping model config reconciliation due to namespace isolation", "modelConfig", req.NamespacedName, "error", err)
return nil
}

modelConfig := &v1alpha2.ModelConfig{}
if err := a.kube.Get(ctx, req.NamespacedName, modelConfig); err != nil {
if apierrors.IsNotFound(err) {
Expand Down Expand Up @@ -337,6 +388,12 @@ func (a *kagentReconciler) reconcileModelConfigStatus(ctx context.Context, model
}

func (a *kagentReconciler) ReconcileKagentMCPServer(ctx context.Context, req ctrl.Request) error {
// Enforce namespace isolation in namespace-scoped mode
if err := a.validateNamespaceIsolation(req.Namespace); err != nil {
reconcileLog.Info("Skipping MCP server reconciliation due to namespace isolation", "mcpServer", req.NamespacedName, "error", err)
return nil
}

mcpServer := &v1alpha1.MCPServer{}
if err := a.kube.Get(ctx, req.NamespacedName, mcpServer); err != nil {
if apierrors.IsNotFound(err) {
Expand Down Expand Up @@ -375,6 +432,12 @@ func (a *kagentReconciler) ReconcileKagentMCPServer(ctx context.Context, req ctr
}

func (a *kagentReconciler) ReconcileKagentRemoteMCPServer(ctx context.Context, req ctrl.Request) error {
// Enforce namespace isolation in namespace-scoped mode
if err := a.validateNamespaceIsolation(req.Namespace); err != nil {
reconcileLog.Info("Skipping remote MCP server reconciliation due to namespace isolation", "remoteMCPServer", req.NamespacedName, "error", err)
return nil
}

nns := req.NamespacedName
serverRef := nns.String()
l := reconcileLog.WithValues("remoteMCPServer", serverRef)
Expand Down
Loading