Skip to content

UDL-TF/RestartController

Repository files navigation

RestartController

A Kubernetes controller that automatically restarts workloads (Deployments, StatefulSets, DaemonSets, ReplicaSets) on a scheduled basis using cron expressions.

Go Version License

Overview

RestartController is a Go-based Kubernetes operator that monitors pods matching a specific label selector and restarts their parent workloads according to a configurable cron schedule. This is useful for scenarios where periodic restarts are needed for maintenance, memory leak mitigation, or forcing configuration updates.

Architecture

graph TB
    subgraph "Kubernetes Cluster"
        subgraph "RestartController Pod"
            Main[Main Process]
            Config[Configuration]
            Cron[Cron Scheduler]
            RestartCtrl[Restart Controller]
        end

        subgraph "Kubernetes API"
            API[K8s API Server]
        end

        subgraph "Target Workloads"
            Deploy[Deployments]
            SSet[StatefulSets]
            DSet[DaemonSets]
            RSet[ReplicaSets]
            Pods[Pods with Labels]
        end
    end

    Main --> Config
    Main --> Cron
    Cron --> RestartCtrl
    RestartCtrl --> API
    API --> Pods
    Pods --> Deploy
    Pods --> SSet
    Pods --> DSet
    Pods --> RSet
    API --> Deploy
    API --> SSet
    API --> DSet
    API --> RSet

    style Main fill:#1e3a8a
    style Cron fill:#92400e
    style RestartCtrl fill:#065f46
    style API fill:#6b21a8
Loading

How It Works

Component Architecture

graph LR
    subgraph "Entry Point"
        A[main.go]
    end

    subgraph "Configuration"
        B[Config]
        B1[POD_SELECTOR]
        B2[CRON_EXPRESSION]
        B3[NAMESPACE]
    end

    subgraph "Core Logic"
        C[RestartController]
        D[Cron Scheduler]
    end

    subgraph "Kubernetes Client"
        E[K8s Client Wrapper]
        E1[ListPods]
        E2[GetPodOwner]
        E3[RestartWorkload]
    end

    A --> B
    B --> B1
    B --> B2
    B --> B3
    A --> C
    C --> D
    C --> E
    E --> E1
    E --> E2
    E --> E3

    style A fill:#1e3a8a
    style C fill:#065f46
    style E fill:#92400e
Loading

Execution Flow

flowchart TD
    Start([Start Controller]) --> Init[Initialize Configuration]
    Init --> K8sClient[Create K8s Client]
    K8sClient --> CreateCron[Create Cron Scheduler]
    CreateCron --> RegisterJob[Register Restart Job<br/>with Cron Expression]
    RegisterJob --> StartCron[Start Cron Scheduler]
    StartCron --> Wait[Wait for Signal/Trigger]

    Wait --> |Cron Triggers| Execute[Execute Restart Job]
    Wait --> |SIGTERM/SIGINT| Shutdown

    Execute --> ListPods[List Pods by Selector]
    ListPods --> CheckPods{Pods Found?}
    CheckPods --> |No| LogNoPods[Log: No Pods Found]
    CheckPods --> |Yes| LoopPods[Iterate Through Pods]

    LoopPods --> GetOwner[Get Pod Owner<br/>ReplicaSet → Deployment]
    GetOwner --> CheckDup{Already<br/>Restarted?}

    CheckDup --> |Yes| Skip[Skip - Already Restarting]
    CheckDup --> |No| CheckKind{Owner Kind?}

    CheckKind --> |Deployment| RestartDeploy[Restart Deployment<br/>Add Annotation]
    CheckKind --> |StatefulSet| RestartSS[Restart StatefulSet<br/>Add Annotation]
    CheckKind --> |DaemonSet| RestartDS[Restart DaemonSet<br/>Add Annotation]
    CheckKind --> |ReplicaSet| RestartRS[Restart ReplicaSet<br/>Scale 0 → Original]
    CheckKind --> |Unknown| LogError[Log Error:<br/>Unsupported Kind]

    RestartDeploy --> MarkRestarted[Mark as Restarted]
    RestartSS --> MarkRestarted
    RestartDS --> MarkRestarted
    RestartRS --> MarkRestarted

    MarkRestarted --> NextPod{More Pods?}
    Skip --> NextPod
    LogError --> NextPod
    LogNoPods --> Wait

    NextPod --> |Yes| LoopPods
    NextPod --> |No| Complete[Complete Restart Cycle]
    Complete --> Wait

    Shutdown([Graceful Shutdown]) --> StopCron[Stop Cron Scheduler]
    StopCron --> End([Exit])

    style Start fill:#1e3a8a
    style Execute fill:#92400e
    style CheckKind fill:#6b21a8
    style Complete fill:#065f46
    style Shutdown fill:#7f1d1d
Loading

Restart Mechanism Detail

sequenceDiagram
    participant Cron as Cron Scheduler
    participant RC as RestartController
    participant K8s as Kubernetes API
    participant Pod as Target Pods
    participant Workload as Parent Workload

    Cron->>RC: Trigger (Cron Expression)
    RC->>K8s: List Pods by Selector
    K8s-->>RC: Return Pod List

    loop For Each Pod
        RC->>K8s: Get Pod Owner Reference
        K8s-->>RC: Owner Kind & Name

        alt Owner is ReplicaSet
            RC->>K8s: Get ReplicaSet Details
            K8s-->>RC: ReplicaSet Info
            RC->>K8s: Get ReplicaSet Owner
            K8s-->>RC: Deployment (Parent)
        end

        alt Not Already Restarted
            alt Deployment/StatefulSet/DaemonSet
                RC->>K8s: Update Template Annotation<br/>"kubectl.kubernetes.io/restartedAt"
                K8s->>Workload: Trigger Rolling Update
                Workload->>Pod: Terminate & Recreate
            else ReplicaSet
                RC->>K8s: Scale to 0
                K8s->>Pod: Terminate All Pods
                RC->>K8s: Scale to Original
                K8s->>Pod: Create New Pods
            end
            K8s-->>RC: Update Success
            RC->>RC: Mark as Restarted
        else Already Restarted
            RC->>RC: Skip (Avoid Duplicate)
        end
    end

    RC-->>Cron: Restart Cycle Complete
Loading

Configuration

The controller is configured via environment variables:

Variable Default Description
POD_SELECTOR restart=tf2 Label selector to match target pods
CRON_EXPRESSION 0 4 * * * Cron expression for restart schedule
NAMESPACE default Kubernetes namespace to operate in

Cron Expression Examples

0 4 * * *        # Every day at 04:00 AM
*/30 * * * *     # Every 30 minutes
0 */6 * * *      # Every 6 hours
0 2 * * 0        # Every Sunday at 02:00 AM
0 0 1 * *        # First day of every month at midnight

Supported Workload Types

graph TD
    Controller[RestartController]

    Controller --> Deploy[Deployment]
    Controller --> SS[StatefulSet]
    Controller --> DS[DaemonSet]
    Controller --> RS[ReplicaSet]

    Deploy --> |Add Annotation| Deploy1[Rolling Restart]
    SS --> |Add Annotation| SS1[Rolling Restart]
    DS --> |Add Annotation| DS1[Rolling Restart]
    RS --> |Scale Down/Up| RS1[Full Restart]

    style Controller fill:#1e3a8a
    style Deploy1 fill:#065f46
    style SS1 fill:#065f46
    style DS1 fill:#065f46
    style RS1 fill:#92400e
Loading

Restart Methods

  1. Deployment, StatefulSet, DaemonSet: Adds/updates the kubectl.kubernetes.io/restartedAt annotation with the current timestamp, triggering a rolling restart
  2. ReplicaSet: Scales to 0 replicas, waits 2 seconds, then scales back to the original replica count

Installation

Using Helm

# Install from OCI registry
helm install restart-controller oci://registry.example.com/charts/restart-controller \
  --version 1.0.0

# Install with custom values from OCI
helm install restart-controller oci://registry.example.com/charts/restart-controller \
  --version 1.0.0 \
  --set config.podSelector="app=myapp" \
  --set config.cronExpression="0 3 * * *" \
  --set config.namespace="production"

# Upgrade from OCI registry
helm upgrade restart-controller oci://registry.example.com/charts/restart-controller \
  --version 1.1.0

Project Structure

RestartController/
├── cmd/
│   └── controller/
│       └── main.go              # Application entry point
├── internal/
│   └── controller/
│       ├── config.go            # Configuration management
│       └── restarter.go         # Core restart logic
├── pkg/
│   └── k8s/
│       └── client.go            # Kubernetes API client wrapper
├── helm/
│   ├── Chart.yaml               # Helm chart metadata
│   ├── values.yaml              # Default Helm values
│   └── templates/               # Kubernetes manifests
│       ├── deployment.yaml
│       ├── role.yaml
│       ├── rolebinding.yaml
│       └── serviceaccount.yaml
├── Dockerfile                   # Container image definition
├── go.mod                       # Go module dependencies
└── README.md                    # This file

RBAC Permissions

The controller requires the following Kubernetes permissions:

- apiGroups: ['']
  resources: ['pods']
  verbs: ['get', 'list', 'watch']
- apiGroups: ['apps']
  resources: ['deployments', 'statefulsets', 'daemonsets', 'replicasets']
  verbs: ['get', 'list', 'update', 'patch']

Example Use Case

graph LR
    subgraph "Scenario: Daily Game Server Restart"
        A[Game Servers] --> |Labeled with<br/>restart=tf2| B[Pods Running]
        B --> C[Memory Leaks Over Time]
        C --> D[RestartController]
        D --> |Cron: 0 4 * * *| E[Restart at 4 AM Daily]
        E --> F[Fresh Pods]
        F --> G[Optimal Performance]
    end

    style A fill:#1e3a8a
    style D fill:#065f46
    style E fill:#92400e
    style G fill:#065f46
Loading

Development

Prerequisites

  • Go 1.25+
  • Kubernetes cluster (for testing)
  • kubectl configured

Build

# Build binary
go build -o bin/controller cmd/controller/main.go

# Build Docker image
docker build -t restart-controller:dev .

Logging

The controller uses structured logging via klog:

  • Info Level: Startup, cron triggers, successful restarts
  • Error Level: Failed operations, API errors
  • V(2): Detailed operation logs (annotations, scaling)

Safety Features

  1. Deduplication: Tracks already-restarted workloads to avoid multiple restarts in the same cycle
  2. Owner Resolution: Automatically resolves ReplicaSets to their parent Deployments
  3. Graceful Shutdown: Waits for in-progress jobs to complete before exiting
  4. Error Handling: Continues processing remaining pods if one fails

License

See LICENSE file for details.

About

The restart controller ensures all servers restart within specific time.

Resources

License

Stars

Watchers

Forks

Packages