# Cilium Task Service ## Overview The Cilium task service provides a complete installation and configuration of [Cilium](https://cilium.io/), a cloud-native networking, observability, and security solution built on eBPF. Cilium provides advanced networking features including load balancing, network policies, and service mesh capabilities for Kubernetes environments. ## Features ### Core Networking - **eBPF-based Networking** - High-performance, programmable networking using eBPF - **Container Network Interface (CNI)** - Full CNI plugin for Kubernetes - **Load Balancing** - Layer 3/4 and Layer 7 load balancing - **Network Address Translation (NAT)** - Advanced NAT capabilities - **IP Address Management (IPAM)** - Flexible IP address allocation ### Security & Policy - **Network Policies** - Kubernetes NetworkPolicy and CiliumNetworkPolicy - **Identity-based Security** - Application-aware security policies - **Encryption** - Transparent encryption with IPSec and WireGuard - **Runtime Security** - Real-time threat detection and prevention - **Service Mesh Security** - mTLS and authentication for service mesh ### Observability - **Hubble** - Built-in network observability platform - **Flow Monitoring** - Real-time network flow visibility - **Service Map** - Visual service dependency mapping - **Metrics & Monitoring** - Prometheus metrics integration - **Distributed Tracing** - Jaeger integration for request tracing ### Advanced Features - **Service Mesh** - Layer 7 proxy and service mesh capabilities - **Multi-Cluster** - Cross-cluster connectivity and policy - **Gateway API** - Support for Kubernetes Gateway API - **BGP Support** - Border Gateway Protocol for advanced routing - **Bandwidth Management** - Traffic shaping and QoS ## Configuration ### Basic Configuration ```kcl cilium: Cilium = { name: "cilium" version: "1.15.0" cluster_name: "kubernetes" mode: "standard" } ``` ### Production Configuration ```kcl cilium: Cilium = { name: "cilium" version: "1.15.0" cluster_name: "production-cluster" mode: "production" networking: { ipam: { mode: "kubernetes" cluster_pool_ipv4_cidr: "10.0.0.0/8" cluster_pool_ipv4_mask_size: 24 } tunnel: "vxlan" native_routing_cidr: "10.0.0.0/8" } security: { network_policy: true host_firewall: true encryption: { enabled: true type: "ipsec" } } hubble: { enabled: true relay: { enabled: true replicas: 2 } ui: { enabled: true ingress: { enabled: true hosts: ["hubble.company.com"] } } } operator: { replicas: 2 resources: { limits: { cpu: "1000m" memory: "1Gi" } requests: { cpu: "100m" memory: "128Mi" } } } agent: { resources: { limits: { cpu: "4000m" memory: "4Gi" } requests: { cpu: "100m" memory: "512Mi" } } } } ``` ### Service Mesh Configuration ```kcl cilium: Cilium = { name: "cilium" version: "1.15.0" # ... base configuration service_mesh: { enabled: true envoy: { enabled: true log_level: "info" } ingress: { enabled: true load_balancer_class: "cilium" } gateway_api: { enabled: true secret_namespace: "cilium-secrets" } } l7_proxy: true enable_l7_proxy_stats: true proxy_prometheus_port: 9964 } ``` ### Multi-Cluster Configuration ```kcl cilium: Cilium = { name: "cilium" version: "1.15.0" # ... base configuration cluster: { name: "cluster-1" id: 1 } clustermesh: { enabled: true use_apiserver: true apiserver: { replicas: 3 tls: { auto: { enabled: true } } } config: { enabled: true } } external_workloads: { enabled: true } } ``` ### Advanced Security Configuration ```kcl cilium: Cilium = { name: "cilium" version: "1.15.0" # ... base configuration security: { network_policy: true host_firewall: true encryption: { enabled: true type: "wireguard" } policy_enforcement: "default" host_protection: { enabled: true enforce: true } auth: { mutual: { spire: { enabled: true install: true } } } } bpf: { masquerade: true host_routing: true tproxy: true } enable_runtime_device_id: true enable_bandwidth_manager: true } ``` ## Usage ### Deploy Cilium ```bash ./core/nulib/provisioning taskserv create cilium --infra ``` ### List Available Task Services ```bash ./core/nulib/provisioning taskserv list ``` ### SSH to Cilium Server ```bash ./core/nulib/provisioning server ssh ``` ### Service Management ```bash # Check Cilium status cilium status # Check connectivity cilium connectivity test # Check cluster mesh status cilium clustermesh status # View Cilium configuration cilium config view ``` ### Network Policy Management ```bash # List network policies kubectl get networkpolicies --all-namespaces kubectl get ciliumnetworkpolicies --all-namespaces # Check policy enforcement cilium endpoint list # View policy verdicts hubble observe --verdict DENIED ``` ### Hubble Observability ```bash # Enable Hubble cilium hubble enable # Port forward to Hubble UI cilium hubble ui # Observe network flows hubble observe # List flows with filters hubble observe --from-pod default/frontend --to-service default/backend # Check service map hubble list nodes ``` ### Troubleshooting Commands ```bash # Check agent status cilium status --verbose # Validate installation cilium connectivity test # Check eBPF maps cilium map list # View agent logs kubectl logs -n kube-system -l k8s-app=cilium # Debug connectivity cilium-dbg endpoint list cilium-dbg policy trace ``` ## Architecture ### System Architecture ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Applications │────│ Cilium Agent │────│ eBPF Kernel │ │ │ │ │ │ │ │ • Pods │ │ • CNI Plugin │ │ • Network │ │ • Services │────│ • Policy Engine │────│ • Security │ │ • Ingress │ │ • Load Balancer │ │ • Observability │ │ • Gateway API │ │ • Service Mesh │ │ • Performance │ └─────────────────┘ └──────────────────┘ └─────────────────┘ ``` ### Component Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Cilium Platform │ ├─────────────────────────────────────────────────────────────┤ │ Hubble (Observability) │ Service Mesh │ Security │ │ │ │ │ │ • Flow Monitoring │ • L7 Proxy │ • Network │ │ • Service Map │ • Ingress │ Policies │ │ • Metrics │ • Gateway API │ • Encryption │ │ • Distributed Tracing │ • Load Balancing │ • Identity │ ├─────────────────────────────────────────────────────────────┤ │ Cilium Agent (DaemonSet) │ ├─────────────────────────────────────────────────────────────┤ │ eBPF Programs (Kernel Space) │ └─────────────────────────────────────────────────────────────┘ ``` ### Network Topology - **Pod-to-Pod Communication** - Direct eBPF-based forwarding - **Service Load Balancing** - In-kernel load balancing without kube-proxy - **Ingress/Egress** - Gateway API and Ingress controller - **Network Policies** - Identity-based security enforcement - **Cross-Cluster** - Cluster mesh for multi-cluster networking ## Supported Operating Systems - Ubuntu 20.04+ / Debian 11+ - CentOS 8+ / RHEL 8+ / Fedora 35+ - Amazon Linux 2+ ## System Requirements ### Minimum Requirements - **Kernel Version**: Linux 4.19.57+ (5.4+ recommended) - **CPU**: 2 cores (4 cores recommended) - **RAM**: 2GB (4GB+ recommended) - **Architecture**: x86_64, arm64 ### Production Requirements - **Kernel Version**: Linux 5.4+ - **CPU**: 4+ cores - **RAM**: 8GB+ (depends on cluster size) - **Network**: 10Gbps+ for high-throughput workloads ### Kernel Features - **eBPF JIT compiler** - Required for optimal performance - **CONFIG_BPF=y** - eBPF support - **CONFIG_BPF_SYSCALL=y** - eBPF syscall support - **CONFIG_NET_CLS_BPF=m** - BPF classifier - **CONFIG_BPF_JIT=y** - eBPF JIT compiler ## Troubleshooting ### Installation Issues ```bash # Check kernel compatibility cilium-dbg version # Verify eBPF support cilium-dbg status --verbose # Check system requirements cilium-dbg status --all-health # Validate configuration cilium config validate ``` ### Networking Issues ```bash # Test connectivity between pods cilium connectivity test # Check endpoint status cilium endpoint list # Debug policy enforcement cilium policy trace # Check service load balancing cilium service list ``` ### Performance Issues ```bash # Check eBPF program statistics cilium-dbg bpf stats # Monitor CPU and memory usage kubectl top pods -n kube-system -l k8s-app=cilium # Check for packet drops cilium-dbg metrics list | grep drop # Analyze network latency hubble observe --follow ``` ### Policy Issues ```bash # Check policy status cilium endpoint list -o jsonpath='{range .items[*]}{.status.identity.id}{"\t"}{.status.policy.enforcement}{"\n"}{end}' # Debug policy enforcement cilium policy trace --src-identity --dst-identity # View applied policies kubectl get ciliumnetworkpolicies -A ``` ### Hubble Issues ```bash # Check Hubble status cilium hubble status # Restart Hubble relay kubectl rollout restart deployment/hubble-relay -n kube-system # Check Hubble UI kubectl port-forward -n kube-system svc/hubble-ui 12000:80 ``` ## Security Considerations ### Network Security - **Zero Trust Networking** - Default deny with explicit allow policies - **Identity-based Security** - Cryptographic identity for all workloads - **Encryption** - Transparent encryption with IPSec or WireGuard - **Runtime Protection** - Real-time threat detection and response ### Policy Management - **Least Privilege** - Implement minimal required network access - **Segmentation** - Use network policies for micro-segmentation - **Compliance** - Built-in compliance reporting and auditing - **Threat Detection** - Continuous monitoring for suspicious activity ### Operational Security - **RBAC Integration** - Kubernetes RBAC for policy management - **Audit Logging** - Comprehensive audit trail for all network events - **Certificate Management** - Automatic certificate rotation - **Secure Defaults** - Security-first default configuration ## Performance Optimization ### eBPF Optimization - **JIT Compilation** - Enable eBPF JIT for optimal performance - **CPU Affinity** - Pin Cilium agents to specific CPU cores - **Kernel Bypass** - Use XDP for ultra-low latency applications - **Memory Management** - Tune eBPF map sizes for workload ### Network Performance - **Native Routing** - Use native routing when possible - **Hardware Offload** - Leverage NIC hardware acceleration - **Bandwidth Management** - Configure traffic shaping and QoS - **Connection Pooling** - Optimize connection reuse ### Monitoring Optimization - **Selective Monitoring** - Monitor only critical flows - **Metric Filtering** - Reduce metric cardinality - **Sampling** - Use flow sampling for high-traffic environments - **Resource Limits** - Set appropriate resource limits ## Integration Examples ### Prometheus Monitoring ```yaml # ServiceMonitor for Cilium metrics apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: cilium-agent namespace: kube-system spec: selector: matchLabels: k8s-app: cilium endpoints: - port: prometheus interval: 30s path: /metrics ``` ### Grafana Dashboard ```yaml # Grafana dashboard configuration apiVersion: v1 kind: ConfigMap metadata: name: cilium-dashboard data: cilium-overview.json: | { "dashboard": { "title": "Cilium Overview", "panels": [ { "title": "Network Policy Drops", "type": "graph", "targets": [ { "expr": "rate(cilium_drop_count_total[5m])" } ] } ] } } ``` ### Network Policy Examples ```yaml # Example CiliumNetworkPolicy apiVersion: cilium.io/v2 kind: CiliumNetworkPolicy metadata: name: frontend-to-backend namespace: default spec: endpointSelector: matchLabels: app: frontend egress: - toEndpoints: - matchLabels: app: backend toPorts: - ports: - port: "8080" protocol: TCP ``` ## Resources - **Official Documentation**: [docs.cilium.io](https://docs.cilium.io) - **GitHub Repository**: [cilium/cilium](https://github.com/cilium/cilium) - **Hubble Documentation**: [docs.cilium.io/en/stable/observability/hubble](https://docs.cilium.io/en/stable/observability/hubble) - **eBPF Documentation**: [ebpf.org](https://ebpf.org) - **Community**: [cilium.slack.com](https://cilium.slack.com)