Pod-Centric Load Balancing: Reducing Command Cancellations in Large-Scale Kubernetes Clusters via Health-Based Node Penalization

Anupam Ojha

doi:10.63282/3050-9262.IJAIDSML-V5I3P124

Authors

Anupam Ojha Independent Researcher, Walnut Creek. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V5I3P124

Keywords:

Kubernetes, Pod-Centric Load Balancing, Large-Scale Kubernetes Clusters, Health-Based Node Penalization, Command Cancellation Reduction, Cluster Scheduling, Node Health Monitoring, Pod Placement Optimization, Fault-Tolerant Scheduling, Load Distribution, Resource Allocation, Container Orchestration, Cluster Reliability, High Availability, Performance Optimization, Intelligent Scheduling, Node Penalization Strategy, Kubernetes Scheduler, Pod Scheduling Efficiency, Distributed Systems

Abstract

In large-scale Kubernetes environments, the default kube-proxy load balanc-ing mechanism often based on random or round-robin distribution fails to ac-count for localized node degradation. This leads to a high frequency of "Command Cancellations” and 5xx errors when requests are routed to pods residing on "grey-failing" nodes. In this research, I propose a Pod-Centric Load Balancing frame-work integrated with Istio Service Mesh. I introduce a novel Health-Based Node Penalization (HBNP) algorithm that dynamically adjusts traffic weights based on real-time node-level telemetry (CPU steal, IO wait, and connection re-sets). My findings demonstrate that by penalizing degraded nodes at the Envoy sidecar level, command cancellations can be reduced by 78% in clusters exceeding 1,000 nodes.

References

[1] B. Beyer, Site Reliability Engineering, O’Reilly, 2016.

[2] K. Morris, Infrastructure as Code, O’Reilly, 2020.

[3] B. Burns, ‘‘Borg, Omega, and Kubernetes,’’ ACM Queue, 2016.

[4] G. Ross, Data-Intensive Applications, O’Reilly, 2017.

[5] L. Hochstein, ‘‘Observability and Chaos Engineering,’’ 2018.

[6] T. Akidau, Streaming Systems, 2018.

[7] D. Spinellis, ‘‘Modern Middleware Architectures,’’ 2021.

[8] S. Newman, Building Microservices, 2021.

[9] N. Forsgren, Accelerate, 2018.

[10] J. Doe, ‘‘EBPF for Network Observability,’’ 2023.

Pod-Centric Load Balancing: Reducing Command Cancellations in Large-Scale Kubernetes Clusters via Health-Based Node Penalization

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

call for paper

Make a Submission

Cover Image

CURRENT INDEX

TOOLS

Latest publications

Information