DIGIT Core
PlatformDomainsAcademyDesign SystemFeedback
2.8
2.8
  • ☑️Introducing DIGIT Platform
    • DIGIT - Value Proposition
  • Platform
    • 🔎Overview
      • Principles
      • Architecture
        • Service Architecture
        • Infrastructure Architecture
        • Deployment Architecture
      • Technology
        • API Gateway
        • Open Source Tools
      • Checklists
        • API Checklist
        • Security Checklist
          • Security Guidelines Handbook
          • Security Flow - Exemplar
        • Performance Checklist
        • Deployment Checklist
      • UI Frameworks
        • React UI Framework
    • 🔧Core Services
      • Workflow Service
        • Setting Up Workflows
        • Configuring Workflows For An Entity
        • Workflow Auto Escalation
        • Migration To Workflow 2.0
      • Location Services
      • User Services
      • Access Control Services
      • PDF Generation Service
      • MDMS (Master Data Management Service)
        • Setting up Master Data
          • MDMS Overview
          • MDMS Rewritten
          • Configuring Tenants
          • Configuring Master Data
          • Adding New Master
          • State Level Vs City Level Master
      • Payment Gateway Service
      • User Session Management
      • Indexer Service
        • Indexer Configuration
      • URL Shortening Service
      • XState Core Chatbot
        • Xstate-Chatbot Message Localisation
        • XState-Chatbot Integration Document
      • NLP Engine Service
        • NLP Chatbot
      • SMS Template Approval Process
      • Telemetry Service
      • Document Uploader Service
      • Notification Enhancement Based On Different Channel
      • Report Service
        • Configuring New Reports
          • Impact Of Heavy Reports On Platform
          • Types Of Reports Used In Report Service
      • SMS Notification Service
        • Setting Up SMS Gateway
          • Using The Generic GET & POST SMS Gateway Interface
      • Survey Service
      • Persister Service
        • Persister Configuration
      • Encryption Service
        • Encryption Client Library
        • User Data Security Architecture
        • Guidelines for supporting User Privacy in a module
      • FileStore Service
      • ID Generation Service
      • Localization Service
        • Configuring Localization
          • Setup Base Product Localization
          • Configure SMS and Email
      • Email Notification Service
      • Searcher Service
      • Zuul Service
      • User OTP Service
      • OTP Service
      • Chatbot Service
      • National Dashboard Ingest
        • National Dashboard API Performance Testing Specs and Benchmark
        • National Dashboard: Steps for Index Creation
        • National Dashboard Adaptor Service
          • Deployment of Airflow DAG
          • Trigger Airflow DAG
          • Configure Airflow
          • Insert & Delete Data - Steps
          • Important Links & Credentials
          • Code Structure
          • KT Sessions
          • Pre-requisites For Enabling Adaptor
        • Revenue Maximisation
      • Audit Service
        • Signed Audit Performance Testing Results
      • Service Request
      • Self Contained Service Architecture (HLD)
      • Accelerators
        • Inbox Service
    • ✏️API Specifications
      • User
      • Access Control
      • Employee
      • Location
      • Localisation
      • Encryption
      • Indexer
      • File Store
      • Collection
      • DSS Ingest
      • HRMS
      • National Dashboard Ingest
      • WhatsApp Chatbot
      • Master Data Management
      • ID Generation
      • URL Shortner
      • Workflow Service
      • Workflow v2
      • Document Uploader Service
      • OTP Service
      • Reporting Service
      • PDF Generation Service
      • Payment Gateway Service
    • 🔐Data Protection & Privacy
      • Data Protection & Privacy Definitions
      • Legal Obligations For Privacy - eGov
      • Data Protection & Privacy - Global Best Practices
      • Guidelines
        • Platform Owner Guidelines
        • Implementing Agencies Guidelines
        • Admin Guidelines
        • Program Owner Guidelines
        • Data Security and Data Privacy
      • Data Privacy Policy Templates
        • eGov Data Privacy Policy
        • Implementing Agency Privacy Policy
        • Admin & Program Owner Privacy Policy
        • Supporting Agency Privacy Policy
      • Global Standards For All Roles
    • ▶️Get Started
      • Install DIGIT
      • Access DIGIT
      • Sandbox
      • Training and Certification
        • Training Resources
    • ⚒️Integrations
      • Payment
      • Notification
      • Transaction
      • Verification
      • View
      • Calculation
    • 🛣️Roadmap
    • 🎬Open Events
    • 👩‍💻Source Code
    • 👁️Project Plan
    • 📋Discussion Board
    • 🤝Contribute
  • Guides
    • 📓Installation Guide
      • DIGIT Deployment
      • Quick Setup
        • DIGIT Installation on Azure
        • DIGIT Installation on AWS
      • Production Setup
        • AWS
          • 1. Pre-requisites
          • 2. Understanding EKS
          • 3. Setup AWS Account
          • 4. Provisioning Infra Using Terraform
          • 5. Prepare Deployment Config
          • 6. Deploy DIGIT
          • 7. Bootstrap DIGIT
          • 8. Productionize DIGIT
          • FAQ
        • Azure
          • 1. Azure Pre-requisites
          • 2. Understanding AKS
          • 3. Infra-as-code (Terraform)
        • SDC
          • 1. SDC Pre-requisites
          • 2. Infra-as-code (Kubespray)
          • CI/CD Setup On SDC
        • CI/CD Set Up
          • CI/CD Build Job Pipeline Setup
        • Prepare Helm Release Chart
        • Deployment - Key Concepts
          • Security Practices
          • Readiness & Liveness
          • Resource Requests & Limits
          • Deploying DIGIT Services
          • Deployment Architecture
          • Routing Traffic
          • Backbone Deployment
    • 💽Data Setup Guide
      • User Module
      • Localisation Module
      • Location Module
    • 🚥Design Guide
      • Model Requirements
      • Design Services
      • Design User Interface
      • Checklists
    • ⚒️Developer Guide
      • Pre-requisites Training Resources
      • Backend Developer Guide
        • Section 0: Prep
          • Development Pre-requisites
          • Design Inputs
            • High Level Design
            • Low Level Design
          • Development Environment Setup
        • Section 1: Create Project
          • Generate Project Using API Specs
          • Create Database
          • Configure Application Properties
          • Import Core Models
          • Implement Repository Layer
          • Create Validation & Enrichment Layers
          • Implement Service Layer
          • Build The Web Layer
        • Section 2: Integrate Persister & Kafka
          • Add Kafka Configuration
          • Implement Kafka Producer & Consumer
          • Add Persister Configuration
          • Enable Signed Audit
          • Run Application
        • Section 3: Integrate Microservices
          • Integrate IDGen Service
          • Integrate User Service
          • Add MDMS Configuration
          • Integrate MDMS Service
          • Add Workflow Configuration
          • Integrate Workflow Service
          • Integrate URL Shortener Service
        • Section 4: Integrate Billing & Payment
          • Custom Calculator Service
          • Integrate Calculator Service
          • Payment Back Update
        • Section 5: Other Advanced Integrations
          • Add Indexer Configuration
          • Certificate Generation
        • Section 6: Run Final Application
        • Section 7: Build & Deploy Instructions
        • FAQs
      • Flutter UI Developer Guide
        • Introduction to Flutter
          • Flutter - Key Features
          • Flutter Architecture & Approach
          • Flutter Pre-Requisites
        • Setup Development Environment
          • Flutter Installation & Setup Guide
          • Setup Device Emulators/Simulators
          • Run Application
        • Build User Interfaces
          • Create Form Screen
        • Build Deploy & Publish
          • Build & Deploy Flutter Web Application
          • Generate Android APKs & App Bundles
          • Publishing App Bundle To Play Store
        • State Management With Provider & Bloc
          • Provider State Management
          • BloC State Management
        • Best Practices & Tips
        • Troubleshooting
      • UI Developer Guide
        • DIGIT-UI
        • Android Web View & How To Generate APK
        • DIGIT UI Development Pre-requisites
        • UI Configuration (DevOps)
        • Local Development Setup
        • Run Application
        • Create New Screen In DIGIT-UI
          • Create Screen (FormComposer)
          • Inbox/Search Screen
          • Workflow Component
        • Customisation
          • Integrate External Web Application/UI With DIGIT UI
          • Utility - Pre-Process MDMS Configuration
          • CSS Customisation
        • Citizen Module Setup
          • Sample screenshots
          • Project Structure
          • Install Dependency
          • Import Required Components
          • Write Citizen Module Code
          • Citizen Landing Screen
        • Employee Module Setup
          • Write Employee Module Code
        • Build & Deploy
        • Setup Monitoring Tools
        • FAQs
          • Troubleshoot Using Browser Network Tab
          • Debug Android App Using Chrome Browser
    • 🔄Operations Guide
      • DIGIT - Infra Overview
      • Setup Central Instance Infra
      • Central Monitoring Dashboard Setup
      • Kubernetes
        • RBAC Management
        • DB Dump - Playground
      • Setup Jenkins - Docker way
      • GitOps
        • Git Client installation
        • GitHub organization creation
        • Adding new SSH key to it
        • GitHub repo creation
        • GitHub Team creation
        • Enabling Branch protection:
        • CODEOWNER Reviewers
        • Adding Users to the Git
        • Setting up an OAuth with GitHub
        • Fork (Fork the mdms,config repo with a tenant-specific branch)
      • Working with Kubernetes
        • Installation of Kubectl
      • Containerizing application using Docker
        • Creation of Dockerhub account
      • Infra provisioning using Terraform
        • Installation of Terraform
      • Customization of existing tf templates
      • Cert-Manager
        • Obtaining SSL certificates with the help of cluster-issuer
      • Moving Docker Images
      • Pre and post deployment checklist
      • Multi-tenancy Setup
      • Availability
        • Infrastructure
        • Backbone services
          • Database
          • Kafka
          • Kafka Connect
          • Elastic search
            • ElasticSearch Direct Upgrade
            • Elastic Search Rolling Upgrade
        • Core services
        • DIGIT apps
        • DSS dashboard
      • Observability
        • ES-Curator to clear old logs/indices
        • Monitoring
        • Tracing
        • Jaeger Tracing Setup
        • Logging
        • eGov Monitoring & Alerting Setup
        • eGov Logging Setup
      • Performance
        • What to monitor?
          • Infrastructure
          • Backbone services
          • Core services
        • Identifying bottlenecks
        • Solutions
      • Handling errors
      • Security
      • Reliability and disaster recovery
      • Privacy
      • Skillsets/hiring
      • Incident management processes
      • Kafka Troubleshooting Guide
        • How to clean up Kafka logs
        • How to change or reset consumer offset in Kafka?
      • SRE Rituals
      • FAQs
        • I am unable to login to the citizen or employee portal. The UI shows a spinner.
        • My DSS dashboard is not reflecting accurate numbers? What can I do?
      • Deployment using helm
        • Helm installation:
        • Helm chart creation
        • Helm chart customization
      • How to Dump Elasticsearch Indexes
      • Deploy Nginx-Ingress-Controller
      • Deployment Job Pipeline Setup
      • OAuth2-Proxy Setup
      • Jira Ticket Creation
  • Reference
    • 👉Setup Basics
      • Setup Requirements
        • Tech Enablement Training - Essential Skills and Pre-requisites
        • Tech Enablement Training (eDCR) - Essential Skills and Prerequisites
          • Development Control Rules (Digit-DCR)
          • eDCR Approach Guide
        • DIGIT Rollout Program Governance
        • DevOps Skills Requirements
        • Infra Requirements
        • Team Composition for DIGIT Implementation
        • Infra Best Practices
        • Operational Best Practices
        • Why Kubernetes For DIGIT
      • Supported Clouds
        • Google Cloud
        • Azure
        • AWS
        • VSphere
        • SDC
      • Deployment - Key Concepts
        • Security Practices
        • CI/CD
        • Readiness & Liveness
        • Resource Requests & Limits
      • Understanding ERP Stack
        • ERP Monolithic Architecture
        • ERP Hybrid Architecture
        • ERP Coexistence Architecture
        • APMDP-HYBRID-INFRA ARCHITECTURE
        • eGov SmartCity eGovernance Suite
        • ERP Deployment Process
        • ERP Release Process
        • ERP User Guide
      • Deploying DIGIT Services
        • Deployment Architecture
        • Routing Traffic
        • Backbone Deployment
      • Troubleshooting
        • Distributed Tracing
        • Logging
        • Monitoring & Alerts
    • 📥Reference Reads
      • Analytics
      • DevSecOps
      • Low Code No Code
        • Application Specification
      • Beneficiary Eligibility
      • Government and Open Digital Platforms
      • Microservices and Low Code No Code
      • Registries
      • Platform Orientation - Overview
    • 🔏Data Security
      • Signed Data Audit
      • Encryption Techniques
      • Approaches to handle Encrypted Data
    • ❕Privacy
    • 🕹️DevOps
      • 1. How DNS works
      • 2. Load Balancer
      • 3. SSL/Cert-manager
      • 4.Ingress,WAF
      • 5.VPC
      • 6.Subnets
      • 7.EKS
      • 8.Worker Node Group
      • 9.RDS
      • 10.NAT
      • 11.Internet Gateway
      • 12.Block Storage (EBS Volumes)
      • 13.Object Storage (S3)
      • 14. Telemetry
Powered by GitBook

All content on this page by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

On this page
  • Probes Overview
  • Kubernetes Probes
  • Readiness Probes
  • Liveness Probe
  • Startup Probes
  • Configuring Probe Actions
  • HTTP
  • TCP
  • Command
  • Best Practices
  • Tools

Was this helpful?

  1. Reference
  2. Setup Basics
  3. Deployment - Key Concepts

Readiness & Liveness

Overview of various probes that we can setup to ensure the service deployment and the availability of the service is ensured automatically.

PreviousCI/CDNextResource Requests & Limits

Last updated 2 years ago

Was this helpful?

Probes Overview

Determining the state of a service based on readiness, liveness, and startup to detect and deal with unhealthy situations. It may happen that if the application needs to initialize some state, make database connections, or load data before handling application logic. This gap in time between when the application is actually ready versus when Kubernetes thinks is ready becomes an issue when the deployment begins to scale and unready applications receive traffic and send back 500 errors.

Many developers assume that when basic pod setup is adequate, especially when the application inside the pod is configured with daemon process managers (e.g. PM2 for Node.js). However, since Kubernetes deems a pod as healthy and ready for requests as soon as all the containers start, the application may receive traffic before it is actually ready.

Kubernetes Probes

Kubernetes supports readiness and liveness probes for versions ≤ 1.15. Startup probes were added in 1.16 as an alpha feature and graduated to beta in 1.18 (WARNING: 1.16 deprecated several Kubernetes APIs. Use this to check for compatibility).

All probes have the following parameters:

  • initialDelaySeconds : number of seconds to wait before initiating liveness or readiness probes

  • periodSeconds: how often to check the probe

  • timeoutSeconds: number of seconds before marking the probe as timing out (failing the health check)

  • successThreshold : minimum number of consecutive successful checks for the probe to pass

  • failureThreshold : number of retries before marking the probe as failed. For liveness probes, this will lead to the pod restarting. For readiness probes, this will mark the pod as unready.

Readiness Probes

Readiness probes are used to let kubelet know when the application is ready to accept new traffic. If the application needs some time to initialize the state after the process has started, configure the readiness probe to tell Kubernetes to wait before sending new traffic. A primary use case for readiness probes is directing traffic to deployments behind a service.

Kubernetes Readiness Probe

One important thing to note with readiness probes is that it runs during the pod’s entire lifecycle. This means that readiness probes will run not only at startup but repeatedly throughout as long as the pod is running. This is to deal with situations where the application is temporarily unavailable (i.e. loading large data, waiting on external connections). In this case, we don’t want to necessarily kill the application but wait for it to recover. Readiness probes are used to detect this scenario and not send traffic to these pods until it passes the readiness check again.

Liveness Probe

On the other hand, liveness probes are used to restart unhealthy containers. The kubelet periodically pings the liveness probe, determines the health, and kills the pod if it fails the liveness check. Liveness checks can help the application recover from a deadlock situation. Without liveness checks, Kubernetes deems a deadlocked pod healthy since the underlying process continues to run from Kubernetes’s perspective. By configuring the liveness probe, the kubelet can detect that the application is in a bad state and restarts the pod to restore availability.

Startup Probes

Startup probes are similar to readiness probes but only executed at startup. They are optimized for slow starting containers or applications with unpredictable initialization processes. With readiness probes, we can configure the initialDelaySeconds to determine how long to wait before probing for readiness. Now consider an application where it occasionally needs to download large amounts of data or do an expensive operation at the start of the process. Since initialDelaySeconds is a static number, we are forced to always take the worst-case scenario (or extend the failureThresholdthat may affect long-running behaviour) and wait for a long time even when that application does not need to carry out long-running initialization steps. With startup probes, we can instead configure failureThreshold and periodSeconds to model this uncertainty better. For example, setting failureThreshold to 15 and periodSeconds to 5 means the application will get 10 x 5 = 75s to startup before it fails.

Configuring Probe Actions

Now that we understand the different types of probes, we can examine the three different ways to configure each probe.

HTTP

The kubelet sends an HTTP GET request to an endpoint and checks for a 2xx or 3xx response. You can reuse an existing HTTP endpoint or set up a lightweight HTTP server for probing purposes (e.g. an Express server with /healthz endpoint).

HTTP probes take in additional parameters:

  • host : hostname to connect to (default: pod’s IP)

  • scheme : HTTP (default) or HTTPS

  • path : path on the HTTP/S server

  • httpHeaders : custom headers if you need header values for authentication, CORS settings, etc

  • port : name or number of the port to access the server

livenessProbe:
   httpGet:
     path: /healthz
     port: 8080

TCP

If you just need to check whether or not a TCP connection can be made, you can specify a TCP probe. The pod is marked healthy if can establish a TCP connection. Using a TCP probe may be useful for a gRPC or FTP server where HTTP calls may not be suitable.

readinessProbe:
   tcpSocket:
     port: 21

Command

Finally, a probe can be configured to run a shell command. The check passes if the command returns with exit code 0; otherwise, the pod is marked as unhealthy. This type of probe may be useful if it is not desirable to expose an HTTP server/port or if it is easier to check initialization steps via command (e.g. check if a configuration file has been created, run a CLI command).

readinessProbe:
   exec:
     command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"]

Best Practices

The exact parameters for the probes depend on your application, but here are some general best practices to get started:

  • For older (≤ 1.15) Kubernetes clusters, use a readiness probe with an initial delay to deal with the container startup phase (use p99 times for this). But make this check lightweight, since the readiness probe will execute throughout the entire lifecycle of the pod. We don’t want the probe to time out because the readiness check takes a long time to compute.

  • For newer (≥ 1.16) Kubernetes clusters, use a startup probe for applications with unpredictable or variable startup times. The startup probe may share the same endpoint (e.g. /healthz ) as the readiness and liveness probes, but set the failureThreshold higher than the other probes to account for longer start times, but more reasonable time to failure for liveness and readiness checks.

  • Readiness and liveness probes may share the same endpoint if the readiness probes aren’t used for other signalling purposes. If there’s only one pod (i.e. using a Vertical Pod Autoscaler), set the readiness probe to address the startup behaviour and use the liveness probe to determine health. In this case, marking the pod unhealthy means downtime.

  • Readiness checks can be used in various ways to signal system degradation. For example, if the application loses connection to the database, readiness probes may be used to temporarily block new requests and allow the system to reconnect. It can also be used to load balance work to other pods by marking busy pods as not ready.

In short, well-defined probes generally lead to better resilience and availability. Be sure to observe the startup times and system behaviour to tweak the probe settings as the applications change.

Tools

Finally, given the importance of Kubernetes probes, you can use a Kubernetes resource analysis tool to detect missing probes. These tools can be run against existing clusters or be baked into the CI/CD process to automatically reject workloads without properly configured resources.

Kubernetes Livenss Probe

​: a resource analysis tool with a nice dashboard that can also be used as a validating webhook or CLI tool.

​: a static code analysis tool that works with Helm, Kustomize, and standard YAML files.

​: read-only utility tool that scans Kubernetes clusters and reports potential issues with configurations.

👉
Polaris
Kube-score
Popeye
migration guide