Optimizing Background Jobs in Ruby on Rails with KEDA Autoscaling

6 min readSep 17, 2024

Introduction

In Ruby on Rails, background job processing with gems like Delayed Job or Sidekiq is essential for handling tasks such as sending emails, processing payments, and performing data-heavy operations in the background. However, these jobs can fluctuate in number, and managing the resources required to handle them efficiently can be challenging. If you over-provision resources, you waste money; if you under-provision, you risk performance bottlenecks. Autoscaling based on CPU or memory usage often falls short because job queues don’t directly correlate to these metrics. Enter KEDA (Kubernetes-based Event Driven Autoscaler), a tool that scales your Kubernetes pods based on event-driven metrics, such as job queue length, making it the perfect solution for handling delayed jobs in Rails.

Problem Statement

In most cloud environments, autoscaling is tied to CPU or memory usage. While this works well for some applications, it doesn’t align with how job queues behave in background processing systems like Delayed Job or Sidekiq. In these systems, job volumes can rise and fall unpredictably — sometimes hundreds of jobs can flood the queue, while at other times it sits almost empty. Scaling based on CPU or memory doesn’t account for these shifts, leading to inefficient resource allocation.

For example, if you have many jobs waiting but your CPU is underutilized, your system won’t scale up, causing delays in processing. On the other hand, during low job volumes, you may have too many worker pods running unnecessarily. This is where KEDA’s event-driven scaling, based on the actual job queue length, becomes invaluable. KEDA ensures that the number of worker pods dynamically adjusts in real time, directly aligned with the workload, preventing inefficiencies and ensuring timely job processing.

Why KEDA?

KEDA addresses the limitations of traditional autoscaling by enabling scaling based on event-driven metrics rather than relying solely on CPU or memory usage. For workloads like background jobs in Ruby on Rails, which fluctuate based on queue length, traditional autoscaling isn’t always effective because job queues don’t directly impact CPU or memory metrics.

With KEDA, you can monitor the size of your job queue and dynamically adjust the number of worker pods in real time. When the number of jobs in the queue increases, KEDA automatically scales up worker pods to meet the demand. As the queue size shrinks, KEDA scales down the workers to prevent unnecessary resource usage.

This event-driven approach ensures that your infrastructure is responsive to actual workload demand, allowing for better performance and resource management, especially during traffic spikes or fluctuating job loads. KEDA seamlessly integrates with Kubernetes, making it an ideal solution for handling background jobs in applications like Ruby on Rails.

Step-by-Step Guide

Step 1: Create the Secret for the MySQL Connection String

To allow KEDA to access your MySQL database for scaling, you need to store the MySQL connection string securely in a Kubernetes Secret.

Create the Secret Manifest:

Here’s a sample manifest for the secret:

apiVersion: v1
kind: Secret
metadata:
  name: delayed-job-secret  # Name of the secret that stores the MySQL connection string
  namespace: worker  # The Kubernetes namespace where the secret is stored
  labels:
    app: delayed-job  # Label to identify the secret for the Delayed Job application
type: Opaque  # Opaque type, which is used for storing sensitive data like connection strings
data:
  mysql_conn_str: <base64-encoded-mysql-connection-string>  # MySQL connection string, base64-encoded

Steps to Use:

Replace <base64-encoded-mysql-connection-string> with your actual MySQL connection string, base64-encoded.

Example command to base64-encode your MySQL connection string:

echo -n 'username:password@tcp(mysql-host:3306)/database' | base64

2. Save the manifest as secret.yaml.

3. Apply the secret to your Kubernetes cluster using the following command:

kubectl apply -f secret.yaml

Step 2: Create the KEDA TriggerAuthentication

Next, create the TriggerAuthentication to reference the Secret for the MySQL connection string:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: delayed-job-trigger-authentication
  namespace: worker
  labels:
    app: delayed-job
spec:
  secretTargetRef:
    - parameter: connectionString
      name: delayed-job-secret  # The name of the secret created in Step 1
      key: mysql_conn_str  # The key in the secret that holds the MySQL connection string

Step 3: Create the KEDA ScaledObject

With the secret in place, create a ScaledObject that will use the MySQL connection string stored in the secret to scale the Delayed Job worker pods.

Here’s an example ScaledObject configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: shopee-webhook-order
  namespace: worker
spec:
  scaleTargetRef:
    kind: Deployment
    name: delayed-job-deployment  # Your Delayed Job deployment
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 30  # Check the job queue every 30 seconds
  cooldownPeriod: 300  # Wait 5 minutes before scaling down
  triggers:
    - type: mysql
      metadata:
        query: "SELECT COUNT(*) as count FROM `db-name`.`delayed_order_jobs` WHERE `queue` = 'some-queue-name';"
        queryValue: "50"
        activationQueryValue: "50"
        connectionStringFromEnv: connectionString  # Reference the secret
      authenticationRef:
        name: delayed-job-trigger-authentication  # Reference the TriggerAuthentication

Step 4: Apply the ScaledObject and TriggerAuthentication

Apply both the ScaledObject and TriggerAuthentication to your Kubernetes cluster:

kubectl apply -f triggerauthentication.yaml
kubectl apply -f scaledobject.yaml

This will configure KEDA to autoscale your Delayed Job workers based on the job queue length in the MySQL database.

Step 5: Monitor Scaling Behavior

Once everything is configured, monitor the scaling behavior to ensure that KEDA is properly scaling your Delayed Job worker pods:

Kubernetes Dashboard: View the number of worker pods scaling up or down.
kubectl: Use kubectl get pods to check the current status of your worker pods.
Logs: Check logs for any potential issues in worker pod performance or scaling events.

Benefits of Using KEDA for Delayed Jobs

By implementing KEDA for autoscaling our Delayed Job workers, we gained several critical advantages:

Performance Optimization: During high-demand periods, KEDA automatically scales our worker pods to handle the load, ensuring that jobs are processed quickly and efficiently, reducing the risk of bottlenecks in job execution.
Improved Resource Utilization: KEDA ensures that worker pods are only scaled when necessary based on job queue metrics, leading to better resource allocation that aligns with real-time workload demands.
Simplicity and Automation: With KEDA, there’s no need for manual intervention or complex autoscaling rules based on CPU or memory. Scaling is driven entirely by the job queue, making it an ideal fit for Rails applications with fluctuating background job volumes.

Best Practices for KEDA and Delayed Job

Set Realistic Min/Max Replicas:

Keep the minReplicaCount as low as possible to avoid idle resources, but ensure that the maxReplicaCount is high enough to handle peak loads. This ensures that your system is both cost-efficient and responsive during traffic spikes.

Tune Scaling Triggers (Adjust queryValue and activationQueryValue):

Fine-tune the queryValue (the threshold to scale up) and activationQueryValue (when scaling becomes active) based on your workload.

queryValue: If your jobs are quick to process, consider raising the threshold (e.g., scale only when there are 100+ jobs in the queue). This prevents unnecessary scaling for small job volumes. Conversely, if your jobs take longer, lower the value (e.g., 50 jobs) to ensure that worker pods scale up promptly to handle the load.
activationQueryValue: This controls when scaling starts. If you want to avoid scaling until there’s a significant backlog, raise this value (e.g., 100 jobs). If your jobs need to be processed more urgently, you can lower it to trigger scaling earlier.

Finding the right balance between these values will help you optimize performance while avoiding over-provisioning.

Monitor Regularly:

Even with KEDA in place, regularly monitor your job queue and pod scaling to ensure that you’re neither over- nor under-scaling. Tools like Prometheus and Grafana can help you track job queue metrics and scaling behavior to adjust your configuration as needed.

Conclusion

KEDA is a powerful tool for optimizing resource allocation in Ruby on Rails applications, especially those with fluctuating background job volumes. By scaling based on job queue length rather than CPU or memory usage, KEDA enables you to cut costs without compromising on performance. Whether you’re managing email delivery, payment processing, or other intensive tasks, KEDA ensures that your infrastructure scales precisely when needed — saving you money and improving efficiency.

Have you used KEDA for autoscaling in your Rails app? What challenges or successes have you had? Share your experience in the comments below! If you’re considering KEDA for your project and need help getting started, feel free to reach out.