Airflow at Asapp: Enhancing AI-Powered Contact Centers (2024)

4 hours ago 1

24 Oct, 2025

(As Airflow Summit 2025 wraps up, this is a repost of my presentation at last year's Airflow Summit 2024 in San Francisco)

ASAPP is the definitive generative AI company for contact centers. Named not only a leader in The Forrester Wave™: Digital Customer Interaction Solutions, Q2 2024 report, but also “this market’s undisputed leader in AI-led innovation,” ASAPP’s Generative Agent creates transformational automation outcomes for our customers, and our AI-powered tools ensure agents at contact centers are equipped to exceed customer expectations.

Throughout its history, ASAPP has been at the forefront of adopting and implementing cutting-edge machine learning algorithms to enhance user performance. From classical models for classification and automated suggestions in the past to large language models currently powering speech-to-text models for audio transcription, generative summarization for customer calls, and agent augmentation through a fully conversational AI voice and chat agent, ASAPP has consistently leveraged state of the art technology to deliver superior results.

Using tools like GenerativeAgent, AutoTranscribe, and AutoSummary, ASAPP’s investment in generative AI allows companies to handle complex, free-form interactions that couldn’t be automated with prior technology.

At ASAPP, we manage a wide array of machine learning models across our NLP and Speech teams, which require frequent training and fine-tuning based on various customer-specific and model-level metrics. Our approach to model management is characterized by:

Continuous Improvement:

  • Frequent training and fine-tuning based on customer-specific and model-level metrics using ML pipelines built on open source tooling
  • Integration into larger workflows with automated triggering mechanisms
  • Emphasis on reliability for mission-critical applications

Diverse data processing pipelines:

  • Different storage and analytics backends like S3, Spark, Flink, Trino, Snowflake, Cassandra, AWS Athena, and Redshift
  • Different sources of data, like raw customer data as well as application-specific data
  • Timely data preparation for engineering and research workflows

Reliable and scalable pipelines for constantly changing technologies:

  • Moving away from classical machine learning models to generative AI models seamlessly
  • Handling real time processing workflows as well as being able to shift to a batch processing workflow for a new pipeline

Additionally, we strive to provide an easy-to-use, scalable, and reliable development experience, catering to our typical users: researchers, machine learning engineers, product engineers, and data engineers.

As one of the early adopters of Airflow, we iterated and improved consistently. This allowed us to support an increasing number of users leveraging Airflow to write automated pipelines for various use cases.

Initially, we faced challenges from a highly coupled interface between our DAG code and cluster management. Making our DAG code independent of core Airflow environment management vastly improved developer experience. Code deployment that earlier took an hour with a few builds takes a few minutes now. We also integrated Airflow to use dynamically provisioned compute on our underlying Kubernetes infrastructure. Our current setup uses the git-sync DAG deployment pattern and we have successfully moved our default execution framework almost exclusively to use the KubernetesExecutor for scheduling tasks.

This allowed us to use our underlying Kubernetes infrastructure to manage execution of workloads and resource allocation but at the same time keep our development iteration layer thin and nimble. A frequent execution pattern is a KubernetesPodOperator with Python/Scala based processing, providing an isolated environment per workflow, allowing developers to be independent in how they design their pipeline, while at the same time benefiting from Airflow’s mature and complex infrastructure orchestration capabilities.

When switching over to Airflow, developers have seen processing times reduce from a few months to less than a week. In massively parallel workloads, our custom Spark solution managed by Airflow has improved processing times by an order of magnitude.

At ASAPP, Airflow plays a pivotal role in the operational lifecycle of our machine learning models, underpinning the technology that drives our product. Here are some ways we use Airflow:

Data Ingestion and Data Preparation

All ASAPP data is first ingested into data lakes via real-time Spark applications and Golang applications. This data is then processed, computed, cleaned, and sorted before being distributed to various sinks and computation mechanisms, including Spark batch, Flink, SQL databases, AWS Redshift, Snowflake, MySQL, AWS Athena, S3, SFTP, and third-party APIs. We manage and process over a million Airflow tasks daily across more than 5,000 Airflow DAGs to orchestrate our ingestion and data processing pipelines in all our environments.

Data Retention Policy Enforcement

At ASAPP, we manage data from diverse sources, each governed by its own retention policy. Without Airflow’s automation, we would manually monitor data retention policies through a mix of tools. This would be cumbersome and prone to errors. Using Airflow, we can now periodically check and enforce these policies to ensure compliance. Data retention is systematically monitored, and appropriate actions are taken to adhere to the specified retention durations for each dataset.

Sampling Production Data

We run scheduled DAGs to collect and process production data for our various applications. This is useful for a variety of downstream pipelines like scoring, drift detection, and retraining and fine-tuning.

Model Training/Fine-Tuning

We use Airflow to launch model training and fine-tuning jobs for our customer models using our internal model training framework. This is required in a variety of use cases such as the launch of a new model or periodic updates for our production models using new sampled production traffic. This can also be used to do a hyperparameter sweep in case of more green-field research models.

Model Performance Evaluation

At ASAPP, we are committed to the continuous monitoring and refinement of our models and prompts used across our services. Leveraging Airflow, we generate metrics for various ML models and prompts at regular intervals, enabling us to swiftly identify performance variations. These metrics are visualized through Grafana dashboards, ensuring quick and effective performance evaluations.

Application Specific DAGs

Most commonly used by our Speech and NLP teams, offline application inference is where Airflow shines. We’d like to highlight two use cases:

AutoTranscribe

Used by our Speech team, we run DAGs to transcribe current and historical audio data for downstream pipelines. This workflow consists of specific sub-tasks like language identification, diarization, and automatic speech recognition (ASR) using our state of the art speech models, and redaction on the transcripts. Here, Airflow allows us to process many hundreds of thousands of hours of audio at scale, with a very high workload to developer ratio.

AutoSummary

Used by our NLP teams, we run text summarization workflows against internal and third party models. Apart from data processing and preparation, we also use Airflow to implement a strong evaluation harness which is used to report summarization evaluation for our larger machine learning teams. For some specific use cases, our workloads are scheduled on task pods on GPU nodes. Here, the task pods can run models natively, primarily using PyTorch or PyTorch Lightning, or have an efficient LLM server like HuggingFace’s text-generation-inference (TGI) or vLLM running locally on a sidecar container. They can also interact with our internal LLM inference service which is optimized for inference and scaling costs.

Massively parallel workloads

At ASAPP, we process incredible amounts of audio and text data. Some of these tasks fall into the realm of embarrassingly parallel workloads, e.g. audio transcription or batch text summarization. We use Apache Spark managed by Airflow to make these kinds of inference workloads tractable.

Instead of managing a standalone, unwieldy Spark cluster for heterogeneous workloads, our custom solution launches an on-demand Spark cluster for each specified Airflow task. For an application DAG running a massive workload, we identify tasks that can benefit from parallelization and modify them to launch a Spark cluster using Airflow’s scheduling mechanism.

For running Spark-Airflow tasks, we use Airflow’s KubernetesPodOperator. The task pod coordinates with the Kubernetes cluster to launch the Spark application, handing off the actual work to the cluster. Each Spark application owner only needs to specify the application-specific code. The cluster has all the necessary infrastructure connections to work independently and signal the Airflow task pod when complete. This leverages Airflow’s strong integration with Kubernetes, allowing us to reliably scale Spark to process terabytes of data with hundreds of Spark executors.

A single DAG can spin up as many Spark clusters as there are tasks. With Airflow’s operational support, we eliminate the complexity of launching Spark, enabling us to focus on our application tasks rather than tuning Spark for a heterogeneous workload. Other tasks in the Airflow DAG remain unchanged.

This integration of Spark with Airflow allows us to use Airflow’s interface to control cluster load, such as using Airflow pools to limit task execution parallelism. While this adds an abstraction layer to running large-scale workloads, it has been adopted well and substantially improved runtime durations, in some cases by an order of magnitude.

At ASAPP, Airflow offers numerous advantages for AI workflow operations. It significantly simplifies the process, handling the interdependencies between various data and model pipelines seamlessly. It is a drop-in replacement for frequent LLM inference/sampling workloads. One of its standout features is the ability to build and integrate heterogeneous workflows with their own isolated environments à la containers. Its Python-based architecture and extensibility make development straightforward, allowing for custom solutions and adaptations, ease of use and pipeline onboarding.

We also benefit from Airflow’s resilience as a mature orchestrating layer. Automatic retry mechanisms for failed jobs, easy access to logs, and strong integration with our underlying Kubernetes compute layer. Airflow scales efficiently to accommodate numerous heterogenous pipelines, acting as a reliable intermediary and orchestrator across different systems. Airflow also has a robust ecosystem of third-party applications which allows us to add integrations to improve quality of life, for example, alert notifications through Slack integrations, exporting observability metrics etc.

Looking to the future, we expect Airflow to play an important role for leveraging advances in AI, especially complex and heavily integrated pipelines for Preference Based Learning (Reinforcement Learning from Human Feedback, Direct Preference Optimization). Airflow’s wide adoption by machine learning researchers, machine learning engineers and data engineers, makes it a versatile tool in the AI ecosystem for ASAPP.

#ai #certainty-certain #engineering #research #technical

Read Entire Article