Asynchronous Tasks & Parallel Execution

What is Async Execution? Asynchronous execution allows Ansible to run long-running tasks in the background without blocking subsequent tasks. This enables true parallel execution, prevents timeout issues, and improves playbook performance for time-consuming operations.

Understanding Async Execution

Why Use Async?

Asynchronous execution solves several challenges:

  • Long-Running Tasks: Avoid SSH timeout on operations that take minutes or hours
  • True Parallelism: Start multiple tasks simultaneously
  • Performance: Reduce total playbook execution time
  • Resource Optimization: Don't block waiting for slow operations
  • Batch Operations: Process many hosts concurrently

Synchronous vs Asynchronous

By default, Ansible runs tasks synchronously:

  • Synchronous: Wait for task completion before moving to next task
  • Asynchronous: Start task and continue immediately (optionally check status later)

Async Keywords

async Parameter

Sets maximum runtime (in seconds) before terminating the task:

---
- name: Async execution example
  hosts: all
  tasks:
    - name: Long-running operation
      ansible.builtin.command: /usr/bin/long_process.sh
      async: 300  # Maximum 300 seconds (5 minutes)
      poll: 0     # Don't wait (fire and forget)

poll Parameter

Controls how Ansible monitors task completion:

  • poll > 0: Check every N seconds until complete
  • poll = 0: Fire-and-forget mode; don't wait for completion

Async with Polling

Long Task with Monitoring

Run task asynchronously but wait for completion:

---
- name: Update all packages
  hosts: servers
  tasks:
    - name: Run yum update (may take several minutes)
      ansible.builtin.yum:
        name: '*'
        state: latest
      async: 1800  # 30-minute timeout
      poll: 10     # Check status every 10 seconds
      register: yum_update

    - name: Display update results
      ansible.builtin.debug:
        var: yum_update

Database Backup with Async

---
- name: Backup large database
  hosts: db_servers
  tasks:
    - name: Perform full database backup
      ansible.builtin.command: >
        pg_dump -F c -f /backup/db_full_{{ ansible_date_time.date }}.dump production_db
      async: 3600  # 1-hour timeout
      poll: 30     # Check every 30 seconds
      register: backup_result

    - name: Verify backup completed
      ansible.builtin.debug:
        msg: "Backup completed successfully"
      when: backup_result.finished

Restart Service with Wait

---
- name: Restart services with async
  hosts: application_servers
  tasks:
    - name: Restart application (slow startup)
      ansible.builtin.service:
        name: myapp
        state: restarted
      async: 300  # 5-minute startup timeout
      poll: 5     # Check every 5 seconds

    - name: Wait for application port
      ansible.builtin.wait_for:
        port: 8080
        state: started
        timeout: 300

Fire-and-Forget Mode

Start Tasks Without Waiting

Use poll: 0 to start tasks and continue immediately:

---
- name: Parallel backups
  hosts: all_servers
  tasks:
    - name: Start backup on all servers
      ansible.builtin.command: /usr/local/bin/backup.sh
      async: 3600
      poll: 0  # Don't wait
      register: backup_job

    - name: Continue with other tasks
      ansible.builtin.debug:
        msg: "Backups started on all servers, continuing..."

    - name: Perform other operations
      ansible.builtin.yum:
        name: monitoring-agent
        state: latest

Check Status Later

Use async_status module to check fire-and-forget tasks:

---
- name: Start and check async tasks
  hosts: servers
  tasks:
    - name: Start long operation
      ansible.builtin.command: /opt/scripts/long_process.sh
      async: 1800
      poll: 0
      register: long_task

    - name: Do other work while task runs
      ansible.builtin.package:
        name: htop
        state: present

    - name: Configure monitoring
      ansible.builtin.template:
        src: monitoring.conf.j2
        dest: /etc/monitoring/config.conf

    - name: Check if long operation completed
      ansible.builtin.async_status:
        jid: "{{ long_task.ansible_job_id }}"
      register: job_result
      until: job_result.finished
      retries: 30
      delay: 60  # Check every 60 seconds

    - name: Display result
      ansible.builtin.debug:
        msg: "Long operation finished: {{ job_result.stdout }}"

Parallel Task Execution

Run Multiple Tasks Concurrently

---
- name: Parallel service updates
  hosts: app_servers
  tasks:
    # Start all updates in parallel
    - name: Update frontend service
      ansible.builtin.command: /opt/update_frontend.sh
      async: 600
      poll: 0
      register: frontend_update

    - name: Update backend service
      ansible.builtin.command: /opt/update_backend.sh
      async: 600
      poll: 0
      register: backend_update

    - name: Update database schema
      ansible.builtin.command: /opt/migrate_database.sh
      async: 600
      poll: 0
      register: db_update

    # Wait for all to complete
    - name: Wait for frontend update
      ansible.builtin.async_status:
        jid: "{{ frontend_update.ansible_job_id }}"
      register: frontend_result
      until: frontend_result.finished
      retries: 60
      delay: 10

    - name: Wait for backend update
      ansible.builtin.async_status:
        jid: "{{ backend_update.ansible_job_id }}"
      register: backend_result
      until: backend_result.finished
      retries: 60
      delay: 10

    - name: Wait for database migration
      ansible.builtin.async_status:
        jid: "{{ db_update.ansible_job_id }}"
      register: db_result
      until: db_result.finished
      retries: 60
      delay: 10

    - name: Verify all updates succeeded
      ansible.builtin.debug:
        msg: "All services updated successfully"
      when:
        - frontend_result.rc == 0
        - backend_result.rc == 0
        - db_result.rc == 0

Batch Processing Pattern

---
- name: Process large dataset across servers
  hosts: worker_nodes
  tasks:
    - name: Start data processing jobs
      ansible.builtin.command: >
        /opt/process_data.py --batch {{ item }}
      loop: "{{ range(1, 101) | list }}"  # 100 batches
      async: 7200  # 2-hour timeout per batch
      poll: 0
      register: processing_jobs

    - name: Wait for all jobs to complete
      ansible.builtin.async_status:
        jid: "{{ item.ansible_job_id }}"
      loop: "{{ processing_jobs.results }}"
      register: job_results
      until: job_results.finished
      retries: 120
      delay: 60

    - name: Collect results
      ansible.builtin.debug:
        msg: "Processed {{ job_results.results | selectattr('rc', 'equalto', 0) | list | length }} batches successfully"

Controlling Parallelism

Forks Configuration

Control maximum concurrent host connections:

# ansible.cfg
[defaults]
forks = 10  # Default is 5

# Command line
ansible-playbook site.yml --forks 20

# Higher forks = more parallel execution

Serial Execution

Process hosts in batches:

---
- name: Rolling update with batches
  hosts: webservers
  serial: 2  # Process 2 hosts at a time
  tasks:
    - name: Update application
      ansible.builtin.yum:
        name: myapp
        state: latest
      async: 300
      poll: 10

    - name: Restart service
      ansible.builtin.service:
        name: myapp
        state: restarted

# Can also use percentage
- name: Update 25% at a time
  hosts: all
  serial: "25%"
  tasks:
    - name: Update packages
      ansible.builtin.yum:
        name: '*'
        state: latest

# Dynamic serial values
- name: Staged rollout
  hosts: production
  serial:
    - 1    # First host
    - 5    # Then 5 hosts
    - 10   # Then 10 hosts
    - "100%"  # Rest all at once

Throttle Directive

Limit task execution across all hosts:

---
- name: Resource-intensive operations
  hosts: all
  tasks:
    - name: CPU-intensive task (limit concurrent execution)
      ansible.builtin.command: /opt/intensive_process.sh
      async: 600
      poll: 10
      throttle: 3  # Only 3 hosts run this simultaneously

    - name: API calls (rate limiting)
      ansible.builtin.uri:
        url: https://api.example.com/register
        method: POST
      throttle: 1  # One at a time to avoid rate limits

Real-World Patterns

Zero-Downtime Deployment

---
- name: Zero-downtime rolling deployment
  hosts: app_servers
  serial: 1
  tasks:
    - name: Remove from load balancer
      ansible.builtin.command: lb-remove {{ inventory_hostname }}
      delegate_to: loadbalancer

    - name: Stop application
      ansible.builtin.service:
        name: myapp
        state: stopped

    - name: Deploy new version
      ansible.builtin.copy:
        src: /releases/app-v2.0.jar
        dest: /opt/app/app.jar

    - name: Start application (slow startup)
      ansible.builtin.service:
        name: myapp
        state: started
      async: 300
      poll: 5

    - name: Health check
      ansible.builtin.uri:
        url: "http://{{ ansible_host }}:8080/health"
        status_code: 200
      register: health
      until: health.status == 200
      retries: 30
      delay: 5

    - name: Add back to load balancer
      ansible.builtin.command: lb-add {{ inventory_hostname }}
      delegate_to: loadbalancer

Parallel Data Synchronization

---
- name: Sync data to multiple destinations
  hosts: primary_server
  tasks:
    - name: Start rsync to all backup servers
      ansible.builtin.command: >
        rsync -avz /data/ {{ item }}:/backup/
      loop: "{{ groups['backup_servers'] }}"
      async: 7200  # 2 hours
      poll: 0
      register: sync_jobs

    - name: Monitor sync progress
      ansible.builtin.async_status:
        jid: "{{ item.ansible_job_id }}"
      loop: "{{ sync_jobs.results }}"
      register: sync_status
      until: sync_status.finished
      retries: 240
      delay: 30

    - name: Report sync results
      ansible.builtin.debug:
        msg: "Synced to {{ sync_status.results | selectattr('rc', 'equalto', 0) | list | length }} servers"

Distributed Build System

---
- name: Parallel build across build servers
  hosts: build_servers
  tasks:
    - name: Start build jobs
      ansible.builtin.command: >
        /opt/build.sh --project {{ item }}
      loop:
        - frontend
        - backend
        - mobile-ios
        - mobile-android
        - desktop
      async: 3600
      poll: 0
      register: builds
      run_once: true
      delegate_to: "{{ groups['build_servers'][item_index % groups['build_servers']|length] }}"
      loop_control:
        index_var: item_index

    - name: Wait for all builds
      ansible.builtin.async_status:
        jid: "{{ item.ansible_job_id }}"
      loop: "{{ builds.results }}"
      register: build_results
      until: build_results.finished
      retries: 60
      delay: 60
      run_once: true

Cleanup and Management

Clean Up Async Jobs

---
- name: Cleanup async job data
  hosts: all
  tasks:
    - name: Start long task
      ansible.builtin.command: /opt/process.sh
      async: 600
      poll: 0
      register: long_job

    - name: Check status
      ansible.builtin.async_status:
        jid: "{{ long_job.ansible_job_id }}"
      register: job_status
      until: job_status.finished
      retries: 20
      delay: 30

    - name: Cleanup job data
      ansible.builtin.async_status:
        jid: "{{ long_job.ansible_job_id }}"
        mode: cleanup
      when: job_status.finished

Handle Failures

---
- name: Async with error handling
  hosts: all
  tasks:
    - name: Start risky operation
      ansible.builtin.command: /opt/risky_operation.sh
      async: 600
      poll: 0
      register: risky_job
      ignore_errors: yes

    - name: Check operation result
      ansible.builtin.async_status:
        jid: "{{ risky_job.ansible_job_id }}"
      register: operation_result
      until: operation_result.finished
      retries: 30
      delay: 20
      ignore_errors: yes

    - name: Handle failure
      ansible.builtin.debug:
        msg: "Operation failed on {{ inventory_hostname }}: {{ operation_result.stderr }}"
      when: operation_result.rc != 0

    - name: Rollback on failure
      ansible.builtin.command: /opt/rollback.sh
      when: operation_result.rc != 0

Best Practices

Async Execution Best Practices:
  • Set Appropriate Timeouts: Use realistic async values based on expected runtime
  • Increase Forks: Higher --forks value improves async performance
  • Use poll: 0 Wisely: Only for truly independent tasks
  • Clean Up Jobs: Use mode: cleanup to remove job cache files
  • Handle Errors: Always check return codes in async_status
  • Avoid Lock Conflicts: Don't use poll: 0 with yum (uses locks)
  • Monitor Progress: Implement proper status checking and retries
  • Document Timing: Comment expected task duration

Limitations and Gotchas

Tasks That Don't Support Async

  • Tasks using with_* loops (use loop instead)
  • Tasks in check mode
  • Some connection plugins
  • include_tasks and import_tasks

Common Issues

---
# WRONG - async doesn't work with check mode
- name: This will fail
  ansible.builtin.command: /opt/script.sh
  async: 300
  poll: 0
  check_mode: yes

# WRONG - package managers with locks don't work well with poll: 0
- name: This can cause issues
  ansible.builtin.yum:
    name: httpd
    state: present
  async: 300
  poll: 0  # Multiple hosts might conflict on yum lock

# RIGHT - use async with polling for package operations
- name: Safe package installation
  ansible.builtin.yum:
    name: httpd
    state: present
  async: 300
  poll: 10

Next Steps