Troubleshooting

Troubleshooting Ansible: Common Errors and Solutions

Teach me Ansible | 2025-02-08 | 26 min read

Struggling with Ansible errors? This comprehensive troubleshooting guide covers the most common problems, their causes, and proven solutions. Learn debugging techniques that will save you hours of frustration.

Essential Debugging Tools

Verbose Output (-v, -vv, -vvv, -vvvv)

Increase verbosity to see what Ansible is doing:

# Basic verbosity
ansible-playbook site.yml -v

# More detail (shows task results)
ansible-playbook site.yml -vv

# Connection debugging
ansible-playbook site.yml -vvv

# Full debug (includes internal Ansible details)
ansible-playbook site.yml -vvvv

Check Mode (Dry Run)

# Preview changes without applying them
ansible-playbook site.yml --check

# Combine with diff to see what would change
ansible-playbook site.yml --check --diff

Syntax Checking

# Check playbook syntax
ansible-playbook site.yml --syntax-check

# Check role syntax
ansible-playbook tests/test.yml --syntax-check

# Validate YAML
yamllint playbook.yml

List Tasks and Hosts

# List all tasks that would execute
ansible-playbook site.yml --list-tasks

# List all hosts in inventory
ansible-playbook site.yml --list-hosts

# List all tags
ansible-playbook site.yml --list-tags

Common Error #1: "Failed to connect to the host via ssh"

Error Message:

fatal: [webserver1]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh",
    "unreachable": true
}

Causes and Solutions:

1. SSH Key Not Configured

# Test SSH connection manually
ssh user@webserver1

# Copy SSH key to remote host
ssh-copy-id user@webserver1

# Or specify SSH key in inventory
[webservers]
webserver1 ansible_ssh_private_key_file=~/.ssh/id_rsa

2. Wrong Username

# Specify correct user in playbook
- hosts: webservers
  remote_user: ubuntu
  # or
  ansible_user: ubuntu

# Or in inventory
[webservers]
webserver1 ansible_user=ubuntu

3. SSH Port Not Default

# inventory.ini
[webservers]
webserver1 ansible_port=2222

4. Host Key Verification Failed

# Add to ansible.cfg
[defaults]
host_key_checking = False

# Or add host to known_hosts
ssh-keyscan webserver1 >> ~/.ssh/known_hosts

Common Error #2: "MODULE FAILURE"

Error Message:

fatal: [webserver1]: FAILED! => {
    "changed": false,
    "module_stderr": "/bin/sh: 1: /usr/bin/python: not found",
    "module_stdout": "",
    "msg": "MODULE FAILURE"
}

Causes and Solutions:

Python Not Installed on Target

# For Ubuntu/Debian
ansible webserver1 -m raw -a "apt install python3 -y" -b

# For RHEL/CentOS
ansible webserver1 -m raw -a "yum install python3 -y" -b

Wrong Python Interpreter

# Specify Python 3 in playbook
- hosts: all
  vars:
    ansible_python_interpreter: /usr/bin/python3

# Or in inventory
[all:vars]
ansible_python_interpreter=/usr/bin/python3

Auto-Discovery

# ansible.cfg
[defaults]
interpreter_python = auto_silent

Common Error #3: "Permission denied"

Error Message:

fatal: [webserver1]: FAILED! => {
    "changed": false,
    "msg": "Failed to set permissions on the temporary files",
    "failed": true
}

Solutions:

Use become (sudo)

- name: Install package
  apt:
    name: nginx
    state: present
  become: yes  # Enable sudo

# Or at play level
- hosts: all
  become: yes
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present

Specify Sudo User

- name: Task requiring root
  command: systemctl restart nginx
  become: yes
  become_user: root
  become_method: sudo

Fix Sudo Configuration

# Allow user to sudo without password
# On remote host, run: sudo visudo
ubuntu ALL=(ALL) NOPASSWD:ALL

Common Error #4: "No package matching 'X' found"

Error Message:

fatal: [webserver1]: FAILED! => {
    "changed": false,
    "msg": "No package matching 'nginx' found available"
}

Solutions:

Update Package Cache First

- name: Update apt cache
  apt:
    update_cache: yes
  become: yes

- name: Install nginx
  apt:
    name: nginx
    state: present
  become: yes

Check Package Name for OS

- name: Install web server
  package:
    name: "{{ web_package }}"
    state: present
  vars:
    web_package: "{{ 'nginx' if ansible_os_family == 'Debian' else 'httpd' }}"
  become: yes

Enable Required Repositories

- name: Enable EPEL repository
  yum:
    name: epel-release
    state: present
  become: yes
  when: ansible_os_family == "RedHat"

Common Error #5: "Undefined variable"

Error Message:

fatal: [webserver1]: FAILED! => {
    "msg": "The task includes an option with an undefined variable.
    The error was: 'db_password' is undefined"
}

Solutions:

Define Missing Variable

# In playbook
- hosts: all
  vars:
    db_password: secret123

# Or in group_vars/all.yml
db_password: secret123

# Or pass via command line
ansible-playbook site.yml -e "db_password=secret123"

Use Default Filter

- name: Set password with fallback
  debug:
    msg: "Password is {{ db_password | default('defaultpass') }}"

Check Variable is Defined

- name: Fail if variable not set
  fail:
    msg: "db_password must be defined"
  when: db_password is not defined

- name: Only run if variable exists
  debug:
    msg: "Password is {{ db_password }}"
  when: db_password is defined

Common Error #6: "Template error"

Error Message:

fatal: [webserver1]: FAILED! => {
    "msg": "AnsibleError: template error while templating string:
    unexpected '}'"
}

Solutions:

Fix Jinja2 Syntax

# Wrong - double braces in template
server_name {{ server_name }};

# Correct - no double braces needed in .j2 files
server_name {{ server_name }};

# Wrong - mixing quotes
msg: "{{ 'hello' }}"

# Correct
msg: "{{ 'hello' }}"

Escape Special Characters

# When using { or } in strings
- name: Print message
  debug:
    msg: "{{ '{{' }} This is literal braces {{ '}}' }}"

Common Error #7: "YAML Syntax Error"

Error Message:

ERROR! Syntax Error while loading YAML.
  expected , but found ''

Solutions:

Fix Indentation

# Wrong - mixed indentation
tasks:
  - name: Task 1
    debug:
      msg: "Hello"
    - name: Task 2  # Wrong indent
      debug:

# Correct
tasks:
  - name: Task 1
    debug:
      msg: "Hello"
  - name: Task 2
    debug:
      msg: "World"

Quote Special Strings

# Wrong - unquoted special characters
name: Load: Balancer

# Correct
name: "Load: Balancer"

# Wrong - @ at start
email: @example.com

# Correct
email: "@example.com"

Multi-line Strings

# Use | for literal blocks
script: |
  #!/bin/bash
  echo "Line 1"
  echo "Line 2"

# Use > for folded blocks
description: >
  This is a long
  description that will
  be folded into one line

Common Error #8: "Timeout waiting for connection"

Solutions:

Increase Timeout

# ansible.cfg
[defaults]
timeout = 30

# Or per task
- name: Slow operation
  command: /path/to/slow/script.sh
  async: 300
  poll: 10

Check Network Connectivity

# Test connection
ansible webserver1 -m ping

# Test with verbose output
ansible webserver1 -m ping -vvv

Common Error #9: "Failed to lock apt"

Error Message:

Could not get lock /var/lib/dpkg/lock-frontend - open
(11: Resource temporarily unavailable)

Solutions:

Wait for Lock

- name: Wait for apt lock
  shell: while fuser /var/lib/dpkg/lock >/dev/null 2>&1; do sleep 1; done
  become: yes

- name: Install package
  apt:
    name: nginx
    state: present
  become: yes

Use Retries

- name: Install nginx
  apt:
    name: nginx
    state: present
  become: yes
  register: result
  retries: 5
  delay: 10
  until: result is succeeded

Common Error #10: "Handler not found"

Error Message:

ERROR! The requested handler 'restart nginx' was not found

Solutions:

Check Handler Name Matches

# In tasks/main.yml
- name: Deploy config
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
  notify: restart nginx  # Must match handler name exactly

# In handlers/main.yml
- name: restart nginx  # Must match notify exactly
  service:
    name: nginx
    state: restarted

Include Handler File

- name: Include handlers
  include_tasks: handlers/main.yml

# Or use role structure (handlers auto-loaded)

Debugging Techniques

1. Use debug Module

- name: Show variable value
  debug:
    var: my_variable

- name: Show message
  debug:
    msg: "The value is {{ my_variable }}"

- name: Show all variables
  debug:
    var: vars

- name: Show hostvars
  debug:
    var: hostvars[inventory_hostname]

2. Register and Inspect Results

- name: Run command
  command: cat /etc/os-release
  register: result

- name: Show stdout
  debug:
    var: result.stdout_lines

- name: Show all result data
  debug:
    var: result

3. Use assert for Validation

- name: Verify variable is set
  assert:
    that:
      - db_password is defined
      - db_password | length > 8
    fail_msg: "db_password must be defined and > 8 characters"
    success_msg: "Password validation passed"

4. Step Through Tasks

# Execute tasks one by one with confirmation
ansible-playbook site.yml --step

5. Start at Specific Task

# Start from a specific task
ansible-playbook site.yml --start-at-task="Install nginx"

6. Run Specific Tags

# Only run tagged tasks
ansible-playbook site.yml --tags "config,deploy"

# Skip specific tags
ansible-playbook site.yml --skip-tags "slow,optional"

Advanced Troubleshooting

Strategy: debug

# ansible.cfg
[defaults]
strategy = debug

# When task fails, drops to interactive debugger
# Commands: p (print), c (continue), q (quit), r (redo)

Enable Callback Plugins

# ansible.cfg
[defaults]
stdout_callback = yaml  # More readable output
# or
stdout_callback = debug  # Detailed debugging

# Show task timing
callbacks_enabled = profile_tasks, timer

Log Playbook Output

# ansible.cfg
[defaults]
log_path = ./ansible.log

# Or via command line
ansible-playbook site.yml 2>&1 | tee playbook.log

Prevention Best Practices

1. Always Check Syntax First

ansible-playbook site.yml --syntax-check
yamllint *.yml
ansible-lint site.yml

2. Use Check Mode in CI/CD

# In your CI pipeline
ansible-playbook site.yml --check --diff

3. Test Incrementally

# Test on one host first
- hosts: webservers[0]
  tasks:
    - name: Test task
      debug:
        msg: "Testing"

4. Use Molecule for Testing

cd roles/myrole
molecule test

Useful Commands Reference

# Test inventory
ansible-inventory --list
ansible-inventory --graph

# Test host connectivity
ansible all -m ping

# Get facts from host
ansible webserver1 -m setup

# Run ad-hoc command
ansible all -a "uptime"

# Check which version of module
ansible-doc -l | grep module_name
ansible-doc module_name

# See all configuration
ansible-config dump

# Validate variables
ansible all -m debug -a "var=hostvars"

Conclusion

Troubleshooting Ansible doesn't have to be frustrating. With the right techniques and understanding of common errors, you can quickly identify and fix issues. Remember to:

  • Start with increased verbosity (-vvv)
  • Check syntax before running
  • Use check mode to preview changes
  • Debug with the debug module
  • Test incrementally on small host groups

Pro Tip

Create a troubleshooting playbook that tests common failure points (connectivity, permissions, Python version, etc.). Run it before deploying to new infrastructure to catch issues early.