Troubleshooting Ansible: Common Errors and Solutions
Teach me Ansible |
2025-02-08 |
26 min read
Struggling with Ansible errors? This comprehensive troubleshooting guide covers the most common problems, their causes, and proven solutions. Learn debugging techniques that will save you hours of frustration.
Essential Debugging Tools
Verbose Output (-v, -vv, -vvv, -vvvv)
Increase verbosity to see what Ansible is doing:
# Basic verbosity
ansible-playbook site.yml -v
# More detail (shows task results)
ansible-playbook site.yml -vv
# Connection debugging
ansible-playbook site.yml -vvv
# Full debug (includes internal Ansible details)
ansible-playbook site.yml -vvvv
Check Mode (Dry Run)
# Preview changes without applying them
ansible-playbook site.yml --check
# Combine with diff to see what would change
ansible-playbook site.yml --check --diff
Syntax Checking
# Check playbook syntax
ansible-playbook site.yml --syntax-check
# Check role syntax
ansible-playbook tests/test.yml --syntax-check
# Validate YAML
yamllint playbook.yml
List Tasks and Hosts
# List all tasks that would execute
ansible-playbook site.yml --list-tasks
# List all hosts in inventory
ansible-playbook site.yml --list-hosts
# List all tags
ansible-playbook site.yml --list-tags
Common Error #1: "Failed to connect to the host via ssh"
Error Message:
fatal: [webserver1]: UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh",
"unreachable": true
}
Causes and Solutions:
1. SSH Key Not Configured
# Test SSH connection manually
ssh user@webserver1
# Copy SSH key to remote host
ssh-copy-id user@webserver1
# Or specify SSH key in inventory
[webservers]
webserver1 ansible_ssh_private_key_file=~/.ssh/id_rsa
2. Wrong Username
# Specify correct user in playbook
- hosts: webservers
remote_user: ubuntu
# or
ansible_user: ubuntu
# Or in inventory
[webservers]
webserver1 ansible_user=ubuntu
3. SSH Port Not Default
# inventory.ini
[webservers]
webserver1 ansible_port=2222
4. Host Key Verification Failed
# Add to ansible.cfg
[defaults]
host_key_checking = False
# Or add host to known_hosts
ssh-keyscan webserver1 >> ~/.ssh/known_hosts
Common Error #2: "MODULE FAILURE"
Error Message:
fatal: [webserver1]: FAILED! => {
"changed": false,
"module_stderr": "/bin/sh: 1: /usr/bin/python: not found",
"module_stdout": "",
"msg": "MODULE FAILURE"
}
Causes and Solutions:
Python Not Installed on Target
# For Ubuntu/Debian
ansible webserver1 -m raw -a "apt install python3 -y" -b
# For RHEL/CentOS
ansible webserver1 -m raw -a "yum install python3 -y" -b
Wrong Python Interpreter
# Specify Python 3 in playbook
- hosts: all
vars:
ansible_python_interpreter: /usr/bin/python3
# Or in inventory
[all:vars]
ansible_python_interpreter=/usr/bin/python3
Auto-Discovery
# ansible.cfg
[defaults]
interpreter_python = auto_silent
Common Error #3: "Permission denied"
Error Message:
fatal: [webserver1]: FAILED! => {
"changed": false,
"msg": "Failed to set permissions on the temporary files",
"failed": true
}
Solutions:
Use become (sudo)
- name: Install package
apt:
name: nginx
state: present
become: yes # Enable sudo
# Or at play level
- hosts: all
become: yes
tasks:
- name: Install nginx
apt:
name: nginx
state: present
Specify Sudo User
- name: Task requiring root
command: systemctl restart nginx
become: yes
become_user: root
become_method: sudo
Fix Sudo Configuration
# Allow user to sudo without password
# On remote host, run: sudo visudo
ubuntu ALL=(ALL) NOPASSWD:ALL
Common Error #4: "No package matching 'X' found"
Error Message:
fatal: [webserver1]: FAILED! => {
"changed": false,
"msg": "No package matching 'nginx' found available"
}
Solutions:
Update Package Cache First
- name: Update apt cache
apt:
update_cache: yes
become: yes
- name: Install nginx
apt:
name: nginx
state: present
become: yes
Check Package Name for OS
- name: Install web server
package:
name: "{{ web_package }}"
state: present
vars:
web_package: "{{ 'nginx' if ansible_os_family == 'Debian' else 'httpd' }}"
become: yes
Enable Required Repositories
- name: Enable EPEL repository
yum:
name: epel-release
state: present
become: yes
when: ansible_os_family == "RedHat"
Common Error #5: "Undefined variable"
Error Message:
fatal: [webserver1]: FAILED! => {
"msg": "The task includes an option with an undefined variable.
The error was: 'db_password' is undefined"
}
Solutions:
Define Missing Variable
# In playbook
- hosts: all
vars:
db_password: secret123
# Or in group_vars/all.yml
db_password: secret123
# Or pass via command line
ansible-playbook site.yml -e "db_password=secret123"
Use Default Filter
- name: Set password with fallback
debug:
msg: "Password is {{ db_password | default('defaultpass') }}"
Check Variable is Defined
- name: Fail if variable not set
fail:
msg: "db_password must be defined"
when: db_password is not defined
- name: Only run if variable exists
debug:
msg: "Password is {{ db_password }}"
when: db_password is defined
Common Error #6: "Template error"
Error Message:
fatal: [webserver1]: FAILED! => {
"msg": "AnsibleError: template error while templating string:
unexpected '}'"
}
Solutions:
Fix Jinja2 Syntax
# Wrong - double braces in template
server_name {{ server_name }};
# Correct - no double braces needed in .j2 files
server_name {{ server_name }};
# Wrong - mixing quotes
msg: "{{ 'hello' }}"
# Correct
msg: "{{ 'hello' }}"
Escape Special Characters
# When using { or } in strings
- name: Print message
debug:
msg: "{{ '{{' }} This is literal braces {{ '}}' }}"
Common Error #7: "YAML Syntax Error"
Error Message:
ERROR! Syntax Error while loading YAML.
expected , but found ''
Solutions:
Fix Indentation
# Wrong - mixed indentation
tasks:
- name: Task 1
debug:
msg: "Hello"
- name: Task 2 # Wrong indent
debug:
# Correct
tasks:
- name: Task 1
debug:
msg: "Hello"
- name: Task 2
debug:
msg: "World"
Quote Special Strings
# Wrong - unquoted special characters
name: Load: Balancer
# Correct
name: "Load: Balancer"
# Wrong - @ at start
email: @example.com
# Correct
email: "@example.com"
Multi-line Strings
# Use | for literal blocks
script: |
#!/bin/bash
echo "Line 1"
echo "Line 2"
# Use > for folded blocks
description: >
This is a long
description that will
be folded into one line
Common Error #8: "Timeout waiting for connection"
Solutions:
Increase Timeout
# ansible.cfg
[defaults]
timeout = 30
# Or per task
- name: Slow operation
command: /path/to/slow/script.sh
async: 300
poll: 10
Check Network Connectivity
# Test connection
ansible webserver1 -m ping
# Test with verbose output
ansible webserver1 -m ping -vvv
Common Error #9: "Failed to lock apt"
Error Message:
Could not get lock /var/lib/dpkg/lock-frontend - open
(11: Resource temporarily unavailable)
Solutions:
Wait for Lock
- name: Wait for apt lock
shell: while fuser /var/lib/dpkg/lock >/dev/null 2>&1; do sleep 1; done
become: yes
- name: Install package
apt:
name: nginx
state: present
become: yes
Use Retries
- name: Install nginx
apt:
name: nginx
state: present
become: yes
register: result
retries: 5
delay: 10
until: result is succeeded
Common Error #10: "Handler not found"
Error Message:
ERROR! The requested handler 'restart nginx' was not found
Solutions:
Check Handler Name Matches
# In tasks/main.yml
- name: Deploy config
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: restart nginx # Must match handler name exactly
# In handlers/main.yml
- name: restart nginx # Must match notify exactly
service:
name: nginx
state: restarted
Include Handler File
- name: Include handlers
include_tasks: handlers/main.yml
# Or use role structure (handlers auto-loaded)
Debugging Techniques
1. Use debug Module
- name: Show variable value
debug:
var: my_variable
- name: Show message
debug:
msg: "The value is {{ my_variable }}"
- name: Show all variables
debug:
var: vars
- name: Show hostvars
debug:
var: hostvars[inventory_hostname]
2. Register and Inspect Results
- name: Run command
command: cat /etc/os-release
register: result
- name: Show stdout
debug:
var: result.stdout_lines
- name: Show all result data
debug:
var: result
3. Use assert for Validation
- name: Verify variable is set
assert:
that:
- db_password is defined
- db_password | length > 8
fail_msg: "db_password must be defined and > 8 characters"
success_msg: "Password validation passed"
4. Step Through Tasks
# Execute tasks one by one with confirmation
ansible-playbook site.yml --step
5. Start at Specific Task
# Start from a specific task
ansible-playbook site.yml --start-at-task="Install nginx"
6. Run Specific Tags
# Only run tagged tasks
ansible-playbook site.yml --tags "config,deploy"
# Skip specific tags
ansible-playbook site.yml --skip-tags "slow,optional"
Advanced Troubleshooting
Strategy: debug
# ansible.cfg
[defaults]
strategy = debug
# When task fails, drops to interactive debugger
# Commands: p (print), c (continue), q (quit), r (redo)
Enable Callback Plugins
# ansible.cfg
[defaults]
stdout_callback = yaml # More readable output
# or
stdout_callback = debug # Detailed debugging
# Show task timing
callbacks_enabled = profile_tasks, timer
Log Playbook Output
# ansible.cfg
[defaults]
log_path = ./ansible.log
# Or via command line
ansible-playbook site.yml 2>&1 | tee playbook.log
Prevention Best Practices
1. Always Check Syntax First
ansible-playbook site.yml --syntax-check
yamllint *.yml
ansible-lint site.yml
2. Use Check Mode in CI/CD
# In your CI pipeline
ansible-playbook site.yml --check --diff
3. Test Incrementally
# Test on one host first
- hosts: webservers[0]
tasks:
- name: Test task
debug:
msg: "Testing"
4. Use Molecule for Testing
cd roles/myrole
molecule test
Useful Commands Reference
# Test inventory
ansible-inventory --list
ansible-inventory --graph
# Test host connectivity
ansible all -m ping
# Get facts from host
ansible webserver1 -m setup
# Run ad-hoc command
ansible all -a "uptime"
# Check which version of module
ansible-doc -l | grep module_name
ansible-doc module_name
# See all configuration
ansible-config dump
# Validate variables
ansible all -m debug -a "var=hostvars"
Conclusion
Troubleshooting Ansible doesn't have to be frustrating. With the right techniques and understanding of common errors, you can quickly identify and fix issues. Remember to:
- Start with increased verbosity (-vvv)
- Check syntax before running
- Use check mode to preview changes
- Debug with the debug module
- Test incrementally on small host groups
Pro Tip
Create a troubleshooting playbook that tests common failure points (connectivity, permissions, Python version, etc.). Run it before deploying to new infrastructure to catch issues early.