Ansible Best Practices

Why Best Practices Matter

Following Ansible best practices ensures maintainable, scalable, and reliable automation code. Well-structured playbooks are easier to understand, test, and debug, leading to faster development and fewer production issues.

Core Principles:
  • Simplicity: Keep playbooks and roles simple and focused
  • Idempotence: Playbooks should be safely re-runnable
  • Version Control: Track all changes in Git
  • Documentation: Make code self-documenting with clear names and comments

Directory Structure

Standard Layout

ansible-project/
├── ansible.cfg                 # Ansible configuration
├── inventory/                  # Inventory files
│   ├── production/
│   │   ├── hosts              # Production inventory
│   │   └── group_vars/
│   │       └── all.yml
│   └── staging/
│       ├── hosts              # Staging inventory
│       └── group_vars/
│           └── all.yml
├── group_vars/                 # Variables for groups
│   ├── all.yml
│   ├── webservers.yml
│   └── databases.yml
├── host_vars/                  # Variables for specific hosts
│   ├── web01.yml
│   └── db01.yml
├── playbooks/                  # Playbooks
│   ├── site.yml               # Master playbook
│   ├── webservers.yml
│   └── databases.yml
├── roles/                      # Roles
│   ├── common/
│   ├── webserver/
│   └── database/
├── files/                      # Static files
├── templates/                  # Jinja2 templates
├── library/                    # Custom modules
├── filter_plugins/             # Custom filters
├── vars/                       # Additional variables
└── requirements.yml            # Role/collection dependencies

Role Structure

roles/webserver/
├── tasks/
│   ├── main.yml               # Main task file
│   ├── install.yml            # Installation tasks
│   ├── configure.yml          # Configuration tasks
│   └── service.yml            # Service management
├── handlers/
│   └── main.yml               # Handlers
├── templates/
│   ├── nginx.conf.j2
│   └── vhost.conf.j2
├── files/
│   └── index.html
├── vars/
│   └── main.yml               # Role variables
├── defaults/
│   └── main.yml               # Default variables
├── meta/
│   └── main.yml               # Role metadata
├── tests/
│   ├── inventory
│   └── test.yml
└── README.md                   # Role documentation

Naming Conventions

Tasks and Plays

# Good: Descriptive task names
- name: Install nginx web server
  apt:
    name: nginx
    state: present

- name: Copy nginx configuration
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf

- name: Ensure nginx is running and enabled
  systemd:
    name: nginx
    state: started
    enabled: yes

# Bad: Unclear names
- name: Install
  apt: name=nginx

- name: Copy file
  copy: src=config dest=/etc/config

Variables

# Good: Clear, descriptive names
nginx_port: 80
nginx_worker_processes: 4
app_version: "1.2.3"
database_host: "db.example.com"

# Use prefixes to avoid conflicts
nginx_conf_dir: /etc/nginx
nginx_log_dir: /var/log/nginx
mysql_root_password: "{{ vault_mysql_root_password }}"

# Bad: Ambiguous names
port: 80
processes: 4
version: "1.2.3"

Files and Templates

# Use .j2 extension for templates
templates/nginx.conf.j2
templates/app-config.yml.j2

# Descriptive file names
files/ssl-cert.pem
files/application-config.json

# Role-specific naming
roles/webserver/templates/webserver-nginx.conf.j2
roles/database/templates/database-my.cnf.j2

Playbook Organization

Master Playbook Pattern

# site.yml - Master playbook
---
- name: Configure all servers
  import_playbook: playbooks/common.yml

- name: Configure web servers
  import_playbook: playbooks/webservers.yml

- name: Configure database servers
  import_playbook: playbooks/databases.yml

- name: Configure load balancers
  import_playbook: playbooks/loadbalancers.yml

Role-Based Playbooks

# playbooks/webservers.yml
---
- name: Configure web servers
  hosts: webservers
  become: yes
  roles:
    - common
    - security
    - webserver
    - monitoring
  tags:
    - webservers

Variable Management

Variable Precedence Strategy

# Use defaults for... defaults (roles/*/defaults/main.yml)
nginx_worker_processes: auto
nginx_port: 80

# Use group_vars for environment-specific settings
# group_vars/production/all.yml
environment: production
log_level: warning

# group_vars/staging/all.yml
environment: staging
log_level: debug

# Use host_vars for host-specific settings
# host_vars/web01.yml
server_id: 1
datacenter: us-east-1

# Use extra vars (-e) for runtime overrides
ansible-playbook playbook.yml -e "version=1.2.3"

Separating Secrets

# group_vars/all/vars.yml - Public variables
app_name: myapp
app_port: 8080

# group_vars/all/vault.yml - Encrypted secrets
vault_db_password: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  ...
vault_api_key: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  ...

# Reference in vars.yml
db_password: "{{ vault_db_password }}"
api_key: "{{ vault_api_key }}"

Idempotence

Ensure Idempotent Tasks

# Good: Idempotent
- name: Ensure line in config file
  lineinfile:
    path: /etc/myapp/config
    line: "setting=value"
    regexp: '^setting='

- name: Ensure package is installed
  apt:
    name: nginx
    state: present

# Bad: Not idempotent without changed_when
- name: Append to file
  shell: echo "line" >> /etc/myapp/config

# Better: Make shell tasks idempotent
- name: Append to file if not present
  shell: grep -q "line" /etc/myapp/config || echo "line" >> /etc/myapp/config
  args:
    creates: /etc/myapp/config
  changed_when: false

Use Module State Parameters

# Always specify state
- name: Ensure service is started
  systemd:
    name: nginx
    state: started       # Not just 'started' but what state should it be
    enabled: yes

- name: Ensure package is latest
  apt:
    name: nginx
    state: latest        # Be explicit about intent

- name: Ensure directory exists
  file:
    path: /opt/app
    state: directory
    owner: app
    group: app
    mode: '0755'

Task Organization

Use Tags Effectively

- name: Install packages
  apt:
    name: "{{ item }}"
  loop: "{{ packages }}"
  tags:
    - install
    - packages

- name: Configure application
  template:
    src: app.conf.j2
    dest: /etc/app/app.conf
  tags:
    - configure
    - config

- name: Start service
  systemd:
    name: app
    state: started
  tags:
    - service
    - deploy

# Run specific tags
ansible-playbook site.yml --tags configure
ansible-playbook site.yml --tags "install,configure"
ansible-playbook site.yml --skip-tags deploy

Use Blocks for Grouping

- name: Web server setup
  block:
    - name: Install nginx
      apt:
        name: nginx
        state: present

    - name: Configure nginx
      template:
        src: nginx.conf.j2
        dest: /etc/nginx/nginx.conf

    - name: Start nginx
      systemd:
        name: nginx
        state: started

  when: inventory_hostname in groups['webservers']
  become: yes
  tags: webserver

Include and Import

# Use import_tasks for static includes (processed at parse time)
- name: Include static tasks
  import_tasks: install.yml
  tags: install

# Use include_tasks for dynamic includes (processed at runtime)
- name: Include dynamic tasks
  include_tasks: "{{ ansible_os_family }}.yml"

# Use import_role for static role inclusion
- name: Apply common role
  import_role:
    name: common

# Use include_role for dynamic role inclusion
- name: Apply environment-specific role
  include_role:
    name: "{{ environment }}_config"

Code Quality

Use Whitespace and Formatting

# Good: Readable formatting
- name: Configure application
  template:
    src: app.conf.j2
    dest: /etc/app/app.conf
    owner: app
    group: app
    mode: '0644'
  notify: Restart application

# Bad: Hard to read
- name: Configure application
  template: src=app.conf.j2 dest=/etc/app/app.conf owner=app group=app mode=0644
  notify: Restart application

Use Comments

# Explain why, not what
# Increase worker processes for high-traffic production environment
nginx_worker_processes: 16

# Workaround for Ubuntu 20.04 systemd bug #12345
- name: Reload systemd daemon
  systemd:
    daemon_reload: yes
  when: ansible_distribution_version == "20.04"

# Document complex logic
# Calculate optimal pool size: (cores * 2) + 1
db_pool_size: "{{ (ansible_processor_vcpus * 2) + 1 }}"

Linting

# Use ansible-lint
pip install ansible-lint

# Run on playbooks
ansible-lint playbooks/

# Run on specific playbook
ansible-lint playbooks/webservers.yml

# Configuration: .ansible-lint
---
skip_list:
  - '403'  # Package installs should not use latest
  - '701'  # meta/main.yml should contain relevant info

warn_list:
  - experimental

exclude_paths:
  - .cache/
  - .github/
  - molecule/

# Use yamllint for YAML syntax
pip install yamllint

# Configuration: .yamllint
extends: default
rules:
  line-length:
    max: 120
  indentation:
    spaces: 2

Security Best Practices

Never Commit Secrets

# Use Ansible Vault for secrets
ansible-vault create group_vars/all/vault.yml
ansible-vault edit group_vars/all/vault.yml

# .gitignore
*.vault
.vault_pass
*_vault.yml
secrets/

# Environment variables for CI/CD
# Never hardcode in playbooks
db_password: "{{ lookup('env', 'DB_PASSWORD') }}"
api_key: "{{ lookup('env', 'API_KEY') }}"

Principle of Least Privilege

# Only use become when necessary
- name: Read configuration
  slurp:
    src: /etc/app/config.yml
  # No become needed for reading

- name: Install package
  apt:
    name: nginx
    state: present
  become: yes  # Only here

# Specify become_user when not root
- name: Deploy application as app user
  copy:
    src: app.jar
    dest: /opt/app/
  become: yes
  become_user: app

Validate Input

# Validate required variables
- name: Validate prerequisites
  assert:
    that:
      - db_host is defined
      - db_host | length > 0
      - db_port is defined
      - db_port | int > 0
      - db_port | int < 65536
    fail_msg: "Database configuration is invalid"

# Validate file contents
- name: Validate config file syntax
  command: nginx -t -c {{ nginx_config_file }}
  changed_when: false
  check_mode: no

Performance Optimization

Optimize Fact Gathering

# ansible.cfg
[defaults]
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 86400

# Disable facts when not needed
- name: Simple playbook
  hosts: all
  gather_facts: no
  tasks:
    - name: Task that doesn't need facts

# Gather only needed facts
- name: Selective fact gathering
  hosts: all
  gather_facts: yes
  tasks:
    - setup:
        filter: ansible_distribution*

Use Native Modules

# Bad: Using shell for everything
- name: Install package
  shell: apt-get install -y nginx

- name: Copy file
  shell: cp /source/file /dest/file

# Good: Using native modules
- name: Install package
  apt:
    name: nginx
    state: present

- name: Copy file
  copy:
    src: file
    dest: /dest/file

Avoid Loops Where Possible

# Bad: Loop over packages
- name: Install packages
  apt:
    name: "{{ item }}"
  loop:
    - nginx
    - postgresql
    - redis

# Good: Single transaction
- name: Install packages
  apt:
    name:
      - nginx
      - postgresql
      - redis
    state: present

Testing

Test-Driven Development

# Include tests in roles
roles/webserver/
├── molecule/
│   └── default/
│       ├── molecule.yml
│       ├── converge.yml
│       └── verify.yml
└── tests/
    ├── inventory
    └── test.yml

# Run tests
molecule test

# Continuous testing in CI/CD
# .github/workflows/test.yml
- name: Test with Molecule
  run: molecule test

Validate Changes

# Always check syntax first
ansible-playbook playbook.yml --syntax-check

# Dry run before applying
ansible-playbook playbook.yml --check --diff

# Test on single host first
ansible-playbook playbook.yml --limit test-server

# Run twice to verify idempotence
ansible-playbook playbook.yml
ansible-playbook playbook.yml  # Should show changed=0

Documentation

Role README

# roles/webserver/README.md
# Webserver Role

Installs and configures Nginx web server.

## Requirements

- Ubuntu 20.04 or later
- Python 3.8+

## Role Variables

- `nginx_port`: Port for nginx (default: 80)
- `nginx_worker_processes`: Number of workers (default: auto)
- `nginx_sites`: List of sites to configure

## Dependencies

- common
- security

## Example Playbook

```yaml
- hosts: webservers
  roles:
    - role: webserver
      nginx_port: 8080
      nginx_sites:
        - name: example.com
          root: /var/www/example
```

## License

MIT

Playbook Documentation

---
# Playbook: webservers.yml
# Purpose: Configure web servers for production
# Author: DevOps Team
# Last Updated: 2024-01-15
#
# Usage:
#   ansible-playbook playbooks/webservers.yml -i inventory/production
#
# Tags:
#   - install: Install packages
#   - configure: Configure services
#   - deploy: Deploy application

- name: Configure web servers
  hosts: webservers
  become: yes
  roles:
    - webserver

Version Control

Git Best Practices

# .gitignore
*.retry
.vault_pass
*.pyc
__pycache__/
.ansible/
.cache/
*.log

# Use meaningful commit messages
git commit -m "Add nginx configuration template for multi-site support"
git commit -m "Fix: Correct variable precedence in database role"

# Branch strategy
main          # Production-ready code
develop       # Development branch
feature/*     # Feature branches
hotfix/*      # Emergency fixes

# Tag releases
git tag -a v1.2.3 -m "Release version 1.2.3"

Common Anti-Patterns to Avoid

Avoid These:
  • Using `latest`: Pin versions for reproducibility
  • Ignoring Idempotence: All playbooks should be re-runnable
  • Complex Jinja2 in Playbooks: Move logic to filters or modules
  • Hardcoded Values: Use variables and defaults
  • No Error Handling: Use blocks, failed_when, and rescue
  • Everything in One File: Split into roles and includes
  • No Testing: Test playbooks before production

Quick Reference Checklist

☐ Use descriptive names for tasks, plays, and variables
☐ Organize code with standard directory structure
☐ Store secrets in Ansible Vault
☐ Use version control (Git)
☐ Implement proper variable precedence
☐ Ensure idempotence
☐ Add comments for complex logic
☐ Use native modules over shell/command
☐ Tag tasks appropriately
☐ Lint playbooks with ansible-lint
☐ Test changes before deployment
☐ Document roles and playbooks
☐ Pin versions of roles and collections
☐ Use blocks for error handling
☐ Optimize fact gathering
☐ Follow security best practices

Next Steps