10 Apr 2019 • on ansible cloud9 aws software provisioning configuration management git

Jumpstart Guide to Ansible

Introduction

This guide will help you get started on using Ansible, an open-source tool you can use to automate and maintain the software and configurations of your Linux systems, as well as handle custom software deployments.

Ansible is programmed in Python, however you do not need to know Python to use it. You may need to gain an understanding of YAML, Jinja templates, and/or read the Ansible Project Documentation.

In this guide we use Ansible to maintain the state of your local development environment. After you are done you’ll be able to apply what you’ve learned beyond your local Linux environment.

To follow this guide, it is recommended that you setup an Amazon Cloud9 environment with the default Amazon Linux EC2 instance. The Cloud9 service is free, as is the EC2 service to new Amazon AWS users for the first 12 months. If you’ve used Amazon AWS before, then your cost should be less than $10 per month as long as you are using a t2.micro EC2 instance that is configured to turn off after 30 minutes of non-activity.

If you do not have an environment setup yet, see the Amazon Cloud9 Environment Setup guide to get started.

This guide also assumes that you have experince with Git. This guide is oriented around the “infrastructure as code” methodology, where all our desired configurations are stored in a Git repository.

Establishing a Git repository

Create a new Github repository to store your Ansible configuration. After you’ve done this, use git to clone the repository into your local environment.

Here’s the command I used with my own repository:

git clone git@github.com:redconfetti/cloud9-dev.git

Installing the Ansible Engine

Amazon Linux includes Python 2.7 and 3.5 by default. Installing Ansible using the Yum package manager leads to conflicts, so use the Python 2.7 version of PIP to install Ansible.

# Install Ansible using Python 2.7
/usr/bin/pip-2.7 install ansible

If you’re using a different Linux distribution, see the Ansible Installation Guide for other system installation instructions.

Ansible Configuration

The default Ansible configuration file is located under /etc/ansible/ansible.cfg. Ansible will over-ride these default settings if there are any found in a .ansible.cfg file located in your home directory, or an ansible.cfg file located in the current directory.

Ansible is designed to be used from a specific machine, designated as the “management node”, that is used to orchestrate the setup and configuration of other machines. This is why the default configurations exist globally under /etc/ansible.

For our purposes we want Ansible to use the configuration we place in our repository, so create a new ansible.cfg file in the root folder of your repository with the following contents

[defaults]
inventory = hosts.yml

This will help us to avoid needing to use the -i hosts.yml parameter with our Ansible commands, however it does introduce a possible security risk. Since we’re running an EC2 instance that we’re not sharing with other users, we’re not running the risk of another user modifying our ansible.cfg file.

Ansible Configuration Documentation:

Ansible Inventory

Ansible is able to manage the state of multiple machines. By default, the inventory is defined in /etc/ansible/hosts using a format similar to an INI file. Alternatively you can use a YAML file to define your inventory.

As you saw above, we configured Ansible to recognize the hosts.yml file located in the root directory of the repository.

Create the hosts.yml file with the following contents that defines the local machine as ‘local’.

all:
  hosts:
    local:
      ansible_host: localhost
      ansible_connection: local

This configures a host named ‘local’ that represents the local EC2 instance that you’re working from when you use the command line interface (Terminal) in Cloud9.

For more information, see Ansible - Working with Inventory

Ansible Ad-Hoc Commands

Ansible is designed to connect to hosts configured in your inventory file using SSH key authentication. Because we’re configured to connect to our local system, we do not have to worry about SSH connections or configuration.

Ansible can be used to perform checks against hosts using a single command line command, known as an ‘ad-hoc’ command.

Run the command ansible all -m ping and you should see the following.

$ ansible all -m ping
local | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

If you see the output above indicating success, then you’re ready to move onto the next step.

For more information, see Introduction To Ad-Hoc Commands.

Ansible Playbooks

Ad-hoc commands are useful, but playbooks are where we can define multiple tasks that maintain the state of your system(s).

Much like the hosts.yml file defined above, playbooks are defined in YAML format. YAML is not a programming language, but a format for storing information. In this case the information is our configuration that Ansible modules are defined to interpret and apply to the hosts we specify.

A playbook can contain one or more “plays” that applies a certain state to a group of hosts.

The Amazon Linux distribution uses a package manager called Yum (Yellowdog Updater Modified). Ansible provides a Yum module which allows us to use this package manager to install and update software packages on the host.

Create a file named local.yml with the following contents in your root directory.

---
# This playbook deploys the entire setup to the Cloud9 development environment.
- hosts: local
  tasks:
    - name: upgrade all packages
      become: yes
      yum:
        name: '*'
        state: latest

Next, run the following command to run this playbook.

ansible-playbook local.yml

You should see output similar to this:

PLAY [local] *******************************************************************

TASK [Gathering Facts] *********************************************************
ok: [local]

TASK [upgrade all packages] ****************************************************
ok: [local]

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0

Now all of the software packages installed within your EC2 instance running Amazon Linux are updated.

See Intro to Playbooks or Modules by Category.

Ansible Modules

We just covered a task that used the Yum module to update all the installed software packages on the server.

Ansible supports many modules that serve many different purposes. You can view a list of all available modules by running:

ansible-doc -l

You can read the documentation for a specific module by using the command:

ansible-doc <module-name>

# View 'hostname' module documentation
ansible-doc hostname

You can also browse modules online starting with Modules by Category.

Ansible Templates

Most Unix/Linux utilities and daemons can be configured using plain text files. Ansible supports the ability to define configuration file templates that use the Jinja2 template syntax.

Let’s create a file called gitconfig.j2 to define our Git client configuration.

# {{ ansible_managed }}
[core]
  editor = /usr/bin/nano
[user]
  name = {{git_client_name}}
  email = {{git_client_name}}

Now update the local.yml playbook file so that it includes the ‘vars’ that define our git client name and email address. Also add the tasks to output the value of the ansible_user_id variable, and configure the .gitconfig in that users home directory.

---
# This playbook deploys the entire setup to the Cloud9 development environment.
- hosts: local
  vars:
    git_client_name: PeeWee Herman
    git_client_email: peewee@example.com
  tasks:
    - name: upgrade all packages
      yum:
        name: '*'
        state: latest
    - debug: var=ansible_user_id
    - name: configure git client
      template:
        src: gitconfig
        dest: "/home/{{ ansible_user_id }}/.gitconfig"
        owner: "{{ ansible_user_id }}"
        group: "{{ ansible_user_id }}"
        mode: 0644

As you can see, the ‘debug’ task allows us to see what value is registered for the ansible_user_id variable. This is useful for troubleshooting your task configurations when they are failing.

$ ansible-playbook local.yml

PLAY [local] *******************************************************************

TASK [Gathering Facts] *********************************************************
ok: [local]

TASK [upgrade all packages] ****************************************************
ok: [local]

TASK [debug] *******************************************************************
ok: [local] => {
    "ansible_user_id": "ec2-user"
}

TASK [configure git client] ****************************************************
changed: [local]

PLAY RECAP *********************************************************************
localhost                  : ok=4    changed=1    unreachable=0    failed=0

Now check the state of the Git configuration file.

$ cat ~/.gitconfig
# Ansible managed
[core]
  editor = /usr/bin/nano
[user]
  name = PeeWee Herman
  email = peewee@example.com

The ‘Ansible managed’ comment at the top is simply a string that can be configured in the ansible.cfg to inform users that the configuration file being viewed is configured by Ansible.

Ansible managed is the default string for this variable. You can redefine this string to include the user-id of the Ansible user as well as the date and time, although this will result in Ansible reporting that the coniguration file has been changed everytime you run the task.

See ansible_managed

ansible.cfg

[defaults]
inventory = hosts.yml
ansible_managed = Ansible managed, do not edit directly: last update by {uid} on %Y-%m-%d, %H:%M:%S

Let’s run the Playbook one more time.

$ ansible-playbook local.yml

PLAY [local] *******************************************************************

TASK [Gathering Facts] *********************************************************
ok: [local]

TASK [upgrade all packages] ****************************************************
ok: [local]

TASK [debug] *******************************************************************
ok: [local] => {
    "ansible_user_id": "ec2-user"
}

TASK [configure git client] ****************************************************
changed: [local]

PLAY RECAP *********************************************************************
localhost                  : ok=4    changed=1    unreachable=0    failed=0  

As you can see, it reports that it changed the configuration file. If we check the configuration file, you’ll see that it now includes the message with user, date, and time.

$ cat ~/.gitconfig
# Ansible managed, do not edit directly: last update by ec2-user on 2019-03-25, 06:28:58
[core]
  editor = /usr/bin/nano
[user]
  name = PeeWee Herman
  email = peewee@example.com

Ansible Roles

If you placed all your tasks in a playbook, it could get very large and wouldn’t be as easy to manage. This is why Ansible supports grouping your automation components (variables, templates, tasks, etc) into re-usable “roles”.

It’s often that a role named ‘common’ is created to store settings that apply across all your hosts, such as updating the system software using Yum, setting the timezone, etc. Other roles are defined separate of the ‘common’ role that apply to certain nodes of your cluster or network, such as ‘webserver’ or ‘loadbalancer’. You can even get more detailed with roles named after the daemons you want running, such as ‘apache’ or ‘nginx’. It’s all up to you.

Taken from Ansible Docs - Roles

Roles expect files to be in certain directory names. Roles must include at least one of these directories, however it is perfectly fine to exclude any which are not being used. When in use, each directory must contain a main.yml file, which contains the relevant content:

tasks - contains the main list of tasks to be executed by the role.

handlers - contains handlers, which may be used by this role or even anywhere outside this role.

files - contains files which can be deployed via this role.

templates - contains templates which can be deployed via this role.

vars - other variables for the role (see Ansible Docs - Using Variables for more information).

defaults - default variables for the role (see Ansible Docs - Using Variables for more information).

meta - defines some meta data for this role. See below for more details.

Let’s move our tasks to a new role called ‘development’.

To get started, establish a directory for your roles to contain the development role, containing a tasks and templates folder. Within tasks create a new main.yml file, and move the Git configuration template into the templates folder.

It should look like this when you’re done:

roles
- development
  - tasks
    - main.yml
  - templates
    - gitconfig.j2

mkdir -p roles/development/tasks
mkdir -p roles/development/templates
touch roles/development/tasks/main.yml
mv gitconfig.j2 roles/development/templates

Next move your tasks from the local.yml to roles/development/tasks/main.yml. Additionally, update the path for the gitconfig template so that it is reflected as templates/gitconfig (you don’t need to include the .j2 file extension).

---
- name: upgrade all packages
  become: yes
  yum:
    name: '*'
    state: latest
- debug: var=ansible_user_id
- name: configure git client
  template:
    src: templates/gitconfig
    dest: "/home/{{ ansible_user_id }}/.gitconfig"
    owner: "{{ ansible_user_id }}"
    group: "{{ ansible_user_id }}"
    mode: 0644

Now update local.yml so that it no longer defines the tasks for the play, but instead points to the development role.

---
# This playbook deploys the entire setup to the Cloud9 development environment.
- hosts: local
  vars:
    git_client_name: PeeWee Herman
    git_client_email: peewee@example.com
  roles:
    - development

If you run ansible-playbook local.yml once again, the output should be the same as before with no errors.

Variables

The Ansible engine makes it possible to re-use a role that you have defined on different hosts with differing configurations. This is made possible through the use of variables. Variables can be defined in your hosts file per each host.

Let’s move our Git name and email address from our local.yml playbook, and place it within hosts.yml so that it applies to our local host.

Here we’ve defined a ‘development’ group, and have placed our ‘local’ host underneath that group. We’ve also defined the variables that should apply to all hosts within that ‘development’ group.

all:
  children:
    development:
      hosts:
        local:
          ansible_host: localhost
          ansible_connection: local
      vars:
        git_client_name: PeeWee Herman
        git_client_email: peewee@example.com

If you want to inspect your inventory for errors, you can use the ansible-inventory command to inspect how Ansible is interpretting your configuration.

$ ansible-inventory --list
{
    "_meta": {
        "hostvars": {
            "local": {
                "ansible_connection": "local",
                "ansible_host": "localhost",
                "git_client_email": "peewee@example.com",
                "git_client_name": "PeeWee Herman",
                "repository": "https://github.com/redconfetti/redconfetti.github.io",
            }
        }
    },
    "all": {
        "children": [
            "development",
            "ungrouped"
        ]
    },
    "development": {
        "hosts": [
            "local"
        ]
    },
    "ungrouped": {}
}

$ ansible-inventory --graph
@all:
  |--@development:
  |  |--local
  |--@ungrouped:

The preferred way to store variables isn’t to define them in the hosts.yml file though. Instead you can create a host_vars directory, and then name the files within it after each host. You can also create a group_vars directory, and then name the files within it after each group.

Let’s create a host variable file for our ‘local’ host.

mkdir host_vars
touch host_vars/local.yml

Next let’s move our variables from the hosts.yml into our host_vars/local.yml file.

host_vars/local.yml

git_client_name: PeeWee Herman
git_client_email: peewee@example.com

This should result in our hosts file being slim again.

hosts.yml

all:
  children:
    development:
      hosts:
        local:
          ansible_host: localhost
          ansible_connection: local

Facts

In addition to the variables that you define, you can also use a feature of Ansible known as “facts”. Facts are variables created by the Ansible engine that contain information about the host you are logging into… what user Ansible is acting as on the system, the command line environment, networking, etc.

To get a list of all the variables Ansible collects about our system run the command: ansible local -m setup.

Defaults

Roles are intended to be re-usable, but it could be tedious having to define variables for every single value that a task or template might need defined.

This is why roles support defaults defined in defaults/main.yml.

To make our role more flexible, let’s define defaults for the name and email address used by the Git client.

mkdir -p roles/development/defaults
touch roles/development/defaults/main.yml

After you’ve done this, add the following to defaults/main.yml.

git_client_name: "{{ ansible_user_gecos }}"
git_client_email: "{{ ansible_user_id }}@{{ ansible_fqdn }}"

To properly test this, go into host_vars/local.yml and comment out the git_client_name and git_client_email variables.

# git_client_name: PeeWee Herman
# git_client_email: peewee@example.com

Run the playbook once more and see what happens.

$ ansible-playbook local.yml

$ cat ~/.gitconfig
# Ansible managed, do not edit directly: last update by ec2-user on 2019-03-25, 19:42:28
[core]
  editor = /usr/bin/nano
[user]
  name = EC2 Default User
  email = ec2-user@ip-172-31-18-245.us-west-2.compute.internal

As you can see Ansible provided our defaults in the configuration file. Now our role is more reusable.

For more information, see Ansible Docs - Using Variables.

Defining Variables in Files

It’s worth mentioning that you can also define variables in external files.

You more than likely have some passwords and API keys that you surely don’t want to check into your repository (even if it’s private).

To work around this, you can configure your playbook like the following.

---
# This playbook deploys the entire setup to the Cloud9 development environment.
- hosts: local
  roles:
   - development
  vars_files:
    - ~/.ansible_secrets.yml
  tasks:
    - debug: var=secret_password
    - debug: var=some_api_key

Next create a ~/.ansible_secrets.yml file with the following contents:

---
secret_password: abcdef123456
some_api_key: A65B90148F091E5F3F7E4DFFECD1B074

When you run your playbook, it should reflect the secrets you configured.

$ ansible-playbook local.yml

PLAY [local] *******************************************************************

TASK [Gathering Facts] *********************************************************
ok: [local]

TASK [debug] *******************************************************************
ok: [local] => {
    "secret_password": "abcdef123456"
}

TASK [debug] *******************************************************************
ok: [local] => {
    "some_api_key": "A65B90148F091E5F3F7E4DFFECD1B074"
}

PLAY RECAP *********************************************************************
local                      : ok=3    changed=0    unreachable=0    failed=0

Now you can define variables in a file that you want to keep secret, but not check them into your repository. At most you might want to check in a file named ansible_secrets.example.yml that contains bogus passwords and keys, and can be copied to ~/.ansible_secrets.yml and then modified if needed in the future.

A good practice is to add instructions to your README.md file, like so:

# Setup

Run the following command to copy the template to your home directory
```shell
cp ansible_secrets.example.yml ~/.ansible_secrets.yml
```

Next, modify `~/.ansible_secrets.yml` to reflect the proper API keys
and passwords.

For more information, see Ansible Docs - Defining Variables In Files.

Ansible Vault

Instead of defining your variables in a file that you keep separate of your repository, there is an option to store your secrets in an encrypted file within your repository.

Ansible Vault provides support for many commands related to working with encrypted variable files.

You can run your playbook with a flag to let ansible-playbook know that it needs to decrypt one of the variable files.

ansible-playbook --ask-vault-pass local.yml

You can also place your Ansible Vault password in a file and tell Ansible to use that password file.

echo 'my$ecretpa$$' > ~/.vault_pass
ansible-playbook --vault-password-file ~/.vault_pass local.yml

You can avoid having to provide this flag altogether by defining the path to your vault password file in an environment variable (defined in ~/.bash_profile, or ~/.bashrc).

export ANSIBLE_VAULT_PASSWORD_FILE=~/.vault_pass

Or if you want to make this a setting associated with your repository, you can simply add a ‘[defaults]’ section to your ansible.cfg that defines the path to the vault password file.

[defaults]
. . .
vault_password_file = ~/.vault_pass

Keep in mind that recommend doing this only if you have a private repository that is not available to the public, but only to your development team. This makes it possible for all the secrets to be available to anyone on the team you give the password to.

Ansible Galaxy

Wouldn’t it be great if you could re-use roles that others have published?

This is possible! You can install and use roles from a public repository known as Ansible Galaxy. It’s also possible to obtain roles packaged on the web, or published in Git repositories (Github, Gitlab, or Bitbucket).

To do this, create a requirements.yml file in the root of your Ansible repository to store the roles that your configuration requires.

For our example we’ll use a role that sets the timezone

# Install timezone role
- src: yatesr.timezone

To install the required roles, run ansible-galaxy install -r requirements.yml.

$ ansible-galaxy install -r requirements.yml
- downloading role 'timezone', owned by yatesr
- downloading role from https://github.com/yatesr/ansible-timezone/archive/1.1.0.tar.gz
- extracting yatesr.timezone to /home/ec2-user/.ansible/roles/yatesr.timezone
- yatesr.timezone (1.1.0) was installed successfully

Now that you’ve installed the timezone role, we’ll need to configure the timezone we want applied to our environment.

Simply add the timezone setting to your host_vars/local.yml like below. We’re using America/Los_Angeles to configure the system to use the Pacific Daylight Time (PDT) time zone.

For other time zone values, see List of TZ Database Time Zones - List.

Also feel free to uncomment your Git name and email address so that it applies like it did before.

git_client_name: PeeWee Herman
git_client_email: peewee@example.com
timezone: America/Los_Angeles

And of course we have to add the role to our playbook.

---
# This playbook deploys the entire setup to the Cloud9 development environment.
- hosts: local
  roles:
   - development
   - yatesr.timezone
  vars_files:
    - ~/.ansible_secrets.yml

Let’s run a command to output our date/time, so we can compare it to what we see after reconfiguring our time zone.

$ date
Fri Mar 29 03:09:26 UTC 2019

Now let’s run our playbook again.

$ ansible-playbook local.yml

PLAY [local] *******************************************************************

TASK [Gathering Facts] *********************************************************
ok: [local]

TASK [development : upgrade all packages] **************************************
ok: [local]

TASK [development : debug] *****************************************************
ok: [local] => {
    "ansible_user_id": "ec2-user"
}

TASK [development : configure git client] **************************************
ok: [local]

TASK [yatesr.timezone : include_vars] ******************************************
ok: [local] => (item=/home/ec2-user/.ansible/roles/yatesr.timezone/vars/../vars/RedHat.yml)

TASK [yatesr.timezone : Install tzdata for Debian based distros] ***************
skipping: [local]

TASK [yatesr.timezone : Install tzdata for RedHat based distros] ***************
ok: [local]

TASK [yatesr.timezone : Install tzdata for Archlinux based distros] ************
skipping: [local]

TASK [yatesr.timezone : Set timezone config] ***********************************
ok: [local]

TASK [yatesr.timezone : Set link to localtime] *********************************
ok: [local]

PLAY RECAP *********************************************************************
local                      : ok=8    changed=0    unreachable=0    failed=0  

Now when we ask the system to output the date, we get the Pacific date/time.

$ date
Mon Mar 25 21:17:18 PDT 2019

See Ansible Galaxy Docs for more information.

Conclusion

I hope this guide has been useful in jumpstarting your interest in using Ansible. I surely tried to mention many of the things I needed to get started and start using Ansible.

I hope you’re more comfortable with Ansible and find yourself navigating the officialdocumentation with more ease now that you understand what it does and how to use it.

Feel free to refer to my cloud9-dev repository to get other ideas on how you can setup your local development environment in Cloud9 - redconfetti/cloud9-dev.