# 2022-05-25 Combining variables in Ansible Ansible is known for it's simplicity. To put it short: Ansible playbooks are just tasks executed on hosts and sprinkled with variables. Some of these variables change the way Ansible behaves as a whole and the rest of them are left for user. They are usually organized into variables assigned to hosts and variables assigned to groups. Host variables are boring, since we can't practically reuse them. Group variables are where the fun begins. Most projects have a set of global configuration settings like address of LDAP server, user accounts etc. However when introducing groups some problem emerges: what, if we want to combine variables from different groups or group and host? ## Why would you want to combine Ansible variables? Let's say that you have Ansible role for provisioning user accounts. You're smart, so you don't import it separately for each account and instead use loops. You have a common set of project accounts, but suddenly third party contractor from outside joins project and requires access to only one host. Of course you can hardcode additional account creation in playbook and call it a day, but it will be hide & seek game for future maintainers. Variables are supposed to express project's configuration. Ad-hoc additions to roles blurs separation between reusable modules and configuration. I've used accounts in above example, but I hope you get the idea: we have role for managing certain service, common set of resources for this service and unique additions assigned more granularly to hosts. You may say that there's nothing wrong in reapplying single role for host, but what if it's `authorized_keys` file for SSH we're managing and we don't want any stray keys? It means that role has exclusive control other some service and reapplying it with different variables would erase prior configuration. That's when first questions about *merging* and *combining* variables are asked. ## Known solutions This problem is not new, so let's review some existing solutions. ### Using Jinja2 filters Since variables have Python-compatible data types, nothing stops us from using Jinja2 filters for transformation. In Ansible we have: - [combine](https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html#combining-hashes-dictionaries) - Allows to merge dictionaries/mappings including deep merge. - [union](https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html#selecting-from-sets-or-lists-set-theory) and other set theory filters - Allow to concatenate lists. Using set theory filters and combine allows to render configuration based on more than one variable. This way instead of `system_accounts` variable we can have `system_accounts_host` and `system_accounts_global` pair. They can be merged upon using role like so: ```yaml - name: Setup system accounts import_role: name: system_accounts vars: system_accounts: "{{ system_accounts_global | union(system_accounts_host) }}" ``` This way we have to modify playbook in order to respect host/group specific variables. It'll be especially cumbersome when we have multilevel configuration with child groups like subregions or availability zones in datacenter. ### hash_behaviour configuration `hash_behaviour` is probably one of lesser known Ansible configuration settings, because it has potential to break a lot of things. The idea is that by default Ansible variables are replaced in specified order, but you can change this behaviour to merging instead. It seems like the most obvious solution, but: - `hash_behaviour` is applied globally. A lot of code relies on default behaviour, so expect problems. - Some modules like `include_vars` ignore this setting anyway. - It was deprecated once and documentation strongly advises to avoid using it. It's handy to know that such option exists, but it's too limited to be actually useful. Puppet solved this issue better with [lookup_options](https://puppet.com/docs/puppet/7/hiera_merging.html#setting_lookup_options_to_refine_the_result_of_a_lookup-defining-merge-behavior-with-lookup-options) which allow to specify merge behaviour per variable. ### Using plugins At least 2 plugins for handling variables merging exist: - [Symaxion/ansible-common](https://gitlab.com/Symaxion/ansible-common/) - Looks nice, but suffers similar issues to `hash_behaviour` approach. - [leapfrogonline/ansible-merge-vars](https://github.com/leapfrogonline/ansible-merge-vars) - Requires extra task for merging variables. Adding plugin has also one obvious downside: it's yet another dependency which needs to be downloaded for each installation and may become unmaintained some day. Of course it hugely depends on project. Some plugins have been consistently maintained for long years despite small user-base. ## Yet another approach > It works similarly to [ansible-merge-vars](https://github.com/leapfrogonline/ansible-merge-vars), > but doesn't require any plugin. Have you ever heard of `varnames` lookup plugin? It's actually [mentioned in documentation](https://docs.ansible.com/ansible/devel/reference_appendices/config.html#default-hash-behaviour) as an alternative to `hash_behaviour`. It allows us to query defined variables names using regular expressions. It means that we no longer have to hardcode variables names like in approach with `combine`/`union`, but we still need to make extra names for variables. We will construct variable's value based on a few other variables matching expression. Jinja2 macros for doing exactly this can be written like so: ``` {% macro merge_list(pattern) -%} {% set ns = namespace(output=[]) -%} {% for name in lookup('varnames', pattern).split(',') -%} {% set ns.output = ns.output | union(lookup('vars', name)) -%} {% endfor -%} {{ ns.output }} {% endmacro -%} {% macro merge_dict(pattern, recursive=True) -%} {% set ns = namespace(output={}) -%} {% for name in lookup('varnames', pattern).split(',') -%} {% set ns.output = ns.output | combine(lookup('vars', name), recursive=recursive) -%} {% endfor -%} {{ ns.output }} {% endmacro -%} ``` Of course we don't want to copy this much of code every time we want to create merged variable. Instead let's save these macros in `templates/macros.j2` file in our project. Some conventions are required to make our playbook easier to maintain. Let's say that for every variable `x` partial variables shall be named according to pattern `__x__m`, where `` is group or host name and `__m` suffix stands for `merged`. We can create merged variable like so: ```yaml all__system_groups__m: jakski: {} system_groups: > {% from 'templates/macros.j2' import merge_dict with context -%} {{ merge_dict('__system_groups__m$') }} ``` Now, if we wanted to add a few more groups which are specific to host `prod1`: ```yaml prod1__system_groups__m: prod-app: {} backups: {} ``` Above snippet can be placed in `host_vars`, but nothing stops us from using this approach in `group_vars` or even creating merged variable based on partials from multiple groups. It's not exactly quick & easy, but this way we rely only on builtin plugins and Jinja2, so there's no problem with dropping it into an already working project and use only for specified variables.