3 Rules to Follow When Developing Ansible Modules
One of the tenants of The Zen of Python is "simple is better than complex."
This basic principle can be seen throughout Ansible, which is built on Python, even in the custom module development process. With a "Python 101" level of expertise, it is entirely possible to create your own custom module, which makes Ansible easily extensible and powerful.
This simplicity is one of the reasons why Ansible quickly became one of my favorite tools for infrastructure automation. In my various roles at World Wide Technology, I've had the opportunity to create more than 100 custom modules related to infrastructure automation and have read the source code of more modules than I can count. During this process, I have slowly developed a series of best practices for developing modules related to infrastructure automation, which are summarized below.
Know your audience
If you take one thing away from this post it's this: remember your target audience.
The world of programming is still new for many system administrators (i.e., your target audience) so it is vital to keep your code as simple as possible (which you should be doing anyway). By following the "simple is better than complex" tenant, you'll allow your audience, who may only have a basic understanding of Python, to better understand your code and hopefully contribute back to your work.
What are some steps you can do to ensure your code is understood by everyone?
This is going to sound exceedingly basic, but more often than not, I don't see this rule being followed. Use variable names that are descriptive and abundantly clear. When someone reads one of your variable names it should be obvious what that variable represents.
Let's look at some code that has been automatically generated by OEM-provided tools to illustrate this point.
These lines of code represent establishing connections to NetApp Clustered Data ONTAP and Cisco UCS Manager respectively. Since they each include an IP address, you may be able to infer what their function is.
Now place yourself in the shoes of a system administrator who has just learned the basics of programming and is not used to reading through code. Chances are these variable names will do nothing but confuse them.
Solving this issue is easy. By changing "s" and "handle" to something more obvious like "connect," your code becomes immensely more readable. You also now have the added benefit of having similar variable names across modules.
Another way to make your code easier to read revolves around the module.params['<variable_name>'] syntax, which is used by Ansible to read the parameters provided in the playbook.
If your playbook looks like the example below, you would use module.params['name'] in your module code to pull the name parameter.
Similar to the parameter name issues above, module.params can be confusing when first looking at Ansible modules. You can help solve this problem by declaring ansible = module.params in your code which results in the new syntax being ansible[name]. A system administrator still may not know the exact function of the code, but they will at least know it's related to Ansible.
Do one thing and do it well
When discussing Ansible module development I have often heard "do one thing and do it well" mentioned. But how do you define "one thing"? A good goal for deciding this is the acronym CRUD, which stands for create, read, update and delete. These four actions are the one thing your module should always do to the object you are automating (NetApp Aggregate, a VLAN, for example).
The most important of these actions is "read." In general, all modules should be idempotent, which means your module will not execute a task unless it is necessary. For example, if you are creating a new VLAN, you should always first check to see if it already exists. If it does, you can give the end user a message that it is already present and skip the creation process. The same principle holds true if you are deleting or updating an object.
When dealing with infrastructure automation, your module is not yet production ready if idempotence is not built in. With OEM APIs and SDKs, reading an object is several times faster than manipulating an object. Preventing any unnecessary action from occurring will result in noticeable time savings in your playbook execution.
More importantly, your playbook runs will have a higher error rate if your modules are not idempotent. For example, if you try to create a NetApp Aggregate that is already present, your module will error out and your playbook will fail. If this happens in the middle of a playbook, you will not be able to re-run your playbook from start unless your modules are idempotent. I've learned this particular lesson the hard way a few times, and trust me when I say that your life will be easier if you can run a playbook from the start without having to worry about an error that could have been solved if you did a read before executing your task.
The one problem with combining CRUD functionality into a single module is that the variables you need to read an object may not be the same as the variables needed to create an object, and so on. You can solve this in a number of ways, including setting the variable as not being required and ignoring it in your code if it is not defined.
The problem with that approach is that you lose some of the built-in playbook error checking that occurs before execution. For example, if you're missing a variable in your playbook required to create an object, you won't know until that individual task is executed, and even then, it may not be obvious what the problem is. To solve this issue you can use the required_if utility. This allows you to include logic that says: "if the user is creating an object, only these variables are required. If a user is deleting an object, only these variables are required," and so on.
Improving your playbooks
When building complex infrastructure automation playbooks, it is not uncommon to have hundreds of tasks. Even though the process of creating a playbook is simple to begin with, it can quickly become a tedious process. So how can you make the playbook creation process easier?
One of the common problems with infrastructure automation modules is that most require you to enter an IP address, username and password for every single play in a playbook. The Cisco NX-OS modules (among others) solve this problem with the inclusion of a "provider" utility that can be imported into each module. The "provider" allows you to define your IP address, username and password once, and then have those automatically imported into each play, preventing you from having to provide the same information repeatedly. If your modules do not include a similar feature, you are doing a disservice to yourself
On a related note, I'm not a fan of the "provider" terminology. It falls under the vague parameter name problem that we discussed earlier. However, I can't come up with a good alternative. If you have any ideas, leave them in the comments.
Another common item I see is using the state parameter to define if you are creating, updating or deleting an object. Often you will set state: present to create an object or state: absent to delete an object. This is a common theme in many core modules, so my assumption is that developers are often just following patterns they see elsewhere. In the context of infrastructure automation, this terminology does not make much sense though. Instead, a more obvious approach is to use action as your parameter and create, delete, and update as its choices.
In summary, it is important to remember that your target audience may have little programming experience and may not be familiar with reading code. If you follow the "simple is better than complex" principle that is found through Ansible, you will encourage more contribution to your work and the larger ecosystem. Not to mention that the simpler your code base is, the simpler your life will be as well.