* developing_modules.rst is now in dev_guide, sync changes and delete the old version * Cleaner RST & formatted code * Tidyup roadmaps * Link to repomerge * Pull in abadger's fixes From https://github.com/ansible/ansible/compare/docs-code-block-fixes?expand=1 * Clean docsite/rst (apart from ROADMAP
23 KiB
Modules
This in-depth dive helps you understand Ansible's program flow to execute modules. It is written for people working on the portions of the Core Ansible Engine that execute a module. Those writing Ansible Modules may also find this in-depth dive to be of interest, but individuals simply using Ansible Modules will not likely find this to be helpful.
Types of Modules
Ansible supports several different types of modules in its code base. Some of these are for backwards compatibility and others are to enable flexibility.
Action Plugins
Action Plugins look like modules to end users who are writing playbooks
but they're
distinct entities for the purposes of this document. Action Plugins
always execute on the controller and are sometimes able to do all work
there (for instance, the debug
Action Plugin which prints
some text for the user to see or the assert
Action Plugin
which can test whether several values in a playbook satisfy certain
criteria.)
More often, Action Plugins set up some values on the controller, then
invoke an actual module on the managed node that does something with
these values. An easy to understand version of this is the template Action Plugin
<template>
. The template Action Plugin <template>
takes values
from the user to construct a file in a temporary location on the
controller using variables from the playbook environment. It then
transfers the temporary file to a temporary file on the remote system.
After that, it invokes the copy module <copy>
which operates on the remote
system to move the file into its final location, sets file permissions,
and so on.
New-style Modules
All of the modules that ship with Ansible fall into this category.
New-style modules have the arguments to the module embedded inside of them in some manner. Non-new-style modules must copy a separate file over to the managed node, which is less efficient as it requires two over-the-wire connections instead of only one.
Python
New-style Python modules use the Ansiballz
framework for constructing modules. All
official modules (shipped with Ansible) use either this or the powershell module framework <flow_powershell_modules>
.
These modules use imports from ansible.module_utils
in
order to pull in boilerplate module code, such as argument parsing,
formatting of return values as JSON
, and various file operations.
Note
In Ansible, up to version 2.0.x, the official Python modules used the
module_replacer
framework. For module authors, Ansiballz
is largely a superset of module_replacer
functionality, so you usually do not need to know about one versus the
other.
Powershell
New-style powershell modules use the module_replacer
framework for constructing modules.
These modules get a library of powershell code embedded in them before
being sent to the managed node.
JSONARGS
Scripts can arrange for an argument string to be placed within them
by placing the string
<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>
somewhere
inside of the file. The module typically sets a variable to that value
like this:
= """<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>""" json_arguments
Which is expanded as:
= """{"param1": "test's quotes", "param2": "\"To be or not to be\" - Hamlet"}""" json_arguments
Note
Ansible outputs a JSON
string with bare quotes. Double quotes are used
to quote string values, double quotes inside of string values are
backslash escaped, and single quotes may appear unescaped inside of a
string value. To use JSONARGS, your scripting language must have a way
to handle this type of string. The example uses Python's triple quoted
strings to do this. Other scripting languages may have a similar quote
character that won't be confused by any quotes in the JSON or it may
allow you to define your own start-of-quote and end-of-quote characters.
If the language doesn't give you any of these then you'll need to write
a non-native JSON module <flow_want_json_modules>
or Old-style module <flow_old_style_modules>
instead.
The module typically parses the contents of
json_arguments
using a JSON library and then use them as
native variables throughout the rest of its code.
Non-native want JSON modules
If a module has the string WANT_JSON
in it anywhere,
Ansible treats it as a non-native module that accepts a filename as its
only command line parameter. The filename is for a temporary file
containing a JSON
string containing the module's parameters. The module needs to open the
file, read and parse the parameters, operate on the data, and print its
return data as a JSON encoded dictionary to stdout before exiting.
These types of modules are self-contained entities. As of Ansible 2.1, Ansible only modifies them to change a shebang line if present.
Examples of Non-native modules written in ruby are in the Ansible for Rubyists repository.
Binary Modules
From Ansible 2.2 onwards, modules may also be small binary programs. Ansible doesn't perform any magic to make these portable to different systems so they may be specific to the system on which they were compiled or require other binary runtime dependencies. Despite these drawbacks, a site may sometimes have no choice but to compile a custom module against a specific binary library if that's the only way they have to get access to certain resources.
Binary modules take their arguments and will return data to Ansible
in the same way as want JSON modules <flow_want_json_modules>
.
One example of a binary module written in go.
Old-style Modules
Old-style modules are similar to want JSON modules <flow_want_json_modules>
,
except that the file that they take contains key=value
pairs for their parameters instead of JSON
.
Ansible decides that a module is old-style when it doesn't have any of the markers that would show that it is one of the other types.
How modules are executed
When a user uses ansible
or ansible-playbook
, they specify a task to execute.
The task is usually the name of a module along with several parameters
to be passed to the module. Ansible takes these values and processes
them in various ways before they are finally executed on the remote
machine.
executor/task_executor
The TaskExecutor receives the module name and parameters that were
parsed from the playbook <playbooks>
(or from the command line
in the case of /usr/bin/ansible
). It uses the name to decide
whether it's looking at a module or an Action Plugin <flow_action_plugins>
. If it's a
module, it loads the Normal Action Plugin <flow_normal_action_plugin>
and passes the name, variables, and other information about the task and
play to that Action Plugin for further processing.
Normal Action Plugin
The normal
Action Plugin executes the module on the
remote host. It is the primary coordinator of much of the work to
actually execute the module on the managed machine.
- It takes care of creating a connection to the managed machine by
instantiating a
Connection
class according to the inventory configuration for that host. - It adds any internal Ansible variables to the module's parameters
(for instance, the ones that pass along
no_log
to the module). - It takes care of creating any temporary files on the remote machine and cleans up afterwards.
- It does the actual work of pushing the module and module parameters
to the remote host, although the
module_common <flow_executor_module_common>
code described in the next section does the work of deciding which format those will take. - It handles any special cases regarding modules (for instance, various complications around Windows modules that must have the same names as Python modules, so that internal calling of modules from other Action Plugins work.)
Much of this functionality comes from the BaseAction
class, which
lives in plugins/action/__init__.py
. It makes use of
Connection
and Shell
objects to do its
work.
Note
When tasks <tasks>
are run with the
async:
parameter, Ansible uses the async
Action Plugin instead of the normal
Action Plugin to invoke
it. That program flow is currently not documented. Read the source for
information on how that works.
executor/module_common.py
Code in executor/module_common.py
takes care of assembling
the module to be shipped to the managed node. The module is first read
in, then examined to determine its type. PowerShell <flow_powershell_modules>
and JSON-args modules <flow_jsonargs_modules>
are
passed through Module Replacer <module_replacer>
. New-style
Python modules <flow_python_modules>
are
assembled by Ansiballz
.
Non-native-want-JSON <flow_want_json_modules>
,
Binary modules <flow_binary_modules>
, and Old-Style modules <flow_old_style_modules>
aren't touched by either of these and pass through unchanged. After the
assembling step, one final modification is made to all modules that have
a shebang line. Ansible checks whether the interpreter in the shebang
line has a specific path configured via an
ansible_$X_interpreter
inventory variable. If it does,
Ansible substitutes that path for the interpreter path given in the
module. After this, Ansible returns the complete module data and the
module type to the Normal Action <flow_normal_action_plugin>
which
continues execution of the module.
Next we'll go into some details of the two assembler frameworks.
Module Replacer
The Module Replacer framework is the original framework implementing new-style modules. It is essentially a preprocessor (like the C Preprocessor for those familiar with that programming language). It does straight substitutions of specific substring patterns in the module file. There are two types of substitutions:
- Replacements that only happen in the module file. These are public
replacement strings that modules can utilize to get helpful boilerplate
or access to arguments.
from ansible.module_utils.MOD_LIB_NAME import *
is replaced with the contents of theansible/module_utils/MOD_LIB_NAME.py
These should only be used withnew-style Python modules <flow_python_modules>
.#<<INCLUDE_ANSIBLE_MODULE_COMMON>>
is equivalent tofrom ansible.module_utils.basic import *
and should also only apply to new-style Python modules.# POWERSHELL_COMMON
substitutes the contents ofansible/module_utils/powershell.ps1
. It should only be used withnew-style Powershell modules <flow_powershell_modules>
.
- Replacements that are used by
ansible.module_utils
code. These are internal replacement patterns. They may be used internally, in the above public replacements, but shouldn't be used directly by modules."<<ANSIBLE_VERSION>>"
is substituted with the Ansible version. Innew-style Python modules <flow_python_modules>
under theAnsiballz
frameworkthe proper way is to instead instantiate anAnsibleModule
and then access the version from`AnsibleModule.ansible_version
`."<<INCLUDE_ANSIBLE_MODULE_COMPLEX_ARGS>>"
is substituted with a string which is the Pythonrepr
of theJSON
encoded module parameters. Usingrepr
on the JSON string makes it safe to embed in a Python file. In new-style Python modules under the Ansiballz framework this is better accessed by instantiating anAnsibleModule
and then usingAnsibleModule.params
.<<SELINUX_SPECIAL_FILESYSTEMS>>
substitutes a string which is a comma separated list of file systems which have a file system dependent security context in SELinux. In new-style Python modules, if you really need this you should instantiate anAnsibleModule
and then useAnsibleModule._selinux_special_fs
. The variable has also changed from a comma separated string of file system names to an actual python list of filesystem names.<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>
substitutes the module parameters as a JSON string. Care must be taken to properly quote the string as JSON data may contain quotes. This pattern is not substituted in new-style Python modules as they can get the module parameters another way.- The string
syslog.LOG_USER
is replaced wherever it occurs with thesyslog_facility
which was named inansible.cfg
or anyansible_syslog_facility
inventory variable that applies to this host. In new-style Python modules this has changed slightly. If you really need to access it, you should instantiate anAnsibleModule
and then useAnsibleModule._syslog_facility
to access it. It is no longer the actual syslog facility and is now the name of the syslog facility. See thedocumentation on internal arguments <flow_internal_arguments>
for details.
Ansiballz
Ansible 2.1 switched from the module_replacer
framework to the Ansiballz framework
for assembling modules. The Ansiballz framework differs from module
replacer in that it uses real Python imports of things in ansible/module_utils
instead
of merely preprocessing the module. It does this by constructing a
zipfile -- which includes the module file, files in ansible/module_utils
that
are imported by the module, and some boilerplate to pass in the module's
parameters. The zipfile is then Base64 encoded and wrapped in a small
Python script which decodes the Base64 encoding and places the zipfile
into a temp directory on the managed node. It then extracts just the
ansible module script from the zip file and places that in the temporary
directory as well. Then it sets the PYTHONPATH to find python modules
inside of the zip file and invokes python
on the extracted ansible module.
Note
Ansible wraps the zipfile in the Python script for two reasons:
- for compatibility with Python-2.4 and Python-2.6 which have less
featureful versions of Python's
-m
command line switch. - so that pipelining will function properly. Pipelining needs to pipe the Python module into the Python interpreter on the remote node. Python understands scripts on stdin but does not understand zip files.
In Ansiballz, any imports of Python modules from the ansible.module_utils
package trigger inclusion of that Python file into the zipfile.
Instances of #<<INCLUDE_ANSIBLE_MODULE_COMMON>>
in the module are turned into
from ansible.module_utils.basic import *
and ansible/module-utils/basic.py
is then included in the
zipfile. Files that are included from module_utils
are themselves scanned for imports of
other Python modules from module_utils
to be included in the zipfile as
well.
Warning
At present, the Ansiballz Framework cannot determine whether an
import should be included if it is a relative import. Always use an
absolute import that has ansible.module_utils
in it to allow Ansiballz to
determine that the file should be included.
Passing args
In module_replacer
,
module arguments are turned into a JSON-ified string and substituted
into the combined module file. In Ansiballz
, the JSON-ified string is passed into the
module via stdin. When a ansible.module_utils.basic.AnsibleModule
is
instantiated, it parses this string and places the args into AnsibleModule.params
where
it can be accessed by the module's other code.
Note
Internally, the AnsibleModule
uses the helper function, ansible.module_utils.basic._load_params
, to load
the parameters from stdin and save them into an internal global
variable. Very dynamic custom modules which need to parse the parameters
prior to instantiating an AnsibleModule
may use
_load_params
to retrieve the parameters. Be aware that
_load_params
is an internal function and may change in
breaking ways if necessary to support changes in the code. However,
we'll do our best not to break it gratuitously, which is not something
that can be said for either the way parameters are passed or the
internal global variable.
Internal arguments
Both module_replacer
and Ansiballz
send
additional arguments to the module beyond those which the user specified
in the playbook. These additional arguments are internal parameters that
help implement global Ansible features. Modules often do not need to
know about these explicitly as the features are implemented in ansible.module_utils.basic
but certain features need support from the module so it's good to know
about them.
_ansible_no_log
This is a boolean. If it's True then the playbook specified
no_log
(in a task's parameters or as a play parameter).
This automatically affects calls to AnsibleModule.log
. If a module implements its own
logging then it needs to check this value. The best way to look at this
is for the module to instantiate an AnsibleModule
and then check the value of AnsibleModule.no_log
.
Note
no_log
specified in a module's argument_spec are handled
by a different mechanism.
_ansible_debug
This is a boolean that turns on more verbose logging. If a module
uses AnsibleModule.debug
rather than AnsibleModule.log
then
the messages are only logged if this is True. This also turns on logging
of external commands that the module executes. This can be changed via
the debug
setting in ansible.cfg
or the environment variable ANSIBLE_DEBUG
. If, for
some reason, a module must access this, it should do so by instantiating
an AnsibleModule
and
accessing AnsibleModule._debug
.
_ansible_diff
This boolean is turned on via the --diff
command line
option. If a module supports it, it will tell the module to show a
unified diff of changes to be made to templated files. The proper way
for a module to access this is by instantiating an AnsibleModule
and accessing
AnsibleModule._diff
.
_ansible_verbosity
This value could be used for finer grained control over logging. However, it is currently unused.
_ansible_selinux_special_fs
This is a list of names of filesystems which should have a special
selinux context. They are used by the AnsibleModule
methods which operate on files
(changing attributes, moving, and copying). The list of names is set via
a comma separated string of filesystem names from ansible.cfg
:
# ansible.cfg
[selinux]
special_context_filesystems=nfs,vboxsf,fuse,ramfs
If a module cannot use the builtin AnsibleModule
methods
to manipulate files and needs to know about these special context
filesystems, it should instantiate an AnsibleModule
and
then examine the list in AnsibleModule._selinux_special_fs
.
This replaces ansible.module_utils.basic.SELINUX_SPECIAL_FS
from
module_replacer
. In
module replacer it was a comma separated string of filesystem names.
Under Ansiballz it's an actual list.
2.1
_ansible_syslog_facility
This parameter controls which syslog facility ansible module logs to.
It may be set by changing the syslog_facility
value in
ansible.cfg
. Most
modules should just use AnsibleModule.log
which will then make use of this.
If a module has to use this on its own, it should instantiate an AnsibleModule
and then
retrieve the name of the syslog facility from AnsibleModule._syslog_facility
. The code will look
slightly different than it did under module_replacer
due to how hacky the old way was
# Old way
import syslog
0, syslog.LOG_USER)
syslog.openlog(NAME,
# New way
import syslog
= module._syslog_facility
facility_name = getattr(syslog, facility_name, syslog.LOG_USER)
facility 0, facility) syslog.openlog(NAME,
2.1
_ansible_version
This parameter passes the version of ansible that runs the module. To
access it, a module should instantiate an AnsibleModule
and then retrieve it from AnsibleModule.ansible_version
. This replaces ansible.module_utils.basic.ANSIBLE_VERSION
from module_replacer
.
2.1
Special Considerations
Pipelining
Ansible can transfer a module to a remote machine in one of two ways:
- it can write out the module to a temporary file on the remote host and then use a second connection to the remote host to execute it with the interpreter that the module needs
- or it can use what's known as pipelining to execute the module by piping it into the remote interpreter's stdin.
Pipelining only works with modules written in Python at this time because Ansible only knows that Python supports this mode of operation. Supporting pipelining means that whatever format the module payload takes before being sent over the wire must be executable by Python via stdin.
Why pass args over stdin?
Passing arguments via stdin was chosen for the following reasons:
- When combined with
pipelining
, this keeps the module's arguments from temporarily being saved onto disk on the remote machine. This makes it harder (but not impossible) for a malicious user on the remote machine to steal any sensitive information that may be present in the arguments. - Command line arguments would be insecure as most systems allow unprivileged users to read the full commandline of a process.
- Environment variables are usually more secure than the commandline but some systems limit the total size of the environment. This could lead to truncation of the parameters if we hit that limit.