Due to the way we're now calculating delegate_to, if that value is based
on a loop variable ('item') we need to calculate all of the possible
delegated_to variables for that loop.
Fixes#12499
There doesn't appear to be anything that actually uses tmp_path in the
connection plugins so we don't need to pass that in to exec_command.
That change also means that we don't need to pass tmp_path around in
many places in the action plugins any more. there may be more cleanup
that can be done there as well (the action plugin's public run() method
takes tmp as a keyword arg but that may not be necessary).
As a sideeffect of this patch, some potential problems with chmod and
the patch, assemble, copy, and template modules has been fixed (those
modules called _remote_chmod() with the wrong order for their
parameters. Removing the tmp parameter fixed them.)
The process is already gone, so there's not going to be any new data
showing up on its stderr; we only want to make sure that we haven't
missed something that was already written. So polling once is enough.
This change is motivated by an ssh oddity: when ControlPersist is
enabled, the first (i.e. master) connection goes into the background; we
see EOF on its stdout and the process exits, but we never see EOF on its
stderr. So if we ran a command like this:
ANSIBLE_SSH_PIPELINING=1 ansible -T 30 -vvv somehost -u someuser -m command -a whoami
We would first do select([stdout,stderr], timeout) and read the command
module output, then select([stdout,stderr], timeout) again and read EOF
on stdout, then select([stderr], timeout) AGAIN (though the process has
exited), and select() would wait for the full timeout before returning
rfd=[], and then we would exit. The use of a very short timeout in the
code masked the underlying problem (that we don't see EOF on stderr).
It's always preferable to call select() with a long timeout so that the
process doesn't use any CPU until one of the events it's interested in
happens (and then select will return independent of elapsed time).
(A long timeout value means "if nothing happens, sleep for up to <x>";
omitting the timeout value means "if nothing happens, sleep forever";
specifying a zero timeout means "don't sleep at all", i.e. poll for
events and return immediately.)
This commit uses a long timeout, but explicitly detects the condition
where we've seen EOF on stdout and the process has exited, but we have
not seen EOF on stderr. If and only if that happens, it reruns select()
with a short timeout (in practice it could just exit at that point, but
I chose to be extra cautious). As a result, we end up calling select()
far less often, and use less CPU while waiting, but don't sleep for a
long time waiting for something that will never happen.
Note that we don't omit the timeout to select() altogether because if
we're waiting for an escalation prompt, we DO want to give up with an
error after some time. We also don't set exceptfds, because we're not
actually acting on any notifications of exceptional conditions.
On Python 2, shlex.split() raises if you pass it a unicode object with
non-ASCII characters in it. The Ansible codebase copes by explicitly
converting the string using to_bytes() before passing it to
shlex.split().
On Python 3, shlex.split() raises ('bytes' object has no attribute 'read')
if you pass a bytes object. Oops.
This commit introduces a new wrapper function, shlex_split, that
transparently performs the to_bytes/to_unicode conversions only on
Python 2.
Currently I've only converted one call site (the one that was causing a
unit test to fail on Python 3). If this approach is deemed suitable,
I'll convert them all.
Without this, we could execute «ssh -q ...» and call select(), which
would timeout after the default 10s, and only then send initial data.
(This is a relic of the earlier change where we always ran ssh with
-vvv, so the situation where it would sit quietly never happened in
practice; but this would have been the right thing to do even then.)
Make the code compatible with Pythons 2.4 through 3.5 by using
sys.exc_info()[1] instead.
This is necessary but not sufficient for Python 3 compatibility.
The event loop (even after it was brought into one place in _run in the
previous commit) was hard to follow. The states and transitions weren't
clear or documented, and the privilege escalation code was non-blocking
while the rest was blocking.
Now we have a state machine with four states: awaiting_prompt,
awaiting_escalation, ready_to_send (initial data), and awaiting_exit.
The actions in each state and the transitions between then are clearly
documented.
The check_incorrect_password() method no longer checks for empty strings
(since they will always match), and check_become_success() uses equality
rather than a substring match to avoid thinking an echoed command is an
indication of successful escalation. Also adds a check_missing_password
connection method to detect the error from sudo -n/doas -n.
The main exec_command/put_file/fetch_file methods now _build_command and
call _run to handle input from/output to the ssh process. The purpose is
to bring connection handling together in one place so that the locking
doesn't have to be split across functions.
Note that this doesn't change the privilege escalation and connection IO
code at all—just puts it all into one function.
Most of the changes are just moving code from one place to another (e.g.
from _connect to _build_command, from _exec_command and _communicate to
_run), but there are some other notable changes:
1. We test for the existence of sshpass the first time we need to use
password authentication, and remember the result.
2. We set _persistent in _build_command if we're using ControlPersist,
for later use in close(). (The detection could be smarter.)
3. Some apparently inadvertent inconsistencies between put_file and
fetch_file (e.g. argument quoting, sftp -b use) have been removed.
Also reorders functions into a logical sequence, removes unused imports
and functions, etc.
Aside: the high-level EXEC/PUT/FETCH description should really be logged
from ConnectionBase, while individual subclasses log transport-specific
details.
* Make LookupBase an abc with required methods (run()) marked as an
abstractmethod
* Mark methods that don't use self as @staticmethod
* Document how to implement the run method of a lookup plugin.
Follow up to 8769f03c, which allows the undefined var error to be raised
if we're getting vars with a full context (play/host/task) and the host
has already gathered facts. In this way, vars_files containing variables
that fail to be templated are not silently ignored.
This fixes a failing unit test.
In actual use (which is still quite far), I'm not sure if bytes ->
unicode conversion should be done here (in which case the code will fail
with an AttributeError: 'bytes' object has no attribute 'readlines'), or
inside self._connection.exec_command() (in which case my change is
correct).
Now, instead of relying on hostvars on the executor side, we compile
the vars for the delegated to host in a special internal variable and
have the PlayContext object look for things there when applying task/
var overrides, which is much cleaner and takes advantage of the code
already dealing with all of the magic variable variations.
Fixes#12127Fixes#12079
* Clearing interpreter settings from variables, so those set for the
original host aren't incorrectly applied to the delegated to host
* Fixed incorrect string for remote user in delegated hosts hostvars
* Properly looking for multiple possiblities in the delegated-to hosts
hostvars (ansible_ssh_host vs. ansible_host)
Use six.moves.range instead (aliased to xrange on Python 2, aliased to
range on Python 3).
Also I couldn't resist replacing the elaborate chr/ord/randrange dance
with the simpler random.choice(string.ascii_lowercase) that was already
used elsewhere in the Ansible codebase.
The earlier distinction was never used; .ipv6_address was always a copy
of .ipv4_address, and the latter was always used to set the remote_addr
field in the PlayContext.
Also uses the canonical ansible_host/ansible_port names when setting the
address and port from variables.
The earlier-recommended "pat1:pat2:pat3[x:y]" notation doesn't work well
with IPv6 addresses, so we recommend ',' as a separator instead. We know
that commas can't occur within a pattern, so we can just split on it.
We still have to accept the "foo:bar" notation because it's so commonly
used, but we issue a deprecation warning for it.
Fixes#12296Closes#12404Closes#12329
* Add exception handling when running PowerShell modules to provide exception message and stack trace.
* Enable strict mode for all PowerShell modules and internal commands.
* Update common PowerShell code to fix strict mode errors.
* Fix an issue with Set-Attr where it would not replace an existing property if already set.
* Add tests for exception handling using modified win_ping modules.
Hi @amenonsen - thanks for fixing up the hunting down the unicode bug and expanding test_addresses. The code looks good, merging!-- Be systematic about parsing and validating hostnames and addresses
These used to go in vars_cache, so merging them in after that as they
are "live" variables and the user would most likely want to see these
above anything else.
Labels must start with an alphanumeric character, may contain
alphanumeric characters or hyphens, but must not end with a hyphen.
We enforce those rules, but allow underscores wherever hyphens are
accepted, and allow alphanumeric ranges anywhere.
We relax the definition of "alphanumeric" to include Unicode characters
even though such inventory hostnames cannot be used in practice unless
an ansible_ssh_host is set for each of them.
We still don't enforce length restrictions—the fact that we have to
accept ranges makes it more complex, and it doesn't seem especially
worthwhile.
This adds a parse_address(pattern) utility function that returns
(host,port), and uses it wherever where we accept IPv4 and IPv6
addresses and hostnames (or host patterns): the inventory parser
the the add_host action plugin.
It also introduces a more extensive set of unit tests that supersedes
the old add_host unit tests (which didn't actually test add_host, but
only the parsing function).
There was code to support set literals (on Python 2.7 and newer), but it
was buggy: SAFE_NODES.union() doesn't modify SAFE_NODES in place,
instead it returns a new set object that is then silently discarded.
I added a unit test and fixed the code. I also changed the version
check to use sys.version_tuple instead of a string comparison, for
consistency with the subsequent Python 3.4 version check that I added in
the previous commit.
Two things changed in Python 3.4:
- 'basestring' is no longer defined, so use six.string_types
- True/False are now special AST node types (NamedConstant) rather than
just names
(Good thing we had tests, or I wouldn't have noticed the 2nd thing!)
I found only one place where safe_eval() is called inside the ansible
codebase: in lib/template/__init__.py. The call to safe_eval(result,
...) is protected by result.startswith('...'), which means result cannot
possibly be a byte string on Python 3 (or startswith() would raise, so
six.string_types (which excludes byte strings on Python 3) is fine here.
PyYAML has a SafeRepresenter in lib/... that defines
def represent_unicode(self, data):
return self.represent_scalar(u'tag:yaml.org,2002:str', data)
and a different SafeRepresenter in lib3/... that defines
def represent_str(self, data):
return self.represent_scalar('tag:yaml.org,2002:str', data)
so the right thing to do on Python 3 is to use represent_str.
(AnsibleUnicode is a subclass of six.text_type, i.e. 'str' on Python 3.)
needed for winrm, disabled closing connections in ssh to avoid issues with that persistance, need to normalize all this in future
This reverts commit 23a22397bf.
Required some rewiring in inventory code to make sure we're using
the DataLoader class for some data file operations, which makes mocking
them much easier.
Also identified two corner cases not currently handled by the code, related
to inventory variable sources and which one "wins". Also noticed we weren't
properly merging variables from multiple group/host_var file locations
(inventory directory vs. playbook directory locations) so fixed as well.
This was commented out earlier because of the lack of interprocess
locking and prepare_writeable_dir in v2.
The locking was not needed: it could only protect against other siblings
of this process (since they were all locking a temporary file that was
opened in the parent), and those would be running as the same user and
with the same umask. Also, os.makedirs() tolerates intermediate paths
being created by other processes. For any other kind of error, both
locking and non-locking code paths would fail in the same way.
So all we really need to do is make sure we have write permissions.
(We also move the cp_dir handling code to where we actually set the
ControlPath ourselves; if the user has set it via ssh_*args already,
we don't need to bother.)
commit 9921bb9d20
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Mon Aug 10 20:19:44 2015 +0530
Document --ssh-extra-args command-line option
commit 8b25595e7b
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Thu Aug 13 13:24:57 2015 +0530
Don't disable GSSAPI/Pubkey authentication when using --ask-pass
This commit is based on a bug report and PR by kolbyjack (#6846) which
was subsequently closed and rebased as #11690. The original problem was:
«The password on the delegated host is different from the one I
provided on the command line, so it had to use the pubkey, and the
main host doesn't have a pubkey on it yet, so it had to use the
password.»
(This commit is revised and included here because #11690 would conflict
with the changes in #11908 otherwise.)
Closes#11690
commit 119d032389
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Thu Aug 13 11:16:42 2015 +0530
Be more explicit about why SSH arguments are added
This adds vvvvv log messages that spell out in detail where each SSH
command-line argument is obtained from.
Unfortunately, we can't be sure if, say, self._play_context.remote_user
is obtained from ANSIBLE_REMOTE_USER in the environment, remote_user in
ansible.cfg, -u on the command line, or an ansible_ssh_user setting in
the inventory or on a task or play. In some cases, e.g. timeout, we
can't even be sure if it was set by the user or just a default.
Nevertheless, on the theory that at five v's you can use all the hints
available, I've mentioned the possible sources in the log messages.
Note that this caveat applies only to the arguments that ssh.py adds by
itself. In the case of ssh_args and ssh_extra_args, we know where they
are from, and say so, though we can't say WHERE in the inventory they
may be set (e.g. in host_vars or group_vars etc.).
commit b605c285ba
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Tue Aug 11 15:19:43 2015 +0530
Add a FAQ entry about ansible_ssh_extra_args
commit 49f8edd035
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Mon Aug 10 20:48:50 2015 +0530
Allow ansible_ssh_args to be set as an inventory variable
Before this change, ssh_args could be set only in the [ssh_connection]
section of ansible.cfg, and was applied to all hosts. Now it's possible
to set ansible_ssh_args as an inventory variable (directly, or through
group_vars or host_vars) to selectively override the global setting.
Note that the default ControlPath settings are applied only if ssh_args
is not set, and this is true of ansible_ssh_args as well. So if you want
to override ssh_args but continue to set ControlPath, you'll need to
repeat the appropriate options when setting ansible_ssh_args.
(If you only need to add options to the default ssh_args, you may be
able to use the ansible_ssh_extra_args inventory variable instead.)
commit 37c1a5b679
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Mon Aug 10 19:42:30 2015 +0530
Allow overriding ansible_ssh_extra_args on the command-line
This patch makes it possible to do:
ansible somehost -m setup \
--ssh-extra-args '-o ProxyCommand="ssh -W %h:%p -q user@bouncer.example.com"'
This overrides the inventory setting, if any, of ansible_ssh_extra_args.
Based on a patch originally by @Richard2ndQuadrant.
commit b023ace8a8
Author: Abhijit Menon-Sen <ams@2ndQuadrant.com>
Date: Mon Aug 10 19:06:19 2015 +0530
Add an ansible_ssh_extra_args inventory variable
This can be used to configure a per-host or per-group ProxyCommand to
connect to hosts through a jumphost, e.g.:
inventory:
[gatewayed]
foo ansible_ssh_host=192.0.2.1
group_vars/gatewayed.yml:
ansible_ssh_extra_args: '-o ProxyCommand="ssh -W %h:%p -q bounceuser@gateway.example.com"'
Note that this variable is used in addition to any ssh_args configured
in the [ssh_connection] section of ansible.cfg (so you don't need to
repeat the ControlPath settings in ansible_ssh_extra_args).
* When iterating over a child state, a failure should be propagated
up so parent blocks don't continue iterating
* Make sure a child state exists before trying to search it
Fixes#12210
I don't think six.iteritems is available here, but I also don't expect
there to be enough platforms to ever make the speed difference between
.items() and .iteritems() noticeable.
Replace .iteritems() with six.iteritems() everywhere except in
module_utils (because there's no 'six' on the remote host). And except
in lib/ansible/galaxy/data/metadata_template.j2, because I'm not sure
six is available there.