6e82df451a
This change is motivated by an ssh oddity: when ControlPersist is enabled, the first (i.e. master) connection goes into the background; we see EOF on its stdout and the process exits, but we never see EOF on its stderr. So if we ran a command like this: ANSIBLE_SSH_PIPELINING=1 ansible -T 30 -vvv somehost -u someuser -m command -a whoami We would first do select([stdout,stderr], timeout) and read the command module output, then select([stdout,stderr], timeout) again and read EOF on stdout, then select([stderr], timeout) AGAIN (though the process has exited), and select() would wait for the full timeout before returning rfd=[], and then we would exit. The use of a very short timeout in the code masked the underlying problem (that we don't see EOF on stderr). It's always preferable to call select() with a long timeout so that the process doesn't use any CPU until one of the events it's interested in happens (and then select will return independent of elapsed time). (A long timeout value means "if nothing happens, sleep for up to <x>"; omitting the timeout value means "if nothing happens, sleep forever"; specifying a zero timeout means "don't sleep at all", i.e. poll for events and return immediately.) This commit uses a long timeout, but explicitly detects the condition where we've seen EOF on stdout and the process has exited, but we have not seen EOF on stderr. If and only if that happens, it reruns select() with a short timeout (in practice it could just exit at that point, but I chose to be extra cautious). As a result, we end up calling select() far less often, and use less CPU while waiting, but don't sleep for a long time waiting for something that will never happen. Note that we don't omit the timeout to select() altogether because if we're waiting for an escalation prompt, we DO want to give up with an error after some time. We also don't set exceptfds, because we're not actually acting on any notifications of exceptional conditions. |
||
---|---|---|
.. | ||
ansible |