Hi @amenonsen - thanks for fixing up the hunting down the unicode bug and expanding test_addresses. The code looks good, merging!-- Be systematic about parsing and validating hostnames and addresses
These used to go in vars_cache, so merging them in after that as they
are "live" variables and the user would most likely want to see these
above anything else.
Labels must start with an alphanumeric character, may contain
alphanumeric characters or hyphens, but must not end with a hyphen.
We enforce those rules, but allow underscores wherever hyphens are
accepted, and allow alphanumeric ranges anywhere.
We relax the definition of "alphanumeric" to include Unicode characters
even though such inventory hostnames cannot be used in practice unless
an ansible_ssh_host is set for each of them.
We still don't enforce length restrictions—the fact that we have to
accept ranges makes it more complex, and it doesn't seem especially
worthwhile.
This adds a parse_address(pattern) utility function that returns
(host,port), and uses it wherever where we accept IPv4 and IPv6
addresses and hostnames (or host patterns): the inventory parser
the the add_host action plugin.
It also introduces a more extensive set of unit tests that supersedes
the old add_host unit tests (which didn't actually test add_host, but
only the parsing function).
There was code to support set literals (on Python 2.7 and newer), but it
was buggy: SAFE_NODES.union() doesn't modify SAFE_NODES in place,
instead it returns a new set object that is then silently discarded.
I added a unit test and fixed the code. I also changed the version
check to use sys.version_tuple instead of a string comparison, for
consistency with the subsequent Python 3.4 version check that I added in
the previous commit.
Two things changed in Python 3.4:
- 'basestring' is no longer defined, so use six.string_types
- True/False are now special AST node types (NamedConstant) rather than
just names
(Good thing we had tests, or I wouldn't have noticed the 2nd thing!)
I found only one place where safe_eval() is called inside the ansible
codebase: in lib/template/__init__.py. The call to safe_eval(result,
...) is protected by result.startswith('...'), which means result cannot
possibly be a byte string on Python 3 (or startswith() would raise, so
six.string_types (which excludes byte strings on Python 3) is fine here.
PyYAML has a SafeRepresenter in lib/... that defines
def represent_unicode(self, data):
return self.represent_scalar(u'tag:yaml.org,2002:str', data)
and a different SafeRepresenter in lib3/... that defines
def represent_str(self, data):
return self.represent_scalar('tag:yaml.org,2002:str', data)
so the right thing to do on Python 3 is to use represent_str.
(AnsibleUnicode is a subclass of six.text_type, i.e. 'str' on Python 3.)