csharplang/meetings/2018/LDM-2018-02-14.md
Joseph Musser 72a4daa853 Fixed typos (#2167)
* Fixed typos in proposals/

* Fixed typos in meetings/2018/
2019-02-12 13:45:25 -08:00

3.7 KiB

C# Language Design Notes for Feb 14, 2018

Warning: These are raw notes, and still need to be cleaned up. Read at your own peril!

Ranges

  1. Ranges are pairs of Indexes, Index'es can be from beginning or end, ^ syntax or similar denotes from end
  2. Ranges are pairs of ints, there's a convention of using ~x, that is, one's complement, for "from end"
  3. Ranges are pairs of Indexes, but the conversion from int uses the ~x convention rather than a new operator
  4. Ranges are pairs of ints, all from beginning, but with a special bit for "open at the end"
  5. Ranges are pairs of ints, all from beginning, with no notion of "open"

1-3 hope to generalize to indexing in general, not just ranges.

coll[^x]; // option 1, from end
coll[~x]; // option 2-3, from end

For people with existing indexers who want to add this, there are problems:

  • if you replace an int indexer with an Index, then you have a binary compat problem
  • if you add an overload, then option 2 and 3 would never pick that overload

For indexing, ^0 would mean "length minus zero", which would be out of range. So there's an irony where we do all this work to allow expressing 0 from end, which would not be a legal index.

Arguably, then, all of this bending-over-backwards is a consequence of wanting to express "zero from end" as the upper bound of an exclusive range.

Making it inclusive at the end makes ^0 or ~0 actually denote the last element. In which case, it's even more necessary to express "zero from end".

"xxx|yyy|zzz", you want to truncate based on the position of the pipe. You want to extract the yyy.

var startIndex = s.IndexOf("|")+1;
var endIndex = s.IndexOf(startIndex, "|"); // -1 ?
var result = s.Substring(startIndex, endIndex-startIndex);
var result = s[startIndex..endIndex]

Why should getting the end be easier than getting the beginning? Both should have to add/subtract one to get rid of the |. Or at least, so goes the argument for inclusive.

The empty range, on the other hand, is a very convincing argument for exclusive. 0..-1 is just really nasty.

Speculating about floating point ranges, having them inclusive-start, exclusive-end would actually likely be the best solution. They are imprecise by nature, so the only realistic way to get good adjacent "buckets" (where every element would be either in one or the others), is to not have symmetry between the endpoints on this.

var slice = myTensor[x1..x2, y1.:l, ^z1.., ..];

Once you get multidimensional, everything "from end" gets really annoying to do manually, because even getting the end for a given dimension is quite involved.

There's a small design question around ^0.. - is it legal or not?

  • We're overwhelmingly slightly preferring exclusive.
  • We do think that the scenario of counting from end is important, at least for ranges
  • Unsure that its value extends enough to other indexing scenarios.

Exploring the hat options:

  1. Limit hat to range expressions
  2. Bake in Index to the language and return it from hat
  3. Conversion from expression to Index
  4. Typeless expression which gets meaning from range

1 keeps our options open, but leads to a natural next request. 2 lets indices be computed independently. 3 leaves open betterness for the future, at the cost of not allowing var i = ^x. 4 is a semantic version of 1. Unclear if different things are allowed between the two.

Adding the Index type and the ^ operator is heavy lifting for relatively limited gain.

Would we add indexers to existing types that can index from-end? Maybe to some, probably not to arrays (for performance reasons). Probably not more or less depending on whether we use ~ or ^.

We're still on the balance.