PowerShell/test/powershell/Language/Parser
Joel Sallow (/u/ta11ow) 3f52adb0ae Add Binary Parsing Support & Refactor TryGetNumberValue & ScanNumberHelper (#7993)
Fixes #7557 

* Adds support for binary parsing in format echoing hex: `0b11010110`
  * Works with all existing type suffixes and multipliers.
  * Supports arbitrary length parsing with `n` suffix using BigInteger; details below.
* Adds `NumberFormat` enum to specify hex/binary/base 10 for the tokenizer, replacing old `bool hex`.
* Adds `n` suffix for all numeric literals to support returning value as a `BigInteger` if requested. This bypasses the issue of large literals losing accuracy when they cast through `double`.
* Adds tests for all new behaviours.

---

### Binary / Hex Parsing Implementation

* Mimics old sign bit behaviour for int and long types. Sign bits accepted for 8 or 16-bit Hex parsing, and 8, 16, 32, 64 for binary.
  * i.e., `0xFFFFFFFF -eq ([int]-1)` and `0xFFFFFFFFFFFFFFFF -eq ([long]-1)`, but suffixing `u` creates `int.MaxValue` and `long.MaxValue`, respectively, instead.
* Sign bits higher than this are accepted for bigint-suffixed numerals:
    * Hex: Bigint-suffixed hex treats the high bit of any literal with a length multiple of 8 as the sign bit
    * Binary: Bigint-suffixed binary accepts sign bits at 96 and 128 chars, and from there on every 8 characters.
    * Prefixing the literal with a 0 will bypass this and be treated as unsigned, e.g. `0b011111111`
* Specifying an `u`nsigned suffix (or combination suffix that includes `u`) ignores sign bits, similar to how parsing a hex string using `[Convert]::ToUint32()` would do so.
* Supports negating literals using `-` prefix. This can result in positive numbers due to sign bits being permitted, just like hex literals.

---

### Refactored numeric tokenizer parsing

**New flow:**

1. Check for `real` (`.01`, `0.0`, or `0e0` syntaxes)
    1. If the decimal suffix is present, TryParse directly into decimal. If the TryParse fails, TryGetNumberValue returns `false`.
    2. TryParse as `Double`, and apply multiplier to value. If the TryParse fails, TryGetNumberValue returns `false`.
        1. Check type suffixes and attempt to cast into appropriate type. This will return `false` if the value exceeds the specified type's bounds.
        2. Default to parsing as `double` where no suffix has been applied.
2. Check number format.
    * If binary, manually parse into BigInteger using optimized helper function to directly construct the BigInteger bytes from the string.
    * If hex, TryParse into `BigInteger` using some special casing to retain original behaviours in int/long ranges.
    * If neither binary nor hex, TryParse normally as a `BigInteger`.
3. Apply multiplier value before attempting any casts to ensure type bounds can be appropriately checked without overflows.
4. Check type suffixes.
    * If a specific type suffix is used, check type bounds and attempt to parse into that type.
      * If the value exceeds the type's available values, the parse fails. Otherwise, a straight cast is performed.
5. If no suffix is used, the following types are bounds-checked, in order, resulting in the first successful test determining the type of the number. 
    * `int`
    * `long`
    * `decimal` (base-10 literals only)
    * `double` (base-10 literals only)
    * ~~`BigInteger` for binary or hex literals.~~ If the value is outside `long` range (for hex and binary) or `double` range (for base 10), the parse will fail; higher values must be explicitly requested using the `n`/`N` BigInteger suffix.

---

*This is a breaking change* as binary literals are now read as numbers instead of generic tokens which could potentially have been used as function / cmdlet names or file names.

Notes:
* Binary literal support was approved by the committee in #7557 
* ~~The same issue is still under further discussion for underscore support in numeric literals and whether BigInteger parsing ought to be exposed to the user at all.~~
    * ~~Supporting underscore literals is a further breaking change causing some generic tokens like `1_000_000` to be read as numerals instead.~~ Per @SteveL-MSFT's [comment](https://github.com/PowerShell/PowerShell/pull/7993#issuecomment-442651543) this proposal was rejected.
    * ~~Removing underscore support or preventing standard parsing from accepting BigInteger ranges is a relatively trivial matter. It is my personal opinion that there is no particular reason *not* to hand the user a BigInteger when they enter a sufficiently large literal, but I will defer to the PowerShell Committee's judgement on this.~~
2019-04-03 15:10:02 -07:00
..
Ast.Tests.ps1 Use new Pester syntax: -Parameter for Pester tests in Language. (#6304) 2018-03-21 10:47:08 -07:00
AutomaticVariables.Tests.ps1 Use new Pester syntax: -Parameter for Pester tests in Language. (#6304) 2018-03-21 10:47:08 -07:00
BNotOperator.Tests.ps1 Use new Pester syntax: -Parameter for Pester tests in Language. (#6304) 2018-03-21 10:47:08 -07:00
Conversions.Tests.ps1 Parse numeric strings as numbers again during conversions (#8681) 2019-02-04 12:22:05 -08:00
ExtensibleCompletion.Tests.ps1 Use new Pester syntax: -Parameter for Pester tests in Language. (#6304) 2018-03-21 10:47:08 -07:00
LanguageAndParser.TestFollowup.Tests.ps1 Change hashtable to use OrdinalIgnoreCase to be case-insensitive in all Cultures (#8566) 2019-01-10 09:11:43 +05:00
MethodInvocation.Tests.ps1 Use new Pester syntax: -Parameter for Pester tests in Language. (#6304) 2018-03-21 10:47:08 -07:00
ParameterBinding.Tests.ps1 Consolidation of all Windows PowerShell work ported to PSCore6 (#8257) 2018-11-13 16:16:29 -08:00
Parser.Tests.ps1 Add Binary Parsing Support & Refactor TryGetNumberValue & ScanNumberHelper (#7993) 2019-04-03 15:10:02 -07:00
Parsing.Tests.ps1 Update copyright and license headers (#6134) 2018-02-13 09:23:53 -08:00
RedirectionOperator.Tests.ps1 Remove use of cmdlet aliases from .\test\powershell (#8546) 2018-12-28 13:48:23 +05:00
TypeAccelerator.Tests.ps1 Consolidation of all Windows PowerShell work ported to PSCore6 (#8257) 2018-11-13 16:16:29 -08:00
UsingAssembly.Tests.ps1 Remove use of cmdlet aliases from .\test\powershell (#8546) 2018-12-28 13:48:23 +05:00
UsingNamespace.Tests.ps1 Remove use of cmdlet aliases from .\test\powershell (#8546) 2018-12-28 13:48:23 +05:00