terminal/build/Helix
Michael Niksa 5d082ffe67
Helix Testing (#6992)
Use the Helix testing orchestration framework to run our Terminal LocalTests and Console Host UIA tests.

## References
#### Creates the following new issues:
- #7281 - re-enable local tests that were disabled to turn on Helix
- #7282 - re-enable UIA tests that were disabled to turn on Helix
- #7286 - investigate and implement appropriate compromise solution to how Skipped is handled by MUX Helix scripts

#### Consumes from:
- #7164 - The update to TAEF includes wttlog.dll. The WTT logs are what MUX's Helix scripts use to track the run state, convert to XUnit format, and notify both Helix and AzDO of what's going on.

#### Produces for:
- #671 - Making Terminal UIA tests is now possible
- #6963 - MUX's Helix scripts are already ready to capture PGO data on the Helix machines as certain tests run. Presuming we can author some reasonable scenarios, turning on the Helix environment gets us a good way toward automated PGO.

#### Related:
- #4490 - We lost the AzDO integration of our test data when I moved from the TAEF/VSTest adapter directly back to TE. Thanks to the WTTLog + Helix conversion scripts to XUnit + new upload phase, we have it back!

## PR Checklist
* [x] Closes #3838
* [x] I work here.
* [x] Literally adds tests.
* [ ] Should I update a testing doc in this repo?
* [x] Am core contributor. Hear me roar.
* [ ] Correct spell-checking the right way before merge.

## Detailed Description of the Pull Request / Additional comments
We have had two classes of tests that don't work in our usual build-machine testing environment:
1. Tests that require interactive UI automation or input injection (a.k.a. require a logged in user)
2. Tests that require the entire Windows Terminal to stand up (because our Xaml Islands dependency requires 1903 or later and the Windows Server instance for the build is based on 1809.)

The Helix testing environment solves both of these and is brought to us by our friends over in https://github.com/microsoft/microsoft-ui-xaml.

This PR takes a large portion of scripts and pipeline configuration steps from the Microsoft-UI-XAML repository and adjusts them for Terminal needs.
You can see the source of most of the files in either https://github.com/microsoft/microsoft-ui-xaml/tree/master/build/Helix or https://github.com/microsoft/microsoft-ui-xaml/tree/master/build/AzurePipelinesTemplates

Some of the modifications in the files include (but are not limited to) reasons like:
- Our test binaries are named differently than MUX's test binaries
- We don't need certain types of testing that MUX does.
- We use C++ and C# tests while MUX was using only C# tests (so the naming pattern and some of the parsing of those names is different e.g. :: separators in C++ and . separators in C#)
- Our pipeline phases work a bit differently than MUX and/or we need significantly fewer pieces to the testing matrix (like we don't test a wide variety of OS versions).

The build now runs in a few stages:
1. The usual build and run of unit tests/feature tests, packaging verification, and whatnot. This phase now also picks up and packs anything required for running tests in Helix into an artifact. (It also unifies the artifact name between the things Helix needs and the existing build outputs into the single `drop` artifact to make life a little easier.)
2. The Helix preparation build runs that picks up those artifacts, generates all the scripts required for Helix to understand the test modules/functions from our existing TAEF tests, packs it all up, and queues it on the Helix pool.
3. Helix generates a VM for our testing environment and runs all the TAEF tests that require it. The orchestrator at helix.dot.net watches over this and tracks the success/fail and progress of each module and function. The scripts from our MUX friends handle installing dependencies, making the system quiet for better reliability, detecting flaky tests and rerunning them, and coordinating all the log uploads (including for the subruns of tests that are re-run.)
4. A final build phase is run to look through the results with the Helix API and clean up the marking of tests that are flaky, link all the screenshots and console output logs into the AzDO tests panel, and other such niceities.

We are set to run Helix tests on the Feature test policy of only x64 for now. 

Additionally, because the set up of the Helix VMs takes so long, we are *NOT* running these in PR trigger right now as I believe we all very much value our 15ish minute PR turnaround (and the VM takes another 15 minutes to just get going for whatever reason.) For now, they will only run as a rolling build on master after PRs are merged. We should still know when there's an issue within about an hour of something merging and multiple PRs merging fast will be done on the rolling build as a batch run (not one per).

In addition to setting up the entire Helix testing pipeline for the tests that require it, I've preserved our classic way of running unit and feature tests (that don't require an elaborate environment) directly on the build machines. But with one bonus feature... They now use some of the scripts from MUX to transform their log data and report it to AzDO so it shows up beautifully in the build report. (We used to have this before I removed the MStest/VStest wrapper for performance reasons, but now we can have reporting AND performance!) See https://dev.azure.com/ms/terminal/_build/results?buildId=101654&view=ms.vss-test-web.build-test-results-tab for an example. 

I explored running all of the tests on Helix but.... the Helix setup time is long and the resources are more expensive. I felt it was better to preserve the "quick signal" by continuing to run these directly on the build machine (and skipping the more expensive/slow Helix setup if they fail.) It also works well with the split between PR builds not running Helix and the rolling build running Helix. PR builds will get a good chunk of tests for a quick turn around and the rolling build will finish the more thorough job a bit more slowly.

## Validation Steps Performed
- [x] Ran the updated pipelines with Pull Request configuration ensuring that Helix tests don't run in the usual CI
- [x] Ran with simulation of the rolling build to ensure that the tests now running in Helix will pass. All failures marked for follow on in reference issues.
2020-08-18 18:23:24 +00:00
..
AzurePipelinesHelperScripts.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
ConvertWttLogToXUnit.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
EnsureMachineState.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
GenerateTestProjFile.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
global.json Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
HelixTestHelpers.cs Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
InstallTestAppDependencies.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
OutputFailedTestQuery.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
OutputSubResultsJsonFiles.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
OutputTestResults.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
packages.config Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
PrepareHelixPayload.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
ProcessHelixFiles.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
readme.md Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
runtests.cmd Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
RunTestsInHelix.proj Helix Testing (#6992) 2020-08-18 18:23:24 +00:00
UpdateUnreliableTests.ps1 Helix Testing (#6992) 2020-08-18 18:23:24 +00:00

This directory contains code and configuration files to run WinUI tests in Helix.

Helix is a cloud hosted test execution environment which is accessed via the Arcade SDK. More details:

WinUI tests are scheduled in Helix by the Azure DevOps Pipeline: RunHelixTests.yml.

The workflow is as follows:

  1. NuGet Restore is called on the packages.config in this directory. This downloads any runtime dependencies that are needed to run tests.
  2. PrepareHelixPayload.ps1 is called. This copies the necessary files from various locations into a Helix payload directory. This directory is what will get sent to the Helix machines.
  3. RunTestsInHelix.proj is executed. This proj has a dependency on Microsoft.DotNet.Helix.Sdk which it uses to publish the Helix payload directory and to schedule the Helix Work Items. The WinUI tests are parallelized into multiple Helix Work Items.
  4. Each Helix Work Item calls runtests.cmd with a specific query to pass to TAEF which runs the tests.
  5. If a test is detected to have failed, we run it again, first once, then eight more times if it fails again. If it fails all ten times, we report the test as failed; otherwise, we report it as unreliable, which will show up as a warning, but which will not fail the build. When a test is reported as unreliable, we include the results for each individual run via a JSON string in the original test's errorMessage field.
  6. TAEF produces logs in WTT format. Helix is able to process logs in XUnit format. We run ConvertWttLogToXUnit.ps1 to convert the logs into the necessary format.
  7. RunTestsInHelix.proj has EnableAzurePipelinesReporter set to true. This allows the XUnit formatted test results to be reported back to the Azure DevOps Pipeline.
  8. We process unreliable tests once all tests have been reported by reading the JSON string from the errorMessage field and calling the Azure DevOps REST API to modify the unreliable tests to have sub-results added to the test and to mark the test as "warning", which will enable people to see exactly how the test failed in runs where it did.