terminal/tools/PGODatabase/readme.md
Michael Niksa 7dadde5dd6
Implement PGO in pipelines for AMD64 architecture; supply training test scenarios (#10071)
Implement PGO in pipelines for AMD64 architecture; supply training test scenarios

## References
- #3075 - Relevant to speed interests there and other linked issues.

## PR Checklist
* [x] Closes #6963
* [x] I work here.
* [x] New UIA Tests added and passed. Manual build runs also tested.

## Detailed Description of the Pull Request / Additional comments
- Creates a new pipeline run for creating instrumented binaries for Profile Guided Optimization (PGO).
- Creates a new suite of UIA tests on the full Windows Terminal app to run PGO training scenarios on instrumented binaries (and incidentally can be used to write other UIA tests later for the full Terminal app.)
- Creates a new NuGet artifact to store trained PGO databases (PGD files) at `Microsoft.Internal.Windows.Terminal.PGODatabase`
- Creates a new NuGet artifact to supply large-scale test content for automated tests at `Microsoft.Internal.Windows.Terminal.TestContent`
- Adjusts the release pipeline to run binaries in PGO optimized mode where content from PGO databases is leveraged at link time to optimize the final release build

The following binaries are trained:
- OpenConsole.exe
- WindowsTerminal.exe
- TerminalApp.dll
- TerminalConnection.dll
- Microsoft.Terminal.Control.dll
- Microsoft.Terminal.Remoting.dll
- Microsoft.Terminal.Settings.Editor.dll
- Microsoft.Terminal.Settings.Model.dll

In the future, adding `<PgoTarget>true</PgoTarget>` to a new `vcxproj` file will automatically enroll the DLL/EXE for PGO instrumentation and optimization going forward.

Two training test scenarios are implemented:
- Smoke test the Terminal by just opening it and typing a bit of text then exiting. (Should help focus on the standard launch path.)
- Optimize bulk text output by launching terminal, outputting `big.txt`, then exiting.

Additional scenarios can be contributed to the `WindowsTerminal_UIATests` project with the `[TestProperty("IsPGO", "true")]` annotation to add them to the suite of scenarios for PGO.

**NOTE:** There are currently no weights applied to the various test scenarios. We will revisit that in the future when/if necessary.

## Validation Steps Performed
- [x] - Training run completed at https://dev.azure.com/ms/terminal/_build?definitionId=492&_a=summary
- [x] - Optimization run completed locally (by forcing `PGOBuildMode` to `Optimize` on my local machine, manually retrieving the databases with NuGet, and building).
- [x] - Validated locally that x86 and ARM64 do not get trained and automatically skip optimization as databases are not present for them.
- [x] - Smoke tested optimized binary versus latest releases. `big.txt` output through CMD is ~11-12seconds prior to PGO and just over 8 seconds with PGO.
2021-05-13 21:12:30 +00:00

3.1 KiB
Raw Permalink Blame History

Profile Guided Optimization

NOTE: All PGO work builds on work from Microsoft/Microsoft-UI-XAML

Description

We generate PGO database NuGet package which is versioned based on product release version and branch name/time stamp of the code that was used for instrumentation and training. In CI/release builds an initialization step enumerates all available versions, filters out those for other releases and branches. Given a list of applicable versions, it will find the one that is closest (BEFORE) the time-stamp of the last commit or a fork-point from instrumented branch. That package version will be installed and version references will be updated. The PGO branch is determined by variable $pgoBranch in tools/PGODatabase/config.ps1. It will need to be updated if a forked branch should be PGO'd.

Scenarios

For the purpose of illustration, lets assume the following is a chronological list of check-ins to two branches (main and release/2.4). Some of them have had instrumentation/training run done on them and have generated PGO NuGets (version numbers in parentheses). To simplify, lets assume that release major and minor versions are the same for all check-in as they merely act as filters for what versions are considered to be available.

1b27fd5f -- main      --
7b303f74 -- main      --
930ff585 -- main      -- 2.4.2001312227-main
63948a75 -- main      --
0d379b51 -- main      --
f23f1fad -- main      -- 2.4.2001312205-main
bcf9adaa -- main      --
6ef44a23 -- main      --
310bc133 -- release/2.4 --
80a4ab55 -- release/2.4 -- 2.4.2001312054-release_2_4
18b956f6 -- release/2.4 --
4abd4d54 -- main      -- 2.4.2001312033-main
d150eae0 -- main      -- 2.4.2001312028-main

Optimizing on PGOd branch

If we are building on main (which in this example is PGOd), the version picked will be the one that has the same major and minor versions AND branch name and is the same or is right before the SHA being built.

E.g.

1b27fd5f -- 2.4.2001312227-main
f23f1fad -- 2.4.2001312205-main
bcf9adaa -- 2.4.2001312033-main

Optimizing release branch

A branch which will be PGOd requires a slightly different handling. Lets say release/2.4 forked from main on commit 4abd4d54. Initially, it will be configured to track main and 18b956f6 will be optimized with 2.4.2001312033-main. When the configuration is changed to start tracking release/2.4 (change branch name $pgoBranch in tools/PGODatabase/config.ps1 script), it will start tracking its own branch.

E.g.

18b956f6 -- if tracking main -> 2.4.2001312033-main,
            if tracking release/2.4 -> ERROR (no database exists)
310bc133 -- 2.4.2001312054-release_2_4

Optimizing topic branch

Assuming topic branch will not have a training run done, it can still use database from branch it was forked from. Lets say we have a branch which was forked from main on 4abd4d54. If we dont change which branch its tracking, it will keep using 2.4.2001312033-main. Merging main on f23f1fad into topic branch, will change used database to 2.4.2001312205-main.