Improve perf by avoiding vector reallocation in renderer clusters and VT output graphics options (#6420)

## Summary of the Pull Request
Caches vectors in the class and uses a new helper to opportunistically shrink/grow as viewport sizes change in order to save performance on alloc/free of commonly used vectors.

## PR Checklist
* [x] Scratches a perf itch.
* [x] I work here.
* [x] wil tests added
* [x] No add'l doc.
* [x] Am core contributor.

## Detailed Description of the Pull Request / Additional comments
Two fixes:
1. For outputting lots of text, the base renderer class spent a lot of time allocating and freeing and reallocating the `Cluster` vector that adapts the text buffer information into render clusters. I've now cached this vector in the base render class itself and I shrink/grow it based on the viewport update that happens at the top of every frame. To prevent too much thrashing in the downward/shrink direction, I wrote the `til::manage_vector` helper that contains a threshold to only shrink if it asks for small enough of a size relative to the existing one. I used 80% of the existing size as the threshold for this one.
2. For outputting lots of changing colors, the VT graphics output engine spent a bunch of time allocating and reallocating the vector for `GraphicsOptions`. This one doesn't really have a predictable size, but I never expect it to get extremely big. So I just held it in the base class.

## Validation Steps Performed
* [x] Ran the til unit test
* [x] Checked render cluster vector time before/after against `big.txt` from #1064
* [x] Checked VT graphics output vector time before/after against `cacafire`

Case | Before | After
---|---|---|
`big.txt` | ![image](https://user-images.githubusercontent.com/18221333/84088632-cbaa8400-a9a1-11ea-8932-04b2e12a0477.png) | ![image](https://user-images.githubusercontent.com/18221333/84088996-b6822500-a9a2-11ea-837c-5e32a110156e.png)
`cacafire` | ![image](https://user-images.githubusercontent.com/18221333/84089153-22648d80-a9a3-11ea-8567-c3d80efa16a6.png) | ![image](https://user-images.githubusercontent.com/18221333/84089190-34463080-a9a3-11ea-98e5-a236b12330d6.png)
This commit is contained in:
Michael Niksa 2020-06-10 15:02:05 -07:00 committed by GitHub
parent 8dcfd61278
commit 0c93b2ebbe
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 82 additions and 10 deletions

View file

@ -18,6 +18,26 @@
namespace til // Terminal Implementation Library. Also: "Today I Learned"
{
template<typename T>
void manage_vector(std::vector<T>& vector, typename std::vector<T>::size_type requestedSize, float shrinkThreshold)
{
const auto existingCapacity = vector.capacity();
const auto requiredCapacity = requestedSize;
// Check by integer first as float math is way more expensive.
if (requiredCapacity < existingCapacity)
{
// Now check if it's even worth shrinking. We don't want to shrink by 1 at a time, so meet a threshold first.
if (requiredCapacity <= gsl::narrow_cast<size_t>((static_cast<float>(existingCapacity) * shrinkThreshold)))
{
// There's no real way to force a shrink, so make a new one.
vector = std::vector<T>{};
}
}
// Reserve won't shrink on its own and won't grow if we have enough space.
vector.reserve(requiredCapacity);
}
}
// These sit outside the namespace because they sit outside for WIL too.

View file

@ -28,7 +28,8 @@ Renderer::Renderer(IRenderData* pData,
std::unique_ptr<IRenderThread> thread) :
_pData(pData),
_pThread{ std::move(thread) },
_destructing{ false }
_destructing{ false },
_clusterBuffer{}
{
_srViewportPrevious = { 0 };
@ -370,6 +371,12 @@ bool Renderer::_CheckViewportAndScroll()
_srViewportPrevious = srNewViewport;
// If we're keeping some buffers between calls, let them know about the viewport size
// so they can prepare the buffers for changes to either preallocate memory at once
// (instead of growing naturally) or shrink down to reduce usage as appropriate.
const size_t lineLength = gsl::narrow_cast<size_t>(til::rectangle{ srNewViewport }.width());
til::manage_vector(_clusterBuffer, lineLength, _shrinkThreshold);
if (coordDelta.X != 0 || coordDelta.Y != 0)
{
for (auto engine : _rgpEngines)
@ -683,7 +690,6 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
// we should have an iterator/view adapter for the rendering.
// That would probably also eliminate the RenderData needing to give us the entire TextBuffer as well...
// Retrieve the iterator for one line of information.
std::vector<Cluster> clusters;
size_t cols = 0;
// Retrieve the first color.
@ -714,7 +720,7 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
const auto currentRunTargetStart = screenPoint;
// Ensure that our cluster vector is clear.
clusters.clear();
_clusterBuffer.clear();
// Reset our flag to know when we're in the special circumstance
// of attempting to draw only the right-half of a two-column character
@ -746,7 +752,7 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
// If we're on the first cluster to be added and it's marked as "trailing"
// (a.k.a. the right half of a two column character), then we need some special handling.
if (clusters.empty() && it->DbcsAttr().IsTrailing())
if (_clusterBuffer.empty() && it->DbcsAttr().IsTrailing())
{
// If we have room to move to the left to start drawing...
if (screenPoint.X > 0)
@ -757,7 +763,7 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
trimLeft = true;
// And add one to the number of columns we expect it to take as we insert it.
columnCount = it->Columns() + 1;
clusters.emplace_back(it->Chars(), columnCount);
_clusterBuffer.emplace_back(it->Chars(), columnCount);
}
else
{
@ -770,7 +776,7 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
else
{
columnCount = it->Columns();
clusters.emplace_back(it->Chars(), columnCount);
_clusterBuffer.emplace_back(it->Chars(), columnCount);
}
if (columnCount > 1)
@ -785,7 +791,7 @@ void Renderer::_PaintBufferOutputHelper(_In_ IRenderEngine* const pEngine,
} while (it);
// Do the painting.
THROW_IF_FAILED(pEngine->PaintBufferLine({ clusters.data(), clusters.size() }, screenPoint, trimLeft, lineWrapped));
THROW_IF_FAILED(pEngine->PaintBufferLine({ _clusterBuffer.data(), _clusterBuffer.size() }, screenPoint, trimLeft, lineWrapped));
// If we're allowed to do grid drawing, draw that now too (since it will be coupled with the color data)
// We're only allowed to draw the grid lines under certain circumstances.

View file

@ -121,6 +121,9 @@ namespace Microsoft::Console::Render
SMALL_RECT _srViewportPrevious;
static constexpr float _shrinkThreshold = 0.8f;
std::vector<Cluster> _clusterBuffer;
std::vector<SMALL_RECT> _GetSelectionRects() const;
void _ScrollPreviousSelection(const til::point delta);
std::vector<SMALL_RECT> _previousSelection;

View file

@ -463,7 +463,9 @@ bool OutputStateMachineEngine::ActionCsiDispatch(const wchar_t wch,
size_t clearType = 0;
unsigned int function = 0;
DispatchTypes::EraseType eraseType = DispatchTypes::EraseType::ToEnd;
std::vector<DispatchTypes::GraphicsOptions> graphicsOptions;
// We hold the vector in the class because client applications that do a lot of color work
// would spend a lot of time reallocating/resizing the vector.
_graphicsOptions.clear();
DispatchTypes::AnsiStatusType deviceStatusType = static_cast<DispatchTypes::AnsiStatusType>(0); // there is no default status type.
size_t repeatCount = 0;
// This is all the args after the first arg, and the count of args not including the first one.
@ -502,7 +504,7 @@ bool OutputStateMachineEngine::ActionCsiDispatch(const wchar_t wch,
success = _GetEraseOperation(parameters, eraseType);
break;
case VTActionCodes::SGR_SetGraphicsRendition:
success = _GetGraphicsOptions(parameters, graphicsOptions);
success = _GetGraphicsOptions(parameters, _graphicsOptions);
break;
case VTActionCodes::DSR_DeviceStatusReport:
success = _GetDeviceStatusOperation(parameters, deviceStatusType);
@ -613,7 +615,7 @@ bool OutputStateMachineEngine::ActionCsiDispatch(const wchar_t wch,
TermTelemetry::Instance().Log(TermTelemetry::Codes::EL);
break;
case VTActionCodes::SGR_SetGraphicsRendition:
success = _dispatch->SetGraphicsRendition({ graphicsOptions.data(), graphicsOptions.size() });
success = _dispatch->SetGraphicsRendition({ _graphicsOptions.data(), _graphicsOptions.size() });
TermTelemetry::Instance().Log(TermTelemetry::Codes::SGR);
break;
case VTActionCodes::DSR_DeviceStatusReport:

View file

@ -71,6 +71,7 @@ namespace Microsoft::Console::VirtualTerminal
Microsoft::Console::ITerminalOutputConnection* _pTtyConnection;
std::function<bool()> _pfnFlushToTerminal;
wchar_t _lastPrintedChar;
std::vector<DispatchTypes::GraphicsOptions> _graphicsOptions;
bool _IntermediateScsDispatch(const wchar_t wch,
const std::basic_string_view<wchar_t> intermediates);

View file

@ -0,0 +1,36 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.
#include "precomp.h"
#include "til.h"
using namespace WEX::Common;
using namespace WEX::Logging;
using namespace WEX::TestExecution;
// Tests for things in the TIL base class.
class BaseTests
{
TEST_CLASS(BaseTests);
TEST_METHOD(ManageVector)
{
constexpr float shrinkThreshold = 0.5f;
std::vector<int> foo;
foo.reserve(20);
Log::Comment(L"Expand vector.");
til::manage_vector(foo, 30, shrinkThreshold);
VERIFY_ARE_EQUAL(30, foo.capacity());
Log::Comment(L"Try shrink but by not enough for threshold.");
til::manage_vector(foo, 18, shrinkThreshold);
VERIFY_ARE_EQUAL(30, foo.capacity());
Log::Comment(L"Shrink because it is meeting threshold.");
til::manage_vector(foo, 15, shrinkThreshold);
VERIFY_ARE_EQUAL(15, foo.capacity());
}
};

View file

@ -14,6 +14,7 @@ DLLDEF =
SOURCES = \
$(SOURCES) \
BaseTests.cpp \
BitmapTests.cpp \
ColorTests.cpp \
OperatorTests.cpp \

View file

@ -10,6 +10,7 @@
</PropertyGroup>
<Import Project="$(SolutionDir)src\common.build.pre.props" />
<ItemGroup>
<ClCompile Include="BaseTests.cpp" />
<ClCompile Include="BitmapTests.cpp" />
<ClCompile Include="OperatorTests.cpp" />
<ClCompile Include="PointTests.cpp" />

View file

@ -13,6 +13,8 @@
<ClCompile Include="RectangleTests.cpp" />
<ClCompile Include="BitmapTests.cpp" />
<ClCompile Include="OperatorTests.cpp" />
<ClCompile Include="MathTests.cpp" />
<ClCompile Include="BaseTests.cpp" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\precomp.h" />