terminal/src/terminal/parser/stateMachine.hpp
James Holderness 7fcff4d33a
Refactor VT control sequence identification (#7304)
This PR changes the way VT control sequences are identified and
dispatched, to be more efficient and easier to extend. Instead of
parsing the intermediate characters into a vector, and then having to
identify a sequence using both that vector and the final char, we now
use just a single `uint64_t` value as the identifier.

The way the identifier is constructed is by taking the private parameter
prefix, each of the intermediate characters, and then the final
character, and shifting them into a 64-bit integer one byte at a time,
in reverse order. For example, the `DECTLTC` control has a private
parameter prefix of `?`, one intermediate of `'`, and a final character
of `s`. The ASCII values of those characters are `0x3F`, `0x27`, and
`0x73` respectively, and reversing them gets you 0x73273F, so that would
then be the identifier for the control.

The reason for storing them in reverse order, is because sometimes we
need to look at the first intermediate to determine the operation, and
treat the rest of the sequence as a kind of sub-identifier (the
character set designation sequences are one example of this). When in
reverse order, this can easily be achieved by masking off the low byte
to get the first intermediate, and then shifting the value right by 8
bits to get a new identifier with the rest of the sequence.

With 64 bits we have enough space for a private prefix, six
intermediates, and the final char, which is way more than we should ever
need (the _DEC STD 070_ specification recommends supporting at least
three intermediates, but in practice we're unlikely to see more than
two).

With this new way of identifying controls, it should now be possible for
every action code to be unique (for the most part). So I've also used
this PR to clean up the action codes a bit, splitting the codes for the
escape sequences from the control sequences, and sorting them into
alphabetical order (which also does a reasonable job of clustering
associated controls).

## Validation Steps Performed

I think the existing unit tests should be good enough to confirm that
all sequences are still being dispatched correctly. However, I've also
manually tested a number of sequences to make sure they were still
working as expected, in particular those that used intermediates, since
they were the most affected by the dispatch code refactoring.

Since these changes also affected the input state machine, I've done
some manual testing of the conpty keyboard handling (both with and
without the new Win32 input mode enabled) to make sure the keyboard VT
sequences were processed correctly. I've also manually tested the
various VT mouse modes in Vttest to confirm that they were still working
correctly too.

Closes #7276
2020-08-18 18:57:52 +00:00

158 lines
5.2 KiB
C++

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.
/*
Module Name:
- stateMachine.hpp
Abstract:
- This declares the entire state machine for handling Virtual Terminal Sequences
- The design is based from the specifications at http://vt100.net
- The actual implementation of actions decoded by the StateMachine should be
implemented in an IStateMachineEngine.
*/
#pragma once
#include "IStateMachineEngine.hpp"
#include "telemetry.hpp"
#include "tracing.hpp"
#include <memory>
namespace Microsoft::Console::VirtualTerminal
{
// The DEC STD 070 reference recommends supporting up to at least 16384 for
// parameter values, so 32767 should be more than enough. At most we might
// want to increase this to 65535, since that is what XTerm and VTE support,
// but for now 32767 is the safest limit for our existing code base.
constexpr size_t MAX_PARAMETER_VALUE = 32767;
class StateMachine final
{
#ifdef UNIT_TESTING
friend class OutputEngineTest;
friend class InputEngineTest;
#endif
public:
StateMachine(std::unique_ptr<IStateMachineEngine> engine);
void SetAnsiMode(bool ansiMode) noexcept;
void ProcessCharacter(const wchar_t wch);
void ProcessString(const std::wstring_view string);
void ResetState() noexcept;
bool FlushToTerminal();
const IStateMachineEngine& Engine() const noexcept;
IStateMachineEngine& Engine() noexcept;
private:
void _ActionExecute(const wchar_t wch);
void _ActionExecuteFromEscape(const wchar_t wch);
void _ActionPrint(const wchar_t wch);
void _ActionEscDispatch(const wchar_t wch);
void _ActionVt52EscDispatch(const wchar_t wch);
void _ActionCollect(const wchar_t wch) noexcept;
void _ActionParam(const wchar_t wch);
void _ActionCsiDispatch(const wchar_t wch);
void _ActionOscParam(const wchar_t wch) noexcept;
void _ActionOscPut(const wchar_t wch);
void _ActionOscDispatch(const wchar_t wch);
void _ActionSs3Dispatch(const wchar_t wch);
void _ActionDcsPassThrough(const wchar_t wch);
void _ActionClear();
void _ActionIgnore() noexcept;
void _EnterGround() noexcept;
void _EnterEscape();
void _EnterEscapeIntermediate() noexcept;
void _EnterCsiEntry();
void _EnterCsiParam() noexcept;
void _EnterCsiIgnore() noexcept;
void _EnterCsiIntermediate() noexcept;
void _EnterOscParam() noexcept;
void _EnterOscString() noexcept;
void _EnterOscTermination() noexcept;
void _EnterSs3Entry();
void _EnterSs3Param() noexcept;
void _EnterVt52Param() noexcept;
void _EnterDcsEntry();
void _EnterDcsParam() noexcept;
void _EnterDcsIgnore() noexcept;
void _EnterDcsIntermediate() noexcept;
void _EnterDcsPassThrough() noexcept;
void _EnterDcsTermination() noexcept;
void _EventGround(const wchar_t wch);
void _EventEscape(const wchar_t wch);
void _EventEscapeIntermediate(const wchar_t wch);
void _EventCsiEntry(const wchar_t wch);
void _EventCsiIntermediate(const wchar_t wch);
void _EventCsiIgnore(const wchar_t wch);
void _EventCsiParam(const wchar_t wch);
void _EventOscParam(const wchar_t wch) noexcept;
void _EventOscString(const wchar_t wch);
void _EventOscTermination(const wchar_t wch);
void _EventSs3Entry(const wchar_t wch);
void _EventSs3Param(const wchar_t wch);
void _EventVt52Param(const wchar_t wch);
void _EventDcsEntry(const wchar_t wch);
void _EventDcsIgnore(const wchar_t wch) noexcept;
void _EventDcsIntermediate(const wchar_t wch);
void _EventDcsParam(const wchar_t wch);
void _EventDcsPassThrough(const wchar_t wch);
void _EventDcsTermination(const wchar_t wch);
void _AccumulateTo(const wchar_t wch, size_t& value) noexcept;
enum class VTStates
{
Ground,
Escape,
EscapeIntermediate,
CsiEntry,
CsiIntermediate,
CsiIgnore,
CsiParam,
OscParam,
OscString,
OscTermination,
Ss3Entry,
Ss3Param,
Vt52Param,
DcsEntry,
DcsIgnore,
DcsIntermediate,
DcsParam,
DcsPassThrough,
DcsTermination
};
Microsoft::Console::VirtualTerminal::ParserTracing _trace;
std::unique_ptr<IStateMachineEngine> _engine;
VTStates _state;
bool _isInAnsiMode;
std::wstring_view _run;
VTIDBuilder _identifier;
std::vector<size_t> _parameters;
std::wstring _oscString;
size_t _oscParameter;
std::optional<std::wstring> _cachedSequence;
// This is tracked per state machine instance so that separate calls to Process*
// can start and finish a sequence.
bool _processingIndividually;
};
}