terminal/src/terminal/parser/stateMachine.hpp
James Holderness 2559ed6efa
Introduce a mechanism for passing through DCS data strings (#9307)
This PR introduces a mechanism via which DCS data strings can be passed
through directly to the dispatch method that will be handling them, so
the data can be processed as it is received, rather than being buffered
in the state machine. This also simplifies the way string termination is
handled, so it now more closely matches the behaviour of the original
DEC terminals.

* Initial support for DCS sequences was introduced in PR #6328.
* Handling of DCS (and other) C1 controls was added in PR #7340.
* This is a prerequisite for Sixel (#448) and Soft Font (#9164) support.

The way this now works, a `DCS` sequence is dispatched as soon as the
final character of the `VTID` is received. Based on that ID, the
`OutputStateMachineEngine` should forward the call to the corresponding
dispatch method, and its the responsibility of that method to return an
appropriate handler function for the sequence.

From then on, the `StateMachine` will pass on all of the remaining bytes
in the data string to the handler function. When a data string is
terminated (with `CAN`, `SUB`, or `ESC`), the `StateMachine` will pass
on one final `ESC` character to let the handler know that the sequence
is finished. The handler can also end a sequence prematurely by
returning false, and then all remaining data bytes will be ignored.

Note that once a `DCS` sequence has been dispatched, it's not possible
to abort the data string. Both `CAN` and `SUB` are considered valid
forms of termination, and an `ESC` doesn't necessarily have to be
followed by a `\` for the string terminator. This is because the data
string is typically processed as it's received. For example, when
outputting a Sixel image, you wouldn't erase the parts that had already
been displayed if the data string is terminated early.

With this new way of handling the string termination, I was also able to
simplify some of the `StateMachine` processing, and get rid of a few
states that are no longer necessary. These changes don't apply to the
`OSC` sequences, though, since we're more likely to want to match the
XTerm behavior for those cases (which requires a valid `ST` control for
the sequence to be accepted).

## Validation Steps Performed

For the unit tests, I've had to make a few changes to some of the
`OutputEngineTests` to account for the updated `StateMachine`
processing. I've also added a new `StateMachineTest` to confirm that the
data strings are correctly passed through to the string handler under
all forms of termination.

To test whether the framework is actually usable, I've been working on
DRCS Soft Font support branched off of this PR, and haven't encountered
any problems. To test the throughput speed, I also hacked together a
basic Sixel parser, and that seemed to perform reasonably well.

Closes #7316
2021-04-30 19:17:30 +00:00

167 lines
5.5 KiB
C++

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.
/*
Module Name:
- stateMachine.hpp
Abstract:
- This declares the entire state machine for handling Virtual Terminal Sequences
- The design is based from the specifications at http://vt100.net
- The actual implementation of actions decoded by the StateMachine should be
implemented in an IStateMachineEngine.
*/
#pragma once
#include "IStateMachineEngine.hpp"
#include "telemetry.hpp"
#include "tracing.hpp"
#include <memory>
namespace Microsoft::Console::VirtualTerminal
{
// The DEC STD 070 reference recommends supporting up to at least 16384 for
// parameter values, so 32767 should be more than enough. At most we might
// want to increase this to 65535, since that is what XTerm and VTE support,
// but for now 32767 is the safest limit for our existing code base.
constexpr size_t MAX_PARAMETER_VALUE = 32767;
// The DEC STD 070 reference requires that a minimum of 16 parameter values
// are supported, but most modern terminal emulators will allow around twice
// that number.
constexpr size_t MAX_PARAMETER_COUNT = 32;
class StateMachine final
{
#ifdef UNIT_TESTING
friend class OutputEngineTest;
friend class InputEngineTest;
#endif
public:
StateMachine(std::unique_ptr<IStateMachineEngine> engine);
void SetAnsiMode(bool ansiMode) noexcept;
void ProcessCharacter(const wchar_t wch);
void ProcessString(const std::wstring_view string);
void ResetState() noexcept;
bool FlushToTerminal();
const IStateMachineEngine& Engine() const noexcept;
IStateMachineEngine& Engine() noexcept;
private:
void _ActionExecute(const wchar_t wch);
void _ActionExecuteFromEscape(const wchar_t wch);
void _ActionPrint(const wchar_t wch);
void _ActionEscDispatch(const wchar_t wch);
void _ActionVt52EscDispatch(const wchar_t wch);
void _ActionCollect(const wchar_t wch) noexcept;
void _ActionParam(const wchar_t wch);
void _ActionCsiDispatch(const wchar_t wch);
void _ActionOscParam(const wchar_t wch) noexcept;
void _ActionOscPut(const wchar_t wch);
void _ActionOscDispatch(const wchar_t wch);
void _ActionSs3Dispatch(const wchar_t wch);
void _ActionDcsDispatch(const wchar_t wch);
void _ActionClear();
void _ActionIgnore() noexcept;
void _ActionInterrupt();
void _EnterGround() noexcept;
void _EnterEscape();
void _EnterEscapeIntermediate() noexcept;
void _EnterCsiEntry();
void _EnterCsiParam() noexcept;
void _EnterCsiIgnore() noexcept;
void _EnterCsiIntermediate() noexcept;
void _EnterOscParam() noexcept;
void _EnterOscString() noexcept;
void _EnterOscTermination() noexcept;
void _EnterSs3Entry();
void _EnterSs3Param() noexcept;
void _EnterVt52Param() noexcept;
void _EnterDcsEntry();
void _EnterDcsParam() noexcept;
void _EnterDcsIgnore() noexcept;
void _EnterDcsIntermediate() noexcept;
void _EnterDcsPassThrough() noexcept;
void _EnterSosPmApcString() noexcept;
void _EventGround(const wchar_t wch);
void _EventEscape(const wchar_t wch);
void _EventEscapeIntermediate(const wchar_t wch);
void _EventCsiEntry(const wchar_t wch);
void _EventCsiIntermediate(const wchar_t wch);
void _EventCsiIgnore(const wchar_t wch);
void _EventCsiParam(const wchar_t wch);
void _EventOscParam(const wchar_t wch) noexcept;
void _EventOscString(const wchar_t wch);
void _EventOscTermination(const wchar_t wch);
void _EventSs3Entry(const wchar_t wch);
void _EventSs3Param(const wchar_t wch);
void _EventVt52Param(const wchar_t wch);
void _EventDcsEntry(const wchar_t wch);
void _EventDcsIgnore() noexcept;
void _EventDcsIntermediate(const wchar_t wch);
void _EventDcsParam(const wchar_t wch);
void _EventDcsPassThrough(const wchar_t wch);
void _EventSosPmApcString(const wchar_t wch) noexcept;
void _AccumulateTo(const wchar_t wch, size_t& value) noexcept;
enum class VTStates
{
Ground,
Escape,
EscapeIntermediate,
CsiEntry,
CsiIntermediate,
CsiIgnore,
CsiParam,
OscParam,
OscString,
OscTermination,
Ss3Entry,
Ss3Param,
Vt52Param,
DcsEntry,
DcsIgnore,
DcsIntermediate,
DcsParam,
DcsPassThrough,
SosPmApcString
};
Microsoft::Console::VirtualTerminal::ParserTracing _trace;
std::unique_ptr<IStateMachineEngine> _engine;
VTStates _state;
bool _isInAnsiMode;
std::wstring_view _run;
VTIDBuilder _identifier;
std::vector<VTParameter> _parameters;
bool _parameterLimitReached;
std::wstring _oscString;
size_t _oscParameter;
IStateMachineEngine::StringHandler _dcsStringHandler;
std::optional<std::wstring> _cachedSequence;
// This is tracked per state machine instance so that separate calls to Process*
// can start and finish a sequence.
bool _processingIndividually;
};
}