terminal/src/terminal/parser/telemetry.hpp
James Holderness 6742965bb8
Disable the acceptance of C1 control codes by default (#11690)
There are some code pages with "unmapped" code points in the C1 range,
which results in them being translated into Unicode C1 control codes,
even though that is not their intended use. To avoid having these
characters triggering unintentional escape sequences, this PR now
disables C1 controls by default.

Switching to ISO-2022 encoding will re-enable them, though, since that
is the most likely scenario in which they would be required. They can
also be explicitly enabled, even in UTF-8 mode, with the `DECAC1` escape
sequence.

What I've done is add a new mode to the `StateMachine` class that
controls whether C1 code points are interpreted as control characters or
not. When disabled, these code points are simply dropped from the
output, similar to the way a `NUL` is interpreted.

This isn't exactly the way they were handled in the v1 console (which I
think replaces them with the font _notdef_ glyph), but it matches the
XTerm behavior, which seems more appropriate considering this is in VT
mode. And it's worth noting that Windows Explorer seems to work the same
way.

As mentioned above, the mode can be enabled by designating the ISO-2022
coding system with a `DOCS` sequence, and it will be disabled again when
UTF-8 is designated. You can also enable it explicitly with a `DECAC1`
sequence (originally this was actually a DEC printer sequence, but it
doesn't seem unreasonable to use it in a terminal).

I've also extended the operations that save and restore "cursor state"
(e.g. `DECSC` and `DECRC`) to include the state of the C1 parser mode,
since it's closely tied to the code page and character sets which are
also saved there. Similarly, when a `DECSTR` sequence resets the code
page and character sets, I've now made it reset the C1 mode as well.

I should note that the new `StateMachine` mode is controlled via a
generic `SetParserMode` method (with a matching API in the `ConGetSet`
interface) to allow for easier addition of other modes in the future.
And I've reimplemented the existing ANSI/VT52 mode in terms of these
generic methods instead of it having to have its own separate APIs.

## Validation Steps Performed

Some of the unit tests for OSC sequences were using a C1 `0x9C` for the
string terminator, which doesn't work by default anymore. Since that's
not a good practice anyway, I thought it best to change those to a
standard 7-bit terminator. However, in tests that were explicitly
validating the C1 controls, I've just enabled the C1 parser mode at the
start of the tests in order to get them working again.

There were also some ANSI mode adapter tests that had to be updated to
account for the fact that it has now been reimplemented in terms of the
`SetParserMode` API.

I've added a new state machine test to validate the changes in behavior
when the C1 parser mode is enabled or disabled. And I've added an
adapter test to verify that the `DesignateCodingSystems` and
`AcceptC1Controls` methods toggle the C1 parser mode as expected.

I've manually verified the test cases in #10069 and #10310 to confirm
that they're no longer triggering control sequences by default.
Although, as I explained above, the C1 code points are completely
dropped from the output rather than displayed as _notdef_ glyphs. I
think this is a reasonable compromise though.

Closes #10069
Closes #10310
2021-11-17 23:40:31 +00:00

143 lines
3.6 KiB
C++

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.
/*
Module Name:
- telemetry.hpp
Abstract:
- This module is used for recording all telemetry feedback from the console virtual terminal parser
*/
#pragma once
// Including TraceLogging essentials for the binary
#include <windows.h>
#include <winmeta.h>
#include <TraceLoggingProvider.h>
#include "climits"
TRACELOGGING_DECLARE_PROVIDER(g_hConsoleVirtTermParserEventTraceProvider);
namespace Microsoft::Console::VirtualTerminal
{
class TermTelemetry sealed
{
public:
// Implement this as a singleton class.
static TermTelemetry& Instance() noexcept
{
static TermTelemetry s_Instance;
return s_Instance;
}
// Names primarily from http://inwap.com/pdp10/ansicode.txt
enum Codes
{
CUU = 0,
CUD,
CUF,
CUB,
CNL,
CPL,
CHA,
CUP,
ED,
EL,
SGR,
DECSC,
DECRC,
DECSET,
DECRST,
DECKPAM,
DECKPNM,
DSR,
DA,
DA2,
DA3,
DECREQTPARM,
VPA,
HPR,
VPR,
ICH,
DCH,
SU,
SD,
ANSISYSSC,
ANSISYSRC,
IL,
DL,
DECSTBM,
NEL,
IND,
RI,
OSCWT,
HTS,
CHT,
CBT,
TBC,
ECH,
DesignateG0,
DesignateG1,
DesignateG2,
DesignateG3,
LS2,
LS3,
LS1R,
LS2R,
LS3R,
SS2,
SS3,
DOCS,
HVP,
DECSTR,
RIS,
DECSCUSR,
DTTERM_WM,
OSCCT,
OSCSCC,
OSCRCC,
REP,
OSCFG,
OSCBG,
DECAC1,
DECSWL,
DECDWL,
DECDHL,
DECALN,
OSCSCB,
XTPUSHSGR,
XTPOPSGR,
// Only use this last enum as a count of the number of codes.
NUMBER_OF_CODES
};
void Log(const Codes code) noexcept;
void LogFailed(const wchar_t wch) noexcept;
void SetShouldWriteFinalLog(const bool writeLog) noexcept;
void SetActivityId(const GUID* activityId) noexcept;
unsigned int GetAndResetTimesUsedCurrent() noexcept;
unsigned int GetAndResetTimesFailedCurrent() noexcept;
unsigned int GetAndResetTimesFailedOutsideRangeCurrent() noexcept;
private:
// Used to prevent multiple instances
TermTelemetry() noexcept;
~TermTelemetry();
TermTelemetry(TermTelemetry const&) = delete;
TermTelemetry(TermTelemetry&&) = delete;
TermTelemetry& operator=(const TermTelemetry&) = delete;
TermTelemetry& operator=(TermTelemetry&&) = delete;
void WriteFinalTraceLog() const;
unsigned int _uiTimesUsedCurrent;
unsigned int _uiTimesFailedCurrent;
unsigned int _uiTimesFailedOutsideRangeCurrent;
unsigned int _uiTimesUsed[NUMBER_OF_CODES];
unsigned int _uiTimesFailed[CHAR_MAX + 1];
unsigned int _uiTimesFailedOutsideRange;
GUID _activityId;
bool _fShouldWriteFinalLog;
};
}