kibana/x-pack/plugins
Ryland Herrick fbe48221ae
[Security Solution][Detections] Signals Migration API (#84721)
* WIP: basic reindexing works, lots of edge cases and TODOs to tackle

* Add note

* Add version metadata to signals documents

* WIP: Starting over from the ground up

* Removes obsolete endpoints/functions
* Adds endpoint for checking the migration status of signals indices
* Adds helper functions to represent the logical pieces of answering
  that question

* Fleshing out upgrade of signals

* triggers reindex for each index
* starts implementing followup endpoint to "finalize" after reindexing
  is finished

* Fleshing out more of the upgrade path

Still moving logic around a bunch.

* Pad the version number of our destination migration index

Instead of e.g. `.siem-signals-default-000001-r5`, this will generate
`.siem-signals-default-000001-r000005`.

This shouldn't matter much, but it may make it easier for users at a
glance to see the story of each index.

* Fleshing out more upgrade finalization

* Verifies that task matches the specified parameters
* Verifies that document counts are the same
* updates aliases
* finalization endpoint requires both source/dest indexes since we can't
  determine that from the task itself.

* Ensure that new signals are generated with an appropriate schema_version

* Apply migration cleanup policy to obsolete signals indexes

After upgrading a particular signals index, we're left with both the old
and new copies of the index. While the former is unlinked, it's still
taking up disk space; this ensures that it will eventually be deleted,
but gives users enough time to recover data if necessary.

This also ensures that, as with the normal signals ILM policy, it is
present during our normal sanity checks.

* Move more logic into component functions

* Fix type errors

* Refactor to make things a little more organized

* Moves migration-related routes under signals/ to match their routing
* Generalizes migration-agnostic helpers, moves them to appropriate
  folders (namely index/)
* Inlined getMigrationStatusInRange, a hyper-specific function with
  limited utility elsewhere

* Add some JSDoc comments around our new functions

This is as much to get my thoughts in order as it is for posterity.

Next: tests!

* Adds integration tests around migration status route

* Adds io-ts schema for route params
* Adds es_archiver data to represent an outdated signals index

* Adds API integration tests for our signals upgrade endpoint

* Adds io-ts schema for route params
* Adds second signals index archive, updates docs
* Adds test helper to wait for a given index to have documents
* Adds test helper to retrieve the relevant index name from a call to
  esArchive.load

* WIP: Fleshing out finalization tests

* Consolidate terminalogy around a migration

We're no longer making a distinction between an upgrade vs. an update
vs. a migration vs. a reindex: a migration is the concept that
encompasses this work. Both an index and individual documents can
require a migration, but both follow the same code path to migrate.

* Implement encoding of migration details

This will be a slightly better API: rather than having to pass all three
fields to finalize the migration, API users can instead send the token.

* Better transformation of errors thrown from the elasticsearch client

These often contain detailed information that we were previously
dropping. This will give better info on the migration finalization
endpoint, but should give more information across all detection_engine
endpoints in the case of an es client error.

* Finishing integration tests around finalization endpoint

This lead to a few changes in the responses from our different
endpoints; mainly, we pass both the migration token AND its constituent
parts to aid in debugging.

* Test an error case due to a reindexing failure

This would be really hard to reproduce with an integration test since
we'd need to generate a specific reindex failure. Much easier to stub
some ES calls to exercise that code in a unit test.

* Remove unnecessary version info from signals documents

We now record a single document-level version field. This represents the
version of the document's _source, which is generated by our rule
execution.

When either a mapping _or_ a transformation is added, this version will
be bumped such that new signals will contain the newest version, while
the index itself may still contain the old mappings.

The transformation pipeline will use the signal version to short-circuit
unnecessary transformations.

* Migrate an index relative to the ACTUAL template version

This handles the case where a user is attempting to migrate, but has not
yet rolled over to the newest template. Running rules may insert "new"
signals into an "old" index, but from the perspective of the app no
migration is necessary in that case.

If/when they roll over, the aforementioned index (and possibly older
ones) will be qualified as outdated, and can be migrated.

* Enrich our migration_status endpoint with an is_outdated qualification

This can be determined programatically, but for users manually
interpreting this response, the qualification will help.

* Update migration scripts

* More uniform version checking

* getIndexVersion always returns a number
* version comparisons use isOutdated

* Fix signal generation unit tests

We now generate a version field to indicate the version under which the
signal was created/migrated.

* Support reindex options to be sent to create_migration endpoint

Rather than having to perform a manual reindex, this should give API
users some control over the performance of their automated migration.

* Fix signal generation integration tests

These were failing on our new signal field.

* Add unit tests for getMigrationStatus

* Add a basic test for getSignalsIndicesInRange

Since this is ultimately just an aggregation query there's not much else
to test.

* Add unit test for the naming of our destination migration index

* Handle write indices in our migration logic

* Treat write indices as any other index in migration status endpoint
* Migration API rejects requests containing write indices
* Migration API rejects requests containing unknown/non-signals indices

* Add original hot phase to migration cleanup policy

Without this phase, ILM gets confused as it tries to move to the delete
phase and fails.

* Update old comment

The referenced field has changed.

* Delete task document as part of finalization

* Accurately report recoverable errors on create_signals_migration route

If we have a recoverable error: e.g. the destination index already
exists, or a specified index is a write index, we now report those
errors as part of the normal 200 response as these do not preclude other
specified indices from being migrated.

However, if non-signals indices are specified, we do continue to reject
the entire request, as that's indicative of misuse of the endpoint.
2020-12-10 13:12:39 -06:00
..
actions [Alerting & Actions ] More debug logging (#85149) 2020-12-08 18:41:20 -05:00
alerting_builtins Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
alerts [Alerting] Introduces a ActionSubGroup which allows for more granular action group scheduling (#84751) 2020-12-10 15:16:42 +00:00
apm [APM] Service overview: Dependencies table (#83416) 2020-12-10 10:32:01 +01:00
audit_trail Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
beats_management Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
canvas Upgrade EUI to v30.5.1 (#84677) 2020-12-04 09:39:03 -07:00
case [Security Solution][Case] Add in-progress status to case (#84321) 2020-12-04 21:36:23 +02:00
cloud Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
code Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
console_extensions [ML] Update console autocomplete for ML data frame evaluate API (#83151) 2020-11-17 12:48:25 +02:00
cross_cluster_replication [Telemetry] Introduce UI Counters (#84224) 2020-12-04 17:47:04 +02:00
dashboard_enhanced Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
dashboard_mode Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
data_enhanced [Search] Session SO polling (#84225) 2020-12-09 14:05:01 +02:00
discover_enhanced Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
drilldowns Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
embeddable_enhanced Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
encrypted_saved_objects ECS audit events for alerting (#84113) 2020-12-04 19:13:30 +00:00
enterprise_search [Workplace Search] Polish Workplace Search Sources & Groups UI (#85071) 2020-12-08 15:30:41 -06:00
event_log [Alerting] Introduces a ActionSubGroup which allows for more granular action group scheduling (#84751) 2020-12-10 15:16:42 +00:00
features Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
file_upload Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
fleet [Fleet][EPM] Move SO work from getFileHandler to service method (#85594) 2020-12-10 13:28:41 -05:00
global_search [GS] add tag and dashboard suggestion results (#85144) 2020-12-09 11:05:59 +01:00
global_search_bar [GS] adding tags UI to search results (#85084) 2020-12-10 11:16:21 -06:00
global_search_providers [GS] adding tags UI to search results (#85084) 2020-12-10 11:16:21 -06:00
graph [Graph] Fix graph saved object references (#85295) 2020-12-10 09:36:06 +01:00
grokdebugger [Grokdebugger] Fix simulate error handling (#83036) 2020-11-11 15:43:17 +01:00
index_lifecycle_management fix serialization of rollover (#85582) 2020-12-10 17:53:01 +01:00
index_management Integrate painless autocomplete in runtime fields editor (#84943) 2020-12-07 12:55:53 -05:00
infra [Logs UI] Custom rendering for <LogStream /> columns (#85148) 2020-12-10 17:19:40 +01:00
ingest_manager Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
ingest_pipelines [Telemetry] Introduce UI Counters (#84224) 2020-12-04 17:47:04 +02:00
lens Lens save modal should conditionally save to library (#85568) 2020-12-10 09:55:52 -06:00
license_management Upgrade EUI to v30.5.1 (#84677) 2020-12-04 09:39:03 -07:00
licensing Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
lists Migrate API keys functionality to a new Elasticsearch client. (#85029) 2020-12-09 20:43:24 +01:00
logstash Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
maps [Maps] fix unlinking an embedded map by reference Causes Error (#85485) 2020-12-10 08:31:28 -07:00
maps_legacy_licensing [Maps] lazy load maps_legacy, tile_map, and region_map bundle (#78027) 2020-09-24 12:45:43 -06:00
ml [ML] Adds security_linux and security_windows Modules (#85065) 2020-12-10 14:02:41 -05:00
monitoring [Monitoring] Optimizing alerting code (#83681) 2020-12-08 10:16:06 -05:00
observability [APM] Service overview: Dependencies table (#83416) 2020-12-10 10:32:01 +01:00
painless_lab Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
remote_clusters [Telemetry] Introduce UI Counters (#84224) 2020-12-04 17:47:04 +02:00
reporting [Reporting] Bump puppeteer 5.4.1 + roll chromium rev (#85066) 2020-12-04 14:50:07 -08:00
rollup Remove 'minute' frequency option from SLM policy form because ES won't allow a frequency faster than every 15 minutes. (#84854) 2020-12-09 09:14:57 -08:00
runtime_fields Add help text for runtime fields source. (#85204) 2020-12-08 14:51:47 -05:00
saved_objects_tagging [GS] adding tags UI to search results (#85084) 2020-12-10 11:16:21 -06:00
searchprofiler Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
security Require gold license for ECS audit logging (#85537) 2020-12-10 16:34:26 +00:00
security_solution [Security Solution][Detections] Signals Migration API (#84721) 2020-12-10 13:12:39 -06:00
snapshot_restore Remove 'minute' frequency option from SLM policy form because ES won't allow a frequency faster than every 15 minutes. (#84854) 2020-12-09 09:14:57 -08:00
spaces [jest] fix errors and warnings (#85291) 2020-12-09 15:04:21 +01:00
stack_alerts Geo containment alert sparsity handling: preserve active status for non-updated alerts (#85364) 2020-12-10 07:27:01 -07:00
task_manager Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
telemetry_collection_xpack Add bulk assign action to tag management (#84177) 2020-12-07 11:18:43 +01:00
transform [Transform] Replace legacy elasticsearch client (#84932) 2020-12-09 12:55:54 +01:00
translations [ILM] Add shrink field to hot phase (#84087) 2020-12-10 10:50:31 +01:00
triggers_actions_ui [Alerts] Hide case connector (#85398) 2020-12-09 21:38:41 +02:00
ui_actions_enhanced [jest] fix errors and warnings (#85291) 2020-12-09 15:04:21 +01:00
upgrade_assistant Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
uptime [Uptime ]Update empty message for certs list (#78575) 2020-12-10 10:08:06 +01:00
vis_type_timeseries_enhanced Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
watcher Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00
xpack_legacy Jest multi-project configuration (#77894) 2020-12-02 11:42:23 -08:00