*Maps* comes with https://maps.elastic.co/#file[predefined regions] that allow you to quickly visualize regions by metrics. *Maps* also offers the ability to map your own regions. You can use any region data you'd like, as long as your source data contains an identifier for the corresponding region.
But how can you map regions when your source data does not contain a region identifier? This is where reverse geocoding comes in. Reverse geocoding is the process of assigning a region identifier to a feature based on its location.
GeoIP is a common way of transforming an IP address to a longitude and latitude. GeoIP is roughly accurate on the city level globally and neighborhood level in selected countries. It’s not as good as an actual GPS location from your phone, but it’s much more precise than just a country, state, or province.
You’ll use the <<get-started, web logs sample data set>> that comes with Kibana for this tutorial. Web logs sample data set has longitude and latitude. If your web log data does not contain longitude and latitude, use {ref}/geoip-processor.html[GeoIP processor] to transform an IP address into a {ref}/geo-point.html[geo_point] field.
To install web logs sample data set:
. On the home page, click *Try sample data*.
. On the *Sample web logs* card, click *Add data*.
[float]
=== Step 2: Index Combined Statistical Area (CSA) regions
GeoIP level of detail is very useful for driving decision-making. For example, say you want to spin up a marketing campaign based on the locations of your users or show executive stakeholders which metro areas are experiencing an uptick of traffic.
That kind of scale in the United States is often captured with what the Census Bureau calls the Combined Statistical Area (CSA). Combined Statistical Area is roughly equivalent with how people intuitively think of which urban area they live in. It does not necessarily coincide with state or city boundaries.
CSAs generally share the same telecom providers and ad networks. New fast food franchises expand to a CSA rather than a particular city or municipality. Basically, people in the same CSA shop in the same IKEA.
To get the CSA boundary data:
. Download the https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html[Cartographic Boundary shapefile (.shp)] from the Census Bureau’s website.
. To use the data in Kibana, convert it to GeoJSON format. Follow this https://gist.github.com/YKCzoli/b7f5ff0e0f641faba0f47fa5d16c4d8d[helpful tutorial] to use QGIS to convert the Cartographic Boundary shapefile to GeoJSON. Or, download a https://raw.githubusercontent.com/elastic/examples/master/blog/reverse-geocoding/csba.json[prebuilt GeoJSON version].
Once you have your GeoJSON file:
. Open the main menu, and click *Maps*.
. Click *Create map*.
. Click *Add layer*.
. Click *Upload GeoJSON*.
. Use the file chooser to import the CSA GeoJSON file.
. Set index name to *csa* and click *Import file*.
. When importing is complete, click *Add as document layer*.
. Add Tooltip fields:
.. Click *+ Add* to open field select.
.. Select *NAME*, *GEOID*, and *AFFGEOID*.
.. Click *Add*.
. Click *Save & close*.
Looking at the map, you get a sense of what constitutes a metro area in the eyes of the Census Bureau.
To visualize CSA regions by web log traffic, the web log traffic must contain a CSA region identifier. You'll use {es} {ref}/enrich-processor.html[enrich processor] to add CSA region identifiers to the web logs sample data set. You can skip this step if your source data already contains region identifiers.
. Open the main menu, then click *Dev Tools*.
. In *Console*, create a {ref}/geo-match-enrich-policy-type.html[geo_match enrichment policy]:
+
[source,js]
----------------------------------
PUT /_enrich/policy/csa_lookup
{
"geo_match": {
"indices": "csa",
"match_field": "coordinates",
"enrich_fields": [ "GEOID", "NAME"]
}
}
----------------------------------
. To initialize the policy, run:
+
[source,js]
----------------------------------
POST /_enrich/policy/csa_lookup/_execute
----------------------------------
. To create a ingest pipeline, run:
+
[source,js]
----------------------------------
PUT _ingest/pipeline/lonlat-to-csa
{
"description": "Reverse geocode longitude-latitude to combined statistical area",
"processors": [
{
"enrich": {
"field": "geo.coordinates",
"policy_name": "csa_lookup",
"target_field": "csa",
"ignore_missing": true,
"ignore_failure": true,
"description": "Lookup the csa identifier"
}
},
{
"remove": {
"field": "csa.coordinates",
"ignore_missing": true,
"ignore_failure": true,
"description": "Remove the shape field"
}
}
]
}
----------------------------------
. To update your existing data, run:
+
[source,js]
----------------------------------
POST kibana_sample_data_logs/_update_by_query?pipeline=lonlat-to-csa
----------------------------------
. To run the pipeline on new documents at ingest, run:
. Open the <<set-time-filter, time filter>>, and set the time range to the last 30 days.
. Scan through the list of *Available fields* until you find the `csa.GEOID` field. You can also search for the field by name.
. Click image:images/reverse-geocoding-tutorial/add-icon.png[Add icon] to toggle the field into the document table.
. Find the 'csa.NAME' field and add it to your document table.
Your web log data now contains `csa.GEOID` and `csa.NAME` fields from the matching *csa* region. Web log traffic not contained in a CSA region does not have values for `csa.GEOID` and `csa.NAME` fields.
Congratulations! You have completed the tutorial and have the recipe for visualizing custom regions. You can now try replicating this same analysis with your own data.