Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Latest commit

 

History

History
91 lines (80 loc) · 6.87 KB

siembol_enrichment_service.md

File metadata and controls

91 lines (80 loc) · 6.87 KB

Siembol Enrichment Service

Overview

Siembol Enrichment is an enrichment engine used to add useful data to events to assist in detection and investigations.

The data that is used to enrich events is stored in JSON files in a file store in the following format:

{ 
    "key" :
    {
        "column1":"value",
         "column2":"value2",
         ...
    }
}

When creating a rule you can specify the table to use, the column to join on, and the column to add to the event.

Enrichment rule

  • rule_name - Rule name that uniquely identifies the rule
  • rule_author - The author of the rule, i.e., the user who last modified the rule
  • rule_version - The version of the rule
  • rule_description - This field contains a single text input that allows you set a description for the rule. This should be a short, helpful comment that allows anyone to identify the purpose of this rule
  • source_type - This fields allows you to determine the type of data you want to match on. It is essentially a matcher for the source_type field. This field does not support regex - however, using * as an input matches all source types. The source_type field is set during parsing and is equal to the name of the last parser which was used to parse the log
  • matchers - Matchers allow you to further filter the events that the enrichment will be applied to
  • table_mapping - Mappings for enriching events

Matchers

Matchers allow you to further filter the events that the enrichment will be applied to. You can add as many matchers as you want.

  • is_enabled - The matcher is enabled
  • description - The description of the matcher
  • matcher_type - Type of matcher, either REGEX_MATCH or IS_IN_SET
  • is_negated- The matcher is negated
  • field - The name of the field on which the matcher will be evaluated

There are two types of matchers:

  • REGEX_MATCH - A regex_match allows you use a regex statement to match a specified field. There are two string inputs:

  • IS_IN_SET - It compares the value of a field to a set of strings defined in data. If the value is in the set then the matcher returns true.

    • data - A list of strings to compare the value to. New line delimited. Does not support regex - each line must be a literal match however, field substitution is supported in this field

Table Mapping

The table mapping tab is where you configure the enrichment you want to perform.

  • table_name - The name of the table which contains the data you want to enrich the event with

  • joining_key - The string used to join the event with the table (the key json field). This field supports substitution e.g. ${field_name} or http://${host_field_name}/${path_field_name}. This is used to filter the key field of the table

  • tags- Tags are added into the event after successful joining the table with the joining key. You can add as many tags as you want

    • tag_name - The name of the tag
    • tag_value - The value of the tag
  • enriching_fields - Fields from the enriching table that are added after successful joining the table with the joining key. You can add as many enriching fields as you want

    • table_field_name - The column in the enrichment table that you want to add
    • event_field_name - The name you want the field to have in event after enriching
Note: you can only enrich from one table per rule. If you want to enrich the same event from multiple table, you need to create multiple rules.

Admin config

  • topology.name- The name of storm topology
  • kafka.spout.num.executors - The number of executors for kafka spout
  • enriching.engine.bolt.num.executors - The number of executors for enriching rule engine
  • memory.enriching.bolt.num.executors - The number of executors for memory enrichments from tables
  • merging.bolt.num.executors - The number of executors for merging enriched fields
  • kafka.writer.bolt.num.executors - The number of executors for producing output messages
  • enriching.rules.zookeeper.attributes - The zookeeper attributes for updating enrichment rules
    • zk.url - Zookeeper servers url. Multiple servers are separated by comma
    • zk.path - Path to a zookeeper node
  • enriching.tables.zookeeper.attributes - The zookeeper attributes for notifying the update of enrichment tables
    • zk.url - Zookeeper servers url. Multiple servers are separated by comma
    • zk.path - Path to a zookeeper node
  • kafka.batch.writer.attributes - Kafka batch writer attributes for producing output messages
  • storm.attributes - Storm attributes for the enrichment topology
  • bootstrap.servers - Kafka brokers servers url. Multiple servers are separated by comma
  • enriching.input.topics- The list of kafka input topics for reading messages
  • enriching.output.topic - Output kafka topic name for correctly processed messages
  • enriching.error.topic - Output kafka topic name for error messages
  • enriching.tables.hdfs.uri - The url for hdfs cluster where enriching tables are stored