extract

Extract data from plain text, using a pattern

No auto-conversion of numbers takes place; use convert for that. You may use named capture groups instead of output-fields but either way fieldnames are restricted to letters, digits and underscores.

Field NameDescriptionTypeDefault
input-fieldField containing the datafield_raw
patternThe pattern to match onregex-
output-fieldsField names where values are storedarray of fields-
dropRemove non-matching eventsboolfalse
removeRemove the field after usageboolfalse
warningWarn on non-matching eventsboolfalse

input-field

Field containing the data

Type: field

Example

input:

{"uptime":" 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23"}

action:

extract:
  input-field: uptime
  remove: true
  pattern: 'load average: (\S+), (\S+), (\S+)'
  output-fields:
    - m1
    - m5
    - m15

output:

{"m1":"0.40","m5":"0.28","m15":"0.23"}

pattern

The pattern to match on

Type: regex

Example

input:

num=1
num=2
num=3

action:

extract:
  pattern: 'num=(?P<n>\d+)'

output:

{"_raw":"num=1","n":"1"}
{"_raw":"num=2","n":"2"}
{"_raw":"num=3","n":"3"}

output-fields

Field names where values are stored

Type: array of fields

Example: extract the round trip time for the ping

input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

action:

extract:
  input-field: _raw
  remove: true
  drop: true
  pattern: '(\S+) ms$'
  output-fields:
    - latency

output:

{"latency":"time=0.060"}

Example: an optional match may or may not set a field

input:

4-01
02

action:

extract:
  input-field: _raw
  remove: true
  pattern: '((\d+)-)*(\d+)'
  output-fields: [day, hour]

output:

{"day":"4","hour":"01"}
{"hour":"02"}

drop

Remove non-matching events

Type: bool

Example

input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

action:

extract:
  input-field: _raw
  remove: true
  drop: true
  pattern: '(\S+) ms$'
  output-fields:
    - latency

output:

{"latency":"time=0.060"}

Example: Without the drop

input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

action:

extract:
  input-field: _raw
  remove: true
  pattern: '(\S+) ms$'
  output-fields:
    - latency

output:

{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}

remove

Remove the field after usage

Type: bool

Example: Parse output of uptime command

input:

 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23

action:

extract:
  input-field: _raw
  remove: true
  pattern: 'load average: (\S+), (\S+), (\S+)'
  output-fields:
    - m1
    - m5
    - m15

output:

{"m1":"0.40","m5":"0.28","m15":"0.23"}

Example: Without input-field removed

input:

 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23

action:

extract:
  input-field: _raw
  pattern: 'load average: (\S+), (\S+), (\S+)'
  output-fields:
    - m1
    - m5
    - m15

output:

{"_raw":" 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23","m1":"0.40","m5":"0.28","m15":"0.23"}

warning

Warn on non-matching events

Type: bool

Example

input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

action:

extract:
  input-field: _raw
  remove: true
  drop: true
  warning: true
  pattern: '(\S+) ms$'
  output-fields:
    - latency

output:

[WARNING] extract: no captures with regex args action-extract --input-field _raw '(?P<latency>\S+) ms$' --remove --warning --drop
LINE: {"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}

Example: Without warn

input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

action:

extract:
  input-field: _raw
  remove: true
  pattern: '(\S+) ms$'
  output-fields:
    - latency

output:

{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}