NAME

rwbag - Build a binary Bag from SiLK Flow records

SYNOPSIS

rwbag --bag-file=KEY,COUNTER,OUTPUTFILE
      [--bag-file=KEY,COUNTER,OUTPUTFILE ...]
      [{ --pmap-file=PATH | --pmap-file=MAPNAME:PATH }]
      [--note-strip] [--note-add=TEXT] [--note-file-add=FILE]
      [--invocation-strip] [--print-filenames] [--copy-input=PATH]
      [--compression-method=COMP_METHOD]
      [--ipv6-policy={ignore,asv4,mix,force,only}]
      [--site-config-file=FILENAME]
      {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

rwbag --help

rwbag --legacy-help

rwbag --version

LEGACY SYNOPSIS

rwbag [--sip-flows=OUTPUTFILE] [--dip-flows=OUTPUTFILE]
      [--sport-flows=OUTPUTFILE] [--dport-flows=OUTPUTFILE]
      [--proto-flows=OUTPUTFILE] [--sensor-flows=OUTPUTFILE]
      [--input-flows=OUTPUTFILE] [--output-flows=OUTPUTFILE]
      [--nhip-flows=OUTPUTFILE]
      [--sip-packets=OUTPUTFILE] [--dip-packets=OUTPUTFILE]
      [--sport-packets=OUTPUTFILE] [--dport-packets=OUTPUTFILE]
      [--proto-packets=OUTPUTFILE] [--sensor-packets=OUTPUTFILE]
      [--input-packets=OUTPUTFILE] [--output-packets=OUTPUTFILE]
      [--nhip-packets=OUTPUTFILE]
      [--sip-bytes=OUTPUTFILE] [--dip-bytes=OUTPUTFILE]
      [--sport-bytes=OUTPUTFILE] [--dport-bytes=OUTPUTFILE]
      [--proto-bytes=OUTPUTFILE] [--sensor-bytes=OUTPUTFILE]
      [--input-bytes=OUTPUTFILE] [--output-bytes=OUTPUTFILE]
      [--nhip-bytes=OUTPUTFILE]
      [--note-add=TEXT] [--note-file-add=FILE]
      [--print-filenames] [--copy-input=PATH]
      [--compression-method=COMP_METHOD]
      [--ipv6-policy={ignore,asv4,mix,force,only}]
      [--site-config-file=FILENAME]
      {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

DESCRIPTION

rwbag reads SiLK Flow records and builds one or more Bag files. A Bag is similar to a set but each key is associated with a counter. Usually the key is some aspect of a flow record (an IP address, a port, the protocol, et cetera), and the counter is a volume (such as the number of flow records or the sum or bytes or packets) for the flow records that match that key. A Bag file supports a single key field and a single counter field; use the Aggregate Bag tools (e.g., rwaggbag(1)) when the key or counter contains multiple fields.

The --bag-file switch is required and it specifies how to create a Bag file. The argument to the switch names the key field to use for the bag, the counter field, and the location where the bag file is to be written. The switch may be repeated to create multiple Bag files.

rwbag reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file is uncompressed as it is read. When the --xargs switch is provided, rwbag reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

If adding a value to a key would cause the value to overflow the maximum value that Bags support, the key's value will be set to the maximum and processing will continue. In addition, if this is the first value to overflow in this Bag, a warning will be printed to the standard error.

If rwbag runs out of memory, it will exit immediately. The output Bag files will remain behind, each with a size of 0 bytes.

Use rwbagcat(1) to see the contents of a bag. To create a bag from textual input or from an IPset, use rwbagbuild(1). rwbagtool(1) allows you to manipulate binary bag files.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--bag-file=KEY,COUNTER,OUTPUTFILE

Bin flow records by unique KEY, compute the COUNTER for each bin, and write the result to OUTPUTFILE. The list of available KEY and COUNTER values are given immediately below. OUTPUTFILE is the name of a non-existent file, a named pipe, or the keyword stdout or - to write the binary Bag to the standard output. Repeat the --bag-file switch to create multiple Bag files in a single pass over the data. Only one OUTPUTFILE may use the standard output. See "LEGACY BAG CREATION SWITCHES" for deprecated methods to create Bag files. This switch or one of legacy equivalents is required. Since SiLK 3.12.0.

rwbag supports the following names for KEY. The case of KEY is ignored.

sIPv4

source IP address, either IPv4 or IPv6

sIPv6

source IP address, either IPv4 or IPv6

dIPv4

destination IP address, either IPv4 or IPv6

dIPv6

destination IP address, either IPv4 or IPv6

sPort

source port for TCP or UDP, or equivalent

dPort

destination port for TCP or UDP, or equivalent

protocol

IP protocol

packets

count of packets recorded for this flow record

bytes

count of bytes recorded for this flow record

flags

bit-wise OR of TCP flags over all packets in the flow

sTime

starting time of the flow, in seconds resolution

duration

duration of the flow, in seconds resolution

eTime

ending time of the flow, in seconds resolution

sensor

numeric ID of the sensor where the flow was collected

input

router SNMP input interface or vlanId if packing tools were configured to capture it (see sensor.conf(5))

output

router SNMP output interface or postVlanId

nhIPv4

router next hop IP address, either IPv4 or IPv6

nhIPv6

router next hop IP address, either IPv4 or IPv6

initialFlags

TCP flags on first packet in the flow

sessionFlags

bit-wise OR of TCP flags over all packets except the first in the flow

attributes

flow attributes set by the flow generator

application

guess as to the content of the flow

sip-country

the country code of the source IP address. Uses the mapping file specified by the SILK_COUNTRY_CODES environment variable or the country_codes.pmap mapping file, as described in "FILES". (See also ccfilter(3).) Since SiLK 3.12.0.

scc

an alias for sip-country

dip-country

the country code of the destination IP address

dcc

an alias for dip-country

sip-pmap:MAPNAME

the value that the source IP address maps to in the mapping file whose map-name is MAPNAME. The type of that prefix map must be IPv4-address or IPv6-address. Use --pmap-file to load the mapping file and optionally set its map-name. Since the MAPNAME must be known when the --bag-file switch is parsed, the --pmap-file switch(es) should precede the --bag-file switch(es).

dip-pmap:MAPNAME

the value that the destination IP address maps to in the mapping file whose map-name is MAPNAME. See sip-pmap:MAPNAME.

sport-pmap:MAPNAME

the value that the protocol/source-port pair maps to in the mapping file whose map-name is MAPNAME. The type of that prefix map must be proto-port. Use --pmap-file to load the mapping file and optionally set its map-name. Since the MAPNAME must be known when the --bag-file switch is parsed, the --pmap-file switch(es) should precede the --bag-file switch(es).

dport-pmap:MAPNAME

the value that the protocol/destination-port pair maps to in the mapping file whose map-name is MAPNAME. See sport-pmap:MAPNAME.

rwbag supports the following names for COUNTER. The case of COUNTER is ignored.

records

count of the number of flow records that match the key

flows

an alias for records

sum-packets

the sum of the packet counts for flow records that match the key

packets

an alias for sum-packets

sum-bytes

the sum of the byte counts for flow records that match the key

bytes

an alias for sum-bytes

--pmap-file=PATH
--pmap-file=MAPNAME:PATH

Load the the prefix map file from PATH for use when the key part of the argument to the --bag-file switch is one of sip-pmap, dip-pmap, sport-pmap, or dport-pmap. Specify PATH as - or stdin to read from the standard input. If MAPNAME is specified, it overrides the map-name contained in the prefix map file itself. If no map-name is available, rwbag exits with an error. The switch may be repeated to load multiple prefix map files; each file must have a unique map-name. To create a prefix map file, use rwpmapbuild(1). Since SiLK 3.12.0.

--note-strip

Do not copy the notes (annotations) from the input files to the output file(s). When this switch is not specified, notes from the input files are copied to the output. Since SiLK 3.12.2.

--note-add=TEXT

Add the specified TEXT to the header of every output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.

--note-file-add=FILENAME

Open FILENAME and add the contents of that file to the header of every output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.

--invocation-strip

Do not record any command line history: do not copy the invocation history from the input files to the output file(s), and do not record the current command line invocation in the output. The invocation may be viewed with rwfileinfo(1). Since SiLK 3.12.0.

Print to the standard error the names of input files as they are opened.

--copy-input=PATH

Copy all binary SiLK Flow records read as input to the specified file or named pipe. PATH may be stdout or - to write flows to the standard output as long as no Bag file is being written there.

--ipv6-policy=POLICY

Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mix. When SiLK has not been compiled with IPv6 support, IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:

ignore

Ignore any flow record marked as IPv6, regardless of the IP addresses it contains. Only IP addresses contained in IPv4 flow records will be added to the bag(s).

asv4

Convert IPv6 flow records that contain addresses in the ::ffff:0:0/96 netblock (that is, IPv4-mapped IPv6 addresses) to IPv4 and ignore all other IPv6 flow records.

mix

Process the input as a mixture of IPv4 and IPv6 flow records. When creating a bag whose key is an IP address and the input contains IPv6 addresses outside of the ::ffff:0:0/96 netblock, this policy is equivalent to force; otherwise it is equivalent to asv4.

force

Convert IPv4 flow records to IPv6, mapping the IPv4 addresses into the ::ffff:0:0/96 netblock.

only

Process only flow records that are marked as IPv6. Only IP addresses contained in IPv6 flow records will be added to the bag(s).

Regardless of the IPv6 policy, when all IPv6 addresses in the bag are in the ::ffff:0:0/96 netblock, rwbag treats them as IPv4 addresses and writes an IPv4 bag. When any other IPv6 addresses are present in the bag, the IPv4 addresses in the bag are mapped into the ::ffff:0:0/96 netblock and rwbag writes an IPv6 bag.

--compression-method=COMP_METHOD

Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.

none

Do not compress the output using an external library.

zlib

Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.

lzo1x

Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.

snappy

Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.

best

Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwbag searches for the site configuration file in the locations specified in the "FILES" section.

--xargs
--xargs=FILENAME

Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwbag opens each named file in turn and reads records from it as if the filenames had been listed on the command line.

--help

Print the available options and exit.

--legacy-help

Print help, including legacy switches. See the "LEGACY BAG CREATION SWITCHES" section below for these switches.

--version

Print the version number and information about how SiLK was configured, then exit the application.

LEGACY BAG CREATION SWITCHES

The following switches are deprecated as of SiLK 3.12.0. These switches may be used in conjunction with the --bag-file switch.

--sip-flows=OUTPUTFILE

Equivalent to --bag-file=sIPv4,records,OUTPUTFILE. Count number of flows by unique source IP.

--sip-packets=OUTPUTFILE

Equivalent to --bag-file=sIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique source IP.

--sip-bytes=OUTPUTFILE

Equivalent to --bag-file=sIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique source IP.

--dip-flows=OUTPUTFILE

Equivalent to --bag-file=dIPv4,records,OUTPUTFILE. Count number of flows by unique destination IP.

--dip-packets=OUTPUTFILE

Equivalent to --bag-file=dIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique destination IP.

--dip-bytes=OUTPUTFILE

Equivalent to --bag-file=dIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique destination IP.

--sport-flows=OUTPUTFILE

Equivalent to --bag-file=sPort,records,OUTPUTFILE. Count number of flows by unique source port.

--sport-packets=OUTPUTFILE

Equivalent to --bag-file=sPort,sum-packets,OUTPUTFILE. Count number of packets by unique source port.

--sport-bytes=OUTPUTFILE

Equivalent to --bag-file=sPort,sum-bytes,OUTPUTFILE. Count number of bytes by unique source port.

--dport-flows=OUTPUTFILE

Equivalent to --bag-file=dPort,records,OUTPUTFILE. Count number of flows by unique destination port.

--dport-packets=OUTPUTFILE

Equivalent to --bag-file=dPort,sum-packets,OUTPUTFILE. Count number of packets by unique destination port.

--dport-bytes=OUTPUTFILE

Equivalent to --bag-file=dPort,sum-bytes,OUTPUTFILE. Count number of bytes by unique destination port.

--proto-flows=OUTPUTFILE

Equivalent to --bag-file=protocol,records,OUTPUTFILE. Count number of flows by unique protocol.

--proto-packets=OUTPUTFILE

Equivalent to --bag-file=protocol,sum-packets,OUTPUTFILE. Count number of packets by unique protocol.

--proto-bytes=OUTPUTFILE

Equivalent to --bag-file=protocol,sum-bytes,OUTPUTFILE. Count number of bytes by unique protocol.

--sensor-flows=OUTPUTFILE

Equivalent to --bag-file=sensor,records,OUTPUTFILE. Count number of flows by unique sensor ID.

--sensor-packets=OUTPUTFILE

Equivalent to --bag-file=sensor,sum-packets,OUTPUTFILE. Count number of packets by unique sensor ID.

--sensor-bytes=OUTPUTFILE

Equivalent to --bag-file=sensor,sum-bytes,OUTPUTFILE. Count number of bytes by unique sensor ID.

--input-flows=OUTPUTFILE

Equivalent to --bag-file=input,records,OUTPUTFILE. Count number of flows by unique input interface index.

--input-packets=OUTPUTFILE

Equivalent to --bag-file=input,sum-packets,OUTPUTFILE. Count number of packets by unique input interface index.

--input-bytes=OUTPUTFILE

Equivalent to --bag-file=input,sum-bytes,OUTPUTFILE. Count number of bytes by unique input interface index.

--output-flows=OUTPUTFILE

Equivalent to --bag-file=output,records,OUTPUTFILE. Count number of flows by unique output interface index.

--output-packets=OUTPUTFILE

Equivalent to --bag-file=output,sum-packets,OUTPUTFILE. Count number of packets by unique output interface index.

--output-bytes=OUTPUTFILE

Equivalent to --bag-file=output,sum-bytes,OUTPUTFILE. Count number of bytes by unique output interface index.

--nhip-flows=OUTPUTFILE

Equivalent to --bag-file=nhIPv4,records,OUTPUTFILE. Count number of flows by unique next hop IP.

--nhip-packets=OUTPUTFILE

Equivalent to --bag-file=nhIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique next hop IP.

--nhip-bytes=OUTPUTFILE

Equivalent to --bag-file=nhIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique next hop IP.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

Bag of Protocol:Byte

Read the SiLK Flow file data.rw and create the Bag proto-byte.bag that contains the total byte-count seen for each protocol by using protocol as the key and sum-bytes as the counter:

$ rwbag --bag-file=protocol,sum-bytes,proto-byte.bag data.rw

Use rwbagcat(1) to view the result:

$ rwbagcat proto-byte.bag
         1|            10695328|
         6|        120536195111|
        17|            24500079|

Specify the output path as - to pass the Bag file from rwbag directly into rwbagcat.

$ rwbag --bag-file=protocol,sum-bytes,- data.rw    \
  | rwbagcat
         1|            10695328|
         6|        120536195111|
        17|            24500079|

Compare that to this rwuniq(1) command.

$ rwuniq --field=protocol --value=bytes --sort-output data.rw
pro|               Bytes|
  1|            10695328|
  6|        120536195111|
 17|            24500079|

One advantage of Bag files over rwuniq is that the data remains in binary form where it can be manipulated by rwbagtool(1).

Two Bags in a Single Pass

Read records from rwfilter(1) and build Bag files sip-flow.bag and dip-flow.bag that count the number of flows seen for each source address and for each destination address, respectively.

$ rwfilter ... --pass=stdout                       \
  | rwbag --bag-file=sipv4,records,sip-flow.bag    \
       --bag-file=dipv4,records,dip-flow.bag

Using a Network Prefix

To create sip16-byte.bag that contains the number of bytes seen for each /16 found in the source address field, use the rwnetmask(1) tool prior to feeding the input to rwbag:

$ rwfilter ... --pass=stdout                       \
  | rwnetmask --4sip-prefix-length=16              \
  | rwbag --bag-file=sipv4,sum-bytes,sip16-byte.bag

$ rwbagcat sip16-byte.bag | head -4
       10.4.0.0|               18260|
       10.5.0.0|              536169|
       10.9.0.0|               55386|
      10.11.0.0|             5110438|

To print the IP addresses of an existing Bag into /16 prefixes, use the --network-structure switch of rwbagcat(1).

$ rwfilter ... --pass=stdout                   \
  | rwbag --bag-file=sipv4,sum-bytes,-         \
  | rwbagcat --network-structure=B             \
  | head -4
       10.4.0.0/16|               18260|
       10.5.0.0/16|              536169|
       10.9.0.0/16|               55386|
      10.11.0.0/16|             5110438|

Bag of Country Codes

As of SiLK 3.12.0, a Bag file may contain a country code as its key. Create scc-pkt.bag that sums the packet count by country.

$ rwbag --bag-file=sip-country,sum-packets,scc-pkt.bag
$ rwbagcat scc-pkt.bag
--|                 840|
a1|                 284|
a2|                   1|
ae|                   8|

Bag of Prefix Map Values

rwbag and rwbagbuild(1) can use a prefix map file as the key in a Bag file as of SiLK 3.12.0. For example, to lookup each source address in the prefix map file ip-map.pmap that maps from address to "type of service", use the --pmap-file switch to specify the prefix map file, and specify the Bag's key as sip-pmap:map-name, where map-name is either the map-name stored in the prefix map file or a name that is provided as part of the --pmap-file argument. (A prefix map's map-name is available via the rwfileinfo(1) command.)

$ rwfileinfo --field=prefix-map ip-map.pmap
ip-map.pmap:
  prefix-map          v1: service-host
$
$ rwbag --pmap-file=ip-map.pmap                            \
       --bag-file=sip-pmap:service-host,bytes,srvhost.bag  \
       data.rw

Multiple --pmap-file switches may be specified which may be useful when generating multiple Bag files in a single invocation. On the command line, the --pmap-file switch that defines the map-name must preceded the --bag-file where the map-name is used.

The prefix map file is not stored as part of the Bag, so you must provide the name of the prefix map when running rwbagcat.

$ rwbagcat srvhost.bag
rwbagcat: The --pmap-file switch is required for \
        Bags containing sip-pmap keys
$ rwbagcat --pmap-file=ip-map.pmap srvhost.bag
         external|         59950837766|
         internal|         60602999159|
              ntp|              588316|
              dns|            14404581|
             dhcp|             2560696|

rwbag also has support for prefix map files that map from a protocol-port pair to a label. The proto-port.pmap file does not have a map-name so a name must be provided on the rwbag command line.

$ rwfileinfo --field=prefix-map proto-port.pmap
proto-port.pmap:
$
$ rwbag --pmap-file=srvport:proto-port.pmap                \
       --bag-file=sip-pmap:srvport,flows,srvport.bag       \
       data.rw
$ rwbagcat --pmap-file=proto-port.pmap srvport.bag | head -4
     ICMP|               15622|
      UDP|               62216|
  UDP/DNS|               62216|
 UDP/DHCP|               15614|

ENVIRONMENT

SILK_COUNTRY_CODES

This environment variable allows the user to specify the country code mapping file that rwbag uses when mapping an IP to a country for the sip-country and dip-country keys. The value may be a complete path or a file relative to the SILK_PATH. See the "FILES" section for standard locations of this file.

SILK_IPV6_POLICY

This environment variable is used as the value for --ipv6-policy when that switch is not provided.

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_COMPRESSION_METHOD

This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwbag may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwbag may use this environment variable. See the "FILES" section for details.

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

$SILK_COUNTRY_CODES
$SILK_PATH/share/silk/country_codes.pmap
$SILK_PATH/share/country_codes.pmap
/usr/share/silk/country_codes.pmap
/usr/share/country_codes.pmap

Possible locations for the country code mapping file required by the sip-country and dip-country keys.

SEE ALSO

rwbagbuild(1), rwbagcat(1), rwbagtool(1), rwaggbag(1), rwfileinfo(1), rwfilter(1), rwnetmask(1), rwpmapbuild(1), rwuniq(1), ccfilter(3), sensor.conf(5), silk(7), zlib(3)