NAME

rwaggbag - Build a binary Aggregate Bag from SiLK Flow records

SYNOPSIS

rwaggbag --keys=KEY --counters=COUNTER
      [--note-strip] [--note-add=TEXT] [--note-file-add=FILE]
      [--invocation-strip] [--print-filenames] [--copy-input=PATH]
      [--compression-method=COMP_METHOD]
      [--ipv6-policy={ignore,asv4,mix,force,only}]
      [--output-path=PATH]
      [--site-config-file=FILENAME]
      {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

rwaggbag --help

rwaggbag --help-fields

rwaggbag --version

DESCRIPTION

rwaggbag reads SiLK Flow records and builds an Aggregate Bag file. To build an Aggregate Bag from textual input, use rwaggbagbuild(1).

An Aggregate Bag is a binary file that maps a key to a counter, where the key and the counter are both composed of one or more fields. For example, an Aggregate Bag could contain the sum of the packet count and the sum of the byte count for each unique source IP and source port pair.

For each SiLK flow record rwaggbag reads, it extracts the values of the fields listed in the --keys switch, combines those fields into a key, searches for an existing bin that has that key and creates a new bin for that key if none is found, and adds the values for each of the fields listed in the --counters switch to the bin's counter. Both the --keys and --counters switches are required.

rwaggbag reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file is uncompressed as it is read. When the --xargs switch is provided, rwaggbag reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

If rwaggbag runs out of memory, it will exit immediately. The output Aggregate Bag file remains behind with a size of 0 bytes.

To print the contents of an Aggregate Bag as text, use rwaggbagcat(1). The rwaggbagbuild(1) tool can create an Aggregate Bag from textual input. rwaggbagtool(1) allows you to manipulate binary Aggregate Bag files.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--keys=KEY

Create a key for binning flow records using the values of the comma-separated field(s) listed in KEY. The field names are case-insensitive, a name may be abbreviated to its shortest unique prefix, and a name may only be used one time. The list of available KEY fields are

sIPv4

source IP address when IPv4

sIPv6

source IP address when IPv6

dIPv4

destination IP address when IPv4

dIPv6

destination IP address when IPv6

sPort

source port for TCP or UDP, or equivalent

dPort

destination port for TCP or UDP, or equivalent

protocol

IP protocol

packets

count of packets recorded for this flow record

bytes

count of bytes recorded for this flow record

flags

bit-wise OR of TCP flags over all packets in the flow

sTime

starting time of the flow, in seconds resolution

duration

duration of the flow, in seconds resolution

eTime

ending time of the flow, in seconds resolution

sensor

numeric ID of the sensor where the flow was collected

input

router SNMP input interface or vlanId if packing tools were configured to capture it (see sensor.conf(5))

output

router SNMP output interface or postVlanId

nhIPv4

router next hop IP address when IPv4

nhIPv6

router next hop IP address when IPv6

initialFlags

TCP flags on first packet in the flow as reported by yaf(1)

sessionFlags

bit-wise OR of TCP flags over all packets in the flow except the first as reported by yaf

attributes

flow attributes set by the flow generator

application

content of the flow as reported in the applabel field of yaf

class

class of the sensor at the collection point

type

type of the sensor at the collection point

icmpType

ICMP type value for ICMP and ICMPv6 flows, 0 otherwise

icmpCode

ICMP code value for ICMP and ICMPv6 flows, 0 otherwise

scc

the country code of the source IP address. Uses the mapping file specified by the SILK_COUNTRY_CODES environment variable or the country_codes.pmap mapping file, as described in "FILES". (See also ccfilter(3).) Since SiLK 3.19.0.

dcc

the country code of the destination IP address. See scc. Since SiLK 3.19.0.

--counters=COUNTER

Add to the bin determined by the fields in --key the values of the comma-separated field(s) listed in COUNTER. The field names are case-insensitive, a name may be abbreviated to its shortest unique prefix, and a name may only be used one time. The list of available COUNTER fields are

records

count of the number of flow records that match the key

sum-packets

the sum of the packet counts for flow records that match the key

sum-bytes

the sum of the byte counts for flow records that match the key

sum-duration

the sum of the durations (in seconds) for flow records that match the key

--note-strip

Do not copy the notes (annotations) from the input file(s) to the output file. When this switch is not specified, notes from the input file(s) are copied to the output.

--note-add=TEXT

Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.

--note-file-add=FILENAME

Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.

--invocation-strip

Do not record any command line history: do not copy the invocation history from the input files to the output file(s), and do not record the current command line invocation in the output. The invocation may be viewed with rwfileinfo(1).

Print to the standard error the names of input files as they are opened.

--copy-input=PATH

Copy all binary SiLK Flow records read as input to the specified file or named pipe. PATH may be stdout or - to write flows to the standard output as long as the --output-path switch is specified to redirect rwaggbag's output to a different location.

--output-path=PATH

Write the binary Aggregate Bag output to PATH, where PATH is a filename, a named pipe, the keyword stderr to write the output to the standard error, or the keyword stdout or - to write the output to the standard output. If PATH names an existing file, rwaggbag exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this switch is not given, the output is written to the standard output. Attempting to write the binary output to a terminal causes rwaggbag to exit with an error.

--ipv6-policy=POLICY

Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mix. When SiLK has not been compiled with IPv6 support, IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:

ignore

Ignore any flow record marked as IPv6, regardless of the IP addresses it contains. Only IP addresses contained in IPv4 flow records will be added to the Aggregate Bag.

asv4

Convert IPv6 flow records that contain addresses in the ::ffff:0:0/96 netblock (that is, IPv4-mapped IPv6 addresses) to IPv4 and ignore all other IPv6 flow records.

mix

Process the input as a mixture of IPv4 and IPv6 flow records. When creating a bag whose key is an IP address and the input contains IPv6 addresses outside of the ::ffff:0:0/96 netblock, this policy is equivalent to force; otherwise it is equivalent to asv4.

force

Convert IPv4 flow records to IPv6, mapping the IPv4 addresses into the ::ffff:0:0/96 netblock.

only

Process only flow records that are marked as IPv6. Only IP addresses contained in IPv6 flow records will be added to the Aggregate Bag.

--compression-method=COMP_METHOD

Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.

none

Do not compress the output using an external library.

zlib

Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.

lzo1x

Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.

snappy

Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.

best

Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwaggbag searches for the site configuration file in the locations specified in the "FILES" section.

--xargs
--xargs=FILENAME

Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwaggbag opens each named file in turn and reads records from it as if the filenames had been listed on the command line.

--help

Print the available options and exit.

--help-fields

Print the names and descriptions of the keys and counters that may be used in the --keys and --counters switches and exit. Since SiLK 3.22.0.

--version

Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

To create an Aggregate Bag that sums the packet count for destination IPs addresses in the SiLK Flow file data.rw:

$ rwaggbag --key=dipv6 --counter=sum-packets data.rw   \
  | rwaggbagcat

To sum the number of records, packet count, and byte count for all flow records

$ rwaggbag --key=dport --counter=records,sum-packets,sum-bytes    \
       --output-path=dport.aggbag data.rw

To count the number of records seen for each unique source port, destination port, and protocol:

$ rwaggbag --key=sport,dport,proto --counter=records data.rw   \
  | rwaggbagcat

ENVIRONMENT

SILK_COUNTRY_CODES

This environment variable allows the user to specify the country code mapping file that rwaggbag uses when mapping an IP to a country for the scc and dcc keys. The value may be a complete path or a file relative to the SILK_PATH. See the "FILES" section for standard locations of this file.

SILK_IPV6_POLICY

This environment variable is used as the value for --ipv6-policy when that switch is not provided.

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_COMPRESSION_METHOD

This environment variable is used as the value for --compression-method when that switch is not provided.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwaggbag may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwaggbag may use this environment variable. See the "FILES" section for details.

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

$SILK_COUNTRY_CODES
$SILK_PATH/share/silk/country_codes.pmap
$SILK_PATH/share/country_codes.pmap
/usr/share/silk/country_codes.pmap
/usr/share/country_codes.pmap

Possible locations for the country code mapping file required by the scc and dcc keys.

NOTES

rwaggbag and the other Aggregate Bag tools were introduced in SiLK 3.15.0.

SEE ALSO

rwaggbagbuild(1), rwaggbagcat(1), rwaggbagtool(1), rwbag(1), rwfileinfo(1), rwfilter(1), rwnetmask(1), rwset(1), rwuniq(1), ccfilter(3), sensor.conf(5), silk(7), yaf(1), zlib(3)