rwbag - Build a binary Bag from SiLK Flow records
rwbag --bag-file=KEY,COUNTER,OUTPUTFILE
[--bag-file=KEY,COUNTER,OUTPUTFILE ...]
[{ --pmap-file=PATH | --pmap-file=MAPNAME:PATH }]
[--note-strip] [--note-add=TEXT] [--note-file-add=FILE]
[--invocation-strip] [--print-filenames] [--copy-input=PATH]
[--compression-method=COMP_METHOD]
[--ipv6-policy={ignore,asv4,mix,force,only}]
[--site-config-file=FILENAME]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}
rwbag --help
rwbag --legacy-help
rwbag --version
LEGACY SYNOPSIS
rwbag [--sip-flows=OUTPUTFILE] [--dip-flows=OUTPUTFILE]
[--sport-flows=OUTPUTFILE] [--dport-flows=OUTPUTFILE]
[--proto-flows=OUTPUTFILE] [--sensor-flows=OUTPUTFILE]
[--input-flows=OUTPUTFILE] [--output-flows=OUTPUTFILE]
[--nhip-flows=OUTPUTFILE]
[--sip-packets=OUTPUTFILE] [--dip-packets=OUTPUTFILE]
[--sport-packets=OUTPUTFILE] [--dport-packets=OUTPUTFILE]
[--proto-packets=OUTPUTFILE] [--sensor-packets=OUTPUTFILE]
[--input-packets=OUTPUTFILE] [--output-packets=OUTPUTFILE]
[--nhip-packets=OUTPUTFILE]
[--sip-bytes=OUTPUTFILE] [--dip-bytes=OUTPUTFILE]
[--sport-bytes=OUTPUTFILE] [--dport-bytes=OUTPUTFILE]
[--proto-bytes=OUTPUTFILE] [--sensor-bytes=OUTPUTFILE]
[--input-bytes=OUTPUTFILE] [--output-bytes=OUTPUTFILE]
[--nhip-bytes=OUTPUTFILE]
[--note-add=TEXT] [--note-file-add=FILE]
[--print-filenames] [--copy-input=PATH]
[--compression-method=COMP_METHOD]
[--ipv6-policy={ignore,asv4,mix,force,only}]
[--site-config-file=FILENAME]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}
rwbag reads SiLK Flow records and builds one or more Bag files. A Bag is similar to a set but each key is associated with a counter. Usually the key is some aspect of a flow record (an IP address, a port, the protocol, et cetera), and the counter is a volume (such as the number of flow records or the sum or bytes or packets) for the flow records that match that key. A Bag file supports a single key field and a single counter field; use the Aggregate Bag tools (e.g., rwaggbag(1)) when the key or counter contains multiple fields.
The --bag-file switch is required and it specifies how to create a Bag file. The argument to the switch names the key field to use for the bag, the counter field, and the location where the bag file is to be written. The switch may be repeated to create multiple Bag files.
rwbag reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use -
or stdin
as a file name. If an input file name ends in .gz
, the file is uncompressed as it is read. When the --xargs switch is provided, rwbag reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.
If adding a value to a key would cause the value to overflow the maximum value that Bags support, the key's value will be set to the maximum and processing will continue. In addition, if this is the first value to overflow in this Bag, a warning will be printed to the standard error.
If rwbag runs out of memory, it will exit immediately. The output Bag files will remain behind, each with a size of 0 bytes.
Use rwbagcat(1) to see the contents of a bag. To create a bag from textual input or from an IPset, use rwbagbuild(1). rwbagtool(1) allows you to manipulate binary bag files.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
Bin flow records by unique KEY, compute the COUNTER for each bin, and write the result to OUTPUTFILE. The list of available KEY and COUNTER values are given immediately below. OUTPUTFILE is the name of a non-existent file, a named pipe, or the keyword stdout
or -
to write the binary Bag to the standard output. Repeat the --bag-file switch to create multiple Bag files in a single pass over the data. Only one OUTPUTFILE may use the standard output. See "LEGACY BAG CREATION SWITCHES" for deprecated methods to create Bag files. This switch or one of legacy equivalents is required. Since SiLK 3.12.0.
rwbag supports the following names for KEY. The case of KEY is ignored.
source IP address, either IPv4 or IPv6
source IP address, either IPv4 or IPv6
destination IP address, either IPv4 or IPv6
destination IP address, either IPv4 or IPv6
source port for TCP or UDP, or equivalent
destination port for TCP or UDP, or equivalent
IP protocol
count of packets recorded for this flow record
count of bytes recorded for this flow record
bit-wise OR of TCP flags over all packets in the flow
starting time of the flow, in seconds resolution
duration of the flow, in seconds resolution
ending time of the flow, in seconds resolution
numeric ID of the sensor where the flow was collected
router SNMP input interface or vlanId if packing tools were configured to capture it (see sensor.conf(5))
router SNMP output interface or postVlanId
router next hop IP address, either IPv4 or IPv6
router next hop IP address, either IPv4 or IPv6
TCP flags on first packet in the flow
bit-wise OR of TCP flags over all packets except the first in the flow
flow attributes set by the flow generator
guess as to the content of the flow
the country code of the source IP address. Uses the mapping file specified by the SILK_COUNTRY_CODES environment variable or the country_codes.pmap mapping file, as described in "FILES". (See also ccfilter(3).) Since SiLK 3.12.0.
an alias for sip-country
the country code of the destination IP address
an alias for dip-country
the value that the source IP address maps to in the mapping file whose map-name is MAPNAME. The type of that prefix map must be IPv4-address or IPv6-address. Use --pmap-file to load the mapping file and optionally set its map-name. Since the MAPNAME must be known when the --bag-file switch is parsed, the --pmap-file switch(es) should precede the --bag-file switch(es).
the value that the destination IP address maps to in the mapping file whose map-name is MAPNAME. See sip-pmap:MAPNAME
.
the value that the protocol/source-port pair maps to in the mapping file whose map-name is MAPNAME. The type of that prefix map must be proto-port. Use --pmap-file to load the mapping file and optionally set its map-name. Since the MAPNAME must be known when the --bag-file switch is parsed, the --pmap-file switch(es) should precede the --bag-file switch(es).
the value that the protocol/destination-port pair maps to in the mapping file whose map-name is MAPNAME. See sport-pmap:MAPNAME
.
rwbag supports the following names for COUNTER. The case of COUNTER is ignored.
count of the number of flow records that match the key
an alias for records
the sum of the packet counts for flow records that match the key
an alias for sum-packets
the sum of the byte counts for flow records that match the key
an alias for sum-bytes
Load the the prefix map file from PATH for use when the key part of the argument to the --bag-file switch is one of sip-pmap
, dip-pmap
, sport-pmap
, or dport-pmap
. Specify PATH as -
or stdin
to read from the standard input. If MAPNAME is specified, it overrides the map-name contained in the prefix map file itself. If no map-name is available, rwbag exits with an error. The switch may be repeated to load multiple prefix map files; each file must have a unique map-name. To create a prefix map file, use rwpmapbuild(1). Since SiLK 3.12.0.
Do not copy the notes (annotations) from the input files to the output file(s). When this switch is not specified, notes from the input files are copied to the output. Since SiLK 3.12.2.
Add the specified TEXT to the header of every output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of every output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Do not record any command line history: do not copy the invocation history from the input files to the output file(s), and do not record the current command line invocation in the output. The invocation may be viewed with rwfileinfo(1). Since SiLK 3.12.0.
Print to the standard error the names of input files as they are opened.
Copy all binary SiLK Flow records read as input to the specified file or named pipe. PATH may be stdout
or -
to write flows to the standard output as long as no Bag file is being written there.
Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mix. When SiLK has not been compiled with IPv6 support, IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:
Ignore any flow record marked as IPv6, regardless of the IP addresses it contains. Only IP addresses contained in IPv4 flow records will be added to the bag(s).
Convert IPv6 flow records that contain addresses in the ::ffff:0:0/96 netblock (that is, IPv4-mapped IPv6 addresses) to IPv4 and ignore all other IPv6 flow records.
Process the input as a mixture of IPv4 and IPv6 flow records. When creating a bag whose key is an IP address and the input contains IPv6 addresses outside of the ::ffff:0:0/96 netblock, this policy is equivalent to force; otherwise it is equivalent to asv4.
Convert IPv4 flow records to IPv6, mapping the IPv4 addresses into the ::ffff:0:0/96 netblock.
Process only flow records that are marked as IPv6. Only IP addresses contained in IPv6 flow records will be added to the bag(s).
Regardless of the IPv6 policy, when all IPv6 addresses in the bag are in the ::ffff:0:0/96 netblock, rwbag treats them as IPv4 addresses and writes an IPv4 bag. When any other IPv6 addresses are present in the bag, the IPv4 addresses in the bag are mapped into the ::ffff:0:0/96 netblock and rwbag writes an IPv6 bag.
Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.
Do not compress the output using an external library.
Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.
Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.
Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.
Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwbag searches for the site configuration file in the locations specified in the "FILES" section.
Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwbag opens each named file in turn and reads records from it as if the filenames had been listed on the command line.
Print the available options and exit.
Print help, including legacy switches. See the "LEGACY BAG CREATION SWITCHES" section below for these switches.
Print the version number and information about how SiLK was configured, then exit the application.
The following switches are deprecated as of SiLK 3.12.0. These switches may be used in conjunction with the --bag-file switch.
Equivalent to --bag-file=sIPv4,records,OUTPUTFILE. Count number of flows by unique source IP.
Equivalent to --bag-file=sIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique source IP.
Equivalent to --bag-file=sIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique source IP.
Equivalent to --bag-file=dIPv4,records,OUTPUTFILE. Count number of flows by unique destination IP.
Equivalent to --bag-file=dIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique destination IP.
Equivalent to --bag-file=dIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique destination IP.
Equivalent to --bag-file=sPort,records,OUTPUTFILE. Count number of flows by unique source port.
Equivalent to --bag-file=sPort,sum-packets,OUTPUTFILE. Count number of packets by unique source port.
Equivalent to --bag-file=sPort,sum-bytes,OUTPUTFILE. Count number of bytes by unique source port.
Equivalent to --bag-file=dPort,records,OUTPUTFILE. Count number of flows by unique destination port.
Equivalent to --bag-file=dPort,sum-packets,OUTPUTFILE. Count number of packets by unique destination port.
Equivalent to --bag-file=dPort,sum-bytes,OUTPUTFILE. Count number of bytes by unique destination port.
Equivalent to --bag-file=protocol,records,OUTPUTFILE. Count number of flows by unique protocol.
Equivalent to --bag-file=protocol,sum-packets,OUTPUTFILE. Count number of packets by unique protocol.
Equivalent to --bag-file=protocol,sum-bytes,OUTPUTFILE. Count number of bytes by unique protocol.
Equivalent to --bag-file=sensor,records,OUTPUTFILE. Count number of flows by unique sensor ID.
Equivalent to --bag-file=sensor,sum-packets,OUTPUTFILE. Count number of packets by unique sensor ID.
Equivalent to --bag-file=sensor,sum-bytes,OUTPUTFILE. Count number of bytes by unique sensor ID.
Equivalent to --bag-file=input,records,OUTPUTFILE. Count number of flows by unique input interface index.
Equivalent to --bag-file=input,sum-packets,OUTPUTFILE. Count number of packets by unique input interface index.
Equivalent to --bag-file=input,sum-bytes,OUTPUTFILE. Count number of bytes by unique input interface index.
Equivalent to --bag-file=output,records,OUTPUTFILE. Count number of flows by unique output interface index.
Equivalent to --bag-file=output,sum-packets,OUTPUTFILE. Count number of packets by unique output interface index.
Equivalent to --bag-file=output,sum-bytes,OUTPUTFILE. Count number of bytes by unique output interface index.
Equivalent to --bag-file=nhIPv4,records,OUTPUTFILE. Count number of flows by unique next hop IP.
Equivalent to --bag-file=nhIPv4,sum-packets,OUTPUTFILE. Count number of packets by unique next hop IP.
Equivalent to --bag-file=nhIPv4,sum-bytes,OUTPUTFILE. Count number of bytes by unique next hop IP.
In the following examples, the dollar sign ($
) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\
) is used to indicate a wrapped line.
Read the SiLK Flow file data.rw and create the Bag proto-byte.bag that contains the total byte-count seen for each protocol by using protocol as the key and sum-bytes as the counter:
$ rwbag --bag-file=protocol,sum-bytes,proto-byte.bag data.rw
Use rwbagcat(1) to view the result:
$ rwbagcat proto-byte.bag
1| 10695328|
6| 120536195111|
17| 24500079|
Specify the output path as -
to pass the Bag file from rwbag directly into rwbagcat.
$ rwbag --bag-file=protocol,sum-bytes,- data.rw \
| rwbagcat
1| 10695328|
6| 120536195111|
17| 24500079|
Compare that to this rwuniq(1) command.
$ rwuniq --field=protocol --value=bytes --sort-output data.rw
pro| Bytes|
1| 10695328|
6| 120536195111|
17| 24500079|
One advantage of Bag files over rwuniq is that the data remains in binary form where it can be manipulated by rwbagtool(1).
Read records from rwfilter(1) and build Bag files sip-flow.bag and dip-flow.bag that count the number of flows seen for each source address and for each destination address, respectively.
$ rwfilter ... --pass=stdout \
| rwbag --bag-file=sipv4,records,sip-flow.bag \
--bag-file=dipv4,records,dip-flow.bag
To create sip16-byte.bag that contains the number of bytes seen for each /16 found in the source address field, use the rwnetmask(1) tool prior to feeding the input to rwbag:
$ rwfilter ... --pass=stdout \
| rwnetmask --4sip-prefix-length=16 \
| rwbag --bag-file=sipv4,sum-bytes,sip16-byte.bag
$ rwbagcat sip16-byte.bag | head -4
10.4.0.0| 18260|
10.5.0.0| 536169|
10.9.0.0| 55386|
10.11.0.0| 5110438|
To print the IP addresses of an existing Bag into /16 prefixes, use the --network-structure switch of rwbagcat(1).
$ rwfilter ... --pass=stdout \
| rwbag --bag-file=sipv4,sum-bytes,- \
| rwbagcat --network-structure=B \
| head -4
10.4.0.0/16| 18260|
10.5.0.0/16| 536169|
10.9.0.0/16| 55386|
10.11.0.0/16| 5110438|
As of SiLK 3.12.0, a Bag file may contain a country code as its key. Create scc-pkt.bag that sums the packet count by country.
$ rwbag --bag-file=sip-country,sum-packets,scc-pkt.bag
$ rwbagcat scc-pkt.bag
--| 840|
a1| 284|
a2| 1|
ae| 8|
rwbag and rwbagbuild(1) can use a prefix map file as the key in a Bag file as of SiLK 3.12.0. For example, to lookup each source address in the prefix map file ip-map.pmap that maps from address to "type of service", use the --pmap-file switch to specify the prefix map file, and specify the Bag's key as sip-pmap:
map-name, where map-name is either the map-name stored in the prefix map file or a name that is provided as part of the --pmap-file argument. (A prefix map's map-name is available via the rwfileinfo(1) command.)
$ rwfileinfo --field=prefix-map ip-map.pmap
ip-map.pmap:
prefix-map v1: service-host
$
$ rwbag --pmap-file=ip-map.pmap \
--bag-file=sip-pmap:service-host,bytes,srvhost.bag \
data.rw
Multiple --pmap-file switches may be specified which may be useful when generating multiple Bag files in a single invocation. On the command line, the --pmap-file switch that defines the map-name must preceded the --bag-file where the map-name is used.
The prefix map file is not stored as part of the Bag, so you must provide the name of the prefix map when running rwbagcat.
$ rwbagcat srvhost.bag
rwbagcat: The --pmap-file switch is required for \
Bags containing sip-pmap keys
$ rwbagcat --pmap-file=ip-map.pmap srvhost.bag
external| 59950837766|
internal| 60602999159|
ntp| 588316|
dns| 14404581|
dhcp| 2560696|
rwbag also has support for prefix map files that map from a protocol-port pair to a label. The proto-port.pmap file does not have a map-name so a name must be provided on the rwbag command line.
$ rwfileinfo --field=prefix-map proto-port.pmap
proto-port.pmap:
$
$ rwbag --pmap-file=srvport:proto-port.pmap \
--bag-file=sip-pmap:srvport,flows,srvport.bag \
data.rw
$ rwbagcat --pmap-file=proto-port.pmap srvport.bag | head -4
ICMP| 15622|
UDP| 62216|
UDP/DNS| 62216|
UDP/DHCP| 15614|
This environment variable allows the user to specify the country code mapping file that rwbag uses when mapping an IP to a country for the sip-country
and dip-country
keys. The value may be a complete path or a file relative to the SILK_PATH. See the "FILES" section for standard locations of this file.
This environment variable is used as the value for --ipv6-policy when that switch is not provided.
The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.
This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.
This environment variable is used as the value for the --site-config-file when that switch is not provided.
This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwbag may use this environment variable when searching for the SiLK site configuration file.
This environment variable gives the root of the install tree. When searching for configuration files, rwbag may use this environment variable. See the "FILES" section for details.
Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.
Possible locations for the country code mapping file required by the sip-country
and dip-country
keys.
rwbagbuild(1), rwbagcat(1), rwbagtool(1), rwaggbag(1), rwfileinfo(1), rwfilter(1), rwnetmask(1), rwpmapbuild(1), rwuniq(1), ccfilter(3), sensor.conf(5), silk(7), zlib(3)