NAME

rwaggbagcat - Output a binary Aggregate Bag file as text

SYNOPSIS

rwaggbagcat [--fields=FIELDS
      [--missing-field=FIELD=STRING [--missing-field=FIELD=STRING...]]]
      [--timestamp-format=FORMAT] [--ip-format=FORMAT]
      [--integer-sensors] [--integer-tcp-flags]
      [--no-titles] [--no-columns] [--column-separator=C]
      [--no-final-delimiter] [{--delimited | --delimited=C}]
      [--output-path=PATH] [--pager=PAGER_PROG]
      [--site-config-file=FILENAME]
      [AGGBAGFILE [AGGBAGFILE...]]

rwaggbagcat --help

rwaggbagcat --help-fields

rwaggbagcat --version

DESCRIPTION

rwaggbagcat reads a binary Aggregate Bag as created by rwaggbag(1) or rwaggbagbuild(1), converts it to text, and outputs it to the standard output, the pager, or the specified file.

As of SiLK 3.22.0, rwaggbagcat accepts a --fields switch to control the order in which the fields are printed.

rwaggbagcat reads the AGGBAGFILEs specified on the command line; if no AGGBAGFILE arguments are given, rwaggbagcat attempts to read an Aggregate Bag from the standard input. To read the standard input in addition to the named files, use - or stdin as an AGGBAGFILE name. If any input does not contain an Aggregate Bag file, rwaggbagcat prints an error to the standard error and exits abnormally.

When multiple AGGBAGFILEs are specified on the command line, each is handled individually. To process the files as a single Aggregate Bag, use rwaggbagtool(1) to combine the Aggregate Bags and pipe the output of rwaggbagtool into rwaggbagcat. Using --fields in this situation allows for a consistent output across the multiple files and causes the titles to appear only once. No value is printed if --fields names a key or counter that is not present in one of the files.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--fields=FIELDS

Print only the key and/or counter fields given in this comma separated list. Fields are printed in the order given in FIELDS, and keys and counters may appear in any order or not at all. Specifying --fields only changes the order in which the columns are printed, it does not re-order the entries (rows) in the Aggregate Bag file. If FIELDS includes fields not present in an input Aggregate Bag file, prints the string specified for that field by --missing-field or an empty value. The title line is printed only one time even if multiple Aggregate Bag files are read.

The names of the fields that may appear in FIELDS are:

sIPv4

source IP address, IPv4 only

dIPv4

destination IP address, IPv4 only

nhIPv4

next hop IP address, IPv4 only

any-IPv4

a generic IPv4 address

sIPv6

source IP address, IPv6 only

dIPv6

destination IP address, IPv6 only

nhIPv6

next hop IP address, IPv6 only

any-IPv6

a generic IPv6 address

sPort

source port

dPort

destination port

any-port

a generic port

protocol

IP protocol

packets

packet count

bytes

byte count

flags

bit-wise OR of TCP flags over all packets

initialFlags

TCP flags on the first packet

sessionFlags

bit-wise OR of TCP flags on the second through final packet

sTime

starting time in seconds

eTime

ending time in seconds

any-time

a generic time in seconds

duration

duration of flow

sensor

sensor name or ID at the collection point

class

class at collection point

type

type at collection point

input

router SNMP ingress interface or vlanId

output

router SNMP egress interface or postVlanId

any-snmp

a generic SNMP value

attribute

flow attributes set by the flow generator

application

guess as to the content of the flow

icmpType

ICMP type

icmpCode

ICMP code

scc

the country code of the source

dcc

the country code of the destination

any-cc

a generic country code

custom-key

a generic key

records

counter: count of records that match the key

sum-packets

counter: sum of packet counts

sum-bytes

counter: sum of byte counts

sum-duration

counter: sum of duration values

custom-counter

counter: a generic counter

Since SiLK 3.22.0.

--missing-field=FIELD=STRING

When --fields is active, print STRING as the value for FIELD when FIELD is not present in the input Aggregate Bag file. The default value is the empty string. The switch may be repeated to set the missing value string for multiple fields. rwaggbagcat exits with an error if FIELD is not present in --fields or if this switch is specified but --fields is not. STRING may be any string. Since SiLK 3.22.0.

--timestamp-format=FORMAT

Specify the format, timezone, and/or modifier to use when printing timestamps. When this switch is not specified, the SILK_TIMESTAMP_FORMAT environment variable is checked for a format, timezone, and modifier. If it is empty or contains invalid values, timestamps are printed in the default format, and the timezone is UTC unless SiLK was compiled with local timezone support. FORMAT is a comma-separated list of a format, a timezone, and/or a modifier. The format is one of:

default

Print the timestamps as YYYY/MM/DDThh:mm:ss.sss.

iso

Print the timestamps as YYYY-MM-DD hh:mm:ss.sss.

m/d/y

Print the timestamps as MM/DD/YYYY hh:mm:ss.sss.

epoch

Print the timestamps as the number of seconds since 00:00:00 UTC on 1970-01-01.

When a timezone is specified, it is used regardless of the default timezone support compiled into SiLK. The timezone is one of:

utc

Use Coordinated Universal Time to print timestamps.

local

Use the TZ environment variable or the local timezone.

--ip-format=FORMAT

Specify how IP addresses are printed, where FORMAT is a comma-separated list of the arguments described below. When this switch is not specified, the SILK_IP_FORMAT environment variable is checked for a value and that format is used if it is valid. The default FORMAT is canonical.

canonical

Print IP addresses in the canonical format. If the column is IPv4, use dot-separated decimal (192.0.2.1). If the column is IPv6, use colon-separated hexadecimal (2001:db8::1) or a mixed IPv4-IPv6 representation for IPv4-mapped IPv6 addresses (the ::ffff:0:0/96 netblock, e.g., ::ffff:192.0.2.1) and IPv4-compatible IPv6 addresses (the ::/96 netblock other than ::/127, e.g., ::192.0.2.1).

no-mixed

Print IP addresses in the canonical format (192.0.2.1 or 2001:db8::1) but do not used the mixed IPv4-IPv6 representations. For example, use ::ffff:c000:201 instead of ::ffff:192.0.2.1. Since SiLK 3.17.0.

decimal

Print IP addresses as integers in decimal format. For example, print 192.0.2.1 and 2001:db8::1 as 3221225985 and 42540766411282592856903984951653826561, respectively.

hexadecimal

Print IP addresses as integers in hexadecimal format. For example, print 192.0.2.1 and 2001:db8::1 as c00000201 and 20010db8000000000000000000000001, respectively.

zero-padded

Make all IP address strings contain the same number of characters by padding numbers with leading zeros. For example, print 192.0.2.1 and 2001:db8::1 as 192.000.002.001 and 2001:0db8:0000:0000:0000:0000:0000:0001, respectively. For IPv6 addresses, this setting implies no-mixed, so that ::ffff:192.0.2.1 is printed as 0000:0000:0000:0000:0000:ffff:c000:0201. As of SiLK 3.17.0, may be combined with any of the above, including decimal and hexadecimal.

The following arguments modify certain IP addresses prior to printing. These arguments may be combined with the above formats.

map-v4

Change an IPv4 column to IPv4-mapped IPv6 addresses (addresses in the ::ffff:0:0/96 netblock) prior to formatting. Since SiLK 3.17.0.

unmap-v6

For an IPv6 column, change any IPv4-mapped IPv6 addresses (addresses in the ::ffff:0:0/96 netblock) to IPv4 addresses prior to formatting. Since SiLK 3.17.0.

The following argument is also available:

force-ipv6

Set FORMAT to map-v4,no-mixed.

--integer-sensors

Print the integer ID of the sensor rather than its name.

--integer-tcp-flags

Print the TCP flag fields (flags, initialFlags, sessionFlags) as an integer value. Typically, the characters F,S,R,P,A,U,E,C are used to represent the TCP flags.

--no-titles

Turn off column titles. By default, titles are printed.

--no-columns

Disable fixed-width columnar output.

--column-separator=C

Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.

--no-final-delimiter

Do not print the column separator after the final column. Normally a delimiter is printed.

--delimited
--delimited=C

Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.

--output-path=PATH

Write the textual output to PATH, where PATH is a filename, a named pipe, the keyword stderr to write the output to the standard error, or the keyword stdout or - to write the output to the standard output (and bypass the paging program). If PATH names an existing file, rwaggbagcat exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this option is not given, the output is either sent to the pager or written to the standard output.

--pager=PAGER_PROG

When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the --output-path switch is given or if the value of the pager is determined to be the empty string, no paging is performed and all output is written to the terminal.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwaggbagcat searches for the site configuration file in the locations specified in the "FILES" section.

--help

Print the available options and exit.

--help-fields

Print the names and descriptions of the keys and counters that may be used in the --fields and --missing-field switches and exit. Since SiLK 3.22.0.

--version

Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

The formatting switches on rwaggbagcat are similar to those on the other SiLK tools.

Creating and printing an Aggregate Bag file

First, use rwaggbag(1) to create an Aggregate Bag file from the SiLK Flow file data.rw:

$ rwaggbag --key=sport,dport --counter=sum-pack,sum-byte \
       --output-path=ab.aggbag data.rw

To print the contents of the Aggregate Bag file:

$ rwaggbagcat ab.aggbag | head -4
sPort|dPort|    sum-packets|           sum-bytes|
    0|    0|          73452|             6169968|
    0|  769|          15052|              842912|
    0|  771|          14176|              793856|

Reordering the columns

Use the --fields switch (added in SiLK 3.22.0) to control the order of the columns in the output or to select only some columns:

$ rwaggbagcat --fields=dPort,sPort,sum-bytes ab.aggbag | head -4
dPort|sPort|           sum-bytes|
    0|    0|             6169968|
  769|    0|              842912|
  771|    0|              793856|

The --fields switch only changes the positions of the columns. The sPort field is still the primary key in the output shown above.

The --fields switch may also include fields that are not in the input. By default, rwaggbagcat prints an empty value for those fields, but the --missing-field switch may be used to display any string instead. The argument to --missing-field is FIELD=STRING where FIELD is one of the fields in --fields.

$ rwaggbagcat --fields=sipv4,proto,dport,sum-bytes \
       --missing=sipv4=n/a ab.aggbag | head -4
         sIPv4|pro|dPort|           sum-bytes|
           n/a|   |    0|             6169968|
           n/a|   |  769|              842912|
           n/a|   |  771|              793856|

Using --fields with IP addresses

When creating an Aggregate Bag file with the source IP address and protocol as keys, rwaggbagcat prints the columns in a different order depending on whether the address is treated as IPv4 or IPv6.

When the key is the source IPv4 address and the protocol, the Aggregate Bag is built with the source address as the primary key:

$ rwaggbag --key=sipv4,proto --counter=records data.rw         \
  | rwaggbagcat
         sIPv4|pro|   records|
   10.4.52.235|  6|         1|
  10.5.231.251|  6|         1|
   10.9.77.117|  6|         1|

Reading the same file but treating the data as IPv6 results in the protocol being the primary key:

$ rwaggbag --key=sipv6,proto --counter=records data.rw         \
  | rwaggbagcat
pro|                                  sIPv6|   records|
  1|                   ::ffff:10.40.151.242|         1|
  1|                   ::ffff:10.44.140.138|         1|
  1|                    ::ffff:10.53.204.62|         1|

In the latter case, the --fields may be used to display the source IPv6 address first, but the switch only changes the positions of the columns, it does not reorder the entries (rows):

$ rwaggbag --key=sipv6,proto --counter=records data.rw         \
  | rwaggbagcat --fields=sipv6,proto,records
                                 sIPv6|pro|   records|
                  ::ffff:10.40.151.242|  1|         1|
                  ::ffff:10.44.140.138|  1|         1|
                   ::ffff:10.53.204.62|  1|         1|

Removing the columns or the title from the output

To produce comma separated data:

rwaggbagcat --delimited=, /tmp/ab.aggbag | head -4
sPort,dPort,sum-packets,sum-bytes
0,0,73452,6169968
0,769,15052,842912
0,771,14176,793856

To remove the title:

$ rwaggbagcat --no-title ab.aggbag | head -4
    0|    0|          73452|             6169968|
    0|  769|          15052|              842912|
    0|  771|          14176|              793856|
    0| 2048|          14356|             1205904|

Customizing the IP and timestamp format

To change the format of IP addresses:

$ rwaggbag --key=sipv4,dipv4 --counter=sum-pack,sum-byte data.rw   \
  | rwaggbagcat --ip-format=decimal | head -4
     sIPv4|     dIPv4|    sum-packets|           sum-bytes|
 168047851|3232295339|            255|               18260|
 168159227|3232293505|            331|              536169|
 168381813|3232282689|            563|               55386|

To change the format of timestamps:

$ rwaggbag --key=stime,etime --counter=sum-pack,sum-byte data.rw   \
  | rwaggbagcat --timestamp-format=epoch | head -4
     sTime|     eTime|    sum-packets|           sum-bytes|
1234396802|1234396802|              2|                 259|
1234396802|1234398594|            526|               38736|
1234396803|1234396803|              9|                 504|

ENVIRONMENT

SILK_IP_FORMAT

This environment variable is used as the value for --ip-format when that switch is not provided.

SILK_TIMESTAMP_FORMAT

This environment variable is used as the value for --timestamp-format when that switch is not provided.

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_PAGER

When set to a non-empty string, rwaggbagcat automatically invokes this program to display its output a screen at a time. If set to an empty string, rwaggbagcat does not automatically page its output.

PAGER

When set and SILK_PAGER is not set, rwaggbagcat automatically invokes this program to display its output a screen at a time.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwaggbagcat may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files and plug-ins, rwaggbagcat may use this environment variable. See the "FILES" section for details.

TZ

When the argument to the --timestamp-format switch includes local or when a SiLK installation is built to use the local timezone, the value of the TZ environment variable determines the timezone in which rwaggbagcat displays timestamps. (If both of those are false, the TZ environment variable is ignored.) If the TZ environment variable is not set, the machine's default timezone is used. Setting TZ to the empty string or 0 causes timestamps to be displayed in UTC. For system information on the TZ variable, see tzset(3) or environ(7). (To determine if SiLK was built with support for the local timezone, check the Timezone support value in the output of rwaggbagcat --version.)

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

NOTES

The --fields, --missing-field, and --help-fields switches were added in SiLK 3.22.0.

rwaggbagcat and the other Aggregate Bag tools were introduced in SiLK 3.15.0.

SEE ALSO

rwaggbag(1), rwaggbagbuild(1), rwaggbagtool(1), silk(7), tzset(3), environ(7)