Cacti (home)ForumsDocumentation

About Consolidation

Attached, you will find a perl script that generates two separate rrd's and will generate a single graph based on both of them. Inline, you will find several constants to play with. The script fills both of them with data generated by a loop. The base value is 2. For each following data point, the value will be incremented by 0.1. After 40 iterations, the value will have increased to 6.

Defining the 1. rrd file

First, lets define some constants needed for rrd file creation

# create first DB
# name of rrd file for test data
my $db1         = "/tmp/rrddemo1.rrd";
my $interval    = 300;          # time between two data points (pdp's)
my $heartbeat   = 2*$interval;  # heartbeat
my $xff         = 0.5;          # pdp's necessary to form one cdp

The timespan for this file will be dynamically computed from current timestamp

# last timestamp of rrd should equal actual time
# rounded to the last interval
my $no_iter     = 40;
my $end         = `date +%s`;
$end            = $interval * int($end/$interval);
my $start       = $end - $no_iter * $interval;

By default, it contains 2 rra's for 4 consolidation functions (AVERAGE, MAX, MIN, LAST).

# define all consolidation functions to be used
my $CF1         = "AVERAGE";
my $CF2         = "MAX";
my $CF3         = "MIN";
my $CF4         = "LAST";

The first rra holds 5 data points (pdp's). The second one holds 9 data points, that are generated automatically by rrdtool by consolidating 5 pdp's each. So you will have 2*4=8 rra's.

# steps and rows
my $rra1step    = 1;                            # no of steps in rra 1
my $rra1rows    = 5;                            # no of pdp's in rra 1
my $rra2step    = 5;                            # no of steps (pdp's of rra 1) to form one cdp
my $rra2rows    = int($no_iter/$rra2step)+1;    # no of cdp's in rra 2

The rrd file will be created by means of the perl module

    "--start=" . ($start-10),
#   define datasource
#   consolidation function 1
#   consolidation function 2
#   consolidation function 3
#   consolidation function 4
    ) or die "Cannot create rrd ($RRDs::error)";

Defining the 2. rrd file

This rrd contains exactly one rra only. There is enough space for all (default:40) data points generated by this script. There is no need for consolidation.

# create second DB
# it will hold all data in its first rra
# without consolidation
# (therefor it is much bigger than the first one)
# name of rrd file for test data
my $db2         = "/tmp/rrddemo2.rrd";

    "--start=" . ($start-10),
#   define datasource
#   consolidation function 1
    ) or die "Cannot create rrd ($RRDs::error)";

Running the Perl Script

You may run the script without any parameter. In this case, it will create the 2 rrd files, fill them and generate one png file:

# generate rrd graph
my $graph       = "/tmp/rrddemo1.png";
# defines some constants for graphing
my $width       = 500;
my $height      = 180;

    "--title=RRDtool Test: consolidation principles",
    "--start=" . $start,
    "--end=" . $end,
    "--width=" . $width,
    "--height=" . $height,
    "COMMENT:raw data as follows, filesize=$db2size\\n",
    "LINE1:demo2#CCCCCC:RAW DATA, no consolidation\\n",
    "COMMENT:Consolidated data as follows, filesize=$db1size\\n",
    "LINE1:demo12#00FF00:CF=MAX equals CF=LAST in this case\\n",
#   "LINE1:demo14#000000:CF=LAST\\n",
       ) or die "graph failed ($RRDs::error)";

The result may be viewed by a browser, e.g.

firefox file:///tmp/rrddemo1.png

The result should be similar to:

consolidation rrddemo1

Discussing the results

One of the basic principles of rrd's is, that they will not grow in space while storing additional data. Let us look at this more carefully. Remember that the script increments each value by 0.1 for each data point. But the first rra will hold only 5 data points, e.g the values 2.0, 2.1, 2.2, 2.3, 2.4. But what happens, if the next value, 2.5, is added? This is where the CONSOLIDATION FUNCTIONS comes in, e.g. AVERAGE. In this case, the average of all 5 values (2.2 in this case) will be stored in the second rra.

So, there is a consolidation of the data, only 1 consolidated data point is stored instead of 5 originally entered ones. As a result, you will loose “some information”. There is no chance to identify, that the average 2.2 was build out of these 5 values above. It may have been build out of 1.0, 1.5, 2.2, 2.9, 3,4 as well. This is why people often want to increase the size of the first rra to store more data points.

But remember, there are more consolidation functions. Use of MAX yields 2.4 in the case above. MIN yields 2.0 and LAST results in 2.4 (the last value of all 5 primary data points). Yes, even in this case it is not possible to rebuild the originally entered data. But you will have an idea at least for MIN, MAX, AVERAGE and even LAST.

On the long run, this saves lots of disk space and is VERY fast in processing. And even if you “loose” the original data, you will see the range between MIN and MAX and the AVERAGE.

Using with cacti

To use this function in cacti, you will have to modify your graph templates. Most of them contain line defintions based on AVERAGE. You may want to add another line using consolidation function MAX/MIN. You won't notice any effect until you use graphs spanning a time frame greater than about 2 days (the default size of the first rra). In this example, AVERAGEs were graphed using an AREA, whereas MAXimums uses LINE1 in a slightly darker shade of the corresponding color. This gives nice graphes even for daily view, IMHO. The example uses an additional feature, a CDEF=CURRENT_DATA_SOURCE,-1,* to mirror outbound traffic to the negative side

consolidation traffic

Please notice, that MAX does not always match AVERAGES, which is not that surprising from the mathematical point of view. AVERAGEs show Volume based information whereas MAXimums show Peak Usage. Both informations are useful.

See the script working

If you would like to see, what's going on when running the script, you may call it by

perl verbose | more

Then, it will produce output like

RRD definitions: Start: 1160814900, End: 1160826900, Updates every: 300

update: 1160814900:2
update: 1160815200:2.1
update: 1160815500:2.2
update: 1160815800:2.3
update: 1160816100:2.4

update: 1160816400:2.5
update: 1160816700:2.6
update: 1160817000:2.7
update: 1160817300:2.8
update: 1160817600:2.9
update: 1160817900:3
update: 1160818200:3.1
update: 1160826300:5.8
update: 1160826600:5.9
update: 1160826900:6
Last 5 minutes CF AVERAGE:
1160825400: 5.6
1160825700: 5.7
1160826000: 5.8
1160826300: 5.9
1160826600: 6
Last 6*5 minutes CF AVERAGE:
1160817900: 3
1160819400: 3.5
1160820900: 4
1160822400: 4.5
Last 30 minutes CF LAST:
1160817900: 3.2
1160819400: 3.7
1160820900: 4.2
1160822400: 4.7
1160823900: 5.2
1160825400: 5.7
1160826900: N/A
Filesize of rrdfile 1 at /tmp/rrddemo1.rrd: 2336
Filesize of rrdfile 2 at /tmp/rrddemo2.rrd: 864

Note: in this very case, the filesize of the rrd using consolidation is bigger. But for real world rrd's it is the other way round. Now, you may study all rrd file values in detail.

Personal Tools