A Netdata json formatter for storing metrics in Timescale.
Added tag v0.3.0 for changeset 1f09cfb560e0
Added tag v0.2.0 for changeset 96b8799a565a

heads

tip
browse log
v0.3.0
browse .tar.gz

clone

read-only
https://hg.sr.ht/~mahlon/netdata-timescale-relay
read/write
ssh://hg@hg.sr.ht/~mahlon/netdata-timescale-relay

#Netdata-TSRelay

#What's this?

This program is designed to accept JSON streams from Netdata clients, and write metrics to a PostgreSQL table - specifically, Timescale backed tables (though that's not technically a requirement.)

#Installation

You'll need a working Nim build environment and PostgreSQL development headers to compile the binary.

Simply run make to build it. Put it wherever you please.

#Configuration

There are a few assumptions that should be satisfied before running this successfully.

#Database setup

You'll need to create the destination table.

CREATE TABLE netdata (
	time timestamptz default now() not null,
	host text not null,
	metrics jsonb default '{}'::jsonb not null
);

Index it based on how you intend to query the data, including JSON functional indexing, etc. See PostgreSQL documentation for details.

Strongly encouraged: Promote this table to a Timescale "hypertable". See Timescale docs for that, but a quick example to partition automatically at weekly boundaries would look something like this, if you're running v0.9.0 or better:

SELECT create_hypertable( 'netdata', 'time', migrate_data => true, chunk_time_interval => '1 week'::interval );

Timescale also has some great examples and advice for efficient JSON indexing and queries.

#Netdata

You'll likely want to pare down what netdata is sending. Here's an example configuration for exporting.conf -- season this to taste (what charts to send and frequency.)

Note: This example uses the "exporting" module introduced in Netdata v1.23. If your netdata is older than that, you'll be using the deprecated "backend" instead in the main netdata.conf file.

[exporting:global]
	enabled = yes

[json:timescale]
	hostname             = your-hostname
	enabled              = yes
	data source          = average
	destination          = localhost:14866
	prefix               = netdata
	update every         = 10
	buffer on failures   = 10
	send charts matching = !cpu.cpu* !ipv6* !users.* nfs.rpc net.* net_drops.* net_packets.* !system.interrupts* system.* disk.* disk_space.* disk_ops.* mem.*

#Running the Relay

#Options

  • [-q|--quiet]: Quiet mode. No output at all. Ignored if -d is supplied.
  • [-d|--debug]: Debug mode. Show incoming data.
  • [-D|--dropconn]: Drop the TCP connection to netdata between samples. This may be more efficient depending on your environment and number of clients. Defaults to false.
  • [-o|--dbopts]: PostgreSQL connection information. (See below for more details.)
  • [-h|--help]: Display quick help text.
  • [-a|--listen-addr]: A specific IP address to listen on. Defaults to INADDR_ANY.
  • [-p|--listen-port]: The port to listen for netdata JSON streams. Default is 14866.
  • [-P|--persistent]: Don't disconnect from the database between samples. This may be more efficient with a small number of clients, when not using a pooler, or with a very high sample size/rate. Defaults to false.
  • [-T|--dbtable]: Change the table name to insert to. Defaults to netdata.
  • [-t|--timeout]: Maximum time in milliseconds to wait for data. Slow connections may need to increase this from the default 500 ms.
  • [-v|--version]: Show version.

Notes

Nim option parsing might be slightly different than what you're used to. Flags that require arguments must include an '=' or ':' character.

  • --timeout=1000 valid
  • --timeout:1000 valid
  • -t:1000 valid
  • --timeout 1000 invalid
  • -t 1000 invalid

All database connection options are passed as a key/val string to the dbopts flag. The default is:

"host=localhost dbname=netdata application_name=netdata-tsrelay"

... which uses the default PostgreSQL port, and connects as the running user.

Reference the PostgreSQL Documentation for all available options (including how to store passwords in a separate file, enable SSL mode, etc.)

#Daemonizing

Use a tool of your choice to run this at system startup in the background. My personal preference is daemontools, but I won't judge you if you use something else.

Here's an example using the simple daemon wrapper tool:

# daemon \
	-o /var/log/netdata_tsrelay.log \
	-p /var/run/netdata_tsrelay.pid \
	-u nobody -cr \
	/usr/local/bin/netdata_tsrelay \
		--dbopts="dbname=metrics user=metrics host=db-master port=6432 application_name=netdata-tsrelay"

#Scaling

Though performant by default, if you're going to be storing a LOT of data (or have a lot of netdata clients), here are some suggestions for getting the most bang for your buck:

  • Use the pgbouncer connection pooler.
  • DNS round robin the hostname where netdata_tsrelay lives across N hosts -- you can horizontally scale without any gotchas.
  • Edit your netdata.conf file to only send the metrics you are interested in.
  • Decrease the frequency at which netdata sends its data. (When in "average" mode, it averages over that time automatically.)
  • Use Timescale hypertables.
  • Add database indexes specific to how you intend to consume the data.
  • Use the PostgreSQL JSON Operators, which take advantage of GIN indexing.
  • Put convenience SQL VIEWs around the data you're fetching later, for easier graph building with Grafana (or whatever.)