Skip to main content

Riak KV monitoring with Netdata

Collects database stats from /stats endpoint.

Requirements#

The following charts are included, which are mostly derived from the metrics listed here.

  1. Throughput in operations/s
  • KV operations

    • gets
    • puts
  • Data type updates

    • counters
    • sets
    • maps
  • Search queries

    • queries
  • Search documents

    • indexed
  • Strong consistency operations

    • gets
    • puts
  1. Latency in milliseconds
  • KV latency of the past minute

    • get (mean, median, 95th / 99th / 100th percentile)
    • put (mean, median, 95th / 99th / 100th percentile)
  • Data type latency of the past minute

    • counter_merge (mean, median, 95th / 99th / 100th percentile)
    • set_merge (mean, median, 95th / 99th / 100th percentile)
    • map_merge (mean, median, 95th / 99th / 100th percentile)
  • Search latency of the past minute

    • query (median, min, max, 95th / 99th percentile)
    • index (median, min, max, 95th / 99th percentile)
  • Strong consistency latency of the past minute

    • get (mean, median, 95th / 99th / 100th percentile)
    • put (mean, median, 95th / 99th / 100th percentile)
  1. Erlang VM metrics
  • System counters

    • processes
  • Memory allocation in MB

    • processes.allocated
    • processes.used
  1. General load / health metrics
  • Siblings encountered in KV operations during the past minute

    • get (mean, median, 95th / 99th / 100th percentile)
  • Object size in KV operations during the past minute in KB

    • get (mean, median, 95th / 99th / 100th percentile)
  • Message queue length in unprocessed messages

    • vnodeq_size (mean, median, 95th / 99th / 100th percentile)
  • Index operations encountered by Search

    • errors
  • Protocol buffer connections

    • active
  • Repair operations coordinated by this node

    • read
  • Active finite state machines by kind

    • get
    • put
    • secondary_index
    • list_keys
  • Rejected finite state machines

    • get
    • put
  • Number of writes to Search failed due to bad data format by reason

    • bad_entry
    • extract_fail

Configuration#

Edit the python.d/riakkv.conf configuration file using edit-config from the Netdata config directory, which is typically at /etc/netdata.

cd /etc/netdata # Replace this path with your Netdata config directory, if different
sudo ./edit-config python.d/riakkv.conf

The module needs to be passed the full URL to Riak's stats endpoint. For example:

myriak:
url: http://myriak.example.com:8098/stats

With no explicit configuration given, the module will attempt to connect to http://localhost:8098/stats.

The default update frequency for the plugin is set to 2 seconds as Riak internally updates the metrics every second. If we were to update the metrics every second, the resulting graph would contain odd jitter.

Reach out

If you need help after reading this doc, search our community forum for an answer. There's a good chance someone else has already found a solution to the same issue.

Documentation

Community