Get started with Graphite

Graphite is a third-party monitoring application that stores real-time metrics and provides customizable ways to view them. Puppet Enterprise (PE) can export many metrics to Graphite. After enabling Graphite support, Puppet Server exports a set of metrics by default that is designed to be immediately useful to Puppet administrators.

Restriction: Graphite setups are deeply customizable and can report many different Puppet Server metrics on demand; however, this requires considerable configuration and additional server resources. Furthermore, the grafanadash and puppet-graphite modules are not Puppet-supported.

We recommend using another method to View and manage Puppet Server metrics, such as the puppet_operational_dashboards module, our Splunk plugin, or the Metrics API.

To use Graphite with PE, you must:

Use the grafanadash module

Grafana provides a web-based, customizable, Graphite-compatible dashboard. The grafanadash module installs and configures a basic Graphite test instance with the Grafana extension. When installed on a Puppet agent, the purpose of this module is to demonstrate how Graphite and Grafana can consume and display Puppet Server metrics.

CAUTION: The grafanadash module is not a Puppet-supported module. It is for testing and demonstration purposes only, is considered insecure, and is tested against CentOS 7 only. Install this module only on a dedicated agent. Do not install the grafanadash module on your primary server. This module makes the following security policy changes that are inappropriate for a primary server.
  • SELinux can cause issues with Graphite and Grafana, so the module temporarily disables SELinux. If you reboot the machine after using the grafanadash module to install Graphite, you must disable SELinux again and restart the Apache service to use Graphite and Grafana.
  • The module disables the iptables firewall and enables cross-origin resource sharing on Apache, which are potential security risks.
For the above reasons, we recommend using another method to View and manage Puppet Server metrics, such as the puppet_operational_dashboards module or the Metrics API.

Install the grafanadash module

Install the grafanadash module on a dedicated *nix agent. The module's grafanadash::dev class installs and configures a Graphite server, the Grafana extension, and a default dashboard.

  1. Install a dedicated *nix PE agent to serve as the Graphite server. For instructions, refer to Installing agents.
  2. As root on the agent node, run: sudo puppet module install puppetlabs-grafanadash
  3. As root on the agent node, run: sudo puppet apply -e 'include grafanadash::dev'

Run Grafana

Grafana runs as a web-based dashboard, and the grafanadash module configures it to use port 10000 by default. To view Puppet Server metrics in Grafana, you must configure a metrics dashboard.

Grafana does not display Puppet metrics displayed by default. You must create a metrics dashboard or edit and import a JSON-based dashboard, such as our sample metrics dashboard JSON file.
Tip: You can also use the puppet_operational_dashboards module to visualize Puppet Server metrics.
  1. Open a web browser on a computer that can reach your grafanadash agent node and navigate to http://<AGENT_HOSTNAME>:10000.
    You'll see a test screen indicating whether Grafana can successfully connect to your Graphite server.
    If Grafana is configured to use a hostname that your current computer can't resolve, click View details and go to the Requests tab to determine the hostname Grafana is trying to use. Then add the IP address and hostname to the hosts file.
    • On *nix and macOS agents, the file is located at: /etc/hosts
    • On Windows agents, the file is located at: C:\Windows\system32\drivers\etc\hosts
  2. Download the sample metrics dashboard JSON file, save the file as sample_metrics_dashboard.json, and open it in a text editor on the same computer you're using to access Grafana.
  3. Throughout the file, replace primary.example.com with the hostname of your primary server.
    Important: The hostname value must also be used as the metrics_server_id value when you Enable Puppet Server's Graphite support.
  4. Save the file.
  5. In the Grafana UI, click Search (Folder icon) > Import > Browse, then select your sample_metrics_dashboard.json file.
Results
This loads a dashboard with nine graphs that display various metrics exported from the Puppet Server to the Graphite server. However, these graphs remain empty until you Enable Puppet Server's Graphite support. For information about the aspects of the sample dashboard, refer to Sample Grafana dashboard graphs.

Enable Puppet Server's Graphite support

Use the PE Master node group in the Puppet Enterprise (PE) console to configure Puppet Server's metrics output settings.

  1. In the PE console, go to Node groups > PE Infrastructure > PE Master.
  2. On the Classes tab, locate the puppet_enterprise::profile::master class, and add these parameters:
    1. Set metrics_graphite_enabled to true (the default is false).
    2. Set metrics_server_id to the primary server hostname.
    3. Set metrics_graphite_host to the hostname of the agent node where you're running Graphite and Grafana.
    4. Set metrics_graphite_update_interval_seconds to an integer representing a number of seconds. This is the frequency at which Graphite updates, and the default value is 60 seconds.
  3. Verify that these parameters are set to their default values, unless your Graphite server uses a non-standard port:
    1. Confirm metrics_jmx_enabled is set to true.
    2. Confirm metrics_graphite_port is set to 2003 or the Graphite port on your Graphite server.
    3. Confirm profiler_enabled is set to true.
  4. Commit changes.

Sample Grafana dashboard graphs

In the Run Grafana steps, you used a JSON file to set up a sample Grafana dashboard. You can customize this dashboard by clicking the title of any graph and clicking Edit.

Graph name Description
Active requests This graph serves as a "health check" for the Puppet Server. It shows a flat line that represents the number of CPUs you have in your system, a metric that indicates the total number of HTTP requests actively being processed by the server at any moment in time, and a rolling average of the number of active requests.

If the number of requests being processed exceeds the number of CPUs for any significant length of time, your server might be receiving more requests than it can efficiently process.

Request durations This graph breaks down the average response times for different types of requests made by Puppet agents. This indicates how expensive catalog and report requests are compared to the other types of requests. It also provides a way to see changes in catalog compilation times when you modify your Puppet code.

A sharp upward curve for all request types indicates an overloaded server. Expect these to trend downward after the server load is reduced.

Request ratios This graph shows how many requests of each type that Puppet Server has handled. Under normal circumstances, you'll see about the same number of catalog, node, or report requests, because these all happen once per agent run. The number of file and file metadata requests correlate to how many remote file resources are in the agents' catalogs.
External HTTP Communications This graph tracks the amount of time it takes Puppet Server to send data and requests for common operations to, and receive responses from, external HTTP services, such as PuppetDB.
File Sync This graph tracks how long Puppet Server spends on File Sync operations, for both its storage and client services.
JRubies This graph tracks how many JRubies are in use, how many are free, the mean number of free JRubies, and the mean number of requested JRubies.

If the number of free JRubies is often less than one, or the mean number of free JRubies is less than one, Puppet Server is requesting and consuming more JRubies than are available. This overload reduces Puppet Server's performance. While this might simply be a symptom of an under-resourced server, it can also be caused by poorly optimized Puppet code or bottlenecks in the server's communications with PuppetDB if it is in use.

If catalog compilation times have increased but PuppetDB performance remains the same, examine your Puppet code for potentially unoptimized code. If PuppetDB communication times have increased, tune PuppetDB for better performance or allocate more resources to it.

If neither catalog compilation nor PuppetDB communication times are degraded, the Puppet Server process might be under-resourced on your server. If you have available CPU time and memory, increase the JRuby max active instances to allow it to allocate more JRubies. Otherwise, consider adding additional compilers to distribute the catalog compilation load.

JRuby Timers This graph tracks these JRuby pool metrics:
  • Borrow time: The mean amount of time that Puppet Server uses (or "borrows") each JRuby from the pool.
  • Wait time: The total amount of time that Puppet Server waits for a free JRuby instance.
  • Lock held time: The amount of time that Puppet Server holds a lock on the pool, during which JRubies cannot be borrowed. This occurs while Puppet Server synchronizes code for File Sync.
  • Lock wait time: The amount of time that Puppet Server waits to acquire a lock on the pool.
These metrics help identify sources of potential JRuby allocation bottlenecks.
Memory Usage This graph tracks how much heap and non-heap memory that Puppet Server uses.
Compilation This graph breaks catalog compilation down into various phases to show how expensive each phase is on the primary server.

Example Grafana dashboard excerpt

The following example shows only the targets parameter of a dashboard. It demonstrates:
  • The full names of Puppet's exported Graphite metrics
  • A way to add targets directly to an exported Grafana dashboard's JSON content
This example assumes the Puppet Server instance has a domain of primary.example.com.
"panels": [
    {
        "span": 4,
        "editable": true,
        "type": "graphite",

...

        "targets": [
            {
                "target": "alias(puppetlabs.primary.example.com.num-cpus,'num cpus')"
            },
            {
                "target": "alias(puppetlabs.primary.example.com.http.active-requests.count,'active requests')"
            },
            {
                "target": "alias(puppetlabs.primary.example.com.http.active-histo.mean,'average')"
            }
        ],
        "aliasColors": {},
        "aliasYAxis": {},
        "title": "Active Requests"
    }
]

Refer to the complete Grafana dashboard JSON sample file for a complete, detailed example of how a Grafana dashboard accesses these exported Graphite metrics.