Guidelines for running scans at scale

You can run Puppet Comply to scan a maximum of 5000 nodes in a batch. Before you run Comply at scale, review the guidelines for configuring the environment and running the scan. The process of running Comply at scale was tested at Puppet Labs. Because many factors affect performance, results in your system environment might vary.

Configure the scan process

To help optimize the scan process, follow the guidelines:
  • In Puppet orchestrator, ensure that the task_concurrency parameter is set to the default value of 250. This value sets the maximum number of task or plan actions that can run concurrently in the orchestrator. If you set the parameter to 250 and run a scan of 5000 nodes, the orchestrator will be fully consumed until the scans are completed on all 5000 nodes. (For more information about optimizing performance, see Tune task and plan performance in Puppet Enterprise (PE).)
  • Schedule scans to coincide with periods of minimal workflow to help ensure adequate network throughput.
  • Plan adequate time for the initial inventory ingestion from Puppet Enterprise (PE). In lab testing, the ingestion of 5000 nodes took 2.5 minutes.
  • Plan adequate time for the scan. In lab testing, a scan of 5000 nodes took 50 minutes.
  • Configure scans in batches of up to 5000 nodes.

Upgrade Comply in a large-scale environment

Before you upgrade Comply in an environment with thousands of nodes, review the limitations and consider the best strategy for your environment.

During the standard upgrade process, a new version of the CIS-CAT Pro Assessor is downloaded to each Puppet-managed node. However, Comply supports a limited number of concurrent downloads of the assessor. In lab testing, a maximum of about 120 concurrent downloads was achieved. Thus, if you initiate an upgrade of thousands of nodes, not all nodes will be updated on the first run.

You can resolve the issue in one of the following ways:
  • Run Puppet manually on a maximum of 120 nodes. Repeat the process until all nodes are updated.
  • Configure Comply to host the assessor file on an internal web server and then upgrade Comply.
To host the assessor file internally and upgrade Comply, complete the following steps:
  1. In the Puppet Enterprise (PE) console, click Node Groups > PE Infrastructure > PE Agent > Classes.
  2. In the Add new class field, select the Comply class.
  3. In the Parameter name field, select scanner_source.
  4. Set the value of the scanner source to the URL where the assessor will be hosted. For example, the URL can have the following structure, where server-hosting-assessor-ip specifies the IP address of the server that will host the assessor:
  5. Commit the changes.
  6. In the PE console, click Run > Puppet.
  7. Complete the upgrade process by selecting the relevant nodes and running the job.

Optimize scanning and reporting at scale

You can compare the results of your scanning and reporting processes against the results obtained in lab testing. If performance is not adequate in your environment, determine the cause of bottlenecks and address the issues.

In lab testing, with no other tasks running, the average run times for scans were as follows:
  • A scan of 1000 nodes required about 10 minutes.
  • A scan of 2000 nodes required about 20 minutes.
  • A scan of 5000 nodes required about 50 minutes.

The run times for scans are affected by the host type. In general, scans on Microsoft Windows systems take longer than scans on *nix systems. Run times can vary significantly, depending on many other factors. For example, run times are longer for nodes with many user accounts and for nodes with many types of software installed. Results obtained in the lab represent an optimal use case.

The time required to generate and load a report increases with the number of nodes scanned. In lab testing, the initial inventory ingestion from PE took 2.5 minutes for 5000 nodes. The process of loading a report for 5000 nodes took 30 seconds to 3 minutes. The average report size was 5 MB.

To help understand performance issues, you can analyze log files. For more information, see Access logs.