Running Continuous Delivery for Puppet Enterprise Impact Analysis at scale

About this document

Product and version: Continuous Delivery for Puppet Enterprise (PE) 4.x
Document version: 1.1
Updated: 02 November 2022 (first published February 2022)

Introduction

Impact Analysis (IA) is one of the main features of Continuous Delivery for PE, it allows you to scan your Puppet catalog and have a report of the nodes that could get affected by a new code change without actually applying the changes.

This article aims to give you a better overview of the common use-cases for IA, so you can decide which approach works best for you.

If you are new to IA, please take a look at the Continuous Delivery for PE configuration documentation.

Based on direct customer feedback we have identified these three common scenarios for IA:

When IA results are needed once in a while
When IA results are needed often
When IA results are always needed and are an essential part of your pipeline

Scenario: When IA results are needed once in a while

You could have a pipeline running on Continuous Delivery for PE, but IA is not a stage in your pipeline. Or maybe you only need the IA results when a major change will be released.

Recommendation: Run IA on-demand

If no one reads or requires the IA results frequently then you might not need to run them automatically as a stage from your pipeline, save yourself some resources and run it on-demand, whenever you need them.

Scenario: When IA results are needed often

In this scenario, you might want to have an IA scan report a couple of times per week. You might even have a second pipeline that contains IA as a stage.

Recommendation: Limit the size of your test environment

Your test environment should be representative of your Production environment, but it doesn't have to be the same.

Say your Production environment contains 20,000 nodes, you could speak with your internal engineering customers and create a subset of 500 nodes that represent every single type of node in your system.

This will give you the flexibility to run IA in a test environment of 500 nodes in less time than targeting the whole set of machines.

Scenario: When IA results are always needed and are an essential part of your pipeline

This is scenario where your team is getting value from every IA result, maybe it's a required stage in your pipeline and your code developers send multiple Pull Requests per day.

You might start having a couple of symptoms that indicate it's time to tune in your IA configuration.

Recommendation: Tunning IA

The first recommended procedure is to Limit the size of your test environment
The second recommendation is to change the IA destination
- By having a dedicated compiler for IA.
  - The main benefit is that your Puppet Enterprise infrastructure will not be directly affected by IA runs.
  - Increase the "concurrent catalog compilations", by default it is 10
- By having a pool of compilers for IA behind a load balancer
  - The benefit from this setup is that it will help you to run more IA in parallel in an overall less time (remember that the size of your catalog has an important role here)
  - This will also allow you to tune "concurrent catalog compilations" to more than 30 (10 by default). Make sure to test how it behaves as you increment this setting, when IA is pointing to a pool of compilers you have more room to increase the concurrency.
  - It will reduce overall the time IA takes to run

Note

If running dedicated IA compilers it would be advised to use either tags or trusted facts to be able to easily identify IA compilers separately when viewing performance and capacity data.