April 2, 2020

Getting Started with Automation: A Beginner's Guide

Infrastructure Automation
How to & Use Cases

If you're anything like the rest of us DevOps folks, you're suddenly being asked to make a lot of unplanned infrastructure changes without much prep time. That combo can make getting started with automation a pretty daunting task.

If this is new ground for you and your team, here are some tips on getting started with automation with fewer missteps along the way.

Table of Contents

How Do I Get Into Automation?

Getting started with automation can look different for every organization. A great first step is to take stock of the workflows that could benefit from automation, and the repetitive tasks that could be automated using code.

Make a Plan That Allows for Course Corrections

It's tempting to jump in with both feet and just start working. There's a lot to do and no time to waste. Unfortunately, this eagerness often leads to mistakes, like automating the wrong thing or configuring things incorrectly. Sometimes it even leads to disasters when poorly planned infrastructure isn't able to handle the load thrown at it. But at the same time, planning out the minutia consumes precious time that you'll never get back.

🖥 Automate up to 10 nodes with Puppet — free. Start a trial today and see what automation can do for you! 👈

I suggest a middle-of-the-road approach. Sketch out a plan of action. Choose the technologies you're going to use. Assess potential pitfalls and roadblocks. Assign work to team members, with clearly delineated interoperation agreements. For example, this is not the time for one team to build single-sign-on with GitHub oAuth and for another team to integrate with your existing LDAP service.

Plan for changes, though. You're likely working with incomplete information and the environment we're working in is changing continuously — even daily.

For example, your company may suddenly pivot to providing virtual services or home delivery whether or not you've got the infrastructure to support it. So anticipate these direction changes and adapt. Plan daily virtual standups to keep the team in sync and clearly communicate when your changes will affect other team members.

Start Small and Iterate

As you're building your plan, remember that you cannot do it all at once. There's an overused phrase in the industry that says "don't try to boil the ocean." This means that if you invest a large amount of time and effort into a single all-inclusive rollout, then that rollout has a much higher chance of failure than if you'd taken a slower iterative approach.

👀 Simplify IT operations and save time — check out our practical guide to streamlining IT operations >> 

Instead, I suggest that one of your first milestones be ensuring that all machines in your infrastructure, from servers to individual laptops, have a configuration management agent like Puppet installed whether or not they're actually managing anything.

These systems report back information about each machine, giving you insights into the true state of your infrastructure and allowing you to center your plan on data rather than assumptions. Then when you're ready for it, you can incrementally roll out configuration policies as they're written.

Getting Started with Automation Means Tackling the Low Hanging Fruit

This might sound odd coming from an automation company. But when time is of the essence, it's the large number of simple tasks that will give you the best return on time investment. Large or tricky automation jobs do have a big impact, but they also take a lot of time to build, test, and troubleshoot.

When you've got a tool like Bolt that lets you easily scale your existing processes to a whole fleet at once, the cumulative time savings for starting with the little things add up quickly. See more about Bolt below.

Wait until you've got the time to do it right to tackle the complex jobs, and for now use the time you save with the low hanging fruit to dedicate more time to making sure that your manual processes are successful.

Document All Your Automation Decisions

Like we said above, even the best laid plans are subject to change. This means that you'll be making a lot of decisions that are effectively going to be codified into practice, at least for the time being.

Make sure you document everything, especially the rationale behind the decision. As you take your notes, make sure that they'll be understandable when you go back to turn them into documentation. When future you or other team members need to troubleshoot or refactor, this will be invaluable.

A Guide to Getting Started with Automation Using Bolt

Often, sysadmins will have a list of shell commands that they run, or a collection of small shell scripts used to provision and configure machines.

The challenge when using those at scale is consistency. How do you ensure that you and all your team members run all the commands and in the same order each time? Copying and pasting from a wiki mostly works when you're building a single machine and have time to go back and check your work, but that certainly doesn't scale to hundreds of machines.

But those scripts still have value. Bolt can help you reuse that existing knowledge and scale it up to your new challenge. Instead of learning a new language and rewriting everything all at once, Bolt lets you use your existing scripts with little or no modification. In other words, it can help you quickly ramp up these easy parts and leave you more time for the hard problems. I'll show you the basics here.

Install Bolt on Your Workstation

Bolt runs on your own workstation, so you'll first want to install the package for your operating system.

Try a Remote Command in Bolt

Then let's try out a remote command. We'll use the "remote" host of localhost for validation first. Notice the nested quotes. That's because we're passing the entire string to the remote machine as a command to run.

$ bolt command run "echo 'hello world'" --targets localhost Started on localhost... Finished on localhost: STDOUT: hello world Successful on 1 target: localhost Ran on 1 target in 0.01 sec

That worked pretty well. Now let's try it on a real remote host. Choose the address of a machine that you can SSH into and run the command again using that address for the target. This time you'll need to pass login credentials, or you can see the docs for other authentication methods, such as configuring automatic key-based SSH login in your ~/.ssh/config.

$ bolt command run "echo 'hello world'" --targets --user root --password hunter2 Started on Finished on STDOUT: hello world Successful on 1 target: Ran on 1 target in 0.51 sec

Configure an Inventory File

You can pass as many targets as you like in a comma separated list. But eventually that's going to end up being tedious, so let's configure an inventory file to simplify this.

Create a yaml file at ~/.puppetlabs/bolt/inventory.yaml that looks like this:

--- version: 2 groups: - name: linux targets: - - - name: windows targets: - winrm://:55985 - winrm://:55985 config: ssh: user: # use either password or private-key password: #private-key: ~/.ssh/id_rsa winrm: user: password: ssl: false

Now you can run that same command on all your nodes at once by passing the group name as a target!

$ bolt command run "echo 'hello world'" --targets linux Started on Finished on STDOUT: hello world Successful on 1 target: Ran on 1 target in 0.72 sec

Try It on a Shell Script

Excellent work! Now it's time to try that with a shell script. For this example, I'm using the bashcheck script to do a quick shell vulnerability assessment. Save that script to your local directory, or use any other shell script you'd like.

$ bolt script run bashcheck.sh --targets linux Started on Finished on STDOUT: Testing /usr/bin/bash ... Bash version 4.2.46(2)-release Variable function parser pre/suffixed [(), redhat], bugs not exploitable Not vulnerable to CVE-2014-6271 (original shellshock) Not vulnerable to CVE-2014-7169 (taviso bug) Not vulnerable to CVE-2014-7186 (redir_stack bug) Test for CVE-2014-7187 not reliable without address sanitizer Not vulnerable to CVE-2014-6277 (lcamtuf bug #1) Not vulnerable to CVE-2014-6278 (lcamtuf bug #2) Successful on 1 target: Ran on 1 target in 0.92 sec

As you can see, not only is my test machine already patched from these vulnerabilities, but Bolt does all the work of transferring the script to the host nodes and running it for you. All you need to do is ensure that the host machines have an interpreter capable of running the script.

PRO TIP: if you set the shebang line properly, you can use any scripting language you'd like. If you're targeting Windows hosts, you can run PowerShell scripts by using a .ps1 file extension.

Now you know just enough to be dangerous, and it only took 5-10 minutes of reading and experimenting. You've now got the ability to easily run commands and scripts across your entire infrastructure.

We know that you're hard pressed for time now, so start with this. Write shell scripts to configure as needed and then let Bolt automate that across your whole fleet. You'll get a report back of successes and failures so you'll know exactly which machines need more attention. And then when you're ready to learn more, come back and check out the more advanced features.


Learn More