Building Docker images with Puppet

Docker-logoEverybody should be building Docker images! but what if you don’t want to write all those shell scripts, which is basically what the Dockerfile is, a bunch of shell commands in RUN declarations; or if you are already using some Puppet modules to build VMs?

It is easy enough to build a new Docker image from Puppet manifests. For instance I have built this Jenkis slave Docker image, so here are the steps.

The Devops Israel team has built a number of Docker images on CentOS with Puppet preinstalled, so that is a good start.


FROM devopsil/puppet:3.5.1

Otherwise you can just install Puppet in any bare image using the normal installation instructions. Something to have into account is that Docker images are quite simple and may not have some needed packages installed. In this case the centos6 image didn’t have tar installed and some things failed to run. In some CentOS images the centosplus repo needs to be enabled for the installation to succeed.


FROM centos:centos6
RUN rpm --import https://yum.puppetlabs.com/RPM-GPG-KEY-puppetlabs && \
    rpm -ivh http://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm

# Need to enable centosplus for the image libselinux issue
RUN yum install -y yum-utils
RUN yum-config-manager --enable centosplus

RUN yum install -y puppet tar

Once Puppet is installed we can apply any manifest to the server, we just need to put the right files in the right places. If we need extra modules we can copy them from the host, maybe using librarian-puppet to manage them. Note that I’m avoiding to run librarian or any tool in the image, as that would require installing extra packages that may not be needed at runtime.


ADD modules/ /etc/puppet/modules/

The main manifest can go anywhere but the default place is into /etc/puppet/manifests/site.pp. Hiera data default configuration goes into /var/lib/hiera/common.yaml.


ADD site.pp /etc/puppet/manifests/
ADD common.yaml /var/lib/hiera/common.yaml

Then we can just run puppet apply and check that no errors happened


RUN puppet apply /etc/puppet/manifests/site.pp --verbose --detailed-exitcodes || [ $? -eq 2 ]

After that it’s the usual Docker CMD configuration. In this case we call Jenkins slave jar from a shell script that handles some environment variables, with information about the Jenkins master, so it can be overriden at runtime with docker run -e.


ADD cmd.sh /cmd.sh

#ENV JENKINS_USERNAME jenkins
#ENV JENKINS_PASSWORD jenkins
#ENV JENKINS_MASTER http://jenkins:8080

CMD su jenkins-slave -c '/bin/sh /cmd.sh'

The Puppet configuration is simple enough

node 'default' {
  package { 'wget':
    ensure => present
  } ->
  class { '::jenkins::slave': }
}

and Hiera customizations, using a patched Jenkins module for this to work.


# Jenkins slave
jenkins::slave::ensure: stopped
jenkins::slave::enable: false

And that’s all, you can see the full source code at GitHub. If you are into Docker check out this IBM research paper comparing virtual machines (KVM) and Linux containers (Docker) performance.

Using Puppet’s metadata.json in Librarian-Puppet and Blacksmith

I have published new versions of librarian-puppet and puppet-blacksmith gems that handle the new Puppet metadata.json format for module data and dependencies.

Puppet Labs logolibrarian-puppet 1.3.1 and 1.0.8 [changelog] include two important changes. Now there is no need to create a Puppetfile if you have a Modulefile or metadata.json, it will use them by default. Of course you can add a Puppetfile to bring in modules from git, a directory, or github tarballs.

The other change is that all the dependencies’ metadata.json files will be parsed now for transitive dependencies, so it works with the latest Puppet Labs modules and those migrated from the old Modulefile format going forward. That also means that the puppet gem is no longer needed if there are no Modulefile present in your tree of dependencies, which was a source of pain for some users.

The 1.0.x branch is kept updated to run in Ruby 1.8 while 1.1+ requires Ruby 1.9 and uses the Puppet Forge API v3.

Puppet Blacksmith, the gem to automate pushing modules to the Puppet Forge was also updated to use metadata.json besides Modulefile in version 2.2+ [changelog].

Anatomy of a DevOps Orchestration Engine: (III) Agents

MaestroDev logoPreviously: (II) Architecture

In Maestro we typically use a Maestro master server and multiple Maestro agents. Each Maestro Agent is just a small service where the actual work happens, it processes the work sent by the master, via ActiveMQ, and executes the plugins with the data received.

Architecture

The two main goals of the agent are load distribution and heterogeneous composition support. The more agents running, the more compositions that can be executed in parallel, and compositions can target specific agents based on its features, such as architecture, operating system,… which is a must for development environments. For simplicity each agent can only run one composition at a time, but you could have multiple agent processes running in a single server.

It uses Puppet Facter to gather the machine facts (operating system, memory size, cloud provider data,…) and sends all that information to the master, that can use it to filter what compositions run in the agent. For instance I may want to run a composition in a Windows agent, or in an agent that has some specific piece of software installed. Facter supports external facts so it is really easy to add new filtering capabilities, and not be just limited to what Facter provides out of the box. A small text file can be added to /etc/facter/facts.d/ and Facter would report it to the master server.

Agents are installed alongside with all the tools that may be needed, from Git, to clone repos, to Jenkins swarm to reuse the agents as Jenkins slaves, or mcollective agents to allow updating the agent itself automatically with Puppet when new manifests are deployed to the Puppet master. In our internal environment any commit to Puppet manifests or modules automatically trigger our rspec-puppet tests, the deployment of those manifests to the Puppet master, and a cascading Puppet update of all the machines in our staging environment using MCollective. All our Puppet modules are likewise built and tested on each commit and a new version published to the Puppet Forge automatically using rspec-puppet and Puppet Blacksmith.

Maestro also supports manually assigning agents to pools, and matching compositions with agent pools, so compositions can be limited to run in a predefined set of agents.

The agent process is written in Ruby and runs under JRuby in the JVM, thus supporting multiple operating systems and architectures, and the ability to write extensions in Java or Ruby easily. It connects to the master’s Composition Execution Engine through ActiveMQ using STOMP for messaging.

Plugins

Plugins are small pieces of code written in Java or Ruby that run in the agent to execute the actual work. We have made all plugins available in GitHub so they can be used as examples to create new plugins for custom tasks.

Plugins can be added to Maestro at runtime and automatically show up in the composition editor. The plugin manifest defines the plugin images, what tasks are defined, and what fields in each task. Based on the workload received, the agent downloads and executes the plugin, which just accesses the fields in the workload and do the actual work, whatever it might be, sending output back to LuCEE and populating the composition context.

For instance the Fog plugin can manage multiple clouds, such as EC2, where it can start and stop instances. The plugin receives the fields defined in the composition (credentials, image id,…), calls the EC2 API, streams the status to the Maestro output (successfully created, instance id,…) and puts some data (ids of the instances created, public ips,…) in the composition context for other tasks to use. All of that in less than 100 lines of code.

The context is important to avoid redefining field values and provide some meaningful defaults, so if you have a provision task and a deprovision task, the values in the the latter are inherited from the former.

Agent cloud manager

The agent cloud manager is a service that runs on Google Compute Engine and watches a number of Maestro installations to provide automatic agent scaling. Based on preconfigured parameters such as min/max number of agents for each agent pool, max waiting time,… and the current status of each agent pool queue, the service can start new machines from specific images, suspend them (destroy the instance but keep the disk), or completely destroy them.

We are also giving a try to Docker instead of using full vms and have created a couple interesting Docker images on CentOS for developers, a Jenkins swarm slave image and a build agent image that includes everything we use at development: Java, Ant, Maven, RVM (with 1.9, 2.0, 2.1, JRuby), Git, Svn, all configurable with credentials at runtime.

Anatomy of a DevOps Orchestration Engine: (II) Architecture

MaestroDev logo

Previously: (I) Workflow

Maestro architecture is basically defined by a master server and multiple agents, written in Java and Ruby (JRuby) for the backend and JavaScript for the frontend using AngularJS, and integrating several open source services. It is quite heterogeneous, with multiple languages, build tools, packages,… using the best tool for the job in each part of the stack.

Architecture

Master

The master services include

  • Maestro REST API
  • End user web interface
  • Composition Execution Engine (LuCEE)
  • ActiveMQ for STOMP messaging
  • PostgreSQL (or MySQL)
  • MongoDB

Maestro REST API

The REST API is a webapp written in Java, using Spring, packaged with a Jetty server. It is documented with Swagger annotations that generate a really nice web interface automatically that allows trying all the operations from the browser.

It handles caching, security, based on LDAP or database records, and delegates to the Composition Execution Engine (LuCEE) typically through LuCEE REST API but also via STOMP messaging to avoid continuous polling.

It also implements handlers to execute compositions from Github, Git, SVN,… on commit callbacks.

End user web interface

The end user UI is written in AngularJS using the AngularJS Bootstrap components and Less stylesheets. It connects to the REST API, so everything that can be done through the webapp can also be automated using the REST API (automation, automation, automation!). I have found Angular really nice to work with besides the service, factory, provider,… complicated abstractions, with good modularity and the ability to reuse third party plugins.

Built with Maven and Grunt (better for the Javascript parts), using Bower to manage all the Javascript dependencies (angular core, bootstrap, ladda button spinner,…), and Karma + PhantomJS, for headless UI tests without needing a real browser.

Composition Execution Engine (LuCEE)

LuCEE is a webapp that manages the execution of compositions, sending/receiving work to/from the agents through ActiveMQ STOMP queues, and storing state in the PostgreSQL database. LuCEE uses the Ruote workflow engine for work scheduling, and manages the compositions queue and agent routing, so basically checks what compositions need to be executed and decides in what agent to execute them, based on composition requirements, free agents, and other factors ie. prioritizing previously used agents that would likely have a cached copy of sources and dependencies to speed things up.

It is written in Ruby, it was quick to implement a first version, with a simple REST API using Sinatra and a STOMP connector to send messages to the Maestro REST webapp through ActiveMQ.

It is packaged as a JRuby war with Warbler, and both LuCEE and the REST API wars are run in the same Jetty server, all packaged as an RPM for easier deployment.

ActiveMQ

ActiveMQ handles all the comunication between LuCEE, the REST API webapp, and the agents using multiple STOMP queues. All the comunication between LuCEE and agents such as workloads, agent output, agent status,… is sent over a queue so it can be easily scaled across a high number of agents.

LuCEE also pushes changes in the database to the REST API webapp so it can update the caches without needing continuous polling.

PostgreSQL

LuCEE uses PostgreSQL (or MySQL or any other SQL database using Ruby Datamapper) as main storage to save compositions, projects, tasks,… The SQL database is also used by the REST API webapp to store permissions and user data when not using LDAP.

MongoDB

We found that in order to do more complex dashboards and reports we needed to store all sort of unstructured data from the plugins, from run time or status to anything that a plugin developer may want such as GitHub payload data received or test stacktrace. That data is sent by the agents to LuCEE and then stored in MongoDB, and can be queried directly (all your data belong to you) or through a reporting pane in the webapp.

Next: (III) Agents

Anatomy of a DevOps Orchestration Engine: (I) Workflow

MaestroDev logoAt MaestroDev we have been building what may be called, for lack of a better name, a DevOps Orchestration Engine, and is long overdue to talk about what we have been doing there and most importantly, how.

The basics of the application is to tie together the different systems involved in a Continuous Delivery cycle: Continuous Integration server, SCM, build tools, packaging tools, cloud resources, notification systems,… and streamline the process through these different tools. So it hooks into a bunch of popular tools to orchestrate interactions between them, an example:

Screen Shot 2014-07-11 at 11.20.12 AM

This workflow, or as we call it, composition, will

  1. download a war file from a Maven repository (previously built by Jenkins)
  2. start an Amazon EC2 instance with Tomcat preinstalled
  3. deploy the war
  4. checkout the acceptance tests from Git
  5. run some tests with Maven (Selenium tests using SauceLabs) against that instance
  6. wait for an user to confirm before moving to the next step (to record the human approval or to do some extra manual tests if needed)
  7. destroy the Amazon EC2 instance

Maestro provides a nice web UI that gives visibility over the composition execution and an aggregated log from all the tools that run during the composition in a single place.

Screen Shot 2014-07-15 at 10.42.42 AM

 

But the power comes with the combination of compositions together, as there are tasks for typical flows, such as running forking and joining compositions, call another composition in case of a failure, or waiting for a composition to finish.

Screen Shot 2014-07-11 at 11.19.54 AM

Here we have a more complex setup with five compositions tied together.

  • * – A composition that calls compositions 1 and 2.
  • 1 – A Jenkins build
  • 2 – The acceptance tests composition mentioned before
  • 2a – Notification composition in case the acceptance tests fail
  • 3 – Deployment to production

So you can see that compositions are not just limited to build, test, deploy. The tasks can be combined as needed to build your specific process.

Tasks are contributed by plugins, easily written in Ruby or Java, and define what fields are needed in the UI and what to do with those fields and the composition context. Maestro includes a lot of prebuilt tasks, publicly available on GitHub, from executing shell scripts to Jenkins job creation or Amazon Route 53 record management, but anything.

All the tasks share a common context and use sensible defaults, so if the scm checkout path is not defined it creates a specific working directory for the composition, and that is reused by the Maven, Ant,… plugins to avoid copying and pasting the fields. That’s also how a EC2 deprovision task doesn’t need any configuration if there was a provision task before in the composition, it will just deprovision those instances started previously in the composition by default.

You can take a look at our Maestro public instance, showing some examples and builds of public projects, mostly Puppet modules that are automatically built and deployed to the Puppet Forge, and Maestro plugins build and release compositions. In next posts I’ll be talking about the technologies used and distributed architecture of Maestro.

Next: (II) Architecture

Writing on InfoQ about DevOps

infoqA few weeks ago I’ve started to write news posts at InfoQ, about DevOps, or anything remotely close, that’s the good thing about DevOps meaning something different depending on who you ask ;)

I’d like to write more here too, I have some post ideas about Docker, Puppet, IoT, MQTT,… let’s see if I find the time

librarian-puppet 1.1 released with new Puppet Forge support

Just released librarian-puppet version 1.1.0, a version that adds support for the new Puppet Forge v3 API and fixes the issues in Puppet 3.6+ and Puppet Enterprise 3.2+, versions that started using the new v3 API. From 1.1 the ruby requirement is 1.9+ due to the puppet_forge library used.

 

librarian-puppetLibrarian-puppet is a bundler for your puppet infrastructure. You can use librarian-puppet to manage the puppet modules your infrastructure depends on, whether the modules come from the Puppet Forge, Git repositories or a just a path.

  • Librarian-puppet can reuse the dependencies listed in your Modulefile
  • Forge modules can be installed from Puppetlabs Forge or an internal Forge such as Pulp
  • Git modules can be installed from a branch, tag or specific commit, optionally using a path inside the repository
  • Modules can be installed from GitHub using tarballs, without needing Git installed
  • Module dependencies are resolved transitively without needing to list all the modules explicitly

Librarian-puppet manages your modules/ directory for you based on your Puppetfile. Your Puppetfile becomes the authoritative source for what modules you require and at what version, tag or branch.

Changelog

1.1.1

  • Issue #227 Fix Librarian::Puppet::VERSION undefined

1.1.0

  • Issue #210 Use forgeapi.puppetlabs.com and API v3
    • Accesing the v3 API requires Ruby 1.9 due to the puppet_forge library used

1.0.3

  • Issue #223 Cannot bounce Puppetfile.lock! error when Forge modules contain duplicated dependencies

1.0.2

  • Issue #211 Pass the PuppetLabs Forge API v3 endpoint to puppet module when running on Puppet >= 3.6.0
  • Issue #198 Reduce the length of tmp dirs to avoid issues in windows
  • Issue #206 githubtarball call for released versions does not consider pagination
  • Issue #204 Fix regex to detect Forge API v3 url
  • Issue #199 undefined method run! packaging a git source
  • Verify SSL certificates in github calls

1.0.1

  • Issue #190 Pass the PuppetLabs Forge API v3 endpoint to puppet module when running on Puppet Enterprise >= 3.2
  • Issue #196 Fix error in error handling when puppet is not installed

 

Announcing librarian-puppet 1.0.0

librarian-puppetI’m proud to announce the release of librarian-puppet version 1.0.0. It was about time to get to 1.x after more than 200k gem installations. See my previous post about managing Puppet modules to take advantage of its features.

Librarian-puppet is a bundler for your puppet infrastructure. You can use librarian-puppet to manage the puppet modules your infrastructure depends on, whether the modules come from the Puppet Forge, Git repositories or a just a path.

  • Librarian-puppet can reuse the dependencies listed in your Modulefile
  • Forge modules can be installed from Puppetlabs Forge or an internal Forge such as Pulp
  • Git modules can be installed from a branch, tag or specific commit, optionally using a path inside the repository
  • Modules can be installed from GitHub using tarballs, without needing Git installed
  • Module dependencies are resolved transitively without needing to list all the modules explicitly

Librarian-puppet manages your modules/ directory for you based on your Puppetfile. Your Puppetfile becomes the authoritative source for what modules you require and at what version, tag or branch.

Changelog

1.0.0

  • Remove deprecation warning for github_tarball sources, some people are actually using it

0.9.17

0.9.16

  • Issue #181 Should use qualified module names for resolution to work correctly
  • Deprecate github_tarball sources
  • Reduce number of API calls for github_tarball sources

0.9.15

  • Issue #187 Fixed parallel installation issues
  • Issue #185 Sanitize the gem/bundler environment before spawning (ruby 1.9+)

0.9.14

  • Issue #182 Sanitize the environment before spawning (ruby 1.9+)
  • Issue #184 Support transitive dependencies in modules using :path
  • Git dependencies using modulefile syntax make librarian-puppet fail
  • Issue #108 Don’t fail on malformed Modulefile from a git dependency

0.9.13

  • Issue #176 Upgrade to librarian 0.1.2
  • Issue #179 Need to install extra gems just in case we are in ruby 1.8
  • Issue #178 Print a meaningful message if puppet gem can’t be loaded for :git sources

0.9.12

  • Remove extra dependencies from gem added when 0.9.11 was released under ruby 1.8

0.9.11

  • Add modulefile dsl to reuse Modulefile dependencies
  • Consider Puppetfile-dependencies recursively in git-source
  • Support changing tmp, cache and scratch paths
  • librarian-puppet package causes an infinite loop
  • Show a message if no versions are found for a module
  • Make download of tarballs more robust
  • Require open3_backport in ruby 1.8 and install if not present
  • Git dependencies in both Puppetfile and Modulefile cause a Cannot bounce Puppetfile.lock! error
  • Better sort of github tarball versions when there are mixed tags starting with and without ‘v’
  • Fix error if a git module has a dependency without version
  • Fix git dependency with :path attribute
  • Cleaner output when no Puppetfile found
  • Reduce the number of API calls to the Forge
  • Don’t sort versions as strings. Rely on the forge returning them ordered
  • Pass –module_repository to puppet module install to install from other forges
  • Cache forge responses and print an error if returns an invalid response
  • Add a User-Agent header to all requests to the GitHub API
  • Convert puppet version requirements to rubygems, pessimistic and ranges
  • Use librarian gem

0.9.10

  • Catch GitHub API rate limit exceeded
  • Make Librarian::Manifest Semver 2.0.0 compatible

 

Security Testing Using Infrastructure-As-Code

Agile RecordArticle originally published at Agile Record magazine Issue #17 Security Testing in an Agile Environment. Can be downloaded for free as a PDF.

Security Testing Using Infrastructure-As-Code

Infrastructure-As-Code means that infrastructure should be treated as code – a really powerful concept. Server configuration, packages installed, relationships with other servers, etc. should be modeled with code to be automated and have a predictable outcome, removing manual steps prone to errors. That doesn’t sound bad, does it?

The goal is to automate all the infrastructure tasks programmatically. In an ideal world you should be able to start new servers, configure them, and, more importantly, be able to repeat it over and over again, in a reproducible way, automatically, by using tools and APIs.

Have you ever had to upgrade a server without knowing whether the upgrade was going to succeed or not for your application? Are the security updates going to affect your application? There are so many system factors that can indirectly cause a failure in your application, such as different kernel versions, distributions, or packages.

When you have a decent set of integration tests it is not that hard to make changes to your infrastructure with that safety net. There are a number of tools designed to make your life easier, so there is no need to tinker with bash scripts or manual steps prone to error.

We can find three groups of tools:

  • Provisioning tools, like Puppet or Chef, manage the configuration of servers with packages, services, config files, etc. in a reproducible way and over hundreds of machines.
  • Virtual Machine automation tools, like Vagrant, enable new virtual machines to be started easily in different environments, from virtual machines in VirtualBox or VMware to cloud providers such as Amazon AWS or Rackspace, and then provision them with Puppet or Chef.
  • Testing tools, like rspec, Cucumber, or Selenium, enable unit and integration tests to be written that verify that the server is in a good state continuously as part of your continuous integration process.

Vagrant

Learning Puppet can be a tedious task, such as getting up the different pieces (master, agents), writing your first manifests, etc. A good way to start is to use Vagrant, which started as an Oracle VirtualBox command line automation tool, and allows you to create new VMs locally or on cloud providers and provision them with Puppet and Chef easily.

Vagrant projects are composed of base boxes, specifically configured for Vagrant with Puppet/Chef, vagrant username and password, and any customizations you may want to add, plus the configuration to apply to those base boxes defined with Puppet or Chef. That way we can have several projects sharing the same base boxes where the Puppet/Chef definitions are different. For instance, a database VM and a web server VM can both use the same base box, i.e. a CentOS 6 minimal server, and just have different Puppet manifests. When Vagrant starts them up it will apply the specific configuration. That also allows you to share boxes and configuration files across teams. For instance, one base box with the Linux flavor can be used in a team, and in source control we can have just the Puppet manifests to apply for the different configurations that anybody from Operations to Developers can use. If a problem arises in production, a developer can quickly instantiate a equivalent environment using the Vagrant and Puppet configuration, making a different environment’s issues easy to reproduce.

There is a list of available VMs or base boxes ready to use with Vagrant at www.vagrantbox.es, but you can build your own and share it anywhere. For VirtualBox they are just (big) VM files that can be easily built using VeeWee (https://github.com/jedi4ever/veewee) or by changing a base box and rebundling it with Packer (http://www.packer.io).

Usage

Once you have installed Vagrant (http://docs.vagrantup.com/v2/installation/index.html) and VirtualBox (https://www.virtualbox.org/) you can create a new project.

Vagrant init will create a sample Vagrantfile, the project definition file that can be customized.

$ vagrant init myproject

Then in the Vagrantfile you can change the default box settings and add basic Puppet provisioning.

config.vm.box = "CentOS-6.4-x86_64-minimal"
config.vm.box_url = "https://repo.maestrodev.com/archiva/repository/public-releases/com/maestrodev/vagrant/CentOS/6.4/CentOS-6.4-x86_64-minimal.box"

# create a virtual network so we can access the vm by ip
config.vm.network "private_network", ip: "192.168.33.13"
config.vm.hostname = "qa.acme.local"
config.vm.provision :puppet do |puppet|
  puppet.manifests_path = "manifests"
  puppet.manifest_file = "site.pp"
  puppet.module_path = "modules"
  end

In manifests/site.pp you can try any puppet code, i.e. create a file

node 'qa.acme.local' {
  file { '/root/secret':
  mode => '0600',
  owner => 'root',
  content => 'secret file, for root eyes only',
  }
}

Vagrant up will download the box the first time, start the VM, and apply the configuration defined in Puppet.

$ vagrant up

vagrant ssh will open a shell into the box. Under the hood, vagrant is redirecting a host port to vagrant box 22.

$ vagrant ssh

If you make any changes to the Puppet manifests you can rerun the provisioning step.

$ vagrant provision

The vm can be suspended and resumed at any time

$ vagrant suspend
$ vagrant resume

and later on destroyed, which will delete all the VM files.

$ vagrant destroy

And then we can start again from scratch with vagrant up getting a completely new vm where we can make any mistakes!

Puppet

In Puppet we can configure any aspect of a server: packages, files, permissions, services, etc. You have seen how to create a file, now let’s see an example of configuring Apache httpd server and the Linux iptables firewall to open a port.

First we need the Puppet modules to manage httpd and the firewall rules to avoid writing all the bits and pieces ourselves. Modules are Puppet reusable components that you can find at the Puppet Forge (http://forge.puppetlabs.com/) or typically in GitHub. To install these two modules into the vm, run the following commands that will download the modules and install them in the /etc/puppet/modules directory.

vagrant ssh -c "sudo puppet module install --version 0.9.0 puppetlabs/apache"
vagrant ssh -c "sudo puppet module install --version 0.4.2 puppetlabs/firewall"

You can find more information about the Apache (http://forge.puppetlabs.com/puppetlabs/apache/0.9.0) and the Firewall (http://forge.puppetlabs.com/puppetlabs/firewall/0.4.2) modules in their Forge pages. We are just going to add some simple examples to the manifests/site.pp to install the Apache server with a virtual host that will listen in port 80.

node 'qa.acme.local' {

  class { 'apache': }

  # create a virtualhost

  apache::vhost { "${::hostname}.local":
    port => 80,
    docroot => '/var/www',
  }
  }

Now if you try to access this server in port 80 you will not be able to, as iptables is configured by default to block all incoming connections. Try accessing http://192.168.33.13 (the ip we configured previously in the Vagrantfile for the private virtual network) and see for yourself.

To open the firewall, we need to open the port explicitly in the manifests/site.pp by adding

firewall { '100 allow apache':
  proto => 'tcp',
  port => '80',
  action => 'accept',
  }

and running vagrant provision again. Now you should see Apache’s default page in http://192.168.33.13.

So far we have created a virtual machine where the apache server is automatically installed and the firewall open. You could start from scratch at any time by running vagrant destroy and vagrant up again.

Testing

Let’s write some tests to ensure that everything is working as expected. We are going to use Ruby as the language of choice.

Unit testing with rspec-puppet

rspec-puppet (http://rspec-puppet.com/) is a rspec extension that allows to easily unit test Puppet manifests.

Create a spec/spec_helper.rb file to add some shared config for all the specs

require 'rspec-puppet'

RSpec.configure do |c|
  c.module_path = 'modules'
  c.manifest_dir = 'manifests'
  end

and we can start creating unit tests for the host that we defined in Puppet.

# spec/hosts/qa_spec.rb

require 'spec_helper'

describe 'qa.acme.local' do

  # test that the httpd package is installed

  it { should contain_package('httpd') }

  # test that there is a firewall rule set to 'accept'

  it { should contain_firewall('100 allow apache').with_action('accept') }

  # ensure that there is only one firewall definition

  it { should have_firewall_resource_count(1) }

  end

After installing rspec-puppet gem install rspec-puppet, you can run rspec to execute the tests.

...

Finished in 1.4 seconds

3 examples, 0 failures

Success!

Integration testing with Cucumber

Unit testing is fast and can catch a lot of errors quickly, but how can we check that the machine is actually configured as we expected?

Let’s use Cucumber (http://cukes.info/), a BDD tool, to create an integration test that checks whether a specific port is open in the virtual machine we started.

Create a features/smoke_tests.feature file with:

Feature: Smoke tests
Smoke testing scenarios to make sure all system components are up and running.

Scenario: Services should be up and listening to their assigned port
Then the "apache" service should be listening on port "80"

Install Cucumber gem install cucumber and run cucumber. The first run will output a message saying that the step definition has not been created yet.

Feature: Smoke tests
Smoke testing scenarios to make sure all system components are up and running.

Scenario: Services should be up and listening to their assigned port # features/smoke_tests.feature:4
Then the "apache" service should be listening on port "80" # features/smoke_tests.feature:5

1 scenario (1 undefined)

1 step (1 undefined)

0m0.001s

You can implement step definitions for undefined steps with these snippets:

Then(/^the "(.*?)" service should be listening on port "(.*?)"$/) do |arg1, arg2|
  pending # express the regexp above with the code you wish you had
  end

So let’s create a features/step_definitions/tcp_ip_steps.rb file that implements our service should be listening on port step by opening a TCP socket.

Then /^the "(.*?)" service should be listening on port "(.*?)"$/ do |service, port|
  host = URI.parse(ENV['URL']).host
  begin
    s = TCPSocket.new(host, port)
    s.close
    rescue Exception => error
    raise("#{service} is not listening at #{host} on port #{port}")
  end
  end

And rerun Cucumber, this time using an environment variable URL to specify where the machine is running, as used in the step definition URL=http://192.168.33.13 cucumber.

Feature: Smoke tests
Smoke testing scenarios to make sure all system components are up and running.

Scenario: Services should be up and listening to their assigned port # features/smoke_tests.feature:4
Then the "apache" service should be listening on port "80" # features/step_definitions/tcp_ip_steps.rb:1

1 scenario (1 passed)

1 step (1 passed)

0m0.003s

Success! The port is actually open in the virtual machine.

Wash, rinse, repeat

This was a small example of what can be achieved using Infrastructure-As-Code and automation tools such as Puppet and Vagrant combined with standard testing tools like rspec or Cucumber. When a continuous integration tool like Jenkins is thrown into the mix to run these tests continuously, the result is an automatic end-to-end solution that tests systems as any other code, avoiding regressions and enabling Continuous Delivery (http://blog.csanchez.org/2013/11/12/continuous-delivery-with-maven-puppet-and-tomcat-video-from-apachecon-na-2013/) – automation all the way from source to production.

A more detailed example can be found in my continuous-delivery project at GitHub (https://github.com/carlossg/continuous-delivery).

New release of librarian puppet

Puppet Labs logoI’ve been helping with the development of librarian-puppet, pushing upstream a lot of fixes we had made in the past and applying long outstanding pull requests in the project in order to get a release out, and finally you can get the (probably) last release before 1.0.0 which should be stable enough for day to day use.

Besides bug fixes probably the best feature is the ability of reusing the Modulefile dependencies by creating the simplest Puppetfile, if you only need modules from the Puppet Forge

forge "http://forge.puppetlabs.com"

modulefile

 

The changelog

0.9.13

  • Issue #176 Upgrade to librarian 0.1.2
  • Issue #179 Need to install extra gems just in case we are in ruby 1.8
  • Issue #178 Print a meaningful message if puppet gem can’t be loaded for :git sources

0.9.12

  • Remove extra dependencies from gem added when 0.9.11 was released under ruby 1.8

0.9.11

  • Add modulefile dsl to reuse Modulefile dependencies
  • Consider Puppetfile-dependencies recursively in git-source
  • Support changing tmp, cache and scratch paths
  • librarian-puppet package causes an infinite loop
  • Show a message if no versions are found for a module
  • Make download of tarballs more robust
  • Require open3_backport in ruby 1.8 and install if not present
  • Git dependencies in both Puppetfile and Modulefile cause a Cannot bounce Puppetfile.lock! error
  • Better sort of github tarball versions when there are mixed tags starting with and without ‘v’
  • Fix error if a git module has a dependency without version
  • Fix git dependency with :path attribute
  • Cleaner output when no Puppetfile found
  • Reduce the number of API calls to the Forge
  • Don’t sort versions as strings. Rely on the forge returning them ordered
  • Pass –module_repository to puppet module install to install from other forges
  • Cache forge responses and print an error if returns an invalid response
  • Add a User-Agent header to all requests to the GitHub API
  • Convert puppet version requirements to rubygems, pessimistic and ranges
  • Use librarian gem