December 25, 2014

Day 25 - Windows has Configuration Management?!?

Written by: Steven Murawski (@stevenmurawski)
Edited by: William Shipway (@shipw)

Windows Server administration has long been the domain of “admins” mousing their way through a number of Microsoft and third party management UIs (and I was one of them for a while). There have always been a stalwart few who, by hook or by crook, found a way to automate the almost unautomateable. But this group remained on the fringes of Windows administration. They were labeled as heretics and shunned, until someone needed to do something not easily accomplished by a swipe of the mouse.

The sea winds have shifted and over the past seven or eight years, Microsoft released PowerShell and began focusing on providing a first class experience to the tool makers and automation-minded. The earlier group of tool makers and automators gained traction and began to develop a larger following, as more Microsoft and third party products added support for PowerShell. That intrepid group of early automators formed the core of the PowerShell community and began welcoming new converts - whether they were true believers or forced into acceptance by the lack of some capability in their comfortable management UIs. Now, most Windows Server administrators have delved into the command line and have begun to succumb to the siren call of automation.

Just as the PowerShell community’s evangelism was reaching a fevered pitch, Microsoft added another management tool - Desired State Configuration. The tool-makers and automators were stunned. Cries of “what about my deployment scripts?” and “but, I already built my VM templates!” echoed through the halls. Early adopters of PowerShell v3 lamented “isn’t this what workflows were for?”. Some had already begun to explore the dark arts of configuration management using tools like Chef and Puppet to bring order to their infrastructure management. With the help of those in the community who blazed a trail in implementing configuration management on Windows, those cries of dismay began to turn into rabid curiosity and even envy. The administrators began to read books like the Phoenix Project and hear stories from companies like Stack Exchange, Etsy, Facebook, and Amazon about this cult of DevOps. They wanted access to this new realm of possibilities, where production deployments don’t mean a week of late nights in the office and requests for new servers don’t go to the bottom of the pile to sit for a month to “percolate”.

Read on, dear reader to understand the full story of Desired State Configuration and its place in the new DevOps world where Windows Server administrators find themselves.

An Introduction to Desired State Configuration

With the release of Windows Server 2012 R2 and Windows Management Framework 4, Microsoft introduced Desired State Configuration (DSC). DSC consists of three main components: the Local Configuration Manager, a configuration Domain Specific Language (DSL), and resources (with a pattern for building more). DSC is available on Windows Server 2012 R2 and Windows 8.1 64 bit out of the box and can be installed on Windows Server 2012, Windows Server 2008 R2, and Windows 7 64 bit with Windows Management Framework 4. There is an evolving ecosystem around Desired State Configuration, including support for a number of systems management and deployment projects. To me, one of the most important benefits of the introduction of Desired State Configuration is the awakening of the Windows administration community to configuration management concepts.

A Platform Play

The inclusion of Desired State Configuration may seem like a slap in the face to existing configuration management vendors, but that is not the case. Desired State Configuration is a platform level capability similar to PerfMon or Event Tracing for Windows. DSC is not intended to wholesale replace other configuration management platforms, but to be a base which other platforms can build on in a consistent manner.

The Evolution of DSC

One of the major knocks against administering Windows servers in the past has been the horrendous story around automation. Command-line tools were either lacking coverage or just plain missing. The shell was in a sorry state.

Then, shortly before Windows Server 2008 shipped, PowerShell came about. Initially, PowerShell had relatively poor native coverage for managing Windows, but it worked with .NET, WMI, and COM, so it could do just about anything you needed.

More coverage was introduced with each release of Windows Server. Windows Server 2012 had an explosion of coverage via native PowerShell commands for just about everything on the platform.

PowerShell appeared to be the management API for configuring Windows servers. The downside of a straight PowerShell interface is that PowerShell commands aren’t necessarily idempotent. Some like Add-WindowsFeature are, and do the right thing if the command is run repeatedly. Others are not, like New-Website, which will throw errors if the site already exists.

DSC was introduced to provide a common management API that offers consistent behavior. Under the covers, it is mostly PowerShell that is running, but the patterns the resources follow ensure that only the work that needs to be done is done, and when a resource is in the proper state, that it is left alone.

Being a platform feature means that there is a consistent, supported mechanism for customers and vendors to manage and evolve the configured state of Windows servers.

Standards Based

Desired State Configuration was built using standards already supported on the Windows platform - CIM and WSMAN.

CIM, Common Information Model, is the DMTF standard that WMI is based upon and provides structure and schema for DSC.

WSMAN, WS-Management, is a web services protocol and DMTF standard for management traffic. WinRM and PowerShell remoting are built on this transport as well.

While these might not be the greatest standards in the world, they do provide a consistent manner for interacting with the Desired State Configuration service.

An Evolving API

Though Windows Management Framework (WMF) was just recently introduced (it has been released for just over a year), WMF 5 development is well under way and includes many enhancements and bug fixes. One major change is to make the DSC engine’s API more friendly to use by third-party configuration management systems.

There was also a recent rollup patch for Server 2012 R2 (KB3000850) that contains a number of bugfixes and some tweaks for ensuring compatibility with changes coming in WMF 5.

Diving In

Now that we’ve got a bit of history and rationale for existence out of the way, we can dig in to the substance of Desired State Configuration.

The Local Configuration Manager

The engine that manages the consistency of a Windows server is the Local Configuration Manager (LCM). The LCM is exposed as a WMI (CIM) class (MSFT_DscLocalConfigurationManager) in the Root/Microsoft/Windows/DesiredStateConfiguration namespace.

The LCM is responsible for periodically checking the state of resources in a configuration document. This agent controls

  • whether resources are allowed to reboot the node as part of a configuration cycle
  • how the agent should treat deviance from the configuration state (apply and never check, apply and report deviance, apply and autocorrect problems)
  • how often consistency checks should be run
  • and more…

It has a plugin/extension point with the concept of Download Managers. Download Managers are used for Pull mode configurations. There are two download managers that ship in the box, one using a simple REST endpoint to retrieve configurations and one using a SMB file share. As it currently stands, these are not open for replacement by third parties (but it could be made so - please weigh in to the PowerShell team about that before WMF 5 is done!).

A Quick Note - Push vs. Pull

DSC configurations can be imperatively pushed to a node (via the Start-DscConfiguration cmdlet or directly to the WMI API), or if a Download Manager is configured it can pull a configuration and resources from a central repository (currently either SMB file share or REST-based pull server). If a node is in PULL mode, when a new configuration is retrieved, it is parsed to find the various modules required for the configuration to be applied. If any of the requisite modules and versions are not present on the local node, the pull server can supply those.

DSC Resources

Resources are the second major component of the DSC ecosystem, and are what make things happen in the context of DSC. There are three ways of creating DSC resources: They can be written in PowerShell, as WMI classes, or in Windows Management Framework 5, as PowerShell classes. As PowerShell class-based resources are still an experimental feature and the level of effort to create WMI based resources is pretty high, we’ll focus on PowerShell-based resources here.

DSC resources are implemented as PowerShell modules. They are hosted inside another PowerShell module under a DSCResources folder. The host module needs to have a module metadata file and have a module version defined in order for it to host DSC resources.

The resources themselves are PowerShell modules that expose three functions or cmdlets:

  • Get-TargetResource
  • Test-TargetResource
  • Set-TargetResource

Get-TargetResource returns the currently configured state (or lack thereof) of the resource. The function returns a hashtable that the LCM converts to an object at a later stage.

Test-TargetResource is used to determine if the resource is in the desired state or not. It returns a boolean.

Set-TargetResource is responsible for getting the resource into the desired state. Set-TargetResource is only executed after Test-TargetResource.

The Configuration DSL

Also introduced with Desired State Configuration are some domain specific language extensions on top of PowerShell. Actually, Windows Management Framework 4 added some public extension points in PowerShell for creating new keywords, which is what DSC uses.

Stick with me here, as it may get a bit confusing - I’ll be using “configuration” in two contexts. First is the configuration script. This is defined in PowerShell and can be defined in a script file, a module, or an ad hoc entry at the command line. The second use of “configuration” is in the context of the configuration document. This is the final serialized representation of the configuration for a particular machine or class of machines. This document is in Managed Object Format (MOF) and is how CIM classes are serialized.

The first keyword defined is configuration. The configuration keyword indicates that the subsequent scriptblock will be a configuration document and should be parsed differently. All your standard PowerShell constructs and commands are valid inside of a configuration, as are a few new keywords. There are two static keywords and a series of dynamic keywords in a configuration document.

The first two static keywords are node and Import-DscResource. I’ll deal with the latter first, since it seems very oddly named. Import-DscResource looks in name like a cmdlet or function, but is a keyword that is valid only in a configuration document and only outside of the context of a node. Import-DscResource identifies custom and third-party modules to make available in a configuration document. By default, only DSC resources in modules located at $pshome/modules (usually c:\windows\system32\windowspowershell\v1.0\modules) can be used without using Import-DscResource and specifying which modules to make resources available from. The second static keyword is the node keyword. Node is used to identify the machine or class of machines that the configuration is targeted at. Resources are generally assigned inside node declarations.

The configuration also includes a number of potential dynamic keywords which represent the DSC resources available for the configuration.

An example configuration script looks something like:

configuration SysAdvent
    Import-DscResource -ModuleName cWebAdministration

    node $AllNodes.where({$_.role -like 'web'}).NodeName
      windowsfeature IIS
        Name = 'web-server'

      cWebsite FourthCoffee
        Name = 'FourthCoffee'
        State = 'Started'
        ApplicationPool = 'FourthCoffeeAppPool'
        PhysicalPath = 'c:\websites\fourthcoffee'
        DependsOn = '[windowsfeature]IIS'


The above configuration script, when run, creates a command in the current PowerShell session called SysAdvent. Running that command will generate a configuration document for every server in a collection that has the role of a web server. The configuration command has a common parameter of ConfigurationData which is where AllNodes comes from (more on that in a bit). The result of this command will be a MOF document describing the desired configuration for every node identified as a web server.

MOF documents created by the command are written in a folder (of the same name as the configuration) created in the current working directory. Files are named for the node they represent (e.g. server1.mof). You can specify a custom output location. Here is our newly created MOF document:

@GenerationDate=12/22/2014 04:12:56

instance of MSFT_RoleResource as $MSFT_RoleResource1ref
SourceInfo = "::7::7::windowsfeature";
 ModuleName = "PSDesiredStateConfiguration";
 ModuleVersion = "1.0";
 ResourceID = "[WindowsFeature]IIS";
 Name = "web-server";

 ConfigurationName = "SysAdvent";

instance of PSHOrg_cWebsite as $PSHOrg_cWebsite1ref
ResourceID = "[cWebsite]FourthCoffee";
 PhysicalPath = "c:\\websites\\fourthcoffee";
 State = "Started";
 ApplicationPool = "FourthCoffeeAppPool";
 SourceInfo = "::12::7::cWebsite";
 Name = "FourthCoffee";
 ModuleName = "cWebAdministration";
 ModuleVersion = "1.1.1";

DependsOn = {


 ConfigurationName = "SysAdvent";

instance of OMI_ConfigurationDocument
 GenerationDate="12/22/2014 04:12:56";

Other Tidbits

There are a few other things one should know in preparation for digging into DSC.

ConfigurationData and AllNodes

Configurations have support for a convention-based approach to separating environmental data from the structural configuration. The configuration script represents the structure or model for the machine, and the environmental data (via ConfigurationData) fleshes out the details.

ConfigurationData is represented by a hashtable with at least one key - AllNodes. AllNodes is an array of hashtables representing the nodes that should have configurations generated and becomes an automatic variable that can be referenced in the configuration (like in the example above). The value provided in $ConfigurationData is also referenced and you can create custom keys and reference those in your configuration document. The PowerShell team reserves the right to use any key in the ConfigurationData hashtable that is prefixed with PS.


$ConfigurationData = @{
  AllNodes = (
      @{NodeName = '*', InterestingData = 'Every node can reference me.'}
      @{NodeName = 'Server1'; Role = 'Web'},
      @{NodeName = 'Server2'; Role = 'SQL'},

Sysadvent -ConfigurationData $ConfigurationData

Resources in DSC are not ordered by default and there is no guarantee of ordering. The current WMF 4 implementation and the previews of WMF 5 all seem to serially process resources, but there is NO guarantee that will stay that way. If you need things to happen in a certain order, you need to use DependsOn to tell a resource what needs to happen first before that one can execute.

Node Names

In PUSH mode, the node name is either the server name, FQDN, or IP address (any valid way you can address that node via PowerShell remoting).

In PULL mode, the node name is not the server name. Servers are assigned a GUID and they use that to identify which configuration to retrieve from a pull server. Where this GUID comes from is up to you - you can generate them on the fly, pull one from AD, or use one from another system. Since the GUID is the identifier, you can use one GUID to represent an individual server or a class of servers.

WMF 5 - In Production

If you are running Windows Server 2012 R2, you can stay on the bleeding edge AND get production support. The PowerShell team recently announced that if you are using WMF 5, you can get production support for what they call “stable” designs - those features that either existed in previous versions of the Management Framework or have reached a level that the team is ready to provide support. Other features, which are more in flux, are labeled experimental and don’t carry the same support level. With this change, you can safely deploy WMF 5 and begin to test new features and get the bug fixes faster than waiting for the full release. WMF previews are released roughly quarterly.

With WMF 5, you can dig into new and advanced features like Debug mode, partial configurations, and separate pull servers for different resource types.

Building an Ecosystem

No tooling is complete without a community around it and Desired State Configuration is no different.

PowerShellGet and OneGet

OneGet and PowerShellGet are coming onto the scene with WMF 5 (although after they release they should be available somewhat downlevel too). OneGet is a package manager manager and provides an abstraction layer on top of things like nuget, chocolatey, and PowerShellGet, and eventually tools like npm, RubyGems, and more. PowerShellGet provides a way to publish and consume external modules, including those that contain DSC resources.

Finding new resources becomes as easy as:

Find-Module -Includes DscResource

Third Parties


Back in July 2014, Chef made a preview of our DSC integration available (video, cookbook) and in September shipped our first production-supported integration (the dsc_script resource) and have more coming. DSC offers Chef increased coverage on the Windows platform.


The guys at ScriptRock (full disclosure - they are friends of mine) have done a pretty interesting thing by taking a configuration visualization and testing tool and offering an export of the configuration as a DSC script. Very cool.


There is a Puppet module on the Forge showing some DSC integration. I’m not too familiar with the state of that project, but it’s great to see it!


Brewmaster from Aditi is a deployment tool and can leverage DSC to get a server in shape to host a particular application, allowing you to distribute a DSC configuration with an application.


PowerShell.Org hosts a DSC Hub containing forums, blog posts, podcasts, videos and a free e-book on DSC.

So, What Are You Waiting For?

Start digging in! There’s a ton of content out there. Shout at me on Twitter (@stevenmurawski) or via my blog if you have any questions.

December 24, 2014

Day 24 - 12 days of SecDevOps

Written by: Jen Andre (@fun_cuddles)
Edited by: Ben Cotton (@funnelfiasco)

Ah, the holidays. The time of year when we want to be throwing back the eggnogs, chilling in front our fake fireplaces, maybe catching a funny Christmas day movie… but oh no we can’t, because guess what, a certain entertainment company was held hostage by a security breach the likes of which corporate America has never seen before… and no more movie for you.

It’s an interesting time to be a security defender. The recent Sony breach has just put a period on the worst-of-the-worst scenarios that us tinfoil-hat, paranoid security people have been ranting about all along: one bad breach could be business shattering.

But let’s step back, and look at the theme of this blog: the 12 days of SecDevOps. Besides being a ridiculous title that I’m 90% sure my ops director chose specifically as a troll for me (thanks, Pete), it underlines an important concept. Whether `security` is in your job title or not, operations is increasingly becoming the front-line for implementing security defenses.

Given that reality, and the fact that security breaches are NOT going away, and that most of us don’t have yacht-sized security budgets, I thought it would be interesting to come up with 12 practical, high-impact things that small organizations could be doing to shore up their security posture.

Day 1: Fear and Loathing and Risk Assessment and Hipsters

Risk assessment. It’s not just some big words auditors love to use. It’s simply weighing the probability of bad things happening against the cost to mitigate the risk of that bad thing happening. And using that to make good security decisions as you make day-to-day architecture and ops choices:

risk = (threat) x (probability) x (business impact)*

*whoever told you there would be no math lied to you

You may not be aware of it, but as an ops person you are likely doing risk assessment already, except more likely around things like uptime and reliability. Consider this scenario:

  • John, the web guy, proposes replacing PostgreSQL with SomeNewHipsterDB.
  • You ask yourself, ‘huh, what’s the chances that I’m going to get paged at 3am because writes stop happening and my web site starts screaming in pain?’ You are probably not having warm-fuzzy feelings about this plan.
  • Your development and ops team evaluates the benefits to the engineering team and business for switching to SomeNewHipsterDB and weighs it against the probability that you are going to get woken up all of the time, and the impact it will have on your sunny disposition and decide that yeah… maybe not gonna do it.
  • Or, you do, except you mitigate this risk by saying ‘John, you will be forever paged for all SomeNewHipsterDB issues. Done.’

Cool. Now do this for security. Every time you are making architecture choices, or changing configuration of your infrastructure, or considering some new third-party service SaaS you’ll be sending data to, you should be asking yourself: what’s the impact if that service or system gets hacked? How will you mitigate the risks?

This doesn’t have to be a formal or fancy report. It can be a running text file or spreadsheet with all of the possible points of failure. Get everyone involved with thinking of ways pieces of the infrastructure or organization can be hacked, and ways you are protected against those worst-case scenarios. It can be like ‘ANYONE WHO OWNS OUR CHEF SERVER COULD DESTROY EVERYTHING [but we have uber-monitoring and Jane over there reviews audit logs daily]’. Start with the scenario: what if… ? and have conversations with engineers and business owners defend why what we’re doing is good enough. Make security a fundamentally collaborative process.

Day 2: Shared Secrets: Figure it Out Now

There’s 3 things in life that are inevitable: death, taxes… and the fact that a sales guy left to his own devices will always put all of his passwords in a plain text file (or if fancy, an Excel spreadsheet).

The lesson is this: password management isn’t something that just the technical team decides and manages for itself. We should be advocating organization-wide education on managing credentials, because guess what? Access to Salesforce, Gmail, and all of these SaaS services with sensitive business data are being used by people who are not engineers.

Solution? As part of every employee’s onboarding process, install password management on an employee’s workstation, and show them how to use it (e.g. 1Password or LastPass, or whatever your tool of choice is). Start doing this from the outset, as it’s best to figure this out on Day 1 rather than 200 employees in.

Day 3: Shared Secrets for Infrastructure, Too

When it comes to infrastructure secrets, there are extra concerns because in most cases, systems needs to be able to access these secrets in a non-interactive, automated way (e.g. I need to be able to spin up an app server that knows how to authenticate to my database).

If all of your infra passwords start unencrypted somewhere in a git repo, You Are Going To Have A Bad Time. Noah has a good article on various options for managing shared secrets in your infrastructure.

Day 4: Config Management On All Of The Things (So You Aren’t Sweating from Shell Shocks)

This should be obvious to everyone who drinks from the DevOps Koolaid, but CM has done beautiful things for patch management. It may be tempting to deploy a one-off box used for dev manually without config management installed, but guess what? In the case of Browser Stack, that turned out to be a massive achilles heel.

Making the process easy for devs to get access to the infrastructure they need (while giving you the ability to manage systems) is key. Do this right away.

Day 5: Secure your Development Environments (Because No One Else Will)

If left to their own devices, development environments tend to veer to chaotic. This isn’t just because developers are lazy (and as a developer, I mean this in the nicest possible way) but because of the nature of the prototyping and testing process.

From a security perspective, this all means bad juju (see Browser Stack example above). I can assure you that if you start building your prototype or dev infrastructure exposed to the public internet, deploying it without even the basic config management, it will stay that way forever.

So: if you are using AWS, start with an Amazon VPC with strict perimeter security, and require VPN access for any development infrastructure. Get some config management on everything, even if it’s just for system patches.

Put some bounds around the chaos early on, and this will make it easy to mature the security controls as the product and organization mature.

Day 6: 2-Factor all of the things (well, the important things)

Require 2-factor wherever you can. Google Apps has made enforcing this super easy, and technologies like DuoSecurity and YubiKey make adding 2-factor to your critical infrastructure (e.g., your VPN accounts) far, far less annoying than it used to be.

Day 7: Encrypt your Emails (and other communications)

Encrypt your emails. It’s annoying to set up, but guess what? Hackers just love to post juicy stuff on pastebin. Again, from Day 1, help every single employee configure PGP or SMIME encryption as part of the onboarding process. Once installed, it’s relatively painless to use (as long as you don’t mind archaic mail clients from 1999).

This is especially important to drill into executives because they tend to have more sensitive emails (e.g. their private boardroom chatter), and are particularly susceptible to phishing style-attacks. With the recent Sony email leaks, you now have some leverage. You can throw the ‘Angelina Jolie’ emails in front of them and ask: how much do you think business and reputations would suffer were their entire email archives publically disclosed via a breach?

For many of us, chat is as crucial as email in terms of the type of reputation-critical information we put there. It may not be reasonable to switch to a self-hosted chat solution, but in that case, ensure you are picking a service that helps YOU mitigate your risk. E.g., do you need all of the history? Do you need private history for user chats?

Day 8: Security Monitoring: Start Small, Plan Big

Put the infrastructure in place to collect as much security data as possible, then start slowly making potential security issues visible by adding reports and alerts that deal with threat scenarios you are most worried about.

Start small. Remember that risk assessment list you made? Identify what you are most afraid of (um, that PHP CMS that has hundreds of vulnerabilities reported per year? Your VPN server?) and tackle monitoring for those items first.

Instrumenting your infrastructure from day 1 for security monitoring (even if it’s just collecting all of the system and application logs) puts you in a good position later on to start sophisticated reporting and intrusion detection on that data.

Day 9: Code/Design Reviews

Although there have been a lot of advancement in static and dynamic source code analysis tools (which you can integrate right into your CI process), a good-old fashion code review by a human being goes a long way. If you’re using GitHub, just make it part of the development workflow and testing pipeline. Whenever changes are made to authentication or authorization, have someone look for automated tests that deal with those cases.

Day 10: Test Your Users

Phish yourself regularly. It’s really easy to do, and can be illuminating to the rest of the business which may not be as technical as the operations/engineering side, and not really understand really the impact of opening an attachment in an email or not checking URLs where they are logging into a website. You can use some open source tools, but are also many services now that you can pay to do this for you.

Day 11: Make an Incident Response Plan Now

So, you see something odd in your logs. Like, Bob your DBA ran a Postgres backup on production DB, tar’d it up, and sent it to an FTP server in Singapore. Bob lives in Reston VA, and this is definitely not normal. You start seeing evidence of other weird stuff ‘bob’ is doing that he shouldn’t be.

What now? Do you email Bob and say ‘something weird is happening?’ Do you call the Director of Ops? Do you put a message in a lonely chat room?

Figure out a plan for escalating possible critical security issues. Doesn’t have to be fancy or use specialize ITIL incident response workflow tools. Make a group in PagerDuty. Have an out-of-band channel for communicating details, in case your normal network goes the way of Sony and is totally compromised or just plain is not working. Maybe it’s as simple as an email list that doesn’t use the corporate email accounts, or a conference bridge everyone can hop on.

Day 12: Don’t be the Security ‘A**hole’

You. Yes, you. Don’t be the security a**hole that gets in everyone’s way and loses sight of the real reason for everyone’s existence: to run a business. You can be the security champion without being the blocker. In fact, that’s the only way to be effective. If a user is coming to you and saying ‘this is really really annoying, I don’t want to do it’ - listen to them. Too many security personnel disregard the usability issue of security controls for the sake of security theater, which leads to (unsurprisingly) abandonment, cynicism, and apathy when it comes to real security concerns.

DevOps is really a philosophy: it’s not a job title, or a set of tools, it’s the concept of using modern tools and processes to facilitate collaboration with the engineers who deliver the code and those who must maintain it. Um, that was a lot of words, but the key word is collaboration. It’s no longer acceptable to throw ‘security over the wall’ and expect your users and ops people to just do what you say.

The best security cultures are not prescriptive, they are collaborative. They understand that business needs to get done. They are intellectually honest and admit ‘yeah, we could get hacked’ - but what can we do about this in a way that doesn’t bring everything to a halt? Zane Lackey has a great talk on building a modern security engineering organization that expounds many of these ideas, and more.

December 23, 2014

Day 23 - The Importance of Pluralism, or The Danger of the Letter "S"

Written by: Mike Fiedler (@mikefiedler)
Edited by: Hugh Brown (@saintaardvark)

Prologue: A Concept

One aspect of Chef that’s confusing to people comes up when searching for nodes that have some attribute: just what is the difference between a nodes reported ‘role’ attribute, and its ‘roles’ attribute? It seems like it could almost be taken for a typo – but underlying it are some very deep statements about pluralism, pluralization, and the differences between them.

One definition of the term ‘pluralism’ is “a condition or system in which two or more states, groups, principles, sources of authority, etc., coexist.” And while pluralism is common in descriptions of politics, religion and culture, it also has a place in computing: to describe situations in which many systems are in more than one desired state.

Once a desired state is determined, it’s enforced. But then time passes – days, minutes, seconds or even nanoseconds – and every moment has the potential to change the server’s actual state. Files are edited, hardware degrades, new data is pulled from external sources; anyone who has run a production service can attest to this.

Act I: Terms

Businesses commonly offer products. These products may be composed of multiple systems, where each system could be a collection of services, which run on any number of servers, which run on some amount of hosts. Each host, in turn, provides another set of services to the server that makes up part of the system, which then makes up part of the product, which the business sells.

An example to illustrate: MyFace offers a social web site (the product), which may need a web portal, a user authentication system, index and search systems, long-term photo storage systems, and many more. The web portal system may need servers like Apache or Nginx, running on any number of instances. A given server-instance will need to use any number of host services, such as /o, cpu, memory and more.

So what we loosely have is: products => systems => services => servers => hosts => services. (Turtles, turtles, turtles.)

In Days of Yore, when a Company ran a ‘Web Site’, they may have had a single System, maybe some web content Service, made up of a web Server, a database Server (maybe even on the same host) - both consuming host services (CPU, Memory, Disk, Network) - to provide the Service the Company then sells, hopefully at a profit (right!?).

Back then, if you wanted to enact a change on the web and database at the same time (maybe release a new feature), it was relatively simple, as you could control both things in one place, at roughly the same time.


In English, to pluralize something, we generally add a suffix of “s” to the word. For instance, to convey more than one instance, “instance” becomes “instances”, “server” becomes “servers”, “system” becomes “systems”, “turtle” becomes “turtles”.

We commonly use pluralization to describe the concept of a collection of similar items, like “apples”, “oranges”, “users”, “web pages”, “databases”, “servers”, “hosts”, “turtles”. I think you see the pattern.

This extends even in to programming languages and idiomatic use in development frameworks. For example, all tables in a Rails application will typically pluralize a table name for objects named Apple to apples.

This emphasizes that the table in question does not store a singular Apple, rather many Apple instances will be located in a table named apples.

This is not pluralism, this is pluralization - don’t get them confused. Let’s move on to the next act.

Act II: Progress

We’ve evolved quite a bit since the Days of Yore. Now, a given business product can span hundreds or even thousands of systems of servers running on hosts all over the world.

As systems grow, it becomes more difficult to enact a desired change at a deterministic point in time across a fleet of servers and hosts.

In the realm of systems deployment, many solutions perform what has become known as “test-and-repair” operations - meaning that when provided a “map” desired state (which typically manifests in human-written and readable code), that when executed, will “test the current state of a given host, and perform and ”repair" operations to bring the host to the desired state - whether it be installing packages, writing files

Each system calls this map something different - cfengine:policies, bcfg2:specifications, puppet:modules, chef:recipes, ansible:playbooks, and so on. While they don’t always map 1:1, they all have some sort of concept for ‘Things that are similar, but not the same.’ They will have unique IP addresses, hostnames, while sharing enough of a set of common features to become termed something like “web heads” or the like.

Act III: Change

In the previous sections, I laid the groundwork to understand one of the more subtle features in Chef. This feature may be available in other services, but I’ll describe the one I know.

Using Chef, there is a common deployment model where Chef Clients check in with a Chef Server to ask “What is the desired state I should have?” The Chef terminology is ‘a node asks the server for its run list’.

A run list can contain a list of recipes and/or roles. A recipe tells Chef how to accomplish a particular set of tasks, like installing a package or editing a file. A role is typically a collection of recipes, and maybe some role-specific metadata (‘attributes’ in Chef lingo).

The node may be in any state at this point. Chef will test for each desired state, and take action to enforce it: install this package, write that file, etc. The end result should either be “this node now conforms to the desired state” or “this node was unable to comply”.

When the node completes successfully, it will report back to Chef Server that “I am node ‘XYZZY’, and my roles are ‘base’ and ‘webhead’, my recipes are ‘base::packages’, ‘nginx’, ‘webapp’” along with a lot of node-specific metdata (IP addresses, CPU, Memory, Disk, and much more).

This information is then indexed and available for others to search for. A common use case we have is where a load balancing node will perform a search for all nodes holding the webhead role, and add these to the balancing list.

Pièce de résistance, or Searching for Servers

In a world where we continue to scale and deploy systems rapidly and repeatedly, we often choose to reduce the need for strong consistency amongst a cluster of hosts. This means we cannot expect to change all hosts at the precise same moment. Rather we opt for eventual consistency: either all my nodes will eventually be correct, or failures will occur and I’ll be notified that something is wrong.

This changes how we think about deployments, and more importantly, how do we use our tools to find other nodes.

Using Chef’s search feature, a search like this:

webheads = search(:node, 'role:webheads')

will use the node index (a collection of node data) to look for nodes with the webheads role in the node’s run list - this will also return nodes that have not yet completed an initial Chef run and reported the complete run list back to Chef Server.

This means that my load balancer could find a node that is still mid-provisioning, and potentially begin to send traffic to a node that’s not ready to receive yet, based on the role assignment alone.

A better search, in this case might be:

webheads = search(:node, 'roles:webheads')

One letter, and all the difference.

This search now looks for an “expanded list” that the node has reported back. Any node with the role webheads that has completed a Chef run would be included. If the mandate is that only webhead nodes get the webhead role assigned to them, then I can safely use this search to include nodes that have completed their provisioning cycle.

Another way to use this search to our benefit is to search one axis and compare with another to find nodes that never completed provisioning:

badnodes = search(:node, 'role:webheads AND NOT roles:webheads')
# Or, with knife command line:
$ knife search node 'role:webheads AND NOT roles:webheads'

This will grab any nodes with an assignment but not a completion – very helpful when launching large amounts of nodes.

Note: This is not restricted to roles; this also applies to recipe/recipes. I’ve used roles here, as we use them heavily in our organization, but the same search patterns apply for using recipes directly in a run list.


This little tidbit of role vs roles has proven time and again to be a confusing point when someone tries to pick up more of Chef’s searching abilities. But having both adjectives describe a state of the node is helpful in making a determination of what state the node is in, and whether it should be included in some other node’s list (such as in the loadbalancer/webhead example from before).

Now, you may argue against the use of roles entirely, or the use of Chef Server and search, and use something else for service discovery. This is a valid argument - but be careful you’re not tethering a racehorse to a city carriage. If you don’t fully understand its abilities, someday it might run away on you.


A surgeon spends a lot of time how to use a sharpened bit of metal to fix the human body. While there are many instruments he or she will go on to master, the scalpel remains the fundamental tool, available when all else is gone.

While we don’t have the same risks involved as a surgeon, the tools we use can be more complex, and provide us with a large amount of power at our fingertips.

It behooves us to learn how they work, and when and how to use its features to provide better systems and services for our businesses.

Chef’s ability to discern between what a node has been told about itself, and what it reports about itself, can make all the difference when using Chef to accomplish complex deployment scenarios and maintain flexible infrastructure as code. This not only lets you accomplish fundamentals of service discovery and less hard-coded configurations, but lets you avoid the uncertainty of bringing in yet another outside tool.

On that note, Happy Holiday(s)!

December 22, 2014

Day 22 - Largely Unappreciated Applicability

Written by: John Vincent (@lusis)
Edited by: Joseph Kern (@josephkern)

I have had the privilege of writing a post for SysAdvent for the past several years. In general these posts have been focused on broader cultural issues. This year I wanted to do something more technical and this topic gave me a chance to do that. It’s also just REALLY cool so there’s that.


I’m sure most people are familiar with Nginx but I’m going to provide a short history anyway. Nginx is a webserver created by Igor Sysoev around 2002 to address the C10K problem. The C10K problem isn’t really a “problem” anymore in the sense that is was originally. It’s morphed into the C10M problem. With the rise of the sensors, we may be dealing with a Cgazillion problem before we know it.

Nginx addressed this largely with an event loop (I know some folks think the event loop was invented in 2009). The dominant webserver at the time (and still), Apache, used a model of spawning a new process or a new thread for each connection. Nginx did it differently in that it spawned a master process (handles configuration, launching worker threads and the like) and a pool of workers each with their own event loop. Workers share no state between each other and select from a shared socket to process requests. This particular model works and scales very well. There’s more on the history of Nginx in its section in the second edition of AOSA. It’s a good read and while not 100% current, the basics are unchanged.


Lua is a programming language invented in 1993. The title of this article is a shout out to how underappreciated Lua is not only as a language but in its myriad uses. Most people who have heard of Lua know it as the language used for World of Warcraft plugins.

Lua is an interesting language. Anyone with experience in Ruby will likely find themselves picking it up very quickly. It has a very small core language, first-class functions and coroutines. It is dynamically typed and has one native data structure - the table. When you work in Lua, you will learn to love and appreciate the power of tables. They feel a lot like a ruby hash and are the foundation of most advanced Lua.

It has no classes, but they can be implemented after a fashion using tables. Since Lua has first-class functions, you can create a “class” by lumping data and function into a table. There’s no inheritence but instead you have prototypes (There’s a bit of sugar to help you out when working with these ‘objects’ - e.g calling foo:somefunc() to imply self as the first argument as opposed to foo.somefunc(self)).

For a good read on the language history - Wikipedia - Lua website

For some basics on the language itself, the wikipedia article has code sample and also the official documentation. There is also a section on Lua in the newest edition of the Seven Languages series - Seven More Languages in Seven Weeks

I’ve also written a couple of modules as well (primary for use with OpenResty):

If you want to see an example of how the “classes” work with Lua, take a look at the github example and compare the usage described in the README with the module itself.

Combining the two

As I mentioned, Lua is an easily embeddable language. I’ve been unable to find a date on when Lua support was added to Nginx but it was a very early version (~ 0.5).

One of the pain points of Nginx is that it doesn’t support dynamically loaded modules. All extended functionality outside the core must be compiled in. Lua support in Nginx made it so that you could add some advanced functionality to Nginx via Lua that would normally require a C module and a recompile.

Much of the Nginx api itself is exposed to Lua directly and Lua can be used at multiple places in the Nginx workflow. You can:

All of these are documented on the Nginx website.

For example, if I wanted to have the response body be entirely created by Lua, I could do the following in Nginx:

location /foo {
  content_by_lua '
  ngx.header.content_type = 'text/plain'
  local username = "bob"
  ngx.say("hello ", username)

This example would return hello bob as plain text to your browser when you requested /foo from Nginx.

Obviously escaping could get to be a headache here so most of the *_by_lua directives (which is for inlined Lua code in the Nginx config files) can be replaced with a *_by_lua_file where the Lua code is stored in an external file.

Some other neat tricks you have available are using the cosocket api where you can actually open arbitrary non-blocking network connections via Lua from inside an Nginx worker.

As you can see, this is pretty powerful. Additionally the Lua functionality is provided in Nginx via a project called LuaJIT which offers amazing speed and predicatable usage. By default, Lua code is cached in Nginx but this can be disabled at run-time to help speed up the development process.

Enter the OpenResty

If it wasn’t clear yet, the combination of Nginx and Lua basically gives you an application server right in the Nginx core. Others have created Lua modules specifically for use within Nginx and a few years ago an enterprising soul started bundling them up into something called OpenResty.

OpenResty combines checkpointed versions of Nginx, modified versions of the Lua module (largely maintained by the OpenResty folks anyway), curated versions of LuaJIT and a boatload of Nginx-specific Lua modules into a single distribution. OpenResty builds of Nginx can be used anywhere out-of-the-box that you would use a non-Lua version of Nginx. Currently OpenResty is sponsored by CloudFlare where the primary author, Yichun Zhang (who prefers to go by “agentzh” everywhere) is employed.

OpenResty is a pretty straightforward “configure/make/make install” beast. There is a slightly dated omnibus project on Github from my friend Brian Akins that we’ve contributed to in the past (and will be contributing our current changes back to in the future). Much of my appreciation and knowledge of Lua and OpenResty comes directly from Brian and his omnibus packages are how I got started.

But nobody builds system packages anymore

Obviously system packages are the domain of greyhaired BOFHs who think servers are for serving. Since we’re all refined and there are buzzword quotas to be maintained, you should probably just use Docker (but you have to say it like Benny says “Spaceship”).

Seriously though, Docker as a packaging format is pretty neato and for what I wanted to do, Docker was the best route. To that end I give you an OpenResty tutorial in a box (well, a container).

The purpose of this repo is to help you get your feet wet with some examples of using Lua with Nginx via the latest OpenResty build. It ships with a Makefile to wrap up all the Docker invocations and hopefully make things dead simple. It works its way up from the basics I’ve described all the way to communicating between workers via a shared dictionary, making remote API calls to Github, two Slack chat websocket “clients” and the skeleton of a dynamic load balancer in Nginx backed by etcd:

In addition, because I know how difficult it can be to develop and troubleshoot against code running inside Nginx I’ve created a web based repl for testing out and experimenting with the Nginx Lua API:

To use the basic examples in the container, you can simply clone the repo and run make all. This will build the container and then start OpenResty listening on port 3131 (and etcd on 5001 for one of the demos). The directory var_nginx will be mounted inside the container as /var/nginx and contains all the neccessary config files and Lua code for you to poke/prod/experiment with. Logs will be written to var_nginx/logs so you can tail them if you’d like. As you can see it also uses Bootstrap for the UI so we’ve pretty much rounded out the “what the hell have you built” graph.

Please note that while the repo presents some neat tricks, the code inside is not optimized by any stretch. The etcd code especially may have some blocking implications but I’ve not yet confirmed that. The purpose is to teach and inspire more than “take it and run it in prod”.

Advanced Examples

If you’d like to work with the Slack examples, you’ll need to generate a slack “bot” integration token for use. The Makefile includes support for running an etcd container appropriate for use with the tutorial container. If you aren’t a Slack user then here’s a screenshot so you can see what it WOULD look like:

Wrap up

Maybe this post has inspired you to at least take a look at OpenResty. Lua is a really neat language and very easy to pick up and add to your toolbelt. We use OpenResty builds of Nginx in many places internally from proxy servers to even powering our own internal SSO system based on Github Oauth and group memeberships. While most people simply use Nginx as a proxy and static content service, we treat it like an application server and leverage the flexibility of not requiring another microservice to handle certain tasks (in addition to using it as a proxy and static content service).

The combination of Nginx and Lua won’t replace all your use cases but by learning the system better, you can better leverage the use of Nginx across the board.

December 21, 2014

Day 21 - Baking Delicious Resources with Chef

Written by: Jennifer Davis (@sigje)
Edited by: Nathen Harvey (@nathenharvey)

Growing up, every Christmas time included the sweet smells of fresh baked cookies. The kitchen would get incredibly messy as we prepped a wide assortment from carefully frosted sugar cookies to peanut butter cookies. Holiday tins would be packed to the brim to share with neighbors and visiting friends.

Sugar Cookies

My earliest memories of this tradition are of my grandmother showing me how to carefully imprint each peanut butter cookie with a crosshatch. We’d dip the fork into sugar to prevent the dough from sticking and then carefully press into the cookie dough. Carrying on the cookie tradition, I am introducing the concepts necessary to extend your Chef knowledge and bake up cookies using LWRPs.

To follow the walkthrough example as written you will need to have the Chef Development Kit (Chef DK), Vagrant, and Virtual Box installed (or use the Chef DK with a modified .kitchen.yml configuration to use a cloud compute provider such as Amazon).

Resource and Provider Review

Resources are the fundamental building blocks of Chef. There are many available resources included with Chef. Resources are declarative interfaces, meaning that we describe the state we want the resource to be in, rather than the steps required to reach that state. Resources have a type, name, one or more parameters, actions, and notifications.

Let’s take a look at one sample resource, Route.

route “NAME” do
  gateway “”
  action :delete

The route resource describes the system routing table. The type of resource is route. The name of the resource is the string that follows the type. The route resource includes optional parameters of device, gateway, netmask, provider, and target. In this specific example, we are only declaring the gateway parameter. In the above example we are using the delete action and there are no notifications.

Each Chef resource includes one or more providers responsible for actually bringing the resource to the desired state. It is usually not necessary to select a provider when using the Chef-provided resources, Chef will select the best provider for the job at hand. We can look at the underlying Chef code to examine the provider. For example here is the Route provider code and rubydoc for the class.

While there are ready-made resources and providers, they may not be sufficient to meet our needs to programmatically describe our infrastructure with small clear recipes. We reach that point where we want to reduce repetition, reduce complexity, or improve readability. Chef gives us the ability to extend functionality with Definitions, Heavy Weight Resources and Providers (HWRP), and Light Weight Resources and Providers (LWRP).

Definitions are essentially recipe macros. They are stored within a definitions directory within a specific cookbook. They cannot receive notifications.

HWRPs are pure ruby stored in the libraries directory within a specific cookbook. They cannot use core resources from the Chef DSL by default.

LWRPs, the main subject of this article, are a combination of Chef DSL and ruby. They are useful to abstract repeated patterns. They are parsed at runtime and compile into ruby classes.


Extending resources requires us to revisit the elements of a resource: type, name, parameters, actions, and notifications.

Idempotence and convergenence must also be considered.

Idempotence means that the provider ensures that the state of a resource is only changed if a change is required to bring that resource into compliance with our desired state or policy.

Convergence means that the provider brings the current resource state closer to the desired resource state.

Resources have a type. The LWRP’s resource type is defined by the name of the file within the cookbook. This implicit name follows the formula of: cookbook_resource. If the default.rb file is used the new resource will be named cookbook.

File names should match for the LWRP’s resource and provider within the resources and providers directories. The chef generators will ensure that the files are created appropriately.

The resource and it’s available actions are described in the LWRP’s resource file.

The steps required to bring the piece of the system to the desired state are described in the LWRP’s provider file. Both idempontence and convergence must also be considered when writing the provider.

Resource DSL

The LWRP resource file defines the characteristics of the new resource we want to provide using the Chef Resource DSL. The Resource DSL has multiple methods: actions, attribute, and default_action.

Resources have a name. The Resource DSL allows us to tag a specific parameter as the name of the resource with :name_attribute.

Resources have actions. The Resource DSL uses the actions method to define a set of supported actions with a comma separated list of symbols. The Resource DSL uses the default_action method to define the action used when no action is specified in the recipe.

Note: It is recommended to always define a default_action.

Resources have parameters. The Resource DSL uses the attribute method to define a new parameter for the resource. We can provide a set of validation parameters associated with each parameter.

Let’s take a look at an example of a LWRP resource from existing cookbooks.

djbdns includes the djbdns_rr resource.

actions :add
default_action :add

attribute :fqdn,     :kind_of => String, :name_attribute => true
attribute :ip,       :kind_of => String, :required => true
attribute :type,     :kind_of => String, :default => "host"
attribute :cwd,      :kind_of => String

The rr resource as defined here will have one action: add, and 4 attributes: fqdn, ip, type, and cwd. The validation parameters for the attribute show that all of these attributes are expected to be of the String class. Additionally ip is the only required attribute when using this resource in our recipes.

Provider DSL

The LWRP provider file defines the “how” of our new resource using the Chef Provider DSL.

In order to ensure that our new resource functionality is idempotent and convergent we need the:

  • desired state of the resource
  • current state of the resource
  • end state of the resource after the run
Requirement Chef DSL Provider Method
Desired State new_resource
Current State load_current_resource
End State updated_by_last_action

Let’s take a look at an example of a LWRP provider from an existing cookbook to illustrate the Chef DSL provider methods.

djbdns includes the djbdns_rr provider.

action :add do
  type = new_resource.type
  fqdn = new_resource.fqdn
  ip = new_resource.ip
  cwd = new_resource.cwd ? new_resource.cwd : "#{node['djbdns']['tinydns_internal_dir']}/root"

  unless IO.readlines("#{cwd}/data").grep(/^[\.\+=]#{fqdn}:#{ip}/).length >= 1
    execute "./add-#{type} #{fqdn} #{ip}" do
      cwd cwd
      ignore_failure true

new_resource returns an object that represents the desired state of the resource. We can access all attributes as methods of that object. This allows us to know programmatically our desired end state of the resource.

type = new_resource.type assigns the value of the type attribute of the new_resource object that is created when we use the rr resource in a recipe with a type parameter.


load_current_resource is an empty method by default. We need to define this method such that it returns an object that represents the current state of the resource. This method is responsible for loading the current state of the resource into @current_resource.

In our example above we are not using load_current_resource.


updated_by_last_action notifies Chef that a change happened to converge our resource to its desired state.

As part of the unless block executing new_resource.updated_by_last_action(true) will notify Chef that a change happened to converge our resource.


We need to define a method for each supported action within the LWRP resource file. This method should handle doing whatever is needed to configure the resource to be in the desired state.

We see that the one action defined is :add which matches our LWRP resource defined actions.

Cooking up a cookies_cookie resource

Preparing our kitchen

First, we need to set up our kitchen for some holiday baking! Test Kitchen is part of the suite of tools that come with the Chef DK. This omnibus package includes a lot of tools that can be used to personalize and optimize your workflow. For now, it’s back to the kitchen.

Kitchen Utensils

Note: On Windows you need to verify your PATH is set correctly to include the installed packages. See this article for guidance.

Download and install both Vagrant, and Virtual Box if you don’t already have them. You can also modify your .kitchen.yml to use AWS instead.

We’re going to create a “cookies” cookbook that will hold all of our cookie recipes. First we will use the chef cli to generate a cookbook that will use the default generator for our cookbooks. You can customize default cookbook creation for your own environments.

chef generate cookbook cookies
Compiling Cookbooks...
Recipe: code_generator::cookbook

followed by more output.

We’ll be working within our cookies cookbook so go ahead and switch into the cookbook’s directory.

$ cd cookies

By running chef generate cookbook we get a number of preconfigured items. One of these is a default Test Kitchen configuration file. We can examine our kitchen configuration by looking at the .kitchen.yml file:

$ cat .kitchen.yml

  name: vagrant

  name: chef_zero

  - name: ubuntu-12.04
  - name: centos-6.5

  - name: default
      - recipe[cookies::default]

The driver section is the component that configures the behavior of Test Kitchen. In this case we will be using the kitchen-vagrant driver that comes with Chef DK. We could easily configure this to use AWS or any other cloud compute provisioner.

The provisioner is chef_zero which allows us to use most of the functionality of integrating with a Chef Server without any of the overhead of having to install and manage one.

The platforms define the operating systems that we want to test against. Today we will only work with the CentOS platform as defined in this file. You can delete or comment out the Ubuntu line.

The suites is the area to define what we want to test. This includes a run_list with the cookbook::default recipe.

Next, we will spin up the CentOS instance.

Preheat Oven

Note: Test Kitchen will automatically download the vagrant box file if it’s not already available on your workstation. Make sure you’re connect to a sufficiently speedy network!

$ kitchen create

Let’s verify that our instance has been created.

$ kitchen list

➜  cookies git:(master) ✗ kitchen list
Instance             Driver   Provisioner  Last Action
default-centos-65    Vagrant  ChefZero     Created

This confirms that a local virtualized node has been created.

Let’s go ahead and converge our node which will install chef on the virtual node.

$ kitchen converge

Cookie LWRP prep

We need to create a LWRP resource and provider file and update our default recipe.

We create the LWRP base files using the chef cli included in the Chef DK. This will create the two files resources/cookie.rb and providers/cookie.rb

chef generate lwrp cookie

Let’s edit our cookie LWRP resource file and add a single supported action of create.

Edit the resources/cookie.rb file with the following content:

actions :create

Next edit our cookie LWRP provider file and define the supported create action. Our create method will log a message that includes the name of our new_resource to STDOUT.

Edit the providers/cookie.rb file with the following content:


action :create do
 log " My name is #{}"

Note: use_inline_resources was introduced in Chef version 11. This modifies how LWRP resources are handled to enable the inline evaluation of resources. This changes how notifications work, so read carefully before modifying LWRPs in use!

Note: The Chef Resource DSL method is actions because we are defining multiple actions that will be defined individually within the providers file.

We will now test out our new resource functionality by writing a recipe that uses it. Edit the cookies cookbook default recipe. The new resource follows the naming format of #{cookbookname}_#{resource}.

cookies_cookie "peanutbutter" do
   action :create

Converge the image again.

$ kitchen converge

Within the output:

Converging 1 resources
Recipe: cookies::default
  * cookies_cookie[peanutbutter] action create[2014-12-19T02:17:39+00:00] INFO: Processing cookies_cookie[peanutbutter] action create (cookies::default line 1)
 (up to date)
  * log[ My name is peanutbutter] action write[2014-12-19T02:17:39+00:00] INFO: Processing log[ My name is peanutbutter] action write (/tmp/kitchen/cache/cookbooks/cookies/providers/cookie.rb line 2)
[2014-12-19T02:17:39+00:00] INFO:  My name is peanutbutter

Our cookies_cookie resource is successfully logging a message!

Improving the Cookie LWRP

We want to improve our cookies_cookie resource. We are going to add some parameters. To determine the appropriate parameters of a LWRP resource we need to think about the components of the resource we want to modify.

Delicious delicious ingredients parameter

There are some basic common components of cookies. The essential components are fat, binder, sweetner, leavening agent, flour, and additions like chocolate chips or peanut butter. The fat provides flavor, texture, and spread of a cookie. The binder will help “glue” the ingredients together. The sweetener affects the color, flavor, texture, and tenderness of a cookie. The leavening agent adds air to our cookie changing the texture and height of the cookie. The flour provides texture as well as the bulk of the cookie structure. All of the additional ingredients differentiate our cookies flavoring.

A generic recipe would involve combining all the wet ingredients and dry ingredients separately and then blending them together adding the additional ingredients last. For now, we’ll lump all of our ingredients into a single parameter.

Other than ingredients, we need to know the temperature at which we are going to bake our cookies, and for how long.

When we add parameters to our LWRP resource, it will start with the keyword attribute, followed by an attribute name with zero or more validation parameters.

Edit the resources/cookie.rb file:

actions :create  

attribute :name, :name_attribute => true
attribute :bake_time
attribute :temperature
attribute :ingredients

We’ll update our recipe to incorporate these attributes.

cookies_cookie "peanutbutter" do
   bake_time 10
   temperature 350
   action :create

Using a Data Bag

While we could add the ingredients in a string or array, in this case we will separate them away from our code. One way to do this is with data bags.

We’ll use a data_bag to hold our cookie ingredients. Production data_bags normally exist outside of our cookbook within our organization policy_repo. We are developing and using chef_zero so we’ll include our data bag within our cookbook in the test/integration/data_bags directory.

To do this in our development environment we update our .kitchen.yml so that chef_zero finds our data_bags.

For testing our new resource functionality, add the following to the default suite section of your .kitchen.yml:

data_bags_path: "test/integration/data_bags"

At this point your .kitchen.yml should look like this.

$ mkdir -p test/integration/data_bags/cookies_ingredients

Create peanutbutter item in our cookies_ingredients data_bag by creating a file named peanutbutter.json in the directory we just created:

  "id" : "peanutbutter",
  "ingredients" :
      "1 cup peanut butter",
      "1 cup sugar",
      "1 egg"

We’ll update our recipe to actually use the cookies_ingredients data_bag:

search('cookies_ingredients', '*:*').each do |cookie_type|
  cookies_cookie cookie_type['id'] do
    ingredients cookie_type['ingredients']
    bake_time 10
    temperature 350
    action :create

Now, we’ll update our LWRP resource to actually validate input parameters, and update our provider to create a file on our node, and use the attributes. We’ll also create an ‘eat’ action for our resource.

Edit the resources/cookie.rb file with the following content:

actions :create, :eat

attribute :name, :name_attribute => true
# bake time in minutes
attribute :bake_time, :kind_of => Integer
# temperature in F
attribute :temperature, :kind_of => Integer
attribute :ingredients, :kind_of => Array

We’ll update our provider so that we create a file on our node rather than just logging to STDOUT. We’ll use a template resource in our provider, so we will create the required template.

Create a template file:

$ chef generate template basic_recipe

Edit the templates/default/basic_recipe.erb to have the following content:

Recipe: <%= @name %> cookies

<% @ingredients.each do |ingredient| %>
<%= ingredient %>
<% end %>

Combine wet ingredients.
Combine dry ingredients.

Bake at <%= @temperature %>F for <%= @bake_time %> minutes.

Now we will update our cookie provider to use the template, and pass the attributes over to our template. We will also define our new eat action, that will delete the file we create with create.

Edit the providers/cookie.rb file with the following content:


action :create do

  template "/tmp/#{}" do
    source "basic_recipe.erb"
    mode "0644"
      :ingredients => new_resource.ingredients,
      :bake_time   => new_resource.bake_time,
      :temperature => new_resource.temperature,
      :name        =>,

action :eat do

  file "/tmp/#{}" do
    action :delete


Try out our updated LWRP by converging your Test Kitchen.

kitchen converge

Let’s confirm the creation of our peanutbutter resource by logging into our node.

kitchen login

Our new file was created at /tmp/peanutbutter. Check it out:

[vagrant@default-centos-65 ~]$ cat /tmp/peanutbutter
Recipe: peanutbutter cookies

1 cup peanut butter
1 cup sugar
1 egg

Combine wet ingredients.
Combine dry ingredients.

Bake at 350F for 10 minutes.

Peanut Butter Cookie Time

Let’s try out our eat action. Update our recipe with

search("cookies_ingredients", "*:*").each do |cookie_type|
  cookies_cookie cookie_type['id'] do
    action :eat

Converge our node, login and verify that the file doesn’t exist anymore.

$ kitchen converge
$ kitchen login
Last login: Fri Dec 19 05:45:23 2014 from
[vagrant@default-centos-65 ~]$ cat /tmp/peanutbutter
cat: /tmp/peanutbutter: No such file or directory

To add additional cookie types we can just create new data_bag items.

Cleaning up the kitchen

Messy Kitchen

Finally once we are done testing in our kitchen today, we can go ahead and clean up our virtualized instance with kitchen destroy.

kitchen destroy

Next Steps

We have successfully made up a batch of peanut butter cookies yet barely touched the surface of extending Chef with LWRPs. Check out Chatper 8 in Jon Cowie’s book Customizing Chef and Doug Ireton’s helpful 3-part article on creating LWRP. You should examine and extend this example to use load_current_resource and updated_by_last_action. Try to figure out how to add why_run functionality. I look forward to seeing you share your LWRPs with the Chef community!

Feedback and suggestions are welcome

Additional Resources

Thank you

Thank you to my awesome editors who helped me ensure that these cookies were tasty!

December 20, 2014

Day 20 - The Pursuit of Learning through Bad Ideas

Written by: Michael Stahnke (@stahnma)
Edited by: Michelle Carroll (@miiiiiche)

I have a confession: I love terrible ideas. I really enjoy trying to think of the absolute worst way to solve problems, largely because being a contrarian is fun. Then I realized something — coming up with the exact wrong way to solve a problem is not only a good time, but can actually be helpful.

My love for sharing terrible ideas was codified when one of my teams (and several people from other areas inside engineering) decided to embrace this behavior and create “Bad Idea Monday.” After participating in several debates fueled by the the worst ideas available, some tangible benefits emerged.

Happy employees do better work. This has been proven countless times. What makes employees happy? Fun things, perks, benefits, and pay are up there, but in my experience, what really gets people engaged is learning. Encouraging and embracing new ways of learning are paramount to building the culture you want. Capturing the desire to talk about the worst ways to solve your problems provide a lot of fresh opportunities to learn.

The worst can make you better

As you throw out the absolute worst idea possible to solve something, several outcomes can occur.

  1. Your idea, while terrible, just isn’t bad enough. Somebody else in the discussion thinks they can do better (worse). They try to one-up you. They often succeed, and it’s amazing. This sport of spouting bad ideas leads to collaboration, as one person’s idea gets picked up and added to by others.

  2. A terrible idea isn’t understood by everybody to be terrible. This often happens when there’s a wide range of experience, either in the job, or within this specific problem domain. The discussion can help spread knowledge, as a more experienced team member explains why your solution of “install head mounted GoPro cameras for auditing purposes” might not actually make your audits any cleaner.

  3. Experienced people get a new viewpoint on problems. The problems you face today may be similar to ones you’ve seen before. Trying to think of the worst possible solution forces you to deviate from your usual viewpoint, and can lead to another level of understanding. It can also lead to you reaching for tools or solutions that you’d normally not have considered.

  4. You come up with a real, legitimate solution. It’s likely one you and your team would not have arrived at without getting creative and trying to think of the worst idea. For example, choosing a Google spreadsheet[1] as the back end for an internal service. It sounds like a terrible idea. A spreadsheet isn’t really a database. It doesn’t really have a great query language, it can’t handle lots of updates per second, but it has access control, it’s a familiar interface for non-technical folks, and doesn’t require significant upgrades or maintenance.

  5. The team learns to debate and discuss ideas. This is important. Because these ideas are intentionally terrible, people don’t get offended when somebody shoots down the idea (or builds on it to come up with something worse). It helps the team learn how to debate properly. Learning how to dismantle ideas without judgment is a much healthier and more productive practice than attacking the person with the idea.

How does it work?

Bad Idea Monday doesn’t have to be a Monday, but it works well when it is. Because, let’s be honest, Mondays are the day of the week that people normally dread. There are copious jokes, cartoons, and comics about how much we all hate the first day back at the work after a nice weekend. Capitalize on Monday’s bad reputation, and use it to get your team to generate the worst possible ideas.

How do you get started? First, you need a problem. This problem could come from your ticketing system, a chat conversation, or a face-to-face discussion of something just not working the way it should. The input queue is more or less limitless. After you have a situation, don’t try to solve it — at least not the way you normally would. Turn it on its head. This doesn’t require a meeting. It can happen in any medium, and occur numerous times throughout the day.

Allow me to walk through an example.

Bad Idea Monday in practice

When Puppet Labs was moving our server-side stack from a Ruby-based solution to Clojure and JRuby, we uncovered a new set of problems. We knew we needed a JRE, but that was about all we knew. Did we need a specific JRE? Did we want to compile a JVM for the ~30 permutations of platforms supported as masters on Puppet Enterprise? Were we going to have to package it? Did we want to require that the end-user brings in libalsa because that’s what normal JVMs do?

So the fundamental problem: how do we ship/bundle a JVM to our enterprise customers? What’s the worst answer to this? We could just unzip a binary of the JVM and somehow work it into our filesystem path — that solution was rejected because it wasn’t bad enough. We could use netcat and dd for distribution, but that wasn’t that interesting enough. Then we got an idea. An awful idea. We got a wonderful, awful idea!

the grinch gets a bad idea

We ship the JVM as a gem. Rubygems allows you to compile things on the fly. Rubygems is cross platform. Rubygems is available over the network. Sure, this content wasn’t Ruby, but why should that stop us?

This is a terrible idea. Why? Well, you would need way too many dependencies. You have to have Ruby on the box already. You have to be connected to a network for a successful installation. You can’t express C-header dependencies in Rubygems. You have to have a compiler on the target system. You have to wait something like 35 minutes for the JDK to compile during a Rubygems installation. In most cases, you actually need a JVM in order to bootstrap and compile a JVM. You have to write a mkmf file to instruct the machine how to do that. At the time, signing gems was basically unheard of. You probably don’t want the JVM in your Ruby load path, but maybe you could move the files in a gem postinstall with enough finagling.

This conversation ended shortly after it started, with the team providing these counterexamples, in addition to others not covered here. We knew it was doomed. It was fun though.

We ended up shipping a version of OpenJDK that we built and optimized for our workload using the native package manager for the platforms. However, when we were dealing with some pretty hairy Ruby problems in subsequent releases, we were able to build on our knowledge of the limitations (and advantages) of the more esoteric features of Rubygems — stuff we’d looked into while identifying why it was the worst way to deliver a Java solution. When we needed to bundle some Ruby content with our distribution, that earlier discussion was extremely useful.

What did we learn from the conversation?

  • Knowledge of some of the newer (and esoteric) features of Rubygems. By the end, we’d figured out answers to questions like. What does the postinstall situation really look like? What’s the state of signing a package? What type of compiler manipulation can reasonably be done and expected on an end-user’s system?
  • Why library managers are bad general purpose package managers.[2] This may seem obvious, but it’s a good discussion for those who haven’t really thought about it.
  • Bootstrapping a JVM is a hard problem.

We also had a great time thinking of ways to bend Rubygems to our will.

The rest of the week

The team liked Bad Idea Monday so much, they created theme days for the rest of the week. I’ll walk through them quickly:

Positive Tuesday. This is a day to be positive. The original intent was to offset the perceived negativity perpetuated with bad ideas that happened on Monday, but it’s really not needed for those reasons. The thing I like about it is the ‘find something you like about it’ attitude, which sometimes can help. Everything is not always wonderful. When it’s not, at least on a Tuesday, we can try to improve our outlook by identifying the good parts (or potentially decent outcomes) of an otherwise less-than-awesome situation. This assists in scenarios where you may have lost a debate, but need to move forward. It can bolster a “disagree and commit” interaction paradigm.

Noncommittal Wednesday. Why make a decision today when you could put it off until tomorrow? I think this started as the neutral leg of to balance the bad (Monday) and good (Tuesday). Since then, this day hasn’t done much. I mean, I could tell you more about it, but I just can’t seem to commit to it.

Troll Thursday. Trolling your coworkers can be fun. We keep it pretty clean and innocent, but some days, you just have to see if you can engage the team on something ridiculous, believe some crazy story, or convince them that DECnet[3] really is the one true networking protocol. I enjoy Troll Thursday because it can be used for learning rather than simply for my own amusement. Also, I am not immune to being trolled. ABT.

FriDre. On Friday, two things happen. One, somebody will forget. Two, we will remind them. Heck, our chat bot will remind you. I’ll admit that Not Forgetting About Dre[4] is a little less fun now that he’s the first billionaire in hip hop. Nonetheless, remembering Dre is something that’s been a part of the culture at Puppet Labs for a long time — nearly as long as I’ve been on board. What purpose does it serve? Other than being fun, I have no idea. I’m even pretty sure I’m the one who decided we shouldn’t forget about Dre.


These theme days have made it easier for me to demonstrate three things: the team is creative, they have fun while they work, and they’re an awesome group. We have a wide variety of people, ranging from their mid-twenties to mid-forties. We have people who have worked in tech for years, and people in their first technical role. Some live the US, and at least one doesn’t. We’re not all men. We’re not all packaging geeks. In short, it’s a good mix. A big part of building this team and culture has been finding ways to keep things fun and by driving learning, even as the organization grows and faces new sets of challenges. I encourage you to take an unorthodox look at encouraging learning, management styles, and the non-technical ideas your teammates are bringing to the table — maybe you’ll find something new to dive into.


[1] If you’re wondering, is backed by a Google spreadsheet.

[2] An excellent talk by Ryan McKern called “Packaging is the Worst Way to Distribute Software, Except for Everything else."


[4] This can help you remember.

Further Learning

December 19, 2014

Day 19 - Infosec Basics: Reason behind Madness

Written by: Jan Schaumann (@jschauma)
Edited by: Ben Cotton (@funnelfiasco)

Sysadmins are a stereotypically grumpy bunch. Oh wait, no, that was infosec people. Or was it infosec sysadmins? The two jobs are intersecting at the corner of cynicism and experience, and while any senior system administrator worth their salt has all the information security basics down, we still find the two camps at logger heads all too frequently.

Information Security frequently covers not only the general aspects of applying sound principles, but also the often ridiculed area of “compliance”, where rules too frequently seem blindly imposed without a full understanding of the practical implications or even their effectiveness. To overcome this divide, it is necessary for both camps to better understand one another’s daily routine, practices, and the reasons behind them.

Information Security professionals would do well to reach out and sit with the operational staff for extended periods of time, to work with them and get an understanding of how the performance, stability, and security requirements are imposed and met in the so-called real-world.

Similarly, System Administrators need to understand the reasons behind any requirements imposed or suggested by an organization’s Security team(s). In an attempt to bring the two camps a little bit closer, this post will present some of the general information security principles to show that there’s reason behind what may at times seem madness.

The astute reader will be amused to find occasionally conflicting requirements, statements, or recommendations. It is worthwhile to remember Sturgeon’s Law. (No, not his revelation, although that certainly holds true in information security just as well as in software engineering or internet infrastructure.)

Nothing is always absolutely so.

Understanding this law and knowing when to apply it, to be able to decide when an exception to the rules is warranted is what makes a senior engineer. But before we go making exceptions, let’s first begin by understanding the concepts.

Defense in Depth

Security is like an onion: the more layers you peel away, the more it stinks. Within this analogy lies one of the most fundamental concepts applied over and over to protect your systems, your users and their data: the principle of defense in depth. In simple terms, this means that you must secure your assets against any and all threats – both from the inside (of your organization or network) as well as from the outside. One layer is not enough.

Having a firewall that blocks all traffic from the Big Bad Internet except port 443 does not mean that once you’re on the web server, you should be able to connect to any other system in the network. But this goes further: your organization’s employees connect to your network over a password protected wireless network or perhaps a VPN, but being able to get on the internal network should not grant you access to all other systems, nor to view data flying by across the network. Instead, we want to secure our endpoints and data even against adversaries who already are on a trusted network.

As you will see, defense in depth relates to many of the other concepts we discuss here. For now, keep in mind that you should never rely separate protection outside of your control.

Your biggest threat comes from the inside

Internal services are often used by large numbers of internal users; sometimes they need to be available to all internal users. Even experienced system administrators may question why it is necessary to secure and authenticate a resources that is supposed to be available to “everybody”. But defense in depth requires us to, as it hints at an uncomfortable belief held by your infosec colleagues: your organization either already has been compromised and you just don’t know it, or it will be compromised in the very near future. Always assume that the attacker is already on the inside.

While this may seem paranoid, experience has shown time and again that the majority of attacks occur or are aided from within the trusted network. This is necessarily so: attackers can seldom gather all the information or gain all the access required to achieve their goals purely from the outside (DDoS attacks may count as the obligatory exception to this rule – see above re Sturgeon’s Law). Instead, they usually follow a general process in which they first gain access to a system within the network and then elevate their privileges from there.

This is one of the reasons why it is important to secure internal resources to the same degree as services accessible from the outside. Traffic on the internal network should be encrypted in transit to prevent an adversary on your network being able to pull it off the wire (or the airwaves, as the case may be); it should require authentication to confirm (and log) the party accessing the data and deny anonymous use.

This can be inconvenient, especially when you have to secure a service that has been available without authentication and around which other tools have been built. Which brings us to the next point…

You can’t just rub some crypto on it

Once the Genie’s out of the bottle, it’s very, very difficult to get it back in. Granting people access or privileges is easy, taking them away is near impossible. That means that securing an existing service after it has been in use is an uphill battle, and one of the reasons why System Administrators and Information Security engineers need to work closely in the design, development and deployment of any new service.

To many junior operations people, “security” and “encryption” are near equivalent, and using “crypto” (perhaps even: ‘military grade cryptography’!) is seen as robitussin for your systems: rub some on it and walk it off. You’re gonna be fine.

But encryption is only one aspect of (information) security, and it can only help mitigate some threats. Given our desire for defense in depth, we are looking to implement end-to-end encryption of data in transit, but that alone is not sufficient. In order to improve our security posture, we also require authentication and authorization of our services’ consumers (both human and software alike).

Authentication != authorization

Authentication and authorization are two core concepts in information security which are confused or equated all too often. The reason for this is that in many areas the two are practically conflated. Consider, for example, the Unix system: by logging into the system, you are authenticating yourself, proving that you are who you claim to be, for example by offering proof of access to a given private ssh key. Once you are logged in, your actions are authorized, most commonly, by standard Unix access controls: the kernel decides whether or not you are allowed to read a file by looking at the bits in an inode’s st_mode, your uid and your group membership.

Many internal web services, however, perform authentication and authorization (often referred to as “authN” and “authZ” respectively) simultaneously: if you are allowed to log in, you are allowed to use the service. In many cases, this makes sense – however, we should be careful to accept this as a default. Authentication to a service should, generally, not imply access of all resources therein, yet all too often we transpose this model even to our trusty old Unix systems, where being able to log in implies having access to all world-readable files.

Principle of least privilege

Applying the concept of defense in depth to authorization brings us to the principal of least privilege. As noted above, we want to avoid having authentication imply authorization, and so we need to establish more fine grained access controls. In particular, we want to make sure that every user has exactly the privileges and permissions they require, but no more. This concept spans all systems and all access – it applies equally to human users requiring access to, say, your HR database as well as to system accounts running services, trying to access your user data… and everything in between.

Perhaps most importantly (and most directly applicable to system administrators), this precaution to only grant the minimal required access also needs to be considered in the context of super-user privileges, where it demands fine-grained access control lists and/or detailed sudoers(5) rules. Especially in environments where more and more developers, site reliability engineers, or operational staff require the ability to deploy, restart, or troubleshoot complex systems is it important to clearly define who can do what.

Extended filesystem Access Control Lists are a surprisingly underutilized tool: coarse division of privileges by generic groups (“admins”, “all-sudo”, or “wheel”, perhaps) are all too frequently the norm, and sudo(8) privileges are granted almost always in an all-or-nothing approach.

On the flip side, it is important for information security engineers to understand that trying to restrict users in their effort to get their job done is a futile endeavor: users will always find a way around restrictions that get in their way, often times in ways that further compromise overall security (“ssh tunnels” are an immediate red flag here, as they frequently are used to circumvent firewall restrictions and in the process may unintentionally create a backdoor into production systems). Borrowing a bit from the Zen of Python, it is almost always better to explicitly grant permissions than to implicitly assume they are denied (and then find that they are worked around).

Perfect as the enemy of the Good

Information security professionals and System Administrators alike have a tendency to strive for perfect solutions. System Administrators, however, often times have enough practical experience to know that those rarely exist, and that deploying a reasonable, but not perfect, solution to a problem upon which can be iterated in the future is almost always preferable.

Herein lies a frequent fallacy however, which many an engineer has derived: if a given restriction can be circumvented, then it is useless. If we cannot secure a resource 100%, then trying to do so is pointless, and may in fact be harmful.

A common scenario might be sudo(8) privileges: many of the commands we may grant developers to run using elevated privileges can be abused or exploited to gain a full root shell (prime example: anything that invokes an editor that allows you to run commands, such as via vi(1)’s “!command” mechanism). Would it not be better to simply grant the user full sudo(8) access to begin with?

Generally: no. The principle of least privilege requires us to be explicit and restrict access where we can. Knowing that the rules in place may be circumvented by a hostile user lets us circle back to the important concept of defense in depth, but we don’t have it easier for the attackers. (The audit log provided by requiring specific sudo(8) invocations is another beneficial side-effect.)

We mustn’t let “perfect” be the enemy of the “good” and give up when we cannot solve 100% of the problems. At the same time, though, it is also worth noting that we equally mustn’t let “good enough” become the enemy of the “good”: a half-assed solution that “stops the bleeding” will all too quickly become the new permanent basis for a larger system. As all sysadmins know too well, there is no such thing as a temporary solution.

If these demands seem conflicting to you… you’re right. Striking the right balance here is what is most difficult, and senior engineers of both camps will distinguish themselves by understanding the benefits and drawbacks of either approach.

Understanding your threat model

As we’ve seen above, and as you no doubt will experience yourself, we constantly have to make trade-offs. We want defense in depth, but we do not want to make our systems unusable; we require encryption for data in transit even on trusted systems, because, well, we don’t actually trust these systems; we require authentication and authorization, and desire to have sufficient fine-grained control to abide by the principle of least privilege, yet we can’t let “perfect” be the enemy of the “good”.

Deciding which trade-offs to make, which security mechanisms to employ, and when “good enough” is actually that, and not an excuse to avoid difficult work… all of this, infosec engineers will sing in unison, depends on your threat model.

But defining a “threat model” requires a deep understanding of the systems at hand, which is why System Administrators and their expertise are so valued. We need to be aware of what is being protected from what threat. We need to know what our adversaries and their motivations and capabilities are before we can determine the methods with which we might mitigate the risks.

Do as DevOps Does

As system administrators, it is important to understand the thought process and concepts behind security requirements. As a by-and-large self-taught profession, we rely on collaboration to learn from others.

As you encounter rules, regulations, demands, or suggestions made by your security team, keep the principles outlined in this post in mind, and then engage them and try to understand not only what exactly they’re asking of you, but also why they’re asking. Make sure to bring your junior staff along, to allow them to pick up these concepts and apply them in the so-called real world, in the process developing solid security habits.

Just like you, your information security colleagues, too, get up every morning and come to work with the desire to do the best job possible, not to ruin your day. Invite them to your team’s meetings; ask them to sit with you and learn about your processes, your users, your requirements.

Do as DevOps does, and ignite the SecOps spark in your organization.

Further reading:

There are far too many details that this already lengthy post could not possible cover in adequate depth. Consider the following a list of recommended reading for those who want to learn more:

Security through obscurity is terrible; that does not mean that obscurity cannot still provide some (additional) security.

Be aware of the differences between active and passive attacks. Active attacks may be easier to detect, as they are actively changing things in your environment; passive attacks like wire tapping or traffic analysis, are much harder to detect. These types of attacks have a different threat model.

Don’t assume your tools are not going to be in the critical path.

Another example of why defense in depth is needed is the fact that often times seemingly minor or unimportant issues can be combined to become a critical issue.

The “Attacker Life Cycle”, frequently used within the context of so-called “Advanced Persistent Threats”, may help you understand more completely an adversaries process, and thus develop your threat model:

This old essay by Bruce Schneier is well worth a read and covers similar ground as this posting. It includes this valuable lesson: When in doubt, fail closed. “When an ATM fails, it shuts down; it doesn’t spew money out its slot.”