Shell script templating

15 years ago I was using cfengine to script fully automated installation of servers. I've since used Puppet, Docker, and Vagrant in different projects to script installations rather than build golden images. I also have felt the pain of trying to introduce too many complicated tools at once.

To that end, I've been building a shell script templating tool. There are essentially five moving parts:

  1. src/ directory: version-controlled shell scripts and configuration files using Jinja2 templating. For example, a configuration file might say:

        
        cd {{apache_configdir}}
        chgrp {{apache_group}} *
        
    
  2. vars/ directory: server/instance-specific variables in JSON files. (This could easily be extended to support additional formats e.g. YAML or INI-style files.) Variable names are "scoped" by the name of the file that they're in. For example:

        vars/apache.json:
        {'configdir': '/etc/httpd/conf.d',
         'group': 'apache'
        }
    

    would create apache_configdir and apache_group.

  3. run/ directory: where the "compiled" scripts are stored.
  4. bin/compile_scripts.py is a Python script to "compile" the src/ content into run/ using vars/.
  5. driver.sh, a shell script that (a) compiles the scripts and (b) execs run/driver.sh.

I'm still reflecting on whether the above is sufficiently "simpler" than using a tool like Puppet. But I like that you can read the scripts in run/ to see exactly what's going to happen, and that the variables in vars/ are relatively straightforward to view and edit.

Benefits/reasons to do this

  • Shell scripts can be reasonably abstracted from their environment.
  • Everything can be done in bash scripts, without having to learn another tool.
  • This is close to a 12 factor app for shell scripting. Weirdly, configuration files can be stored in a repository but at the same time the credentials and other state can be populated in vars/. (vars/, in turn, can have variables populated from environment variables and/or other sources.)
  • Passwords and other configuration data can be stored locally, while their configuration files can still be in version control.
  • I built a totally sweet lastpass-cli + jq one-liner to populate vars/secrets.json with usernames/passwords stored in lastpass. This way passwords can originate in lastpass but be deployed appropriately to configuration files via variables such as {{secrets_mysql_adminuser}}.

Current challenges/issues

  • As with any configuration tool it's easy to abstract "too much," or alternatively "not enough." How much should scripts in src/ assume OS-specific , e.g. to know whether libraries are in /usr/lib64 vs /usr/lib vs /usr/local/lib?
  • Variables can originate in too many places. For example, I currently pull variables such as $CONFIG_ENV into vars/config.json, plus runtime-derived information such as the current user into vars/driver.json.
  • Containers have many advantages over this approach, because you can more assumptions about the context for your environment. For example, I have built a bash function local_yum to extract RPMs via rpm2cio into a local environment, because I don't want the script to require elevated privileges.
  • There may be many core tools that have been around for so long that they're hard even to identify. For example install(1) is one of the best configuration tools! Its -C option means "don't take action unless the file to install is different than the currently-installed file.

Updated: