I was playing around with Terraform and GCP the other day to run up a couple of different environments. It niggled me that I needed to hack around to get my instances working with Puppet, sure it was only a provisioning script, but it still lacked elegance.

A bit of google-fu led me to Lucy Wyman’s awesome post about combining Bolt and Terraform. I spent a while diving into how Lucy had set things up and I was really impressed, it was fairly easy to adapt that workflow to what I was looking for.

A few days of effort later and I had a single bolt plan which would call terraform to provision the infrastructure and then use bolt to apply all of the configuration. It was neat, but a little loquacious when it came to adding nodes to the inventory.

Lucy’s (and the one I used too) used the following code to build the inventories.

plans/build_pe.pp

1
2
3
4
5
6
7
8
9
10
run_command("cd ${$tf_path} && terraform apply", $localhost)

$ip_string = run_command("cd ${$tf_path} && terraform output public_ips", $localhost).map |$r| { $r['stdout'] }

$ips = Array($ip_string).map |$ip| { $ip.strip }

# Turn IPs into Bolt targets, and add to inventory
$targets = $ips.map |$ip| {
  Target.new("${$ip}").add_to_group('terraform')
}

inventory.yaml

1
2
3
4
5
6
7
8
9
groups:
  - name: terraform
    nodes: [] # This will be populated by the Bolt plan
    config:
      transport: ssh
      ssh:
        private-key: ~/.ssh/id_rsa-phraseless
        user: ubuntu
        host-key-check: false

Not a huge problem to iterate over that for different inventory groups, but the lack of elegance still bothered me, but it was a great solution and the best that could be done. Sorry, that should be “and was the best that could be done”.

On the 17th May Bolt V0.20.0 was released. The feature that really caught my eye was :

A new plugin in inventory v2 loads Terraform state and map resource properties to target parameters. This plugin enables using a Terraform project to dynamically determine the targets to use when running Bolt

The inventory file changes were straight forward and made life far easier going forward.

I decided to try setting up a moderately complex set of infrastructure which could be setup with as few moving parts as possible. My idea was to do a full Puppet Enterprise deployment, including :

  • Puppet Enterprise Master
  • PE HA replica
  • Compilers behind a load balancer
  • Continuous Delivery for Puppet Enterprise (CD4PE)
  • Workers for the CD4PE

The terraform side looks like this :

pe.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
resource "google_dns_managed_zone" "frontend" {
  project    = "puppet"
  dns_name   = "gcp.example.com."
  name       = "gcp-example"
  visibility = "public"
}

resource "google_dns_record_set" "frontend" {
  project      = "puppet"
  name         = "puppetmaster.gcp.example.com."
  managed_zone = "gcp-example"
  rrdatas      = ["${google_compute_instance.master.network_interface.0.access_config.0.nat_ip}"]
  ttl          = "300"
  type         = "A"
}

resource "google_compute_firewall" "puppet" {
  project = "puppet"
  name    = "pe-master"
  network = "${google_compute_network.pe-network.name}"
  target_tags = [ "pe-master" ]
  allow {
    protocol = "tcp"
    ports = [ "22", "5432", "443","8140","8142","8143","8170","4433","8123","8080","8081" ]
  }
}

resource "google_compute_network" "pe-network" {
  project                 = "puppet"
  name                    = "pe-network"
  auto_create_subnetworks = "true"
}

resource "google_compute_instance" "master" {
  project      = "puppet"
  zone         = "australia-southeast1-b"
  name         = "master"
  machine_type = "n1-standard-4"
  boot_disk {
    initialize_params {
      size = 30
      image = "centos-cloud/centos-7"
    }
  }
  network_interface {
    network       = "${google_compute_network.pe-network.self_link}"
    access_config = {
    }
  }
  lifecycle {
    ignore_changes = [ "attached_disk" ]
  }

  tags = [ "pe-master", "https-server" ]

  metadata = {
   "sshKeys"        = "tony_green:${file("~/.ssh/gcp.pub")}"
   "enable-oslogin" = true
  }
}

resource "google_compute_instance" "compiler" {
  count                     = 3
  allow_stopping_for_update = true
  project                   = "puppet"
  zone                      = "australia-southeast1-a"
  name                      = "compiler-${count.index + 1}"
  machine_type              = "n1-standard-2"
  boot_disk {
    initialize_params {
      image = "centos-cloud/centos-7"
    }
  }
  network_interface {
    network       = "${google_compute_network.pe-network.self_link}"
    access_config = {
    }
  }

  tags = [ "compiler", "pe-master" ]

  metadata = {
   "sshKeys"        = "tony_green:${file("~/.ssh/gcp.pub")}"
   "enable-oslogin" = true
  }
}

resource "google_compute_instance" "cd4pe" {
  project      = "puppet"
  zone         = "australia-southeast1-a"
  name         = "cd4pe"
  machine_type = "n1-standard-4"
  boot_disk {
    initialize_params {
      size = 30
      image = "centos-cloud/centos-7"
    }
  }
  network_interface {
    network       = "${google_compute_network.pe-network.self_link}"
    access_config = {
    }
  }

  tags = [ "pe-master", "cd4pe", "https-server" ]

  metadata = {
    "enable-oslogin" = true
    "sshKeys"        = "tony_green:${file("~/.ssh/gcp.pub")}"
  }

  depends_on = ["google_compute_instance.master"]
}
resource "google_compute_instance" "replica" {
  project      = "puppet"
  zone         = "australia-southeast1-a"
  name         = "replica"
  machine_type = "n1-standard-4"
  boot_disk {
    initialize_params {
      image = "centos-cloud/centos-7"
    }
  }
  network_interface {
    network       = "${google_compute_network.pe-network.self_link}"
    access_config = {
    }
  }

  tags = [ "pe-master", "https-server" ]

  metadata = {
    "enable-oslogin" = true
    "sshKeys"        = "tony_green:${file("~/.ssh/gcp.pub")}"
  }

  depends_on = ["google_compute_instance.master"]
}

resource "google_compute_instance_group" "compiler-cluster" {
  project     = "puppet"
  name        = "compiler-cluster"
  description = "Terraform test instance group"

  instances = [
    "${google_compute_instance.compiler.*.self_link}"
  ]

  named_port {
    name = "puppet-8140"
    port = "8140"
  }

  named_port {
    name = "puppet-8142"
    port = "8142"
  }

  zone = "australia-southeast1-a"
}

resource "google_compute_forwarding_rule" "compiler-forwarding-rule" {
  project               = "puppet"
  name                  = "compiler-lb"
  load_balancing_scheme = "INTERNAL"
  network               = "${google_compute_network.pe-network.name}"
  backend_service       = "${google_compute_region_backend_service.compiler-lb.self_link}"
  ports                 = [ "8140", "8142" ]
  region                = "australia-southeast1"
}

resource "google_compute_region_backend_service" "compiler-lb" {
  project          = "puppet"
  name             = "compiler-lb"
  protocol         = "TCP"
  timeout_sec      = 10
  session_affinity = "NONE"
  region           = "australia-southeast1"

  backend {
    group = "${google_compute_instance_group.compiler-cluster.self_link}"
  }

  health_checks = ["${google_compute_health_check.compiler-healthcheck.self_link}"]
}

resource "google_compute_health_check" "compiler-healthcheck" {
  project            = "puppet"
  name               = "compiler-healthcheck"
  check_interval_sec = 5
  timeout_sec        = 5

  tcp_health_check {
    port = "8140"
  }
}

output "master_ip" {
  value = ["${google_compute_instance.master.network_interface.0.access_config.0.nat_ip}"]
}

output "compiler_ip" {
  value = ["${google_compute_instance.compiler.*.network_interface.0.access_config.0.nat_ip}"]
}

output "replica_ip" {
  value = ["${google_compute_instance.replica.network_interface.0.access_config.0.nat_ip}"]
}

output "cd4pe_ip" {
  value = ["${google_compute_instance.cd4pe.network_interface.0.access_config.0.nat_ip}"]
}

The bolt plan and inventory look like this :

inventory.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
version: 2
groups:
  - name: master
    target-lookups:
      - plugin: terraform
        dir: ~/dev/terraform_puppet
        resource_type: google_compute_instance.master
        uri: network_interface.0.access_config.0.nat_ip
  - name: replica
    target-lookups:
      - plugin: terraform
        dir: ~/dev/terraform_puppet
        resource_type: google_compute_instance.replica
        uri: network_interface.0.access_config.0.nat_ip
  - name: compilers
    target-lookups:
      - plugin: terraform
        dir: ~/dev/terraform_puppet
        resource_type: google_compute_instance.compiler
        uri: network_interface.0.access_config.0.nat_ip
  - name: cd4pe
    target-lookups:
      - plugin: terraform
        dir: ~/dev/terraform_puppet
        resource_type: google_compute_instance.cd4pe
        uri: network_interface.0.access_config.0.nat_ip
config:
  transport: ssh
  ssh:
    private-key: ~/.ssh/gcp
    user: centos
    host-key-check: false
    run-as: root

plans/build_pe.pp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
plan terraform_puppet::build_pe(
  String $tf_path,
) {

  $localhost = get_targets('localhost')
  run_command("cd ${$tf_path} && terraform apply --auto-approve", $localhost)

  # Pull targets out of terraform, configured in the inventory.yaml file
  $master = get_targets('master')
  if $master.size != 1 { fail("Must specify a single master, not ${master}") }

  $replica = get_targets('replica')
  if $master.size != 1 { fail("Must specify a single master, not ${master}") }

  $cd4pe = get_targets('cd4pe')
  if $master.size != 1 { fail("Must specify a single cd4pe, not ${cd4pe}") }

  $compilers = get_targets('compilers')

  # Transfer installer scripts to the master and do the install
  upload_file('~/dev/terraform_puppet/provision','/tmp/provision',$master)
  run_command('/tmp/provision/setup.sh',$master, '_run_as' => 'root')

  # Install the agent on the replia
  run_command('curl -k https://master.c.puppet.internal:8140/packages/current/install.bash | sudo bash -s -- --puppet-service-ensure stopped',$replica, '_run_as' => 'root')
  run_command('/opt/puppetlabs/bin/puppet agent -t || exit 0',$replica, '_run_as' => 'root')

  # Configure the replica
  run_command('/opt/puppetlabs/bin/puppet-infrastructure provision replica replica.c.puppet.internal',$master, '_run_as' => 'root')

  # Install the agents on the compilers
  run_command('curl -k https://master.c.puppet.internal:8140/packages/current/install.bash | sudo bash -s main:dns_alt_names=puppet.example.com',$compilers, '_run_as' => 'root')
  run_command('/opt/puppetlabs/bin/puppet agent -t || exit 0',$compilers, '_run_as' => 'root')
  run_command('/opt/puppetlabs/bin/puppet agent -t || exit 0',$master, '_run_as' => 'root')

  run_command('curl -k https://master.c.puppet.internal:8140/packages/current/install.bash | sudo bash -s -- --puppet-service-ensure stopped',$replica, '_run_as' => 'root')
  run_command('/opt/puppetlabs/bin/puppet agent -t || exit 0',$replica, '_run_as' => 'root')

  # Install the agent on cd4pe
  run_command('curl -k https://master.c.puppet.internal:8140/packages/current/install.bash | sudo bash -s -- --puppet-service-ensure stopped',$cd4pe, '_run_as' => 'root')
  run_command('/opt/puppetlabs/bin/puppet agent -t || exit 0',$cd4pe, '_run_as' => 'root')

  run_command('/opt/puppetlabs/bin/puppet task run pe_installer_cd4pe::install cd4pe_admin_email="puppet@example.com" cd4pe_admin_password="p@ssword" -n cd4pe.c.puppet.internal', $master, '_run_as' => 'root')
}

The step to provision PE just copies up a script to download the latest version of Puppet Enterprise, untar it, and run the setup with a custom pe.conf to get things like code manager working.

What worked

In under 30 mins I can go from not having any PE infrastructure to having a full HA master, compile masters and CD4PE. Everything other than CD4PE is fully automated and is running in it’s final state by the end of the process.

This was a proof of concept for me, using PE as it’s a platform I’m very familiar with. Adapting it other applications is trivial.

What didn’t work

Getting started with the dynamic inventory was tricky, however it turns out the problem was because I was using a remote backend for my terraform state. Mine was stored on GCP storage, so the terraform.state file just contained a pointer to the remote storage, so the new bolt plugin didn’t work. Once I moved back to local state storage (not something I want to do long term) everything “just worked”.

I need to look into that in more detail and talk to the Bolt team to see if there’s something that I’m missing.

The plugin also needs to have the terraform state in place before the plan starts, so having the terraform apply command inside the plan won’t work on the first run and may not work correctly on subsequent runs. Again, this isn’t a huge problem and could be something that I’m doing wrong or a feature that will be added in future.

I’m looking forward to seeing the CD4PE setup get a bit more puppetized. It’s still a fair bit of pointy clickyness once you have it installed.

What’s next?

  • I’m still pretty new to terraform and I know there’s a lot of duplication in my code, I’m going to clean that up next.
  • I’m not happy with the way I’m using bolt to do the install of the puppet agents on the nodes. I know I could just use the prep function, but that just installs the agent, it doesn’t point it to the puppet master (at least as far as I could tell). I’ll look into how I can make that nicer.
  • Using terraform import to pull state information out of existing infrastructure so that I can use bolt to mange it

Conclusion

Bolt continues to go from strength to strength. The dynamic inventory (currently supporting terraform and PuppetDB connections) really makes life a lot easier.

Since you can drive bolt off your existing puppet control repository, you can get even more creative with how you stand up and manage infrastructure.

This process could easily work with a mixed-vendor/hybrid cloud environment.