Ten months into this job, and I still feel like an OpenStack novice, but it feels better than a couple of months ago at least. In fact last week we had what I felt was a big automation win, where we deployed a Ceph OSD node from bare metal to joining the cluster without any ‘manual’ intervention. That automation needs more, well, automation, but at least it’s repeatable and consistent now. But I’ve leapt ahead of myself. This is a heavily abbreviated history of how we got here:
- Luca deployed OpenStack with Fuel. Five short words which actually represent months of detailed work and a fair bit of complaining from his cubicle. Disk partitioning, network bonds, bridges, VLANs, GRE, VXLAN, MTU settings, bugs, confusing or missing or out of date documentation, people in the wrong timezone for proper conversations… oh my. I helped a bit.
- I created an All-In-One (AIO) deployment with the puppet-openstack-integration (POI) project. I started comparing the (hiera) data between it and the Fuel-deployed stack.
- Using POI I deployed a compute node almost to the point of working, but we managed to break our dev stack before we got to iron out the final kinks.
- Luca got us started with MAAS, which proved a little more intuitive than xCAT and being built by Canonical it works well with Ubuntu. We customised the MAAS deployment process to suit our hardware and needs.
- Ceph is not as much of a core integrated component of OpenStack as the other parts so it is another good candidate for early deployment tooling, and so we got started with Puppet-Ceph. In the end we found spjmurray’s Ceph module more intuitive and reliable, and it handled the new long term stable release 12.x Luminous almost as soon as it was released.
Here’s how we deploy a Ceph OSD node:
- PXE boot the node:
$ ipmitool -I lanplus -H $IP -U $user -P $pass chassis power off $ ipmitool -I lanplus -H $IP -U $user -P $pass chassis bootdev pxe $ ipmitool -I lanplus -H $IP -U $user -P $pass chassis power on
- Commission the node: Straightforward MAAS step from the documentation.
- Customise the node: Network bridges, disk partitions, hostname. We have a hundred-line script to do this, and the main tools in use are the MAAS CLI and jq.
- Prepare the curtin (curt installer) script (largely one-off work, although we continue to tweak it). Currently this just installs the Puppet Agent.
- Deploy the node: Straightforward MAAS step from the documentation.
Once the node is deployed, it lets Puppet and our modules (which in turn use the Ceph module) take over, and we have more OSDs in our cluster!
$ ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME -1 51.97385 - 53220G 177G 53043G 0.33 1.00 - root default ... -21 20.06506 - 20547G 70573M 20478G 0.34 1.01 - host new-node 12 hdd 1.82410 1.00000 1867G 6458M 1861G 0.34 1.02 90 osd.23 13 hdd 1.82410 1.00000 1867G 6433M 1861G 0.34 1.01 95 osd.24 25 hdd 1.82410 1.00000 1867G 6344M 1861G 0.33 1.00 71 osd.25 26 hdd 1.82410 1.00000 1867G 6429M 1861G 0.34 1.01 74 osd.26 27 hdd 1.82410 1.00000 1867G 6394M 1861G 0.33 1.00 103 osd.27 28 hdd 1.82410 1.00000 1867G 6412M 1861G 0.34 1.01 94 osd.28 29 hdd 1.82410 1.00000 1867G 6429M 1861G 0.34 1.01 102 osd.29 30 hdd 1.82410 1.00000 1867G 6559M 1861G 0.34 1.03 104 osd.30 31 hdd 1.82410 1.00000 1867G 6343M 1861G 0.33 1.00 76 osd.31 32 hdd 1.82410 1.00000 1867G 6474M 1861G 0.34 1.02 98 osd.32 33 hdd 1.82410 1.00000 1867G 6293M 1861G 0.33 0.99 69 osd.33