VMWare upgrade 5.1 -> 6.0 notes

In 2013, the fine folks at Interconnekt sold us a fine Dell PowerEdge T420 server to run VMWare and a bunch of virtual machines. VMWare came installed on a dual SD module, and this came in handy later. We quickly discovered that VMWare ESXi 5.1 has a RAM limit of 32GB, so we removed the other 32GB and sailed on.

Three years on, and we have bumped up against that limit, our VMs exhausting available supply. We could pare down our use on some machines, but knowing there is an upgrade available and some DIMMs in the cupboard, I’m upgrading.

Dell recommend their customised version of VMWare, so I downloaded that through the Drivers & Downloads part of their website. I was stumped finding the link for a while, but eventually found I had to ‘change OS’ from Windows to VMWare, and found it under Enterprise Solutions. There is nothing to suggest skipping 5.5 is a bad idea, so I grabbed 6.0 and burned it to CD.

Backup the VMs

I installed ghettoVCB and did a test backup run of all machines:

$ wget -O ghettoVCB.vib https://github.com/lamw/ghettoVCB/blob/master/vghetto-ghettoVCB.vib?raw=true
$ scp ghettoVCB.vib root@server:
esxi # esxcli software vib install -v /ghettoVCB.vib
 [DependencyError]
 VIB virtuallyGhetto_bootbank_ghettoVCB_1.0.0-0.0.0 violates extensibility rule checks: [u'(line 23: col 0) Element vib failed to validate content']
 VIB virtuallyGhetto_bootbank_ghettoVCB_1.0.0-0.0.0's acceptance level is community, which is not compliant with the ImageProfile acceptance level partner
 To change the host acceptance level, use the 'esxcli software acceptance set' command.
 Please refer to the log file for more details.
esxi # esxcli software vib install -v /ghettoVCB.vib -f
Installation Result
   Message: Operation finished successfully.
   Reboot Required: false
   VIBs Installed: virtuallyGhetto_bootbank_ghettoVCB_1.0.0-0.0.0
   VIBs Removed: 
   VIBs Skipped: 

GhettoVCB creates a snapshot, does a backup of that snapshot, then deletes the snapshot. It fails if a snapshot already exists.

esxi # vim /etc/ghettovcb/ghettoVCB.conf
esxi # /opt/ghettovcb/bin/ghettoVCB.sh -a -c /etc/ghettovcb/ghettoVCB.conf -l /ghettoVCB-`date +%Y%m%d`.log

Some VMs had old unused snapshots, and some were in an old format, probably from when they were copied across from previous versions of ESXi. Therefore those backups failed. Right click, upgrade VM worked. Delete old snapshots. A successful test backup took 24 hours to our NAS – slow, but that’s okay in this case.

I didn’t want to manage changes in the event of failure, so I shutdown all VMs before running the actual backup. 24 hours later, I was ready to upgrade ESXi and install more RAM.

Upgrade ESXi to 6.0.0 Update 2

I put ESXi in maintenance mode so it wouldn’t automatically boot VMs on next boot. I first wanted to see that all is well. Shutdown ESXi, physically removed all HDDs and network cable from NAS – for paranoia’s sake, boot up with the CD I previously burned. Select upgrade and the dual SD module. Dammit, the upgrade failed because there’s a CommunitySupported VIB installed – ghettoVCB! Boot up, remove it:

esxi # esxcli software vib remove --vibname=ghettoVCB

Shutdown and boot the CD again. The upgrade was simple as can be, and for interest I timed it: 8 minutes of processing time. ESXi boots, complains about missing HDDs, and works just fine.

I’d heard of another benefit that I am now going to realise: I don’t have to boot my Windows VM to run vSphere Client any more! I might still for some things, because web clients sometimes suck, but not having to is great.

Install 4 x 8GB 2Rx4 1333MHz RAM DIMMs

Now, on to the RAM upgrade. The machine is nicely laid out, with easy access to the DIMMs. Naturally I want to put two of my DIMMs into each bank of slots, but where? There are two used in each out of six. The information panel on the case was helpful but not conclusive. Over to the Owner’s Manual.

“Populate all sockets with white release tabs first and then black.”
“Advanced ECC mode extends SDDC from x4 DRAM based DIMMs to both x4 and x8 DRAMs”
“Memory Optimized (Independent Channel) Mode … supports SDDC only for memory modules that use x4 device width and does not impose any specific slot population requirements”

A bit of discussion led me to believe that I should just place them closes to the CPU and with x4 DIMMs it didn’t make a lot of difference. It worked. Sadly I left out the big plastic baffle from the case when I reassembled – not a huge problem, but I need to schedule another outage to reinstall it.

Job done. I’m happy.

FreeBSD and pg_upgrade

Thank you kreynolds, your post makes my current job look dead easy. I want to upgrade PostgreSQL from 9.3.12 to 9.5.4, and after checking through all of the release notes between the two versions, I have narrowed the relevant ones down to the two biggest releases: 9.4 and 9.5 (thanks in part to our devs not using the darkest corners of PostgreSQL features… yet).

A side note: PostgreSQL is a shining example of how to do release notes. All in one place, linked properly between versions, available back as far as the eye can see, to pre-1.0 in the 1990s!

Another win for documentation: the FreeBSD handbook. Curated by members of the project, a simple URL, and quality information.

I could use pg_dumpall > backup.sql, upgrade the package, then psql -f backup.sql postgres, or do it in place faster with pg_upgrade. FreeBSD doesn’t simply allow both old and new packages to be installed together, so enter jails.

# jailroot=/usr/tmp/pg_upgrade
# bsdinstall jail $jailroot
# pkg -r $jailroot install postgresql93-server

In the dialog boxes, choose a nearby mirror and no extra components. Wait for the installation to complete.

# su pgsql -c 'pg_dumpall -c | bzip2' > /usr/tmp/pgdump.sql.bz2
# service puppet stop
# service postgresql onestop
# pkg delete postgresql93-client postgresql93-contrib postgresql93-server
# mv /usr/local/pgsql/data{,.93}
# pkg install postgresql95-client postgresql95-contrib postgresql95-server
# service postgresql oneinitdb
# pg_controldata -D /usr/local/pgsql/data.93

“Latest checkpoint location” needs to be the same on master and replication slaves. pg_upgrade(1) has good information on the 16 step process.

# su -l pgsql -c "pg_upgrade -b $jailroot/usr/local/bin -B /usr/local/bin -d /usr/local/pgsql/data.93 -D /usr/local/pgsql/data -j 16 -k --check"
Checking for reg* system OID user data types       fatal
Your installation contains one of the reg* data types in user tables.
These data types reference system OIDs that are not preserved by
pg_upgrade, so this cluster cannot currently be upgraded. You can
remove the problem tables and restart the upgrade. A list of the problem
columns is in the file:
    tables_using_reg.txt
[root@flora ~]# wc -l ~pgsql/tables_using_reg.txt 
 38 /usr/local/pgsql/tables_using_reg.txt

Bother. I can’t use pg_upgrade. Oh well, I’ll talk to the devs for next time, and get on with dump and restore.

Recover from a skipped FreeBSD upgrade step

Per my last post, upgrading a FreeBSD machine from one major version to another is dead easy, thanks in part to the clean separation between core system components and anything an administrator or user might add. Today I learned that even if one fails to follow the steps correctly, recovery is straightforward.

The process (e.g. for upgrading to 8.4-RELEASE):

# freebsd-update -r 8.4-RELEASE upgrade
# freebsd-update install
Reboot
# freebsd-update install
Rebuild ports as required
# freebsd-update install
– oops, missed this step without noticing.
The machine worked fine. A few days later I ran into this ugly error which meant nothing to me:
# make -C /usr/ports/editors/vim-lite
Unknown modifier 't'
...
"/usr/ports/Mk/bsd.sites.mk", line 953: Malformed conditional (!empty(_PERL_CPAN_ID) && ${_PERL_CPAN_FLAG:tl} == "cpan")
...
"/usr/ports/Mk/bsd.port.mk", line 2885: Unclosed conditional/for loop

(full listing at http://pastebin.com/kPg03nhy)

The ports tree is NFS mounted, and other systems mounting it worked fine.

The friendly folks at BUGS (specifically nox on the IRC channel) suggested that it might be make itself. So
# which make
/usr/bin/make

No problem. What about the file itself?
# file `which make`
/usr/bin/make: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, for FreeBSD 8.3, stripped

Ah, that doesn’t look right. Sure enough, it shows 8.4 on another machine. I can copy that file across, no worries, but what else is wrong?
<@peter> Is there some sort of audit function?
Yes there is: freebsd-update IDS gave me a full listing of everything that didn’t match – roughtly 11,800 files! A bit of scripting, ignoring the files locally managed (e.g. /etc/ntp.conf), and I’ve now got all of those files across to the once-broken box. Fixed!

FreeBSD system upgrade – it Just Works

Since Windows 3.x, I’ve never had much luck with operating system upgrades, and by many accounts, neither has much of the world’s computer-using population. The safest method has been to backup, format, install the new one, restore. It would seem things are getting better. Mac OS X all but requires upgrades now (since by two versions old you’re not receiving security patches any more, and some programs refuse to work without security patches), and they work. I’ve still had significant trouble with upgrades to Windows 7, but have heard a few good stories. The GNU/Linux experience seems as varied as the OS itself, so I won’t try to comment much except that I’ve had a couple of failed Ubuntu upgrades, albeit a few years ago.

FreeBSD seems to Just Work. I’d never used it before a couple of years ago, but on first reading of the relevant handbook page, the steps became clear, and they Just Worked!

# freebsd-update -r 8.4-RELEASE upgrade

I can’t be sure, but I reckon a big part of it is the clear separation between the core of the OS, and the packages installed through the ports system (or prebuilt binaries).

Other operating systems which have decent package management (of which I have most experience with GNU/Linux) seem to come with a lot of packages installed, but not FreeBSD. Also, in the vast majority of cases non-core packages that get installed to FreeBSD don’t get to touch system locations like /etc, rather they use /usr/local/etc for the config files. In fact /usr/local seems to be the general prefix for all things package-related. At system upgrade time, only the core elements are upgraded, leaving the packages untouched, which means they can be upgraded separately.

Of course, there are then arguments to be made about which functionality should be core, because the core could get unmanageably big, and offer services only a very small few would want. Too small and it becomes a pain to bootstrap a machine to a useable state. I reckon FreeBSD has balanced this well – I get ssh, ntp, vi, and a range of other basic items from core. If I want vim, bash or sudo, I install them.

As a relatively new user, I’m pretty pleased with all of this.