Fixing ‘Can’t have a partition outside the disk!’

Another day, another strange bug. This is just how my life works, apparently.

When deploying newer versions of ESXi on Cisco C-series servers that use Cisco’s FlexFlash SD card storage (although, admittedly this could be any server vendor) you may run into this fantastic error message that reads “Can’t have a partition outside the disk! Unable to read partition table for device.” This essentially means that the format in which the volume on the storage isn’t able to be used for an ESXi installation.

In looking into this issue we found Cisco KB CSCus51007 and countless blogs that tell you to do one of four things:

  1. Install ESXi 5.5U1 Customized Image and then upgrade to the flavor of ESXi of your choice.
  2. Take the SD card out, put it in a different computer, re-partition it (or make it blank) and then install ESXi.
  3. Boot into GParted and re-partition the SD card.
  4. Insert the SD card into a working ESXi host and use the recovery console shell to format the SD card.

I thought about this for a moment and came to a hypothesis. Is it possible to get to the recovery shell from the ESXi installer? Knowing what I know about how ESXi works, it just loads everything into memory at boot. The installer has to work the same way, right? So, when I booted up my image of 6.0 U3, I got to the window where you select your disk and wrote down the C#:T#:L# of the SD card volume and hit ALT+F1.

This was a triumph. One may even call it a huge success. The recovery console was available. A coworker of mine came over to see what I was up to and I explained what I was doing. He was as intregued as I was, as storage is his jam. I logged in as root (which in the installer, has no password set) and punched in lsLS showed us that there was a /vmfs/ directory, just like a live version of ESXi.. In /vmfs/devices/disks/ we found our device.

The command you need to run to convert the volume is as follows: partedUtil mklabel “/dev/disks/deviceID” gpt. An example would look like this: partedUtil mklabel “/dev/disks/mpx.vmhba11:C0:T0:L0” gpt. Wait, why isn’t /vmfs/devices/ in the path name? /vmfs/devices/ is actually a symlink to /dev/. The command, from our testing, doesn’t even work when you use the symlink.

From there, hit ALT-F2 and re-scanned the storage for the installer. ESXi 6.0 U3 installed without issue. I hope this helps some admins in the future, as the resources out there on this problem aren’t great! A TL;DR is below with steps. Enjoy!

TL;DR: just give me the fix!

  1. Boot into ESXi’s installer
  2. Get to the disk selection screen and take note of the disk identifier
  3. Hit ALT+F1
  4. Navigate to /dev/disks/
  5. Use ls to find your disk identifier
  6. Use the following command to convert the disk into GPT: partedUtil mklabel “/dev/disks/deviceID” gpt
  7. Install ESXi

Disclaimer: If you break something using this, I take zero responsibility. Any advice you take from me is on you.

VMware: Why Customize?

Today I deployed a greenfield (enterprise speak for “brand new without needing to think about past deployments”) vCenter Server 6.0 deployment. Those words don’t mean too much to most people, however for those VMware admins out there they are like birds singing on a summer morning, with blue skies and a slightly warm breeze. Today was a good day. [Insert Ice Cube Meme here.]

Deployment day was supposed to be yesterday. My team and I kicked off the deployment on Wednesday. As we went through the external platform services controller (PSC) deployment, we made two conscious decisions: our SSO domain will be something that isn’t vsphere.local and we will use the NTP sources that we have set up in the environment to provide NTP. These things don’t seem like that big of a deal. NTP has been around for ages. It’s an essential service of any enterprise environment.In many cases, the SSO domain is a vanity domain that only exists in the vCenter environment. VMware’s guidance is just to ensure that it is not your LDAP/Active Directory domain name.

The first thing that we ran into was an error with the PSC deployment failing its firstboot scripts, complaining that DNS was not set up correctly. Spoiler alert: DNS for the PSC was set up properly. Upon further investigation, a team member stumbled across a blog post that pointed out that you should only use one NTP source. “Great, good, the technology just isn’t there to deploy with two NTP sources,” I said to the team as we all had a good laugh. We redeploy the PSC and all is well, or so we thought.

We ran the vCenter Server Appliance (VCSA) deployment wizard and blew through it with ease. We set it to large, gave it a name, punched in only one NTP source (as we assumed the VCSA also couldn’t handle more than one NTP source) and started the deploy. Like the PSC, it failed to run it’s firstboot scripts. Again, the VCSA was complaining about DNS. Again, DNS was not the issue. We tried three slightly different deployments thinking that there may be a gotcha in the deployment process that isn’t documented or an issue with deploying VCSA to a vCenter that is running 5.5 U3. Stumped and ready to leave for the day, my Canadian counterpart opened a ticket with VMware (which is still unresolved despite us getting it working this morning) and we called it a day.

Later that night, as I was checking my email to ensure that I didn’t need to Make Infrastructure Great Again before heading to bed (as I’m on my on-call rotation). I see a message from my Canadian counterpart. He found this fantastic blog post on why you shouldn’t change your SSO domain from the default, ‘vsphere.local.’ Posted almost one year ago, the article states “VMware Engineering are aware and will resolve this in a future release of vSphere 6.0.” I question this, as it still isn’t fixed. We also decided that we would change the NTP option from specifying NTP servers to the “use ESXi host’s time,” option. Our ESXi hosts are all set to use the same NTP sources, so it really didn’t seem to make that big of a change in deployment methodology.

We took all of this very valuable information and deployed a successful greenfield vCenter 6.0 environment this morning! The PSC and VCSA deployments all did what they were supposed to do in a very short amount of time. It’s nearly complete, as I think I only need to configure a handful of things tomorrow as I await a couple other components that are still in the provisioning cycle. Good times all around!

The morale of the story is this: don’t over complicate your VMware deployments unless absolutely necessary!

Goals, Ambitions and Dreams

I don’t know about you but there are some days where I think, “Man, I could be doing something with my time that is more productive than sitting here playing CS:GO.” Prior to getting my master’s degree in Information Security and Assurance (a.k.a. Cybersecurity), I loved to write. While I wrote daily during my bachelors degree, it was fluffy writing. My master’s degree kicked my ass. I haven’t been able to write anything without flashbacks to writing tombs of technical details and theorem. OK — that’s a tad hyperbolic — I did write a whitepaper last year on Hyperconverged Solutions for my place of work and I churn out opinion pieces on Facebook almost daily (which are short, to the point and fluffy).

What I want to do here with Definitely Alive is to have a place to write about subjects I find interesting, small technical things I happen to run across and document some aspects of the goings on of the world. I am also challenging myself to post 300 posts this year. No, this isn’t Sparta. This is Definitely Alive — my small corner of the Internet where I am going to try to re-spark that love of writing I long for.

300 posts, 365 days. I should probably set myself a reminder to write a post every day, however impossible that may seem.Although, sometimes I am driving home and think about things I could write about while I am decompressing. I hope this becomes an extension of my daily decompression.

So, feel free to reach out to me. My LinkedIn and Twitter are links at the bottom of the page, which I use quite frequently. Oh, in case you were wondering, I hit 600 hours in CS:GO today. That’s got to be healthy, right? RIGHT?