Tag Archives: Nutanix

Export a VM on AHV

Step 1: Find UUID of the vDisk.

Connect to a CVM, enter aCLI and run the command vm.get [vm name]

Copy the vmdisk_uuid.

image

image

Step 2: Export the vDisk

vDisks of AHV VMs are located in a hidden folder on the container named .acropolis.  We use the qemu-img command to export the vDisk.  One cool thing is that the vDisk is exported in a thin format, so even if it is provisioned as a 100GB drive, it will only export the actual size used.

Make sure the VM is powered off, then run the following command:

qemu-img convert –c nfs://127.0.0.1/[container]/.acropolis/vmdisk/[UUID] –O qcow2 nfs://127.0.0.1/[container]/[vmdisk].qcow2

Example:
qemu-img convert -c nfs://127.0.0.1/Nutanix/.acropolis/vmdisk/5c0996b9-f114-475f-98c0-ea4d09e8e447 -O qcow2 nfs://127.0.0.1/Nutanix/export_me.qcow2

Step 3: Copy the vDisk

Once the export completes, you can now whitelist a Windows 2012 R2 server and simply browse to the container and copy the vDisk.

image

Nutanix – The difference is SERVICE!

It’s still sinking in that I spent the past week at new hire training for one of the hottest and fastest growing tech startups. I’m joining their federal sales team as a systems engineer.  I am humbled by the opportunity to work with so many great people.

Part of the training comprised meeting all of the senior leadership of the company, each giving a 30-60 minute overview of their role at the company and how to interface with them.  They definitely have the vision and drive to be disruptive to the tech industry.  And passion. It’s not often that you become part of something where everyone you work with has the same passion for the product.

At the end of sales training I went down to a empty conference room on the first floor of Nutanix HQ.  As I sat there getting caught up on email that had filled my new mailbox and messaging my wife about the day I was surprised by the CEO of Nutanix, Dheeraj Pandey.  He popped in and asked me what I had thought of the training.  I told him that it was like drinking kool-aid from a firehose.  I realized by his facial expression that what I had said probably didn’t come out right.  What I would have said if I could get a do over would be something like that coming from a service delivery role I was a little overloaded with information on sales-y type stuff, but felt challenged to practice and hone my messaging so I could share my passion for the product and company… yeah that sounded sales-y.

Obviously I need some practice.   Why would I want to challenge myself by going from service delivery to pre sales?  Because I want to share my passion for the technology and be a part of building the future.

As a kid I was fascinated by tech.  I was always collecting computer parts and getting old hand me down 486s to connect to the internet.  In high school I ran my own web/gaming server in my bedroom.  I was passionate about learning technology.  More than anything else I wanted to work at the local ISP.  I thought if I could get a job there I would be able to get my hands on the latest technology and be part of building the future.  I tried to apply for a job there every year.  Finally when I was 18 I was able to convince them that I was passionate about the technology they had and they relented / gave me a chance and hired me as a help desk tech.

The helpdesk techs were also given the task of taking all of the new sales calls.  We were often asked the question why should we buy your local service for $X when I can get the same service for less somewhere else?  I would have rather been asked how to configure Trumpet Winsock to connect using SLIP on Windows 3.11.

After a couple weeks of fumbling for answers I asked my boss for help.  He asked me a couple of questions.

  1. When you call a company, do you like to wait on hold before you are able to speak to someone?
  2. When your car needs maintenance, do you take it somewhere far away for the work to be performed?
  3. Do you like being able to know the name of the mechanic that performs the maintenance on your car?

My boss explained to me that the difference between us and our competitors was SERVICE!  Our customers preferred our service to our competitors even though we weren’t the least expensive service in town.  It was our responsibility to make sure that we delivered that premium level of service or our customers would leave.  No matter what we should answer the phone before the call queued.  No matter what we should solve our customer’s issue before they got off the phone.  We would even let our customers bring their computer into the shop and we would fix their issue for free if we could.

As a Nutanix customer I experienced the same level of customer service and commitment.  Just one example of this is when HR hired 1000 additional people and told us they were starting in a week.  We panicked.  I’m sure Nutanix panicked when we told them we needed to double our order and needed it before Monday.  Somehow they pulled through and our hardware arrived in less time then I could have even submitted a PO with other vendors.  We met our crazy deadline.  Nutanix even sent someone on-site to help install it… at no additional charge!  And that is the difference between other infrastructure vendors and Nutanix… the difference is SERVICE!

New hire training confirmed to me, like my first boss so many years ago, the message that customer service is THE MOST IMPORTANT THING extends from the senior leadership at Nutanix. They get it.

That day when I added all of the additional hardware for 1000 unexpected VDI users I recognized I had just seen something that I had never seen before.  I powered on the hardware, and a few minutes later I had 1000 additional desktops, without having to configure any LUNs, switches, or cable anything other than power and ethernet.  I had seen the future of infrastructure.  It was the power of the software defined datacenter.  It was webscale.  It was the same feeling that I had when I saw vMotion for the first time and thought holy shit! This changes everything!  This is amazing! I need to learn all that I can about this technology!  This is the future and I need to be a part of building the future!

Two years later Nutanix is giving me a chance to be part of building the future.  I will try my hardest to keep up with these amazing people and continue to share my passion and enthusiasm with all of you.

Use PowerCLI to Automate Disaster Recovery Failover On Nutanix

Using VMware SRM on Nutanix has a few challenges.  SRM expects replication to happen at a datastore level.  By default Nutanix protection domains replicate at a VM level.  It is possible to set up Nutanix replication at a datastore level, but you lose granularity of being able to take VM specific snapshots.  SRM is also dependent on vCenter and SSO.  We were having a few issues that caused us to migrate from the Windows version of vCenter to the vCenter Server Appliance, and in doing so broke SRM so it had to be set up again.  Well, instead of setting it up again, I figured we would get more flexibility if I could do the same thing with PowerCLI.  Unfortunately, Nutanix’s Powershell CMDLET Migrate-NTNXProtectionDomain was published before actually implementing the failover part of the command, so after the script runs you still need to perform the additional step of logging into PRISM and clicking migrate. The script checks to see if the VMs are Windows or Linux. If they are Linux, the script expects a file to be staged called failover, that copies a staged network interface configuration file.

Change Nutanix CVM RAM with PowerCLI

*Update – story behind the script*
Finally I have a few minutes to write the story behind this script.

One of our VMware View environments was experiencing performance problems. The CPUs on our VMs would constantly spike to 100% after they were powered on. Our admins relayed back to engineering that they were having density issues. We reached out to Nutanix who recommended that we increase the cache size to be able to absorb more IOPS. To increase the cache size on Nutanix you simply need to power off the controller virtual machine (CVM) on a host, increase RAM, and power it back on. While is a non disruptive process if you power the CVMs on and off one at a time, it becomes a very disruptive process if someone makes a mistake and powers off more than one CVM at a time. It is also very time intensive because you must check that the CVM services are completely back up before you perform the procedure on the next CVM. With 120 hosts in our environment, and averaging 10 minutes per manual CVM procedure, it looked like it was going to take about 20 hours to perform this task. For us this means 3-4 days in maintenance windows!

I figured there has to be a way to automate this and eliminate the human component so we could perform this maintenance task all in one maintenance window. Well a couple hours of fiddling with powerCLI and trying to figure out which service is the last CVM service to power on, and running the script in our test environment to work out the bugs and we were ready to run it in production. In our environment the average run time per CVM was about 5 minutes, but the best part is that it really saves hours of admin time. An admin only needs to babysit the script while it is running instead of needing to perform an intensive manual process. This shows the huge benefit of Software Defined Storage. Imagine trying to update cache on a traditional SAN without any downtime… isn’t going to happen.

It later turned out that the issue in our environment was a classic VMware View admin mistake of installing updates and then shutting down immediately and recomposing the pool. The updates needed to finish installing after reboot, so they finished installing on all of the linked clones when they powered on. Combined with refresh on logoff which occurs multiple times per day and it was a sure way to test max performance of our equipment!

What #NixVblock should have been

Nutanix is running a marketing campaign #NixVblock.  As part of the marketing  campaign they had a video that I really can’t describe better than the way Sean Massey put it:

“VBlock is supposed to be an uninteresting, high maintenance woman who hears three voices in her head and dresses like three separate people.

The “VBlock” character is supposed to represent the negatives of the competing VCE vBlock product.  Instead, it comes off as the negative stereotype of a crazy ex that has been cranked past 11 into offensive territory.”

While I was not personally offended by the video, it was inappropriate, and I was very disappointed.  It had the feeling of the inside joke that you tell someone else who isn’t involved and then you come off as an insensitive jerk.  You didn’t mean to be an insensitive jerk, you just wanted to let your new friend in on the joke too.  When you turn that joke into a marketing video for your company, comparing your competitor to a crazy date and broadcast it to the world in an official marketing campaign, that is sexist and immature.  Would VCE put out a video like that?  The immaturity of the video just makes Nutanix come out looking like the underdog that they are… nipping at the heels of VMware, Cisco and EMC.

Since Nutanix is still a startup, perhaps they still have interns running the marketing department?  I really only need to ask the marketing department one question that should illustrate why I am upset that they chose such an immature method to attempt to communicate their product’s technical superiority to vBlock (which that video doesn’t even attempt to address).  Who is the intended audience of that video?  Is it customers that haven’t purchased Nutanix before but are also considering VCE?   Consider that some of my US Federal customers have many organizations run by women.  Is that video something that I should point them to that will make them choose Nutanix over VCE?  Is that video going to help convince them that Nutanix is actually the more mature feature rich product?

I have actually experienced trying to procure VCE for a project.  VCE is actually a separate company that resells VMware, Cisco and EMC in one package.  They market that the value add is that their support is qualified in all three products and won’t redirect you to VMware, Cisco or EMC.  But in reality this only helps tier 1 sys admins.  If you forget to check a box, VCE will help you, but if you encounter anything that is a serious bug in one of the technologies, you are going to get redirected to the source.  Also when I tried to procure VCE it came out as SIGNIFICANTLY more expensive than just buying the components separately and putting them together myself… I guess that VCE SME has to eat to?!  Imagine that… putting in a middle man costs more money rather than less… Who would have thought it?!

Another disadvantage you have with VCE is that you lose the ability to compete the internal components.  For example I lose the ability to compete VMware with Citrix, Cisco with Brocade or Arista, and EMC with Netapp, which lowers costs for my customers.  I also had the requirement to have US citizen on US soil support which at the time the VCE rep couldn’t answer if they had or not… IE I was going to get redirected to the component supplier when I called support anyway.  In the end, I just bought the separate VMware, Cisco, and EMC components and bolted them together myself.

Of course that was long before Nutanix.  Which brings be to the title of this post.  All Nutanix really had to do was highlight the features that Nutanix has that vBlock doesn’t have.  Let’s compare.

Nutanix vBlock
Built-in VM aware Disaster Recovery integrated into GUI with N:Many replication Not Built in.  Can buy RecoverPoint for Block replication and MirrorView for file replication. Not VM aware unless you’re talking about vSphere replication, but that’s not really storage-level replication
VM aware storage snapshots Block or File level snapshots
Simple web based GUI interface Cluttered Java interface that I can only get to when I alter security policies to allow some version of java 5 releases old.
Storage Controller on every node 2 Storage Controllers
Infinitely scalable Forklift upgrade
Shared nothing architecture Shared Everything Architecture
Built in Compression / Deduplication Why would you compress / dedupe?  How would VCE make you buy more disks?
Shadow Clones Nothing like shadow clones.
Built in storage analytics that detail IO by disk, VM and node Not Built in.  You can buy the EMC Storage Analytics plug-in for vCOPS for $20K.
Prism Central management interface can span multiple clusters You can argue that Unisphere can do this too, but is still in Java and sucks.

I could sit here for an hour adding to this list, but I think I’ve made my point.

Nutanix, please don’t fire anyone for failing with that video.  We can forgive you, and you need to allow people to make mistakes, learn and grow from them, but going forward please stick to marketing your strengths.  You don’t need to put anyone down, what you are doing stands out for itself.  Take the high road and you’ll win more friends.  I also get that may have grown out of an inside joke and sometimes it is hard to see any potential complications from the inside, but you have enough money to get an external PR agency for future marketing campaign analysis.

Nutanix and VMware vSphere Host Profiles

Host profiles seem like a great idea… Make sure that all of your hosts are configured consistently and enforce compliance. However, when it comes to actually applying a host profile the caveat is that you need to put the host in maintenance mode to apply it. This means that you have to vMotion any running VMs to another host and then enter maintenance mode… A process that could take quite a while depending on the number of VMs you have running.

On Nutanix there is the pesky issue that there is one VM that you can not vMotion to another host… the CVM! The CVM (Controller Virtual Machine) is the storage controller that lives on the host. The physical disks are presented to the VM through VMDirectPath. Since Virtual Machines that are tied to physical devices on the host can not be vMotioned the host will fail to enter maintenance mode. It is possible to shut down a CVM on one node, then put that host into maintenance mode, apply the host profile, exit maintenance mode, power on the CVM, then SSH into the CVM to make sure it is back into the storage cluster before you rinse and repeat for all of your hosts. However, that is a very manual process! It would be bearable to perform on one block (four Nutanix hosts), but if you have hundreds of hosts it will take weeks and a small army of dedicated sys admins to complete the task.

It’s too bad that VMware couldn’t have host profiles distinguish between minor and major changes when dealing with applying host profiles. For example adding a port group would be a minor change, not requiring entering into maintenance mode, while attaching a vSwitch to a vNIC would be a major change requiring maintenance mode because of its potential to disrupt traffic for all of the VMs on that host.

Do we really need host profiles? Nutanix is trying to market the idea that infrastructure should be web-scale. I don’t really like the term web-scale because I think it implies that you’re trying to build some kind of internet service, but that’s beside the point… What they are trying to say is that it should be easy to massively scale infrastructure. This includes having to manually configure a bunch of settings. Putting all of the hosts in your environment into maintenance mode just to apply some settings definitely isn’t scalable. There is no reason to do it!

Every change that a host profile makes can be accomplished through PowerCLI without putting your host into maintenance mode. My recommendation for Nutanix hosts is to use PowerCLI to make any changes to your hosts that you want to be consistent throughout your environment, and then maintain your PowerCLI script and apply it to new hosts that you add to your environment.

You could also make a script that checks the settings on the hosts to monitor for compliance, for example to make sure that no one has added a vLAN to just one host. If you are using vCloud in your environment VMware includes VCM (vCenter Configuration Manager) which accomplishes the same task, with the added component of generating automated compliance reports.

Of course I’m implying that your hosts are running VMware, Nutanix also supports running Hyper-V and KVM where it’s almost inherently implied that you are going to need scripts to maintain consistency in the environment.

Nutanix CVM Autopathing Test

I have a Nutanix cluster that needs to be upgraded from 3.1.2 to 3.5.2.1 (or 3.5.3.1 if it is out by the time I get around to upgrading it). That got me to thinking about the upgrade process. When you perform a Nutanix Operating System (NOS) upgrade, it performs what Nutanix calls a “rolling upgrade”. This in effect only performs the upgrade on one CVM at a time. While the CVM is being upgraded, the storage on that node is directed to another CVM.

My first thought was, “How does that actually work”? Thanks to Zach Vaughn @z_n_v, Nutanix SE Extraordinaire, my eyes were opened.  When the cluster detects that a CVM is down, it SSHs to the Hypervisor (I’m referring to ESXi) and adds a route to the external IP of another CVM in the cluster. The cluster performs this check every 30 seconds, so it is possible that your VM will be without storage for 30 seconds. How disasterous could this be? (I’m told that as of NOS version 3.5.3.1 this will be much faster than 30 seconds). The following video shows what happens.

This test was performed on a Nutanix 1350 block running NOS 3.5.2.1. The desktop is running on Node C. I start encoding a video using handbrake which is writing to the user’s desktop on the local disk. When I shut down the CVM on Node C the desktop appears to hang for 20 seconds. However, it is possible that the PCoIP server process stops responding for those 20 seconds, as when the desktop resumes you can see that it has still received pings from the hypervisor.

I ran this test from a different machine and the View Client seemed to stay connected. The difference being that it was an iMac connected via ethernet and I recorded the video on my Macbook Pro connected via wireless. The desktop continued to receive pings, but the handbrake process stopped while the disk was unavailable for about 20 seconds and then resumed when the route to the CVM was changed on the hypervisor. If I can get that to work again I’ll try to post another video.

Export Nutanix Configuration to CSV through Powershell and REST API

What do you do when you have over 100 Nutanix nodes scattered across multiple datacenters and need to audit the configurations, or record the configurations for documentation?

Write a powershell script that queries the REST API of course!

In this instance I needed a known starting point.  I didn’t have all of the IP addresses of the CVMs, hosts, etc in a format that I could query.  What I did have was all of the hosts in vCenter along with all of their CVMs.  So this script starts by connecting to all of the vCenters in the Datacenters and getting a list of all of the CVMs and their IP addresses.  It then runs REST API queries against the CVM IPs.


Here’s what the output looks like when opened in Excel (and scrubbed of proprietary information):

image


Any blocks that are not configured yet, or are not running a version of NOS that has the REST API, or do not have network connectivity will return System.Collections.Hashtable values as you can see below.

image

Nutanix Block Startup / Shutdown Powershell Scripts

Anyone who has Nutanix lab blocks that need to be started / stopped frequently may appreciate these scripts.

Upgrade Nutanix 1350 block to ESXi 5.5

Nutanix recommends that you upgrade to vSphere 5.5 using the VMware Update manager instead of directly mounting the ISO.

Another way to upgrade instead of installing Update Manager is to just download the offline bundle and run the command:

esxcli software vib update –d “FILEPATH to OFFLINE BUNDLE”

Here are the steps that I used to upgrade my nodes from ESXi 5.0.0 to ESXi 5.5.

  1. Download ESXi 5.5 bundle from VMware.
  2. Upload the bundle to the root of my Nutanix datastore

    image

  3. SSH to the CVM.  From the CVM we can execute a script that will run on all of the hosts:

    for i in hostips; do echo $i && ssh root@$i "esxcli software vib install -d /FILEPATH TO OFFLINE BUNDLE"; done

    *I missed that the hostips is encapsulated with backticks and not ‘’ single quotes so I just logged onto each host and ran “esxcli software vib install –d /FILEPATH TO OFFLINE BUNDLE”

    image

  4. Shutdown the CVM.  We are able to shut down one CVM at a time without disrupting the state of the cluster.   Then reboot the host.

     image

  5. Rut-roh!  My host didn’t come back into vCenter.  When I try to force it to reconnect it tells me that some virtual machines powered back on without following the cluster EVC rules.  Upgrading to ESXi 5.5 must have reset the EVC setting on that host.

    image

    To remedy it I shut down the CVM, force the host to reconnect, then power the CVM back on.  On the next node I just put the host into maintenance mode before I reboot.