Geaux Virtual

Helping virtualize the datacenter…

Looking beyond Long Distance VMotion…

First post in a while, but this should be changing in the near future.

Back from VMworld 2009, there was a topic that interested me: Long Distance VMotion.  However, when looking at the technology behind it, it left me wanting more.

What more could I be looking for?  Datacenter VMotion.

How does this differ from Long Distance VMotion?  Datacenter VMotion would be a combination of Long Distance VMotion and migrating the storage live as well (Long Distance Storage VMotion?).  If storage is already replicated to the second location, it would just be a matter of syncing the changed blocks of the VM to the second location.  One click, all VMs (or at least critical VMs), migrate to the second location to complete the Datacenter VMotion.

September 5, 2009 Posted by jguidroz | VMware | | 2 Comments

Where’s my CPU Affinity??

We were having a discussion today over a vendor that said they currently do not support virtualizing their server application because of the “real-time clock” issues with virtualization, but they were working on this.  This led me to go find the VMworld 2008 presentation on Real-Time Applications.  The solution in the presentation was to set CPU Affinity for the VM.  Well, for the next 15 minutes or so, we went looking for CPU affinity with no luck in finding it.  Finally, we stumbled across it….

CPU affinity is hidden in the preferences for the VM when the VM is actively part of a DRS cluster.  For VMs in the DRS cluster with DRS disabled, CPU affinity can be found under the Advanced CPU selection in the resources tab.

May 8, 2009 Posted by jguidroz | VMware | , | No Comments Yet

svmotion said NO!

Today we were needing to a do a full restore of one of our servers; however, we did not have enough space on the LUN to take a snapshot of the system disk of the VM and perform a full restore.  So I decided I would svmotion the VM system disk to another LUN.

svmotion said NO!

It seems, if a VM has independent-persistent disks, svmotion will not move any virtual disks, even if the independent-persistent disks are remaining on their original LUN.  I currently do not have my vSphere 4.0 RC hosts connected to a SAN, so I can not test if this is still the case in vSphere 4.0.

May 4, 2009 Posted by jguidroz | VMware | , | No Comments Yet

So you say to use Update Manager

Update Manager 1.0 is great.  Automatic update downloads.  Compliance baselines.  Support for VM and ESX host patches.

Sure there are some issues, such as downloading updates for ESX 3.0.3 and ESXi even though none of these versions exist in my Datacenter (I hear this is fixed in vSphere 4.0, though I can’t confirm as the Update Manager RC crashes when trying to run with domain credentials).

Now to what I’m hear to talk about.  Using Update Manager to update hosts across a WAN.  When you have a few patches to push, it works great.  Slow, but it works.  So what do you do when you have 2GBs+ of patches to install on 24 hosts located across 7 different states?

PATCH DEPOTS!!!

Here is what I did to update my 24 remote hosts.  I copied all the patches that were downloaded since my last update back in June of 2008 to a folder on a VMFS volumes at each location.  I also copied the contents.xml.sig and contents.xml files to this directory as well.  I then logged on to each host using ssh and ran esxupdate to patch the hosts.  Now, you have to run esxupdate from the patch depot, or you have to specify the location when executing esxupdate.

So from the patch depot, you would execute esxupdate -b ESX350-Update04 –test update to first test the installs.  Then run esxupdate -b ESX350-Update04 update to update the servers.  And I have to say, this patched the servers a lot quicker than through Update Manager.  Reading on the VMware Communities, it seems any patch that restarts hostd, Update Manager waits 10 minutes before installing the next patch.  This is not an issue with esxupdate on the host itself.

In vSphere 4.0 Update Manager, it seems like Update Manager has the ability to utilize patch depots at different locations, but I believe the patch depots reside on a host, not on a SAN LUN.  Again, I can’t confirm this just yet as the Update Manager RC isn’t running for me right now.  If the patch depot does not reside on a SAN LUN, I think I’ll be submitting a feature request for this to happen.  Sure residing on a host is at least on a step in the right direction, but why push patches over the network when each host has access to the same LUNs on the SAN?

April 29, 2009 Posted by jguidroz | VMware | , | No Comments Yet

Fixing my Update Manager issue

I recently upgraded vCenter to 2.5 Update 3, and this was my first time using Update Manager to update the hosts at one of my sites to ESX 3.5 Update 3 since the vCenter upgrade.  I migrated all the VMs off my “test” system, and ran an Update Manager scan to see which updates needed to be applied.  And this is when I saw this error message:

VMware Update Manager had a failure.

So I preceded to try again.  Same error.  Restart the service on the vCenter server.  Same error.  I stopped the service to look at the logs, and I see a SQL error regarding a foreign key constraint.  Searching for the error on Google, VMware, etc., I came across KB1007512.  This is the exact issue I had seen.  So following the KB, I had to let all the ESX/ESXi updates re-download.

Fast forward a couple of hours to the completion of the re-download of 5.70GB of updates.  I restart the Update Manager service and rescan my host.  I receive the same error:

VMware Update Manager had a failure.

Stop the Update Manager service once again.  Upon inspection of the log files, I see a different SQL error, this time about duplicate primary keys.  Search Google, VMware, etc.  Nothing.

I really wanted to push this off till the morning, but with the hectic schedule I had the next day, I decided to use my trusty VMware Support.  Gave them a call, and after uploading various log files to the support team, they had a fix for my issue.  In the Update Manager database, there are two tables: VCI_SCANHISTORY and VCI_SCANRESULTS.  The identity value for VCI_SCANHISTORY must be larger than the identity value for VCI_SCANRESULTS.  Running SELECT MAX(ID) FROM for VCI_SCANHISTORY and VCI_SCANRESULTS showed the identity values to be 20 and 134, respectively.  The fix was to issue the following command:

DBCC CHECKIDENT (“VCI_SCANHISTORY”, RESEED, 135)

This command reset the identity value of VCI_SCANHISTORY to 135.  With the identity value reset, I restarted the Update Manager server and successfully scanned my host to begin updating.  Thanks to VMware Support, I was able to go to bed at a reasonable hour.

March 17, 2009 Posted by jguidroz | VMware | , , | No Comments Yet

My beef with Update Manager

First, I’ll start off with what caused me to make this post.

KB Article 1007512 – Scan of host fails after upgrade to Virtual Center 2.5 Update 3.

The resolution to this KB article is to rename the hostupdate folder and let Update Manager download all the ESX patches again.  5.60 GB worth!!!  So, without further ado, here are my beefs with Update Manager:

  1. Update Manager is not smart enough to know I do not have any ESX 3.0.3 hosts in my cluster.  Furthermore, there is no option to disable downloading ESX 3.0.3 updates.
  2. Unlike with Windows hosts where Update Manager only downloads the updates that are needed as per the baselines, Update Manager downloads every ESX update available, even if it’s not needed.
  3. Resolution to KB 1007512
  4. No ability to schedule reboots when patching a Windows host.  You can schedule when a patch will be pushed to a Windows host, but you can not push patches and then schedule a reboot for a maintenance window at a later time.
  5. Unable to have more than one patch repository (would be very useful in networks with one VC server, but multiple clusters located across a WAN).

Now, Update Manager is a really good product for a 1.0 version.  I hope to see some of these changes in future releases of this software.

March 12, 2009 Posted by jguidroz | VMware | , , | No Comments Yet

Campus Network Design

My role in modernizing our work network started a little over a year ago when I joined the network team.  Now to better understand our network, it’s best to think of it as a campus network.  We have a lot of buildings, with only a few having a large amount of users, connected all together in a switched network.  In total, I believe I have over 100 switches right now currently making up our network.  The list of issues on our network are many:

  • All users, switch management, servers, printers, etc. all on VLAN 1
  • Access switches daisy chained together
  • Switches handling routing that are not the core switches of the network

Now outside of everything on VLAN 1, some may ask what exactly is wrong with this layout.  Well for one, it’s very difficult to locate devices on the network, since all devices are on the same subnet.  Thankfully, we have a Fluke Optiview that can locate devices in a large layer 2 network.  The network is also prone to flooding, from broadcasts, unicast, and multicast flooding.  In fact for a couple of weeks, we experienced unicast flooding due to a misconfiguration between our core switches and the newly-installed-but-completely-out-of-my-control layer 3 switches. (Maybe one day I’ll do a post about outsourcing and it’s “benefits”).

So I’ve set out to improve the network.  I am working to model the network after the Cisco Campus Network Design.  The basis of this design is that access switches, routers, VPN routers, datacenters, etc. connect to distribution switches, and the distribution switches connect to two core switches.  The two core switches are the control center for the network.  Now, if you follow the Cisco recommendations, the Core switches only handle layer 3, and the distribution switches handle layer 2 and layer 3.  Recently, there was a design discussing bringing routing all the way down the access level.

I see two issues with bringing routing all the way to the access layer in our network.  First, we have a guest VLAN for our contractors that is spread through one vlan in our network, and we have a security VLAN for our security network that is spread through one vlan in our network.  VRF-lite is basically VLANs for layer 3, but I believe each network in the vrf would be in it’s own subnet.  However,  I haven’t verified this yet.  If this is the case, then that will increase management.  Second, with over 100 switches, we are talking about 200 separate networks (data and voice) on each switch.  That is 200 subnets to manage dhcp address for compared to the 2 we have now.  This would be a huge increase in management.  Granted, we could easily locate devices on the network by just performing a traceroute.  Broadcast flooding or unicast flooding would be limited to only one switch.  With the access switches doing routing, you could standardize on a similar configuration for all switches.

But do the increased efficiencies in the network offset the increased management of the network?

Lets forget about routing at the access layer, and go back to the design with the distribution switches being layer 2/3 devices.  I would still have to look at using VRF-lite for the security and guest networks.  With about 8 distribution spots in the network, I could have a minimum 16 vlans (1 voice and 1 data vlan for each distribution point) and 16 subnets, but you would have more than one access switch per vlan.  A limit could be set and have say 5 access switches per vlan.  This would increase the number of vlans and subnets to manage, but would reduce the number of devices in each vlan.  But in this setup, there is still an issue of locating devices on the network.  If one went with the minimum setup of 16 vlans and 16 subnets, private vlans could be used to isolate traffic between buildings, but even private vlans are not without their security issues, so this would need to be taken into account.

All in all, it appears I have a daunting task ahead of me.  At some point, I will have to make concessions to both manageability and efficiency in the network.  Though this exercise has made me think about subnets and manageability vs efficiency.  Say I wanted a different subnet for each access switch.  I install a 24 port switch.  Is it really efficient to assign a /24 subnet to this switch just for management sake since this is what most people are used to?

I’ll update this with links once I figure out what’s going on with not being able to add links.  I believe it may something to do with using Safari 4 Beta.

March 10, 2009 Posted by jguidroz | Networking | , , , | No Comments Yet

Another blog….so what…

Another blog is created.  Will anyone notice?

So what is this all about?  I’m looking to share my ideas and thoughts on virtual datacenter design and networking layouts.  Every now and then, other non-related information may be posted.  If anyone finds a use for my ideas, I’ll be more than happy to accept a beer as payment.  I’m a VMware VCP living in South Louisiana working towards the VMware VCDX.  I’m currently waiting for my Enterprise Exam to be scheduled.

March 7, 2009 Posted by jguidroz | VMware | , , | 1 Comment

And it begins…

The beginning of something.  We’ll just see where this goes.

March 7, 2009 Posted by jguidroz | Random | | No Comments Yet