New Record
I made a new personal record today. 24 new VMWare virtual machines made in one day.
21 agents, 3 managers. There's 7 new Windows nodes to build as well. I also have to set up all of the clustering goodness on the nodes.
This puts us closer to filling QA's request of 54 new servers (Linux, Windows, UNIX.)
It's days like these that I think that if I knew Perl better, I could write a wrapper using the VMWare API and make this whole process easier. It also makes me wonder why nobody has done anything like that yet.
My Large VMWare Server Farm
It seems like many people come to this blog from Google searches about VMWare, CentOS, and OpenFiler. I figured it might be good to talk about my VMWare Server deployment at work, since it's something that I am fairly proud of.
I have fifteen Dell PowerEdge 1950 servers. They're 1U each, with dual quad-core Intel Xeon CPU's ranging from 1.8 to 2.2ghz. They each have 16GB of RAM. Ten of them have 143GB 15K 3.5" SAS drives, and 5 of them have 143GB 10K 2.5" SAS drives. The servers that have the 10K drives have a backplane that will allow you to plug in 4 drives. The servers with the 15K drives have backplanes that will allow you to only have 2 drives. Each server has two onboard Broadcom NIC's, a PCI-X Broadcom NIC, and a recently added dual port Intel e1000 NIC. I'll get into that in a second.
Each VMWare server runs CentOS 4.4 64 bit ServerCD edition. For those of you who don't know, CentOS is a 100% Red Hat Enterprise Linux binary compatible distribution. It's built from Red Hat sources and, due to the nature of the GPL, is able to be released by the CentOS group for those of us who want Red Hat Linux but don't want or need to pay for Red Hat support. I would argue, given my experiences with Red Hat support, that the support offerings of CentOS are superior.
I am a firm believer in keeping things as simple as possible. I have seen many other Linux sysadmins want to go crazy with the software they deploy and the hacks they roll into production, only to be bogged down in a morass of "one offs" or to leave behind a legacy of poorly documented systems that really need their original owner to run right. I don't like that, which is why I tend to stay on the straight and narrow. I keep my partitioning simple. I (generally) keep the packages I install restricted to the ones available through official CentOS channels. Some may consider this heresy, but if there is a RPM available for something, I'd rather install that than build from source. All of this leads to systems that "just work" and that can hum along and do their jobs with a minimum amount of fuss. Could I squeeze some extra performance out if I did a custom compiled kernel? Sure. Do I want to be troubleshooting VMWare at 3AM in the morning because something in that kernel broke virtual networking? No way.
On all but a few of our VMWare servers, we run VMWare Server 1.0.3. New servers that have just made it into production are getting 1.0.4, with a general upgrade planned in the somewhat near future. Not because we're seeing problems, but if we have to take boxes down to add new hardware (the Intel e1000 NICs that I am getting to in a second) we might as well upgrade VMWare while we're at it.
We chose VMWare Server for the price. You absolutely can not beat it for the price, which is free. We spoke with VMWare about getting VMWare ESX in, and even in it's most basic of forms, it would have been prohibitively expensive. Here at GA we're concerned about getting the most value for our money. By going with VMWare Server we lose the ability to have multiple snapshots per VM which would be nice, but is not a deal breaker. We also lose the central management, but you can make up for that by buying VMWare VirtualCenter 1.4, which we did. I'm not too happy with it, but it could be because it just doesn't scale well to the level that we're using it, or it could be set up better. Probably both.
Each VMWare server has three nics. Two onboard and one PCI-X. eth0 and eth1 are both bridged interfaces - eth0 handles all of the main traffic to each node, and also serves as the management interface to the VMWare server itself. eth1 handles Oracle priv traffic for RAC, and cluster heartbeats for Windows SQL Server clusters. eth2, the pci-x NIC, handles all of the storage traffic. Each VMWare server has a dedicated uplink on it's own VLAN to a Dell PowerEdge 2900 that is acting as a big NFS server.
We ran into a problem with the PowerEdge 1950's on-board NIC's. If you put them under any sort of load (which we were with multiple VM's trying to copy media and provision databases on ASM) the bus that the NIC's were sitting on would reset. That would drop all of the VM's off the network for a time, and the switches that the nics were plugged into would show that the link had gone down and then back up. This is a bad thing. We're also not the first people to see it. After a fight with Dell (who were not really inclined to help us because of CentOS or VMWare Server) I got them to send us an Intel e1000 card. Installing this in the spare PCI-X slot made our network problems go away. So, we're in the midst of bringing down all of our VMWare servers, disabling the on-board NIC's, and installing these Intel cards.
Another problem we're running into is that Dell PowerEdge 2900. We have ~70 VM's on it, and when they get under heavy load some of the VM's experience SCSI resets, which sometimes results in database creates failing, and support tickets in our queue. According to some of the folks on the Linux-Poweredge mailing list, the hardware RAID controller that is in the box - the PERC5/i - generally sucks under Linux, offering performance slower than software RAID. There are rumors of an updated driver from Dell that will make it run faster -- we'll have to see how that pans out. In the mean time, we're going to be ordering fifteen 750GB SATA drives for each server. That will increase our total available VM storage to 11TB or so, which is better than the 2TB we get from the 2900. That also means that we lose out on nifty features like "if the VMWare server goes down, we can bring these VM's back up on another machine."
You may be curious how many VM's we can stuff on one of those 1950's. Well, with a mix of local and NFS storage, we've gotten up to 15 VM's running at once. These aren't weenie VM's either - they're either RHEL nodes which have either 512 or 1GB (usually 1GB) of RAM, 15GB of disk, or Windows nodes with 512-1GB of RAM, 15GB of disk, and clusters running. They're either running Oracle or MS SQL, and while they're not handling millions of transactions, they're being used by my development and QA staff.
As you might expect, power and cooling requirements for this bunch of servers is high. They're all in one APC Netshelter VX rack, fed by three 15A 110v AC lines. Some other infrastructure servers are also on those circuits, but we're using up roughly 30A in that one rack alone. Cooling is hard -- we've blown past what the 5 ton AC unit in the room can handle, and the two portable A/C units don't do much to help. We're in the process of moving gear to a colo.
All said, this environment has helped GA really expand. If we had to make an investment in physical servers we would have spent in excess of $500k to purchase all of that gear. With less than $70k invested, we're able to accomplish nearly the same thing -- and more, once we work the bugs out. I've been a huge fan of virtualization since VMWare first came on the market, and in my case it's really been worth it to deploy.
VMWare Server + Win2k3 64 bit + Linux NFS = Not Fun
I haven't written a tech blog post in a while, but I've been working on an interesting, albeit frustrating, problem over the last few days.
At work I have 12 Dell PowerEdge 1950 servers, each with dual quad core Xeons (ranging from 1.8ghz to 2.3ghz), 16GB of RAM, and 138GB SAS drives. They're running VMWare Server 1.0.3 on CentOS 4.4, with all of the latest OS level updates installed.
We're virtualizing about 120+ Red Hat Enterprise Linux 4 U2, U4, Windows 2000, and Windows 2003 Server nodes, both 32 and 64 bit. These nodes would be running my company's software, Oracle, and MS SQL Server.
The bulk of those VM's live on a Dell Poweredge 2900 server with 8 x 500GB SATA drives, and a Dell PERC 5/i RAID controller in a RAID 5 config. The CPU is a quad core 1.8ghz Xeon. It has 2GB of RAM. The server is running CentOS 5 and is sharing it's disks with NFS v3. There's a 2GB bonded ethernet connection using the onboard Broadcom nic's and a Dell Powerconnect 5324 switch.
We were seeing that Windows 2003 64 bit nodes, when under moderate to heavy load, would experience massive packet loss. Additionally, the VMWare Server Client would not redraw the servers screens reliably. Finally, the node would bluescreen with a KERNEL_DATA_INPAGE_ERROR. This would happen when our software was copying SQL Server media to the node in preparation to provision a database. This would only happen with 64 bit Windows - 32/64 bit Linux would be fine, and 32 bit Windows would be fine.
The Windows Event Log would be littered with warnings and errors about "The device, \Device\Scsi\symmpi1, is not ready for access yet." It didn't take a rocket scientist to figure out that something was happening to make these machines try to access swap, fail, and bluescreen.
Now, I had been told by users that this was happening on nodes that were on local disk as well as our remote NFS server. I did extensive testing and was not able to reproduce the problem when the nodes were on local disk. It turns out that I was given erroneous information, and that nodes that people thought were local were in fact on NFS. Once I moved my test nodes over to NFS, I could reproduce the problem.
VMWare has a KB article that addresses this issue. In fact, it seems fairly common for people who run their VM's over an iSCSI SAN. Once I applied the registry change, my VM's stopped bluescreening, but our file copy operation would still fail.
Looking on the VMWare Server, you would see load averages of ~20-30, and iowait's around 25%. Looking at the NFS box, you could see that i/o to /dev/sda2 was eating up about 100% of CPU.
I changed our NFS mount options. No dice. I turned on Jumbo Frames on the bridged nic on my test VMware server. No dice. Each step would make things a "little" better, but not solve the problems.
Then, I moved the VM images over to our Netapp, which was no small feat since most of the space is used. I finally freed up about 120gb, enough for my 5 test VM's and their snapshots, and went to testing. I fired the VM's back up ran through another provisioning event.
Not only did my packet loss issues seem to go away, but for once I was able to run a Windows 2003 64 bit node on NFS and provision MS SQL instances without bluescreening.
Our Netapp isn't the newest model. It's a FAS 270 with 1.2tb of space. It's connected to another Dell switch in another rack, with a 1GB uplink to my core switches. The Netapp does not even have Jumbo Frames enabled. Somehow, though, it's kicking the crap out of my Dell NFS box, despite being seemingly "inferior."
My questions at the moment are:
- Is my config on this NFS box fundamentally broken somehow?
- Is Linux's NFS server really bad? Would I be better off with BSD or Solaris?
- Is something up with the driver for the PERC/5i? Is write caching enabled?
- Is there something up with the LSI driver in Win64 that does not show up in Win32 or in Linux?
- If I have to rebuild this NFS box, where do I put 1TB worth of VMWare images while I rebuild the box?
Solaris x86 on a Dell Poweredge 2900
We got a Dell PowerEdge 2900 in with the intention of making it a big ass file server. The basic specs on it are:
- Quad Core Xeon 1.6 (Dell was running a special, free upgrade to the quad core from the dual core)
- 2GB RAM - PC2 5300, 4 x 512mb
- 8 500gb 7200 RPM SATA drives
- Dell PERC 5/i SATA RAID
- 5U Rack chassis
My intent was to install Solaris x86 on the box, setup a ZFS partition, install NFS and Samba and make a nice file server to hold VMWare images and a file dump for the developers. Unfortunately I found out that there are no drivers for the RAID controller for Solaris x86 from Dell, Sun, or LSI.
So, I loaded CentOS 5 (CentOS 4.4 won't boot on it for some reason - hangs before Grub tries to run), and installed VMWare Server. I'm going to install Solaris x86 under a VM and give it access to a raw partition to hold it's data. This should keep things speedy. I did read, however, that Solaris x86 will core dump VMWare Server 1.03 if it tries to access a raw partition. Hopefully that won't be the case for me.
I also need to get Dell OpenManage installed on all of these servers so I can monitor their health and get alerts if they lose a drive in their RAID arrays.
I also need to get the storage network up and running. For now it's going to be on it's own VLAN. If I have my druthers, though, it will be on a physically separate switch. All of the new VMWare servers I bought have a 3rd TOE nic that I was going to use just for accessing the NFS server that will host the VM's images. The PE2900 will probably end up having 2 of it's interfaces bonded to get 2GB/s access to the LAN.
It's never ending. At least I got to leave before 7 tonight. Still didn't get home until 9:15 or so.