Virtualization – Good or Bad?

Virtualization is a craze in the making. All the warning signs are there – you have to throw all your machines away, and replace them with several hundred thousand dollars worth of new ones. Effectively you have to ditch your distributed system and replace it with a distributed system simulated on a supercomputer. Wow. That’s Cool!!! But is it a good idea?

The first time I had the technology (and its capabilities) explained to me, I was captivated. But I’ve spent the last few weeks thinking about what it means to me as a designer and developer of enterprise web applications and I’m beginning to question the wisdom of virtualization as a technology.

The latency of all inter-tier calls will shrink massively from milliseconds to microseconds. That will allow the server to render pages at a MUCH higher rate. Its threads are not waiting on calls to external services like RDBMSs. If that is the case – what effect does that have on the design of typical distributed systems? Can we dispense with the usual strictures against chatty interfaces? Can we avoid using cursors and other programmatic methods to avoid network usage?

The beauty of virtualization is that it allows us to undo some of that overzealous use of remoting and web services that we all used a few years back, because it was the craze at the time. Physical deployment wasn’t a benefit of n-tier design; it was just what people thought they ought to be doing to their n-tier designs – to allow scaling. The problem was that people never took the time to do stress and volume testing and capacity planning, so they didn’t know where their time was being wasted – they just deployed across a web farm in case. The end result was that simple pages could take a large chunk of a second to render because of all this needless communication across the network.

Virtualization magically turns our physical partitioning back into logical partitioning. This is a great thing. But not something we need to spend a gazillion dollars to achieve. Just use in-proc calls wherever possible. If we were to back off from physical partitioning, what are we left with? An n-tier logical design deployed on as few machines as possible (whilst still retaining fail-over).

Question: If you had a supercomputer to run a website with, would you use a sizable proportion of its CPU cycles to simulate an old-style web-farm or would you dedicate all the CPU cycles to rendering web pages? I guess the thing is that virtualization allows you to do is make a slow transition from physically distributed systems to monolithically distributed systems. Which would be better – have a VM server running a single VM for the presentation and business logic tiers or have several VMs doing that work? I’d be interested to see how the performance metrics look for these two scenarios.

Virtualization will mean big changes for development teams, since it may be possible for them to test their code on literally identical installations to production. What other changes are in store?

3 thoughts on “Virtualization – Good or Bad?

  1. Hey Andrew, new site?

    The thing that attracts me to using virtualization is the scalability. If you put the UI and application tier in one machine (physical or virtul), then decide you want them separated, that’s a lot of change and downtime. If you put them as two seperate VM’s on one server, as demand increased you could move one to another server.

    This is especially useful to small applications or businesses. You might run your website and SQL server on the same machine. But what if you decide to seperate? You have to get another box, install windows and SQL, configure all the network mumbo jumbo, take the application offline, then change the connection strings (assuming you put them in a config file🙂 ).

    Alternatively, if they were two virtual machines on the same server, you’d just xcopy the VM image of your SQL server one night onto the other server, plugin a network cable and take the other offline.

    It also means you can make better use of your resources. Consider putting an exchange server and a batch processing server (e.g., a spider?) on the same machine. Exchange is usually busy during the day but sleeps at night, while the batch server would be a night-only thing. You wouldn’t want to install them both on the same Windows installation, it might be a much cheaper way to make full use of your resources.

    Of course you have to ask yourself whether the VM overhead is worth it, because as you said, in the case of many applications it really isn’t.

  2. Yeah – The Hemingway style. I Like it.

    Like I said, I was captivated by the possibilities. I am just thinking out loud about what it means for us developers.

    I understand that the physical separation of a system allows scale-out deployment IF you can identify what application logic or database activity is the performance bottleneck. In many cases DB activity is the issue, and we will need to separate out the DB for the sake of fail-over anyway. But in many systems that I’ve looked at the cost of communications is a large fraction of the cost of page generation.

    I just question whether the best use of these huge grunty machines is running simulations, and not web sites. I can see the way a VM server would reduce the capital expenditure of a hosting service, since they can pile on VMs to a big machine, and thus save space and provide enhanced services etc. They can also provide great facilities for disaster recovery. But do we care about that as designers?

    What I’m wondering is whether the application designers should be treating these machines as scaled up or scaled out. I don’t think we’re making good use of our resources if we scale out on a VM – you are multiplying the housekeeping code that must be run.

    I assume that if you simulate a distributed system on VM, you will still have to simulate the whole TCP/IP stack down to the driver level before the VM can step in. This wastefulness be frowned upon even today. Why would we turn a blind eye? We acknowledge that we need performance enhancements for the purposes of scaling. Why not just use the machine?

  3. Hmmm, I agree and disagree – some things will change, like datacentres being a lot less kit. Other things, like development environments, will have the same challenges. e.g. that ‘all-in-one’ virtual server image won’tbe the same as production, because the CPU, IO/memory latency, and the other weirdisms that differentiate a virtual server on one host server against a virtual server on another.

    Potentially it may be worse – because the constraints (IO bandwidth, IO latencency etc.) are now dynamically contended, but previously these were static constraints. e.g. a flash load on the web server could cause the database journal log to be exhausted. Over two physical systems, disk IO bandwidth is possibly lower than a beefy server, but it is much easier to capacity manage and understand spikes and simultaneous flash loads.

    One thing that virtualisation will bring is proper understanding of resource utilisation – in effect profiling beyond the application and into the OS and the hardware itself. This will impact development significantly – as the exact characteristics of not only CPU use and memory, but memory bandwidth, disk, network, working set cache sizes etc. etc. .

    Developers could also previously pretty much leave system config to the sysadmin – as scalability demanded. Now they need to ensure that their application will not hog CPU/memory/IO/network or other resources – because they may be over-contended on that physical server’s between multiple virtual servers.

    So, though we’ve a whole new Information Processing ‘medicine’ to cure some problems, we also have a new, bigger needle and we are not experienced in exactly where to shove it !

    Overall, I see a move to a ‘tool-per-server’ approach, because it’s simply more development efficient, but like a good DB admin can make/break an application, we now will rely on good virtualisation admins.
    Their immediate challenge is that virtualisation products don’t have much in the way of tunability. Ideally you would want a VS config that gives you exactly the reserved resource of a particular physical server (e.g. Compaq DL320-G3 2x3ghz, 2gb ram etc). Then you could at least start with a system that is initially uncondended, and then monitor peak use and judge how much overcontention you can risk.

    At the moment, it’s the reverse situation: cost saving drivers are pushing massive, unengineered, overcontention , which is much more risky.
    Oh, and you may be trapped in (addicted to) virtualisation: virtualisation permits brief dedicated intense resource use – but a smaller physical servers don’t have the same ability to cope with a high flash resource requirement. Anyone that’s ever seen what happens when an application expects more physical RAM than is available, and thrashes, knows this is a world of pain.

    Just to cap fnish with more of my ‘medicine’ analogy: We’ve just not done any any proper clinical trials…😦

Comments are closed.