This was the weekend when my main demo environment was reinstalled with Windows Server 8… I love the new Data Deduplication feature, a feature that removes duplicate chunks in files on the file system, even inside VHD files etc… Meaning if you have many files that are full or partly identical, the data will only claim the real hard drive space for the chunks in the files that are redundant.
Please note that Deduplication works best for content folder stores, virtualization depots or backup stores etc. It's not really intended for live, constantly changing data.
Never, I been hesitant of having the scheduled dedup chunking jobs enabled when have my VMs are running (I crashed a few vm's beyond salvation during a scheduled dedup operation). For a demo hyper-v host, I recommend disable all the dedup schedules in task scheduler, and only run it manually every once in a while when all the virtual machines are saved or turned off.
Estimate the Deduplication savings
If you want to examine how much space you can save on a volume, without actually enabling deduplication, you can run the ddpeval.exe tool. You can also copy the ddpeval.exe file from a Windows Server 8 installation to a Windows Server 2008 R2 machine and run it. Useful to find out if your machine would benefit from a Windows Server 8 upgrade in terms of deduplication. You can run ddpeval.exe against local drives or remote shares.
Here is the output from running ddpeval.exe on one of my Windows Server 2008 R2 deployment servers.
Deduplication is a File Services role that you add via server manager, and after doing that you can enable data deduplication on your data drives (not the os volume).
In my demo environment I had a few Hyper-V hosts with about a terabyte of virtual machines and ISO files. After installing Windows Server 8 beta and restoring my backup of files, my disks looked like this… E.g. before adding deduplication
The I added the Deplication file services role via server manager, and forced an immediate data deduplication schedule (via the task scheduler). After about three hours or so my drives looked like this:
I still have the same data on the drive, I just have 372 GB free space instead of only 73 GB. Life is good… 🙂
For more info about Deduplication, check the following
Data Deduplication Planning and Deployment