Optimizing VHDX Files in a Hyper-V Lab

In my Lab(s), I have hundreds of VMs spread across multiple Hyper-V hosts (and a few VMware hosts too, but that's out of scope for this post). The operating systems used for my Hyper-V hosts are a mix of Windows Server 2016, 2019, and 2022. Also, I have Data DeDuplication enabled on all volumes used for the VMs. Since I heavily rely on Data DeDuplication I typically don't have to worry about optimizing my VHDX files for day-to-day operations, but when it comes to migrating VMs to another volume, or for backup, it's quite the difference copying 150 GB instead of 300 GB over the network.

VMs on a disk volume with data deduplication enabled.

The Optimize-VHD Cmdlet

Since the release of Windows Server 2012, the PowerShell module for Hyper-V includes the Optimize-VHD cmdlet. This is just one of hundreds of cmdlets in that module, but this one is used to optimize the allocation of space in your virtual hard disk files. Assuming your virtual disks are dynamic disks that is.

To make sure the Optimize-VHD is running with the best efficiency you need to mount the VHDX file in read-only mode prior to running command. If the VHDX file is not mounted in read-only mode prior to running the command, it will ignore the Full Mode you told it to run in, and default to Prezeroed Mode.

You can read more about the various modes the Optimize-VHD cmdlet supports here: https://learn.microsoft.com/en-us/powershell/module/hyper-v/optimize-vhd

Defrag vs. No Defrag

In my environment I found that defragging the disk prior to optimization allowed me to shrink the VHDX even further.

Benchmarking

Below are some metrics from one of my labs. For this test I copied the VHDX file from a VM hosting a deployment server to a temporary location. I then ran the Optimize-VHX command for VHDX files that both had their volumes defragged and not defragged. To make sure I compared correctly I re-copied the VHDX file from the original source in between the tests. I also compared the runtimes between SATA SSD's and NVME SSD's.

Full Mode, No Defrag

  • Runtime on SATA SSD: 14 minutes and 11 seconds
  • Runtime on NVMe SSD: 35 seconds
  • VHDX size before optimization: 213 GB
  • VHDX size after optimization: 169 GB
  • Reclaimed disk space: 44 GB

Full Mode, With Defrag

  • Runtime on SATA SSD: 35 minutes and 46 seconds
  • Runtime on NVMe SSD: 2 minutes and 17 seconds
  • VHDX size before optimization: 213 GB
  • VHDX size after optimization: 159 GB
  • Reclaimed disk space: 54 GB

Summary

For me, waiting an extra 2 minutes is definitely worth it since it reduced the VHDX size with another 10 GB. The preceding results is also a good proof for staying away from SATA SSD's and instead use NVMe SSD's.

The PowerShell Script

Below you find the PowerShell script I used for the Full Mode, With Defrag test:

Note: Please don't give me a hard time for using write-host in the end of the script 🙂 I think that is perfectly ok when you write a script that is purposely crafted to be run interactively. For full automation tasks however, you should avoid it.

# Get the VHDX file(s)
$VirtualDisks = Get-ChildItem -Path "C:\VHDX_For_Optimize\DEMO-OSD-MDT03-PSD" -Filter *.vhdx -Recurse

$Time = Measure-Command {
    [System.Collections.ArrayList]$VHDXInfo = @()
    foreach ($VHDX in $VirtualDisks) {
       
        $DiskSizeBeforeInGB = [math]::Round($(Get-Item -Path $VHDX.FullName).length/1GB)

        $Mount = Mount-VHD -Path $VHDX.FullName -Passthru
        $Volumes = $Mount | Get-Disk | Get-Partition | Get-Volume | Select-Object -Property DriveLetter, FileSystem, Drivetype | Where-Object {$_.DriveLetter -notin '',$null} 

        # Defrag each volume
        foreach ($Volume in $Volumes){

            $DriveLetter = $Volume.DriveLetter+":"
            # Code for VHDX stored on SSD drives (not using /d)
            defrag $DriveLetter /x
            defrag $DriveLetter /k /l
            defrag $DriveLetter /x # repeated
            defrag $DriveLetter /k # repeated, but without trim (/l)
        }
        
        Dismount-VHD -Path $VHDX.FullName
        
        # Mount the VHDX file read-only, not mapping any drive letters
        Mount-VHD -Path $VHDX.FullName -NoDriveLetter -ReadOnly
        Optimize-VHD -Path $VHDX.FullName -Mode Full
        Dismount-VHD -Path $VHDX.FullName

        $DiskSizeAfterInGB = [math]::Round($(Get-Item -Path $VHDX.FullName).length/1GB)

        $obj = [PSCustomObject]@{

            # Add values to arraylist
            DiskSizeBeforeInGB = $DiskSizeBeforeInGB
            DiskSizeAfterInGB = $DiskSizeAfterInGB
        }

        # Add all the values
        $VHDXInfo.Add($obj)|Out-Null
    }

    $TotalDiskSizeBeforeInGB = ($VHDXInfo.DiskSizeBeforeInGB | Measure-Object -Sum).Sum
    $TotalDiskSizeAfterInGB = ($VHDXInfo.DiskSizeAfterInGB | Measure-Object -Sum).Sum
    $SavingsInGB = $TotalDiskSizeBeforeInGB -  $TotalDiskSizeAfterInGB
    Write-Host "Total disk size before optimization: $TotalDiskSizeBeforeInGB GB"
    Write-Host "Total disk size after optimization: $TotalDiskSizeAfterInGB GB"
    Write-Host "Total savings after optimization is: $SavingsInGB GB"

}

Write-Host "Optimization runtime was $($Time.Minutes) minutes and $($Time.Seconds) Seconds"
About the author

Johan Arwidmark

4 2 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

>