As most of us know data deduplication feature in Server 2012 helps saving more space by locating and removing duplication within data files. Instead of storing multiple copies of those identical files, only a single copy takes up space and all duplicates refer the original. Windows achieves this by a filter driver that monitors local and remote IO and a deduplication service which controls three important scheduled jobs on the server – Optimization, Garbage collection and Scrubbing. Lets get into the details of these jobs.
I have installed Data Deduplication feature. Why don’t these jobs show up on my server ? – That’s probably because you did not enable data deduplication on any of the volumes. Once you enable it, these jobs would show up.
Lets get into the details of those jobs.
Background Optimization – The job invokes ddpcli.exe command to run on the enabled volumes. The job starts optimization process and all the duplicate files are de-duped. You can always check the results of this job by running powershell command “Get-DedupStatus” before and after the job run..
Weekly Garbage Collection – Garbage Collection job is configured to run weekly basis by default. The job cleans up chunk store by removing unused chunks. This releases disk space. The job can also be manually invoked when needed.
Weekly Scrubbing – De-duplication has a redundancy for critical or most accessed data chunks to avoid any chances of corruption. It provides a log file to record these corruption details and later using “Weekly Scrubbing job it will analyze the log and try to make repairs.