DeployHappiness | Data Deduplication in Server 2012/2012 R2

Data Deduplication is the best feature in Server 2012/2012 R2. For any shop, it provides a huge benefit for 5 minutes of work! When configured, data deduplication will analyze files for duplicate chunks, remove the duplicate portions, and reference files to a single specially stored copy. This can give you some amazing space savings!

Data deduplication is useful for three main scenarios:

Folder Redirection/Home Folders/User Share: imagine all of the PDFs that are emailed out and saved into each user’s documents.
Software Distribution/Application shares: a lot of software share the same components.
VDI: If you have 100 VMs running the same OS, the disk space saved with data deduplication would be insane! Microsoft saved close to 90% in their environment!

In this guide, we will setup data deduplication and learn some best practices for integration.

Data Deduplication in Server 2012/2012 R2 - Saving GBs with a Checkbox

Saving 31% on an SCCM distribution point!

How to Configure Data Deduplication in Server 2012/2012 R2

First, data deduplication can not be configured on a system or boot volume. With that out of the way, pick a machine running Server 2012 or higher. In Server Manager, launch the Add Roles and Features Wizard. Under Server Roles, expand File and Storage Services – select the Data Deduplication role and finish the wizard. This role does not require a reboot.

Side note – I love data deduplication so much that I enable this role on nearly every new server with role installations in my Task Sequence.

Select the File and Storage Services node in Server Manager and then select volumes. Right click on data volume and select Configure Data Deduplication…

Configure Data Deduplication on a new volume.

Change the type from disabled to General purpose file server (or check the enable box if you are on Server 2012). Leave the deduplicate files older than setting at the default value. You may want to exclude certain folders/files from deduplication. For example, SCCM 2012 requires a few folders to be excluded.

Excluding folders from Data Deduplication for SCCM 2012

Select Set Deduplication Schedule and check the Enable throughput optimization box. Adjust the duration so that the optimization is outside of your work hours – the server will be taxed while this optimization occurs. Use Server Manager Performance Counters to keep an eye on resources during optimization.

Depending on your server size, it may take a day or two for volumes to be completely optimized. If you are wanting to see results quickly, you have two options:

Launch DDPEval.exe (ex: ddpeval.exe \\Server-01\Data\). This tool in is the System32 folder on any server with data dedupe installed on it. You can copy this EXE to any 2008R2+ machine to evaluate potential savings.
Start a dedup job with PowerShell. The following command will dedup volume D and consume up to 50% of the server’s RAM: Start-DedupJob D: -Type Optimization –Memory 50

That wraps up our data deduplication guide. If you wish to learn more, the links below will help:

16 thoughts on “Data Deduplication in Server 2012/2012 R2 – Saving GBs with a Checkbox”

Fantastic guide. Just built a new File Server migrating from 2008 -> 2012R2. Migrated JUST our HomeDrives and so far its saved me 90GB (19% deduplication rate)… What will happen when I move over our central share? I’ll post the results then!

Joseph Moody says

May 25, 2017 at 1:16 pm

Awesome news, Chris!

Reply

Awesome tip, thanks! I’d like to use this along with BranchCache to save space at the main office and speed things up for a branch office. Any conflict with using the two simultaneously? Do they work well in tandem?

Joseph Moody says

February 13, 2015 at 12:04 pm

Should not be any issues with combining them.

Reply

We had enabled this on a file server and it saved a ton of space. Tested with Acronis both at the file level and hyper-v vm restore level and both worked. However, restoring individual files to a different location on a different server/workstation that did not have deduplication enabled did not work. The files were corrupted and inaccessible. So dedup can certainly cause *some* issues with restores though your two primary restore jobs will work fine (file to original location, full vm). We didn’t do a lot of testing when we found the issue, but I wanted to throw my 2c in that there can be issues with Acronis Backup for HV.

Joseph Moody says

January 23, 2015 at 8:11 am

Thanks for your tips Jason! I saw a corrupt restore of a pst file yesterday. I don’t believe data dedupe had anything to do with it (as the file was constantly being written to) but I am going to check the way I restored it.

Reply

This killed a bunch of little application databases and interfered with my backup. Spent the next two hours undoing what it did.

Joseph Moody says

January 7, 2015 at 8:48 am

Application databases are not good candidates for dedup. Those files change way too often. I would limit dedup to volumes containing the three types of data listed above (home folders/user shares, software distribution shares, VDI VHDs on 2012 R2).

Here are the general candidate guidelines from TechNet:

Great candidates for deduplication:

◦ Folder redirection servers
◦ Virtualization depot or provisioning library
◦ Software deployment shares
◦ SQL Server and Exchange Server backup volumes
◦ VDI VHDs (supported only on Windows Server 2012 R2)

Should be evaluated based on content:

◦ Line-of-business servers
◦ Static content providers
◦ Web servers
◦ High-performance computing (HPC)

Not good candidates for deduplication:

◦ Hyper-V hosts
◦ WSUS
◦ Servers running SQL Server or Exchange Server
◦ Files approaching or larger than 1 TB in size
◦ VDI VHDs on Windows Server 2012

Reply
- Andy says
  
  January 7, 2015 at 11:03 am
  
  I am so dumb! I skipped right over the scenario section and just rolled this out without testing… thanks for pointing me in the right direction and for putting together guides like this!
  
  Reply
Joseph Moody says

January 7, 2015 at 8:49 am

And if the application databases are on a volume that you really want to dedup, you can exclude those files/folders from the process by using exclusions.

Reply

Does this also work with virtualized servers (Hyper-V)? Or does this require additional requirements of the SAN environment?

Joseph Moody says

January 7, 2015 at 8:18 am

Works on VMs and Physical machines. And no additional requirements.

Reply

Also,

Any negatives from using block level backup solutions like Shadowprotect or Acronis? Wonder how that would work with Acronis since it does dedup as well during the backup.

Joseph Moody says

January 7, 2015 at 7:54 am

According to Microsoft, “Block-based backup applications should work without modification, and they maintain the optimization on the backup media.”

I would imagine that Acronis would just not have anything to dedup. You may even be able to cut your backup times by turning off dedup in Acronis during the backup.

Reply

Any negatives you experienced? I thought most dedup systems had extremely high memory requirements?

Joseph Moody says

January 7, 2015 at 7:58 am

Other than the CPU/Memory hit during the optimization timeframe, I haven’t see any additional load on our servers. It is a post processing dedup job – new file writes are not altered at all. Files aren’t even touched until they are X days old.

I would advise that you check out your performance stats first and ensure that you aren’t close to maxing out memory/CPU on a daily basis.

Reply

Chris Robb says

May 22, 2017 at 2:42 am

Fantastic guide. Just built a new File Server migrating from 2008 -> 2012R2. Migrated JUST our HomeDrives and so far its saved me 90GB (19% deduplication rate)… What will happen when I move over our central share? I’ll post the results then!

- Joseph Moody says
  
  May 25, 2017 at 1:16 pm
  
  Awesome news, Chris!
  
AG says

February 13, 2015 at 9:29 am

Awesome tip, thanks! I’d like to use this along with BranchCache to save space at the main office and speed things up for a branch office. Any conflict with using the two simultaneously? Do they work well in tandem?

- Joseph Moody says
  
  February 13, 2015 at 12:04 pm
  
  Should not be any issues with combining them.
  
Jason says

January 22, 2015 at 4:25 pm

We had enabled this on a file server and it saved a ton of space. Tested with Acronis both at the file level and hyper-v vm restore level and both worked. However, restoring individual files to a different location on a different server/workstation that did not have deduplication enabled did not work. The files were corrupted and inaccessible. So dedup can certainly cause *some* issues with restores though your two primary restore jobs will work fine (file to original location, full vm). We didn’t do a lot of testing when we found the issue, but I wanted to throw my 2c in that there can be issues with Acronis Backup for HV.

- Joseph Moody says
  
  January 23, 2015 at 8:11 am
  
  Thanks for your tips Jason! I saw a corrupt restore of a pst file yesterday. I don’t believe data dedupe had anything to do with it (as the file was constantly being written to) but I am going to check the way I restored it.
  
Andy says

January 7, 2015 at 8:36 am

This killed a bunch of little application databases and interfered with my backup. Spent the next two hours undoing what it did.

- Joseph Moody says
  
  January 7, 2015 at 8:48 am
  
  Application databases are not good candidates for dedup. Those files change way too often. I would limit dedup to volumes containing the three types of data listed above (home folders/user shares, software distribution shares, VDI VHDs on 2012 R2).
  
  Here are the general candidate guidelines from TechNet:
  
  Great candidates for deduplication:
  
  ◦ Folder redirection servers
  ◦ Virtualization depot or provisioning library
  ◦ Software deployment shares
  ◦ SQL Server and Exchange Server backup volumes
  ◦ VDI VHDs (supported only on Windows Server 2012 R2)
  
  Should be evaluated based on content:
  
  ◦ Line-of-business servers
  ◦ Static content providers
  ◦ Web servers
  ◦ High-performance computing (HPC)
  
  Not good candidates for deduplication:
  
  ◦ Hyper-V hosts
  ◦ WSUS
  ◦ Servers running SQL Server or Exchange Server
  ◦ Files approaching or larger than 1 TB in size
  ◦ VDI VHDs on Windows Server 2012
  
  - Andy says
    
    January 7, 2015 at 11:03 am
    
    I am so dumb! I skipped right over the scenario section and just rolled this out without testing… thanks for pointing me in the right direction and for putting together guides like this!
    
- Joseph Moody says
  
  January 7, 2015 at 8:49 am
  
  And if the application databases are on a volume that you really want to dedup, you can exclude those files/folders from the process by using exclusions.
  
L. Kikkert says

January 7, 2015 at 8:11 am

Does this also work with virtualized servers (Hyper-V)? Or does this require additional requirements of the SAN environment?

- Joseph Moody says
  
  January 7, 2015 at 8:18 am
  
  Works on VMs and Physical machines. And no additional requirements.
  
Brian Martin says

January 6, 2015 at 8:14 pm

Also,

Any negatives from using block level backup solutions like Shadowprotect or Acronis? Wonder how that would work with Acronis since it does dedup as well during the backup.

- Joseph Moody says
  
  January 7, 2015 at 7:54 am
  
  According to Microsoft, “Block-based backup applications should work without modification, and they maintain the optimization on the backup media.”
  
  I would imagine that Acronis would just not have anything to dedup. You may even be able to cut your backup times by turning off dedup in Acronis during the backup.
  
Brian Martin says

January 6, 2015 at 8:09 pm

Any negatives you experienced? I thought most dedup systems had extremely high memory requirements?

- Joseph Moody says
  
  January 7, 2015 at 7:58 am
  
  Other than the CPU/Memory hit during the optimization timeframe, I haven’t see any additional load on our servers. It is a post processing dedup job – new file writes are not altered at all. Files aren’t even touched until they are X days old.
  
  I would advise that you check out your performance stats first and ensure that you aren’t close to maxing out memory/CPU on a daily basis.

Data Deduplication in Server 2012/2012 R2 – Saving GBs with a Checkbox

How to Configure Data Deduplication in Server 2012/2012 R2

Related

16 thoughts on “Data Deduplication in Server 2012/2012 R2 – Saving GBs with a Checkbox”

Leave a ReplyCancel reply

Pages

Archives

Categories

WordPress

Subscribe