Turning on data deduplication

Deduplication is something that we as people do naturally. Every once in a while, you clean out the refrigerator, right? And if there are seven half-empty bottles of ketchup, you probably deduplicate that and throw some away. Or your closet. If you dig around and find thirty blue shirts, chances are that you can part with a few to save some space. These things make common sense, and so does deduplication when talking about the data that is stored on our servers.

Starting with Windows Server 2012, data deduplication became possible at the filesystem level. When enabled, Windows runs scheduled optimization jobs that search for duplicate files and data, and consolidates them. If you have two copies of the same file, stored in two different locations, all that is doing is consuming extra hard disk space. Data deduplication removes the secondary copy and utilizes the primary whenever that file is called for from either location on the disk.

In Server 2016, we have the ability to extend this deduplication into Hyper-V, specifically for VDI-type deployments. This is huge! Think about all of the different VDI systems that are going to be spun up by that system. With so many similar systems running under the same drive context, there is the potential to have thousands of duplicated files, and all duplicated numerous times. In this recipe, we are going to walk through the steps to enable data deduplication on a server so that you can start trying this out in your own environments.

Getting ready

We will be enabling data deduplication on a single server for this recipe, running Windows Server 2016, of course.

How to do it…

To enable data deduplication on our server, follow these steps:

  1. Open up Server Manager and click on the Add roles and features link.
  2. Click Next until you get to the Select server roles screen.
  3. Expand File and Storage Services | File and iSCSI Services and check the box next to Data Deduplication:

    How to do it…

  4. Finish the wizard in order to complete the installation of the deduplication role.
  5. Now in the left pane of Server Manager, click on File and Storage Services.
  6. Click on Volumes.
  7. Right-click on a data volume and choose Configure Data Deduplication…:

    How to do it…

  8. Click on the Data deduplication drop-down box and specify whether you are intending to run deduplication on a General purpose file server, Virtual Desktop Infrastructure (VDI) server, or Virtualized Backup Server. If you test out selecting one or the other, you will notice that the default list of file extensions to exclude from deduplication changes automatically. These are the file types that Microsoft has determined need to be excluded from deduplication in order for it to run effectively.
  9. If there are any specific files or folders that you want the deduplication process to leave alone, you can specify them here as exclusions. There is also a button named Set Deduplication Schedule… where you can specify the times of day that the optimization jobs run to consolidate the data:

    How to do it…

How it works…

Data deduplication is very easy to enable, but can be a powerful tool for saving disk space on your file servers. A graph available in one of the following links displays Server 2012 R2 deduplication statistics in terms of space-saving percentages for different kinds of data. These numbers are quite a bit larger than I expected to see, around 50 percent for general file shares and over 80 percent for VHD libraries! In Windows Server 2016, we now have support for even larger volumes and files, so the data savings are even greater. We can now support volumes up to 64TB, and individual files up to 1TB! Try data deduplication on some of your own systems and watch your available disk space start to increase.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.123.84