User`s guide
66  Copyright © Acronis, Inc., 2000-2010 
Deduplication at target 
After backup to a deduplicating vault is completed, the storage node runs the indexing task to 
deduplicate data in the vault as follows: 
1.  It moves the items (disk blocks or files) from the archives to a special file within the vault, storing 
duplicate items there only once. This file is called the deduplication data store. If there are both 
disk-level and file-level backups in the vault, there are two separate data stores for them. Items 
that cannot be deduplicated remain in the archives. 
2.  In the archives, it replaces the moved items with the corresponding references to them. 
As a result, the vault contains a number of unique, deduplicated items, with each item having one or 
more references to it from the vault's archives.  
The indexing task may take considerable time to complete. You can see this task's state in the Tasks 
view on the management server. 
Compacting 
After one or more backups or archives have been deleted from the vault—either manually or during 
cleanup—the vault may contain items which are no longer referred to from any archive. Such items 
are deleted by the compacting task, which is a scheduled task performed by the storage node. 
By default, the compacting task runs every Sunday night at 03:00. You can re-schedule the task as 
described in Actions on storage nodes (p. 315), under "Change the compacting task schedule". You 
can also manually start or stop the task from the Tasks view. 
Because deletion of unused items is resource-consuming, the compacting task performs it only when 
a sufficient amount of data to delete has accumulated. The threshold is determined by the 
Compacting Trigger Threshold (p. 331) configuration parameter. 
2.12.6.3  When deduplication is most effective 
The following are cases when deduplication produces the maximum effect: 
  When backing up in the full backup mode similar data from different sources. Such is the case 
when you back up operating systems and applications deployed from a single source over the 
network. 
  When performing incremental backups of similar data from different sources, provided that the 
changes to the data are also similar. Such is the case when you deploy updates to these systems 
and apply the incremental backup. Again, it is recommended that you first back up one machine 
and then the others, all at once or one by one. 
  When performing incremental backups of data that does not change itself, but changes its 
location. Such is the case when multiple pieces of data circulate over the network or within one 
system. Each time a piece of data moves, it is included in the incremental backup which becomes 
sizeable while it does not contain new data. Deduplication helps to solve the problem: each time 
an item appears in a new place, a reference to the item is saved instead of the item itself. 
Deduplication and incremental backups 
In case of random changes to the data, de-duplication at incremental backup will not produce much 
effect because: 
  The deduplicated items that have not changed are not included in the incremental backup. 










