Volume Clone Split Extremely Slow in clustered Data ONTAP

Problem

My colleague had been dealing with growth on an extremely large volume (60 TB) for some time. After discussing with Business groups it was aggreed to split the volume in two seperate volumes.The largest directory identified was 20 TB that could be moved to it’s own volume. Discussions started on the best possible solution to get this job completed quickly.

Possible Solutions

  • robocopy / securecopy the directory to another volume. Past experience says this could be lot more time consuming.
  • ndmpcopy the large directory to a new volume. The ndmpcopy session needs to be kept open, if the job fails during transfer, we have to restart from begining. Also, there are no progress updates available.
  • clone the volume, delete data not required, split the clone. This seems to be a nice solution.
  • vol move. We don’t want to copy entire 60 TB volume and delete data. Therefore, we didn’t consider this solution.

So, we aggreed on the 3rd  Solution (clone, delete, split).

What actually happened

snowy-mgmt::> volume clone split start -vserver snowy -flexclone snowy_vol_001_clone
Warning: Are you sure you want to split clone volume snowy_vol_001_clone in Vserver snowy ?
{y|n}: y
[Job 3325] Job is queued: Split snowy_vol_001_clone.
 
Several hours later:
snowy-mgmt::> volume clone split show
                                Inodes              Blocks
                        ——————— ———————
Vserver   FlexClone      Processed      Total    Scanned    Updated % Complete
——— ————- ———- ———- ———- ———- ———-
snowy snowy_vol_001_clone          55      65562    1532838    1531276          0
 
Two Days later:
snowy-mgmt::> volume clone split show
                                Inodes              Blocks
                        ——————— ———————
Vserver   FlexClone      Processed      Total    Scanned    Updated % Complete
——— ————- ———- ———- ———- ———- ———-
snowy snowy_vol_001_clone         440      65562 1395338437 1217762917          0

This is a huge problem. The split operation will never complete in time.

What we found

We found the problem was with the way clone split works. Data ONTAP uses a background scanner to copy the shared data from the partent volume to the FlexClone volume. The scanner has one active message at any time that is processing only one inode, so the split tends to be faster on a volume with fewer inodes. Also, the background scanner runs at a low priority and can take considreable amount of time to complete. This means for a large volume with millions of inodes, it will take a huge amount of time to perform the split operation.

Workaround

“volume move a clone”

snowy-mgmt::*> vol move start -vserver snowy -volume snowy_vol_001_clone -destination-aggregate snowy01_aggr_01
  (volume move start)
 
Warning: Volume will no longer be a clone volume after the move and any associated space efficiency savings will be lost. Do you want to proceed? {y|n}: y

Benefits of vol move a FlexClone:

  • Faster than FlexClone split.
  • Data can be moved to different aggregate or node.

Reference

FAQ – FlexClone split

Leave a Reply

Your email address will not be published. Required fields are marked *