18334231. CHUNKING AND DEDUPLICATION OF DATA USING ERROR CHECKING VALUES (VMware, Inc.)
CHUNKING AND DEDUPLICATION OF DATA USING ERROR CHECKING VALUES
Organization Name
Inventor(s)
Abhay Kumar Jain of Santa Clara CA (US)
Wenguang Wang of Santa Clara CA (US)
Enning Xiang of San Jose CA (US)
CHUNKING AND DEDUPLICATION OF DATA USING ERROR CHECKING VALUES
This abstract first appeared for US patent application 18334231 titled 'CHUNKING AND DEDUPLICATION OF DATA USING ERROR CHECKING VALUES
Original Abstract Submitted
Chunks of data are identified and deduplication is performed on the chunks of data using associated cyclic redundancy check (CRC) values. A plurality of CRC values is obtained that is associated with consecutive data blocks stored in a payload data store. Cut point CRC values are identified in the plurality of CRC values and CRC chunks are identified based on those cut point CRC values, wherein each CRC chunk is bounded by two consecutive cut point CRC values. A CRC chunk hash value is generated for each CRC chunk. A pair of duplicate CRC chunks is identified using the CRC chunk hash values and a deduplication operation is performed in association with the identified pair of duplicate CRC chunks. Using existing CRC values during the identification of chunk cut points reduces the computing resource costs associated with performing that process using the data blocks.