US Patent Application 17658932. WARM START FILE COMPRESSION USING SEQUENCE ALIGNMENT simplified abstract

From WikiPatents
Jump to navigation Jump to search

WARM START FILE COMPRESSION USING SEQUENCE ALIGNMENT

Organization Name

Dell Products L.P.


Inventor(s)

Ofir Ezrielev of Be'er Sheba (IL)


Ilan Buyum of Kfar-Aza (IL)


Jehuda Shemer of Kfar Saba (IL)


WARM START FILE COMPRESSION USING SEQUENCE ALIGNMENT - A simplified explanation of the abstract

  • This abstract for appeared for US patent application number 17658932 Titled 'WARM START FILE COMPRESSION USING SEQUENCE ALIGNMENT'

Simplified Explanation

The abstract describes a method for compressing files. The process involves aligning the input file by splitting it into smaller sequences. This alignment creates a compression matrix, where each row represents a part of the file. This compression matrix can also be used as a starting point for further compression. The compression can be done in multiple stages, with larger letter sizes used in the first stage and smaller letter sizes used in the second stage. A consensus sequence is determined from the compression matrix, and pointer pairs are generated based on this consensus sequence. Each pointer pair identifies a subsection of the consensus matrix. The final compressed file includes these pointer pairs and the consensus sequence.


Original Abstract Submitted

Compressing files is disclosed. An input file to be compressed is first aligned. Aligning the file includes splitting the file into sequences that can be aligned. The result is a compression matrix, where each row of the matrix corresponds to part of the file. The compression matrix may also serve as a warm start if additional compression is desired. Compression may be performed in stages, where an initial compression matrix is generated in a first stage using larger letter sizes for alignment and then a second compression stage is performed using smaller letter sizes. A consensus sequence id determined from the compression matrix. Using the consensus sequence, pointer pairs are generated. Each pointer pair identifies a subsequence of the consensus matrix. The compressed file includes the pointer pairs and the consensus sequence.