US Patent Application 18137695. System And Method For Large-Scale Data Processing Using An Application-Independent Framework simplified abstract
Contents
System And Method For Large-Scale Data Processing Using An Application-Independent Framework
Organization Name
Inventor(s)
Jeffrey Dean of Palo Alto CA (US)
Sanjay Ghemawat of Mountain View CA (US)
System And Method For Large-Scale Data Processing Using An Application-Independent Framework - A simplified explanation of the abstract
This abstract first appeared for US patent application 18137695 titled 'System And Method For Large-Scale Data Processing Using An Application-Independent Framework
Simplified Explanation
- The patent application describes a method for processing large amounts of data in a distributed and parallel processing environment. - The method uses application-independent map and reduce operations, which automatically handle data partitioning, parallelization of computations, and fault tolerance. - Users can specify a map operation to read and write data, and a reduce operation to process intermediate data and produce final output data. - The method executes map worker processes to read input files and store intermediate data values. - The method also executes reduce worker processes to process intermediate data and produce final output data.
Original Abstract Submitted
A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.