US Patent Application 18449651. SYSTEMS AND METHODS FOR COMPUTING DOT PRODUCTS OF NIBBLES IN TWO TILE OPERANDS simplified abstract

From WikiPatents
Jump to navigation Jump to search

SYSTEMS AND METHODS FOR COMPUTING DOT PRODUCTS OF NIBBLES IN TWO TILE OPERANDS

Organization Name

Intel Corporation

Inventor(s)

Raanan Sade of Portland OR (US)

Simon Rubanovich of Haifa (IL)

Amit Gradstein of Binyamina (IL)

Zeev Sperber of Zichron Yackov (IL)

Alexander Heinecke of San Jose CA (US)

Robert Valentine of Kiryat Tivon (IL)

Mark J. Charney of Lexington MA (US)

Bret Toll of Hillsboro OR (US)

Jesus Corbal of King City OR (US)

Elmoustapha Ould-ahmed-vall of Gilbert AZ (US)

SYSTEMS AND METHODS FOR COMPUTING DOT PRODUCTS OF NIBBLES IN TWO TILE OPERANDS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18449651 titled 'SYSTEMS AND METHODS FOR COMPUTING DOT PRODUCTS OF NIBBLES IN TWO TILE OPERANDS

Simplified Explanation

The patent application relates to computing dot products of nibbles in tile operands.

  • The processor includes decode circuitry to decode a tile dot product instruction.
  • The instruction has fields for an opcode, destination identifier, first source identifier, and second source identifier.
  • The matrices involved in the computation contain doubleword elements.
  • The execution circuitry performs a flow K times for each element (M,N) of the identified destination matrix.
  • It generates eight products by multiplying each nibble of a doubleword element (M,K) of the first source matrix by a corresponding nibble of a doubleword element (K,N) of the second source matrix.
  • The execution circuitry also accumulates and saturates the eight products with previous contents of the doubleword element (M,N).


Original Abstract Submitted

Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).