20250217591. Lossless Lossy Large La (Dell Products L.P.)
LOSSLESS AND LOSSY LARGE LANGUAGE MODEL-BASED TEXT COMPRESSION VIA ARITHMETIC CODING
Abstract: one example method includes receiving, by a large language model (llm), input text to be compressed, defining a size of a rolling window of previous tokens, generated prior to receipt of the input text, that the llm is permitted to consider in a conditional probability estimate, generating, by the llm, tokenized text based on the input text, and the tokenized text comprises a sequence of tokens, based on the previous tokens, obtaining a probability mass function of a next token of the sequence, providing the probability mass function as an input to an arithmetic coding (ac) scheme, and assigning, by the ac scheme, a respective binary code to the token with a highest probability as assigned by the llm.
Inventor(s): David Burth Kurka, Renam Castro da Silva, Diego Vrague Noble, R么mulo Teixeira de Abreu Pinho, Vinicius Michel Gottin
CPC Classification: G06F40/284 (ELECTRIC DIGITAL DATA PROCESSING (computer systems based on specific computational models ))
Search for rejections for patent application number 20250217591