Hyundai motor company (20240338522). GRADIENT CONTROL DEVICE AND GRADIENT CONTROL METHOD OF LANGUAGE MODEL simplified abstract

From WikiPatents
Jump to navigation Jump to search

GRADIENT CONTROL DEVICE AND GRADIENT CONTROL METHOD OF LANGUAGE MODEL

Organization Name

hyundai motor company

Inventor(s)

Woojong Ryu of Seoul (KR)

Seongmin Lee of Incheon (KR)

Sungroh Yoon of Seoul (KR)

Sangwon Yu of Seoul (KR)

GRADIENT CONTROL DEVICE AND GRADIENT CONTROL METHOD OF LANGUAGE MODEL - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240338522 titled 'GRADIENT CONTROL DEVICE AND GRADIENT CONTROL METHOD OF LANGUAGE MODEL

The abstract describes a gradient control device and method for a language model, which involves calculating occurrences of tokens in batch data, grouping rare tokens, calculating a gate tensor on embedding vectors of the rare tokens, and scaling a gradient part to push the embedding vectors away from certain feature vectors in a training step.

  • The device includes processors and memory storing instructions.
  • It calculates occurrences of tokens in batch data at each training step.
  • Rare tokens are grouped based on a comparison with a threshold value.
  • A gate tensor is calculated on embedding vectors of the grouped rare tokens.
  • The gradient part is scaled to push the embedding vectors away from certain feature vectors in a training step.

Potential Applications: - Natural language processing - Machine learning - Text generation

Problems Solved: - Improving the efficiency of language models - Enhancing the performance of token embeddings - Addressing issues with rare tokens in training data

Benefits: - Better accuracy in language modeling - Improved handling of rare tokens - Enhanced training efficiency

Commercial Applications: Title: Enhanced Language Model Training Device for AI Applications This technology can be used in various industries such as: - Chatbots - Sentiment analysis - Text summarization

Questions about the technology: 1. How does this gradient control device improve the training process of language models? 2. What impact does grouping rare tokens have on the overall performance of the language model?

Frequently Updated Research: Stay updated on advancements in language model training techniques and token embedding optimization for improved natural language processing tasks.


Original Abstract Submitted

provided are a gradient control device and a gradient control method of a language model. the gradient control device of a language model may include: one or more processors, and memory storing instructions. the instructions, when executed by the one or more processors, may cause the gradient control device to calculate a number of occurrences of each token, of a plurality of tokens, in batch data at each training step of a plurality of training steps ranging from a current training step to a set previous training step; group rare tokens based on a comparison of the calculated number of occurrences of each token, of the plurality of tokens, with a threshold value; calculate a gate tensor on embedding vectors of the grouped rare tokens; and scale a gradient part that pushes the embedding vectors of the grouped rare tokens away from feature vectors having relatively non-rare and feature vectors having relatively rare target tokens, among gradients of a loss function for the embedding vectors of the grouped rare tokens in a training step.