Patent Applications by DeepMind Technologies Limited on August 1st, 2024

DeepMind Technologies Limited: 5 patent applications

DeepMind Technologies Limited has applied for patents in the areas of G06N3/092 (3), G06N3/08 (1), G06N3/0895 (1), G06N3/042 (1) G06N3/092 (3), G06N3/08 (1), G06N3/0895 (1)

With keywords such as: dataset, function, neural, network, current, augmented, environment, training, agent, and gradient in patent application abstracts.

Patent Applications by DeepMind Technologies Limited

20240256861. DETERMINING STATIONARY POINTS OF A LOSS FUNCTION USING CLIPPED AND UNBIASED GRADIENTS_simplified_abstract_(deepmind technologies limited)

Inventor(s): Marcus Hutter of London (GB) for deepmind technologies limited, Bryn Hayeder Khalid Elesedy of Enfield (GB) for deepmind technologies limited

IPC Code(s): G06N3/08

CPC Code(s): G06N3/08

Abstract: a method of optimizing a loss function defined by one or more numerical parameters is provided. the method comprises determining initial values of the parameters, and performing a plurality of training iterations. each training iteration except the first comprises (i) determining a gradient of the loss function associated with the parameters, (ii) obtaining a clipped value generated in a previous training iteration, (iii) additively combining the gradient and the clipped value to generate a modified gradient, (iv) processing, using a clipping function based on a threshold value, the modified gradient to generate a clipped gradient, (v) updating the value of the one or more parameters based on the clipped gradient, and (vi) storing, as the clipped value for use in a next training iteration, a difference between the modified gradient and the clipped gradient.

20240256879. TRAINING A NEURAL NETWORK TO PERFORM AN ALGORITHMIC TASK USING A SELF-SUPERVISED LOSS_simplified_abstract_(deepmind technologies limited)

Inventor(s): Beatrice Bevilacqua of Lafayette IN (US) for deepmind technologies limited, Petar Velickovic of Cambridge (GB) for deepmind technologies limited, Jovana Mitrovic of London (GB) for deepmind technologies limited, Kyriacos Nikiforou of Strovolos (CY) for deepmind technologies limited, Ioana Bica of London (GB) for deepmind technologies limited, Borja Ibarz Gabardos of London (GB) for deepmind technologies limited

IPC Code(s): G06N3/0895

CPC Code(s): G06N3/0895

Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network to perform an algorithmic task. according to one aspect, there is provided a method comprising: obtaining an input dataset; generating a first augmented dataset and a second augmented dataset, wherein for both the first augmented dataset and the second augmented dataset: applying the computational algorithm to the augmented dataset causes the same computational operations to be performed at a target computational step as would be performed by applying the computational algorithm to the input dataset; processing the first augmented dataset and the second augmented dataset using the neural network, comprising, for each augmented dataset: generating an intermediate representation of the augmented dataset at an intermediate layer of the neural network; and training the neural network on an objective function, wherein the objective function comprises a self-supervised loss term.

20240256882. REINFORCEMENT LEARNING BY DIRECTLY LEARNING AN ADVANTAGE FUNCTION_simplified_abstract_(deepmind technologies limited)

Inventor(s): Yunhao Tang of London (GB) for deepmind technologies limited, Remi Munos of London (GB) for deepmind technologies limited, Mark Daniel Rowland of London (GB) for deepmind technologies limited, Michal Valko of Paris (FR) for deepmind technologies limited

IPC Code(s): G06N3/092

CPC Code(s): G06N3/092

Abstract: a system and method, implemented by one or more computers, of controlling an agent to take actions in an environment to perform a task is provided. the method comprises maintaining a value function neural network an advantage function neural network that is an estimate of a state-action advantage function representing a relative advantage of performing one possible action relative to the other possible actions. the method further comprises using the advantage function neural network to control the agent to take actions in the environment to perform the task. the method also comprises training the value function neural network and the advantage function neural network in a way that takes into account a behavior policy defined by a distribution of actions taken by the agent in training data.

20240256883. REINFORCEMENT LEARNING USING QUANTILE CREDIT ASSIGNMENT_simplified_abstract_(deepmind technologies limited)

Inventor(s): Thomas Mesnard of Paris (FR) for deepmind technologies limited, Remi Munos of London (GB) for deepmind technologies limited, Alaa Saade of Montreuil (FR) for deepmind technologies limited, Yunhao Tang of London (GB) for deepmind technologies limited, Mark Daniel Rowland of London (GB) for deepmind technologies limited, Theophane Guillaume Weber of London (GB) for deepmind technologies limited, Wenqi Chen of Cambridge MA (US) for deepmind technologies limited

IPC Code(s): G06N3/092

CPC Code(s): G06N3/092

Abstract: methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions to be performed by an agent interacting with an environment. implementations of the system can take into account a level of luck in the environment, and hence whilst learning can account for outcomes that were caused by external factors as well as those dependent on the actions of the agent.

20240256884. GENERATING ENVIRONMENT MODELS USING IN-CONTEXT ADAPTATION AND EXPLORATION_simplified_abstract_(deepmind technologies limited)

Inventor(s): Hado Philip van Hasselt of London (GB) for deepmind technologies limited, Nan Ke of London (GB) for deepmind technologies limited, Chentian Jiang of Edinburgh (GB) for deepmind technologies limited

IPC Code(s): G06N3/092, G06N3/042

CPC Code(s): G06N3/092

Abstract: methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent interacting with an environment to perform a task. in one aspect, one of the methods include: maintaining context data; receiving a current observation characterizing a current state of the environment; generating a current graph model that represents the environment; selecting, from a possible set of actions and using the current graph model, a current action to be performed by the agent in response to the current observation; controlling the agent to perform the selected current action to cause the environment to transition from the current state into a new state; and updating the context data to include (i) data identifying the selected current action and (ii) a new observation characterizing the new state of the environment.

DeepMind Technologies Limited patent applications on August 1st, 2024

DeepMind Technologies Limited patent applications on August 1st, 2024

Patent Applications by DeepMind Technologies Limited on August 1st, 2024

Patent Applications by DeepMind Technologies Limited

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools