DeepMind Technologies Limited patent applications on July 11th, 2024

From WikiPatents
Jump to navigation Jump to search

Patent Applications by DeepMind Technologies Limited on July 11th, 2024

DeepMind Technologies Limited: 3 patent applications

DeepMind Technologies Limited has applied for patents in the areas of G06N3/092 (2), G06N3/0455 (1) G06N3/092 (2), G06N3/0455 (1)

With keywords such as: network, training, agent, target, computer, data, action, embeddings, selection, and policy in patent application abstracts.



Patent Applications by DeepMind Technologies Limited

20240232580. GENERATING NEURAL NETWORK OUTPUTS BY CROSS ATTENTION OF QUERY EMBEDDINGS OVER A SET OF LATENT EMBEDDINGS_simplified_abstract_(deepmind technologies limited)

Inventor(s): Andrew Coulter Jaegle of London (GB) for deepmind technologies limited, Jean-Baptiste Alayrac of London (GB) for deepmind technologies limited, Sebastian Borgeaud Dit Avocat of London (GB) for deepmind technologies limited, Catalin-Dumitru Ionescu of London (GB) for deepmind technologies limited, Carl Doersch of London (GB) for deepmind technologies limited, Fengning Ding of London (GB) for deepmind technologies limited, Oriol Vinyals of London (GB) for deepmind technologies limited, Olivier Jean Hénaff of London (GB) for deepmind technologies limited, Skanda Kumar Koppula of London (GB) for deepmind technologies limited, Daniel Zoran of London (GB) for deepmind technologies limited, Andrew Brock of London (GB) for deepmind technologies limited, Evan Gerard Shelhamer of London (GB) for deepmind technologies limited, Andrew Zisserman of London (GB) for deepmind technologies limited, Joao Carreira of St. Albans (GB) for deepmind technologies limited

IPC Code(s): G06N3/0455

CPC Code(s): G06N3/0455



Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a network output using a neural network. in one aspect, a method comprises: obtaining: (i) a network input to a neural network, and (ii) a set of query embeddings; processing the network input using the neural network to generate a network output that comprises a respective dimension corresponding to each query embedding in the set of query embeddings, comprising: processing the network input using an encoder block of the neural network to generate a representation of the network input as a set of latent embeddings; and processing: (i) the set of latent embeddings, and (ii) the set of query embeddings, using a cross-attention block that generates each dimension of the network output by cross-attention of a corresponding query embedding over the set of latent embeddings.


20240232642. REINFORCEMENT LEARNING USING EPISTEMIC VALUE ESTIMATION_simplified_abstract_(deepmind technologies limited)

Inventor(s): Hado Philip van Hasselt of London (GB) for deepmind technologies limited, Simon Schmitt of London (GB) for deepmind technologies limited

IPC Code(s): G06N3/092

CPC Code(s): G06N3/092



Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing a control system of selecting an action to be performed by a reinforcement learning agent, based on an observation characterizing a current state of an environment. the control system is trained based on a distribution of neural network model parameters derived using a database of previous experiences in the environment.


20240232643. LEVERAGING OFFLINE TRAINING DATA AND AGENT COMPETENCY MEASURES TO IMPROVE ONLINE LEARNING_simplified_abstract_(deepmind technologies limited)

Inventor(s): Zheng Wen of Fremont CA (US) for deepmind technologies limited, Benjamin Van Roy of Stanford CA (US) for deepmind technologies limited, Rahul Anant Jain of Malibu CA (US) for deepmind technologies limited, Botao Hao of Redwood City CA (US) for deepmind technologies limited

IPC Code(s): G06N3/092

CPC Code(s): G06N3/092



Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a target action selection policy to control a target agent interacting with an environment. in one aspect, a method comprises: obtaining a set of offline training data, wherein the offline training data characterizes interaction of a baseline agent with an environment as the baseline agent performs actions selected in accordance with a baseline action selection policy; generating a set of online training data that characterizes interaction of the target agent with the environment as the target agent performs actions selected in accordance with the target action selection policy; and training the target action selection policy on both: (i) the offline training data, and (ii) the online training data, wherein the training of the target action selection policy on the offline training data is conditioned on a measure of competency of the baseline agent.


DeepMind Technologies Limited patent applications on July 11th, 2024