DeepMind Technologies Limited patent applications on April 25th, 2024
Patent Applications by DeepMind Technologies Limited on April 25th, 2024
DeepMind Technologies Limited: 3 patent applications
DeepMind Technologies Limited has applied for patents in the areas of G06N3/04 (2), G06N3/08 (1), G06N3/092 (1), G10L25/30 (1), G06N3/045 (1) G06N3/04 (2), G06N3/08 (1), G06N3/092 (1), G10L25/30 (1), G06N3/045 (1)
With keywords such as: action, agent, training, data, target, time, current, computer, audio, and selection in patent application abstracts.
Patent Applications by DeepMind Technologies Limited
Inventor(s): Georg Ostrovski of London (GB) for deepmind technologies limited, William Clinton Dabney of London (GB) for deepmind technologies limited
IPC Code(s): G06N3/08, G06N3/04
CPC Code(s):
Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. in one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
Inventor(s): Zheng Wen of Fremont CA (US) for deepmind technologies limited, Benjamin Van Roy of Stanford CA (US) for deepmind technologies limited, Rahul Anant Jain of Malibu CA (US) for deepmind technologies limited, Botao Hao of Redwood City CA (US) for deepmind technologies limited
IPC Code(s): G06N3/092
CPC Code(s):
Abstract: methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a target action selection policy to control a target agent interacting with an environment. in one aspect, a method comprises: obtaining a set of offline training data, wherein the offline training data characterizes interaction of a baseline agent with an environment as the baseline agent performs actions selected in accordance with a baseline action selection policy; generating a set of online training data that characterizes interaction of the target agent with the environment as the target agent performs actions selected in accordance with the target action selection policy; and training the target action selection policy on both: (i) the offline training data, and (ii) the online training data, wherein the training of the target action selection policy on the offline training data is conditioned on a measure of competency of the baseline agent.
Inventor(s): Aaron Gerard Antonius van den Oord of London (GB) for deepmind technologies limited, Sander Etienne Lea Dieleman of London (GB) for deepmind technologies limited, Nal Emmerich Kalchbrenner of Amsterdam (NL) for deepmind technologies limited, Karen Simonyan of London (GB) for deepmind technologies limited, Oriol Vinyals of London (GB) for deepmind technologies limited
IPC Code(s): G10L25/30, G06N3/04, G06N3/045, G06N3/048, G10L13/06
CPC Code(s):
Abstract: methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. one of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.
DeepMind Technologies Limited patent applications on April 25th, 2024