Google LLC patent applications published on September 21st, 2023
Optical Sensor in a Button of an Electronic Device (18186591)
Abstract
A computing device includes a housing defining an aperture extending therethrough. The computing device further includes a button that includes a switch and a body. The body is at least partially disposed within the aperture and movable relative to the housing between a first position and a second position to selectively actuate the switch to cause the computing device to perform a function. The button includes a printed circuit electrically coupled to one or more processors of the computing device. The button includes an optical sensor disposed within the interior of the body and configured to obtain biometric data for determining one or more biometrics of a user.
Inventor
Debanjan Mukherjee
EMISSIVE DISPLAY CONFIGURED WITH THROUGH-DISPLAY ZERO-DISTANCE PROXIMITY SENSOR (18006548)
In Simple Terms
This is a smart device with a screen that can sense when something is close to it. It can adjust the amount of light it sends out to detect objects and can turn off the screen or touchscreen when something is too close, and turn it back on when the object is at a sufficient distance. The device has a brain and memory that use instructions to decide when to turn the screen or touchscreen on or off, based on how much light comes back to the sensor.
Abstract
A mobile computing device an emissive display that includes a touchscreen and a proximity sensor. The proximity sensor includes a transmitter configured to transmit electromagnetic radiation through the display and a receiver of electromagnetic radiation configured to receive electromagnetic radiation transmitted by the transmitter, reflected off an object facing the emissive display and received through the emissive display. The proximity sensor is configured for generating a quantitative output signal based on an amount of the received electromagnetic radiation, and the transmitter is configured to transmit a first predetermined amount of light when a distance between the object and the display is greater than a near threshold distance between the object and the display and is configured to transmit a second predetermined amount of light when the distance between the object and the display is less than the near threshold distance. The second predetermined amount is greater than the first predetermined amount. A processor is configured for receiving the generated quantitative output signal, and memory stores instructions that, when executed by the processor, cause the processor to deactivate the touchscreen and/or the emissive display when the touchscreen and/or the emissive display is activated and when the quantitative output signal increases above an high threshold value and to activate the touchscreen and/or the emissive display when the touchscreen and/or the emissive display is deactivated and when the quantitative output signal is below a low threshold value, the low threshold value being less than the high threshold value.
Inventor
YungSheng Chang
COUPLING NARROWBAND PROJECTOR SOURCE LIGHT INTO DISPLAY WAVEGUIDES (17695995)
Abstract
A system includes a feedback loop that includes a light engine to generate light, a light engine controller to control operation of the light engine, a scanning device to scan a light beam across a range of scan angles to an incoupler of a waveguide, a photo-sensor to measure an amount of light outcoupled through the incoupler of the waveguide at the range of incident angles. The light engine controller adjusts one or more of a pulse duration, a phase, or a pulse frequency of the scanned light, based on an incident angle of the scanned light and the measured amount of light.
Inventor
Shreyas Potnis
ACTIVE ACOUSTIC RIPPLE CANCELLATION FOR MEMS MIRRORS (17694845)
Abstract
Systems, devices, and methods are described for mitigating or eliminating distortion patterns (such as those caused by one or more high-volume audio sources) in a display system such as a laser projection system. Frequency components of an incoming sound that correspond to one or more resonant frequencies of an optical reflector of the display system are determined to exceed a defined volume threshold. Responsive to that determination, a magnitude and phase of one or more harmonic motions of the optical reflector are measured. Sound waves are generated to destructively interfere with at least one frequency component corresponding to the resonant frequencies of the optical reflector.
Inventor
Sangtak Park
Power Throttling Mechanism Using Instruction Rate Limiting in High Power Machine-Learning ASICs (18201974)
Abstract
A system contains a machine learning application specific integrated circuit (ASIC) and a power supply unit. The power supply unit and the ASIC are configured to be in data communication through dedicated pins on the ASIC and the power supply unit. The power supply unit detects a present power consumption of the ASIC. Upon determining that a threshold condition has been met, the power supply unit, responsive to the condition sends a digital signal to the ASIC. The ASIC contains a synchronizer which synchronizes the digital signal to be consistent with the ASICs internal clock frequency. A chip manager the synchronized signal and other signals to generate a throttling mask. The throttling mask is sent to a sequencer of the ASIC, which then limits the instruction flow into the processing units of the ASIC based on the mask. This in turn limits the power being consumed by the ASIC.
Inventor
Houle Gan
PROVIDING COMPOSITE GRAPHICAL ASSISTANT INTERFACES FOR CONTROLLING VARIOUS CONNECTED DEVICES (18200979)
Abstract
Methods, apparatus, systems, and computer-readable media are provided for tailoring composite graphical assistant interfaces for interacting with multiple different connected devices. The composite graphical assistant interfaces can be generated in response to a user providing a request for an automated assistant to cause a connected device to perform a particular function. In response to the automated assistant receiving the request, the automated assistant can identify other connected devices, and other functions capable of being performed by the other connected devices. The other functions can then be mapped to various graphical control elements in order to provide a composite graphical assistant interface from which the user can interact with different connected devices. Each graphical control element can be arranged to reflect how each connected device is operating simultaneous to the presentation of the composite graphical assistant interface.
Inventor
Yuzhao Ni
ADAPTIVE CONTENT CONTROL AND DISPLAY FOR INTERNET MEDIA (18202055)
Abstract
This disclosure relates to adaptive content control and display for internet media. A playback component provides for playback of media content. An input component detects user inputs during playback of the content. In response to the user inputs being detected, a menu component displays a level of a pivot menu during playback of the content. The pivot menu is displayed on top, or in front, of a portion of the content during playback, and the pivot menu can be at least partially transparent to enable consumption of the content to continue without complete obstruction.
Inventor
Shivakumar Littoo Rajaraman
Activity-Dependent Audio Feedback Themes for Touch Gesture Inputs (18164694)
Abstract
Systems and methods that provide audio feedback in response to gesture validity can provide a more intuitive interface that can train users to correctly complete gestures. Moreover, systems and methods that provide line-specific audio feedback can provide more specific feedback that can allow a user to better understand what sensing line is being contacted. The systems and methods can further include basing the audio feedback based at least in part on obtained activity data, such that invalid and valid feedbacks can provide different sounds dependent on the determined activity state.
Inventor
Daniel Lee Giles
VECTOR PROCESSING UNIT (18074990)
Abstract
A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memory includes memory banks configured to store data used by each of the processor units to perform the arithmetic operations. The processor units and the vector memory are tightly coupled within an area of the vector processing unit such that data communications are exchanged at a high bandwidth based on the placement of respective processor units relative to one another, and based on the placement of the vector memory relative to each processor unit.
Inventor
William Lacy
System For Live Migration of Virtual Machines With Assigned Peripheral Devices (18130652)
Abstract
Hardware transactions or other techniques, such as custom PCIe handling devices, are used to atomically move pages from one host's memory to another host's memory. The hosts are connected by one or two non-transparent bridges (NTBs), which make each host's memory and devices available to the other, while allowing each host to reboot independently.
Inventor
Benjamin Charles Serebrin
Memory Error Prevention By Proactive Memory Poison Recovery (17695406)
Abstract
The disclosed technology provides techniques, systems, and apparatus for proactively detecting, containing, and recovering from uncorrectable memory errors in distributed computing environment. An aspect of the disclosed technology includes scanning, by a scanner of a host machine, memory of the host machine for errors. After the scanner detects an error, the scanner may generate an error notification. The scanner may transmit the error notification to one or more processors of the host machine to implement mitigation techniques.
Inventor
Jue Wang
EFFICIENTLY ALLOCATING MEMORY ON NEURAL NETWORK COMPUTE TILES (17919164)
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data indicating a neural network comprising a plurality of layers; for each layer in a subset of the plurality of layers: assigning a subset of the plurality of computing units to at least partially perform inference computations associated with the layer; determining a memory size and a common memory address for the respective addressable memory unit of each computing unit assigned for the layer; and generating a shared instruction comprising a memory allocation instruction that, when executed by each of the subset of the plurality of computing units, causes the computing unit to store a result of performing inference computations associated with the layer in the determined common memory address with the determined memory size in the addressable memory of the computing unit.
Inventor
Jack Liu
Hybrid and Hierarchical Multi-Trial and OneShot Neural Architecture Search on Datacenter Machine Learning Accelerators (17721873)
Abstract
According to various implementations, generally disclosed herein is a hybrid and hierarchical neural architecture search (NAS) approach. The approach includes performing a search space partitioning scheme to divide the search space into sub-search spaces. The approach further includes performing a first type of NAS, such as a Multi-trial NAS, to cover a search across the sub-search spaces. The approach also includes performing a second type of NAS, such as a One-Shot NAS, to cover each sub-search space. The approach further includes automatically stopping the second type of NAS based on one or more early stopping criteria.
Inventor
Sheng Li
Time Series Forecasting (18323766)
Abstract
A method for time series forecasting includes receiving a time series forecasting query from a user requesting the data processing hardware to perform a plurality of time series forecasts. Each time series forecast is a forecast of future data based on respective current data. Simultaneously, for each time series forecast of the plurality of time series forecasts requested by the time series forecasting query, the method includes training a plurality of models for the respective time series forecast. The method also includes determining which model of the plurality of models best fits the respective time series forecast and forecasting the future data based on the determined best fitting model and the respective current data. The method also includes returning, to the user, the forecasted future data for each of the plurality of time series forecasts request by the timer series forecasting query.
Inventor
Xi Cheng
WEB PAGE TRANSFORMER FOR STRUCTURE INFORMATION EXTRACTION (18200813)
Abstract
The technology provides a rich attention mechanism for structured information extraction of web pages and other electronic documents. An input layer of a model obtains system, information associated with the document, including field tokens representing respective fields to be extracted from the document, structured document type tokens associated, and text tokens from a text sequence in the document. An encoder connects the field tokens, the S type tokens and the text tokens according to a set of different attention patterns. The encoder generates an overall token representation based on the set of different attention patterns. An output layer of the model extracts a final text span for the each of the respective fields from the set of text tokens. The extracted final text span for each of the respective fields is stored in memory, and can be produced in response to a search query, analytics evaluation or other request.
Inventor
Qifan Wang
Enclave Fork Support (18200648)
Abstract
A fork support is provided for duplicating an application running inside an enclave entity. In this regard, a request to duplicate an application running inside a first enclave may be received by one or more processors of a host computing device of the first enclave. A snapshot of the first enclave including the application may be generated. The snapshot may be encrypted with a snapshot key and copied to untrusted memory of the host. A second enclave may be generated. The snapshot key may be sent from the first enclave to the second enclave through a secure communication channel. The encrypted snapshot may be copied from the untrusted memory of the host into the second enclave. The encrypted snapshot may be decrypted inside the second enclave with the snapshot key.
Inventor
Keith Moyer
Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents (18321424)
Abstract
Systems and methods of the present disclosure are directed to a method for predicting semantic similarity between documents. The method can include obtaining a first document and a second document. The method can include parsing the first document into a plurality of first textual blocks and the second document into a plurality of second textual blocks. The method can include processing each of the plurality of first textual blocks and the second textual blocks with a machine-learned semantic document encoding model to obtain a first document encoding and a second document encoding. The method can include determining a similarity metric descriptive of a semantic similarity between the first document and the second document based on the first document encoding and the second document encoding.
Inventor
Liu Yang
Multi-Stage Machine Learning Model Synthesis for Efficient Inference (18007379)
Abstract
Example implementations of the present disclosure combine efficient model design and dynamic inference. With a standalone lightweight model, the unnecessary computation on easy examples is avoided and the information extracted by the lightweight model also guide the synthesis of a specialist network from the basis models. With extensive experiments on ImageNet it is shown that a proposed example BasisNet is particularly effective for image classification and a BasisNet-MV3 achieves 80.3% top-1 accuracy with 290 M MAdds without early termination.
Inventor
Li Zhang
METHODS AND APPARATUS FOR PERFORMING PHASE OPERATIONS (18181419)
Abstract
Methods, systems, and apparatus for performing phase operations. In one aspect, a method for performing a same phase operation on a first and second qubit using a third qubit prepared in a phased plus state includes: performing a first NOT operation on the third qubit; computing a controlled adder operation on the first, second and third qubit, comprising encoding the result of the controlled adder operation in a fourth qubit; performing a square of the phase operation on the fourth qubit; uncomputing the controlled adder operation on the first, second and third qubit; performing a CNOT operation between the first qubit and the third qubit, wherein the first qubit acts as the control; performing a CNOT operation between the second qubit and the third qubit, wherein the second qubit acts as the control; and performing a second NOT operation on the third qubit.
Inventor
Craig Gidney
Optimal Time-to-Event Modeling for Longitudinal Prediction fo Open Entitles (18183291)
Abstract
A method for optimal time-to-event (TTE) modeling includes obtaining a forecast request requesting performance of a TTE forecast forecasting an amount of time an event will occur after a starting point in time. The method includes obtaining a cutoff value representing an amount of time after the starting point in time that the event has not occurred. The method also includes forecasting, using an uncertainty forecasting model, the amount of time the event will occur after the starting point in time and updating the forecasted amount of time based on the cutoff value. The method also includes returning the updated forecasted amount of time the event will occur after the starting point in time.
Inventor
Jingtao Wang
Systems and Methods for Generating Splat-Based Differentiable Two-Dimensional Renderings (18014307)
Abstract
Systems and methods of the present disclosure are directed to a method that can include obtaining a 3D mesh comprising polygons and texture/shading data. The method can include rasterizing the 3D mesh to obtain a 2D raster comprising pixels and coordinates respectively associated with a subset of pixels. The method can include determining an initial color value for the subset of pixels based on the coordinates of the pixel and the associated shading/texture data. The method can include constructing a splat at the coordinates of a respective pixel. The method can include determining an updated color value for a respective pixel based on a weighting of the subset of splats to generate a 2D rendering of the 3D mesh based on the coordinates of a pixel and a splat.
Inventor
Kyle Adam Genova
SELECTIVE BLACK LEVEL CONTROL IN ACTIVE MATRIX DISPLAYS (17910641)
Abstract
A method includes: (a) receiving initial image frame data to display an image frame on a display panel, a luminance of each pixel of the display corresponding to a gray level; (b) identifying dark pixels at or below a first threshold gray level; (c) identifying pixels to be modified as a subset of the dark pixels neighbored by at least one bright pixel exceeding a second threshold gray level; (d) increasing by an incremental amount the gray level of the pixels to be modified, providing modified image frame data composed of: (i) the dark pixels that are neighbored by at least one bright pixel having gray levels that have been increased by the incremental gray level amount, and (ii) other pixels that have gray levels from the initial image frame data; and (e) displaying the image frame using the modified image frame data.
Inventor
Sangmoo Choi
DISPLAY DEVICE WITH HARDWARE THAT DIMS PIXELS (17920453)
Abstract
An electronic device includes a display device that includes a plurality of pixels that form an active area of the display device, the active area of the display device defining a rounded edge portion, wherein multiple pixels that form at least part of the rounded edge portion have stepped relative brightness levels determined by hardware structures of the multiple pixels, such that a first pixel of the multiple pixels that is located at a first position in the rounded edge portion has a first relative brightness level defined by a first pixel hardware structure and a second pixel of the multiple pixels that is located at a second position has a second relative brightness level defined by a second pixel hardware structure, the first relative brightness level being different from the second relative brightness level, the first pixel hardware structure being different from the second pixel hardware structure.
Inventor
Sangmoo Choi
Deliberation by Text-Only and Semi-Supervised Training (18186157)
Abstract
A method of text-only and semi-supervised training for deliberation includes receiving training data including unspoken textual utterances that are each not paired with any corresponding spoken utterance of non-synthetic speech, and training a deliberation model that includes a text encoder and a deliberation decoder on the unspoken textual utterances. The method also includes receiving, at the trained deliberation model, first-pass hypotheses and non-causal acoustic embeddings. The first-pass hypotheses is generated by a recurrent neural network-transducer (RNN-T) decoder for the non-causal acoustic embeddings encoded by a non-causal encoder. The method also includes encoding, using the text encoder, the first-pass hypotheses generated by the RNN-T decoder, and generating, using the deliberation decoder attending to both the first-pass hypotheses and the non-causal acoustic embeddings, second-pass hypotheses.
Inventor
Ke Hu
Using Non-Parallel Voice Conversion for Speech Conversion Models (17660487)
Abstract
A method includes receiving a set of training utterances each including a non-synthetic speech representation of a corresponding utterance, and for each training utterance, generating a corresponding synthetic speech representation by using a voice conversion model. The non-synthetic speech representation and the synthetic speech representation form a corresponding training utterance pair. At each of a plurality of output steps for each training utterance pair, the method also includes generating, for output by a speech recognition model, a first probability distribution over possible non-synthetic speech recognition hypotheses for the non-synthetic speech representation and a second probability distribution over possible synthetic speech recognition hypotheses for the synthetic speech representation. The method also includes determining a consistent loss term for the corresponding training utterance pair based on the first and second probability distributions and updating parameters of the speech recognition model based on the consistent loss term.
Inventor
Andrew M. Rosenberg
4-bit Conformer with Accurate Quantization Training for Speech Recognition (18186774)
Abstract
A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes quantizing the trained ASR model to an integer target fixed-bit width. The quantized trained ASR model includes a plurality of weights. Each weight of the plurality of weights includes an integer with the target fixed-bit width. The method includes providing the quantized trained ASR model to a user device.
Inventor
Shaojin Ding
Rare Word Recognition with LM-aware MWER Training (18187222)
Abstract
A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of the external language model.
Inventor
Weiran Wang
Scalable Model Specialization Framework for Speech Model Personalization (18184630)
Abstract
A method for speech conversion includes obtaining a speech conversion model configured to convert input utterances of human speech directly into corresponding output utterances of synthesized speech. The method further includes receiving a speech conversion request including input audio data corresponding to an utterance spoken by a target speaker associated with atypical speech and a speaker identifier uniquely identifying the target speaker. The method includes activating, using the speaker identifier, a particular sub-model for biasing the speech conversion model to recognize a type of the atypical speech associated with the target speaker identified by the speaker identifier. The method includes converting, using the speech conversion model biased by the activated particular sub-model, the input audio data corresponding to the utterance spoken by the target speaker associated with atypical speech into output audio data corresponding to a synthesized canonical fluent speech representation of the utterance spoken by the target speaker.
Inventor
Fadi Biadsy
Freeze Words (18322149)
Abstract
A method for detecting freeze words includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device associated with the user. The method also includes processing, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation. The speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data. Before the predetermined duration of non-speech, the method includes detecting a freeze word in the audio data. In response to detecting the freeze word in the audio data, the method also includes triggering a hard microphone closing event at the user device. The hard microphone closing event prevents the user device from capturing any audio subsequent to the freeze word.
Inventor
Matthew Sharifi
End-to-End Streaming Keyword Spotting (18322207)
Abstract
A method for training hotword detection includes receiving a training input audio sequence including a sequence of input frames that define a hotword that initiates a wake-up process on a device. The method also includes feeding the training input audio sequence into an encoder and a decoder of a memorized neural network. Each of the encoder and the decoder of the memorized neural network include sequentially-stacked single value decomposition filter (SVDF) layers. The method further includes generating a logit at each of the encoder and the decoder based on the training input audio sequence. For each of the encoder and the decoder, the method includes smoothing each respective logit generated from the training input audio sequence, determining a max pooling loss from a probability distribution based on each respective logit, and optimizing the encoder and the decoder based on all max pooling losses associated with the training input audio sequence.
Inventor
Raziel Alvarez Guevara
Emotionally Intelligent Responses to Information Seeking Questions (17655544)
Abstract
A method for generating emotionally intelligent responses to information seeking questions includes receiving audio data corresponding to a query spoken by a user and captured by an assistant-enabled device associated with the user, and processing, using a speech recognition model, the audio data to determine a transcription of the query. The method also includes performing query interpretation on the transcription of the query to identify an emotional state of the user that spoke the query, and an action to perform. The method also includes obtaining a response preamble based on the emotional state of the user and performing the identified action to obtain information responsive to the query. The method further includes generating a response including the obtained response preamble followed by the information responsive to the query.
Inventor
Madelaine Plauché
SUGGESTING AN ALTERNATIVE INTERFACE WHEN ENVIRONMENTAL INTERFERENCE IS EXPECTED TO INHIBIT CERTAIN AUTOMATED ASSISTANT INTERACTIONS (18200518)
Abstract
Implementations set forth relate to suggesting an alternate interface modality when an automated assistant and/or a user is expected to not understand a particular interaction between the user and the automated assistant. In some instances, the automated assistant can pre-emptively determine that a forthcoming and/or ongoing interaction between a user and an automated assistant may experience interference. Based on this determination, the automated assistant can provide an indication that the interaction may not be successful and/or that the user should interact with the automated assistant through a different modality. For example, the automated assistant can render a keyboard interface at a portable computing device when the automated assistant determines that an audio interface of the portable computing device is experiencing interference.
Inventor
Matthew Sharifi
TRANSFERRING AN AUTOMATED ASSISTANT ROUTINE BETWEEN CLIENT DEVICES DURING EXECUTION OF THE ROUTINE (18201987)
Abstract
Transferring (e.g., automatically) an automated assistant routine between client devices during execution of the automated assistant routine. The automated assistant routine can correspond to a set of actions to be performed by one or more agents and/or one or more devices. While content, corresponding to an action of the routine, is being rendered at a particular device, the user may walk away from the particular device and toward a separate device. The automated assistant routine can be automatically transferred in response, and the separate device can continue to rendering the content for the user.
Inventor
Yuzhao Ni
PERFORMING SUBTASK(S) FOR A PREDICTED ACTION IN RESPONSE TO A SEPARATE USER INTERACTION WITH AN AUTOMATED ASSISTANT PRIOR TO PERFORMANCE OF THE PREDICTED ACTION (18202236)
Abstract
Implementations herein relate to pre-caching data, corresponding to predicted interactions between a user and an automated assistant, using data characterizing previous interactions between the user and the automated assistant. An interaction can be predicted based on details of a current interaction between the user and an automated assistant. One or more predicted interactions can be initialized, and/or any corresponding data pre-cached, prior to the user commanding the automated assistant in furtherance of the predicted interaction. Interaction predictions can be generated using a user-parameterized machine learning model, which can be used when processing input(s) that characterize a recent user interaction with the automated assistant. The predicted interaction(s) can include action(s) to be performed by third-party application(s).
Inventor
Lucas Mirelmann
Hotphrase Triggering Based On A Sequence Of Detections (18323725)
Abstract
A method includes receiving audio data corresponding to an utterance spoken by the user and captured by the user device. The utterance includes a command for a digital assistant to perform an operation. The method also includes determining, using a hotphrase detector configured to detect each trigger word in a set of trigger words associated with a hotphrase, whether any of the trigger words in the set of trigger words are detected in the audio data during the corresponding fixed-duration time window. The method also includes determining identifying, in the audio corresponding to the utterance, the hotphrase when each other trigger word in the set of trigger words was also detected in the audio data. The method also includes triggering an automated speech recognizer to perform speech recognition on the audio data when the hotphrase is identified in the audio data corresponding to the utterance.
Inventor
Victor Carbune
Optimizing Personal VAD for On-Device Speech Recognition (18123060)
Abstract
A computer-implemented method includes receiving a sequence of acoustic frames corresponding to an utterance and generating a reference speaker embedding for the utterance. The method also includes receiving a target speaker embedding for a target speaker and generating feature-wise linear modulation (FiLM) parameters including a scaling vector and a shifting vector based on the target speaker embedding. The method also includes generating an affine transformation output that scales and shifts the reference speaker embedding based on the FiLM parameters. The method also includes generating a classification output indicating whether the utterance was spoken by the target speaker based on the affine transformation output.
Inventor
Shaojin Ding
Generalized Automatic Speech Recognition for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation (18171368)
Abstract
A method for training a generalized automatic speech recognition model for joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving a plurality of training utterances paired with corresponding training contextual signals. The training contextual signals include a training contextual noise signal including noise prior to the corresponding training utterance, a training reference audio signal, and a training speaker vector including voice characteristics of a target speaker that spoke the corresponding training utterance. The operations also include training, using a contextual signal dropout strategy, a contextual frontend processing model on the training utterances to learn how to predict enhanced speech features. Here, the contextual signal dropout strategy uses a predetermined probability to drop out each of the training contextual signals during training of the contextual frontend processing model.
Inventor
Tom O'Malley
Microphone Array Configuration Invariant, Streaming, Multichannel Neural Enhancement Frontend for Automatic Speech Recognition (18171411)
Abstract
A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of self-attention blocks, a stacked input including the single channel cleaned input signal and a single channel noisy input signal, and generates, as output, from a final block of the stack of self-attention blocks, an un-masked output. The masking layer receives, as input, the single channel noisy input signal and the un-masked output, and generates, as output, enhanced input speech features corresponding to a target utterance.
Inventor
Joseph Caroselli
Systems and Methods for Improved Machine-Learned Compression (18008045)
Abstract
A computer-implemented method for compressing computer-readable data having improved efficiency can include obtaining, by a computing system including one or more computing devices, input data associated with the computing system; and encoding, by the computing system, the input data and added noise from a noisy channel to produce encoded data based at least in part on an encoder model, wherein encoding the input data and added noise includes additively combining the added noise and the input data to obtain noisy input data and rounding the noisy input data by a soft rounding function, the soft rounding function having a sharpness, to produce the encoded data, wherein the machine-learned encoder model is trained on training data, wherein the training data is encoded with the added noise from the noisy channel.
Inventor
Eirikur Thor Agustsson
NEURAL NETWORK-BASED TRANSMISSION FEEDBACK IN A CELLULAR NETWORK (18016502)
Abstract
Two devices in wireless communication implement a soft transmission feedback scheme. A data-sending device wirelessly communicates a first transmission representing a data block and generated using one or more neural networks to a data-receiving device, which processes the first transmission using one or more neural networks to attempt to recover the data block, as well as to generate transmission feedback indicating a status of the recovery attempt. The feedback is used by one or more neural networks to generate a second transmission that is wirelessly communicated to the data-sending device. One or more neural networks process the second transmission to generate a retransmit control signal. One or more neural networks selectively include at least a portion of the data block for retransmission in a third transmission to the data-receiving device based on the retransmit control signal.
Inventor
Jibing Wang
QUANTUM NEURAL NETWORK (18117232)
Abstract
A quantum neural network architecture. In one aspect, a quantum neural network trained to perform a machine learning task includes: an input quantum neural network layer comprising (i) multiple qubits prepared in an initial quantum state encoding a machine learning task data input, and (ii) a target qubit, a sequence of intermediate quantum neural network layers, each intermediate quantum neural network layer comprising multiple quantum logic gates that operate on the multiple qubits and target qubit; and an output quantum neural network layer comprising a measurement quantum gate that operates on the target qubit and provides as output data representing a solution to the machine learning task.
Inventor
Hartmut Neven
LOCALIZED CRYPTOGRAPHIC TECHNIQUES FOR PRIVACY PROTECTION (17924599)
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for preserving user privacy when selecting content are described. In some aspects, a method includes receiving a data element identifying a set of candidate digital components and, for each candidate digital component, a set of distribution parameters for the candidate digital component. For each candidate digital component, encrypted selection data for the candidate digital component is provided as input to a cryptographic analysis application running in a trusted hardware module of a client device. The encrypted selection data represents the set of distribution parameters for the candidate digital component and is encrypted using a zero-knowledge proof protocol. The cryptographic analysis application is configured to determine a measure of match between the selection data and user attributes of a user of the client device.
Inventor
Dr. Christopher Schneider
ADAPTIVE RESIZING OF AUDIO JITTER BUFFER BASED ON CURRENT NETWORK CONDITIONS (18016493)
Abstract
In a streaming media system, a client device includes an adjustable-size jitter buffer to buffer audio packets of a stream. A buffer controller of the client device operates to determine a representation of a current condition of the network, such as through statistical analysis to generate a histogram or probability density function representative of measured differences between arrival times of successive audio packets at the client device, either since a start of a streaming session or over a sliding time window. The buffer controller then selects an updated size for the jitter buffer based on the representation of the current condition of the network and implements the updated size, either in one adjustment or over time at a size adaptation rate based on a programmable adjustment duration, so as to balance buffer latency and dropped packet rate in view of the current network conditions.
Inventor
Chiong Ching Lai
BITRATE-ADAPTIVE SEGMENTATION FOR VIDEO TRANSCODING (17696760)
Abstract
Bitrate-adaptive segmentation is performed for transcoding a video stream uploaded to an online video platform for hosting and later playback to platform users. The video stream is segmented into chunks based on prediction-based bit costs determined for frames of the video stream rather than based on scene changes detected within the video stream. The bitrate-adaptive segmentation includes determining inter-prediction bit costs and intra-prediction bit costs for frames of the video stream based on information indicated within a pass log based on a first pass encoding of the video stream, determining chunk boundaries for segmenting the video stream into a chunk based on the inter-prediction bit costs and the intra-prediction bit costs for the frames, and transcoding the chunk to produce a transcoded video stream.
Inventor
Di Chen
VERIFYING THE RENDERING OF VIDEO CONTENT AT CLIENT DEVICES USING TRUSTED PLATFORM MODULES (18200883)
Abstract
Systems and methods for verifying the rendering of video content on information resources are provided herein. A server can receive, from a target client device, a tracking message purporting to relate to delivery of a target content item; determine whether the tracking message contains an identifier of a sending device that sent the tracking message; determine whether the sending device and the target client device are the same device; if the sending client device and the target client device are the same device: recover, from the tracking message, information about at least a portion of a frame of a content item processed by a trusted platform module of the client device; and compare the at least a portion of the frame of the content item processed by a trusted platform module of the client device with a target content item.
Inventor
Oliver Woodman
RACH PROCEDURES FOR REQUESTING SLICE SUPPORT INFORMATION (18040772)
Abstract
In a method in a base station configured to communicate with a user device located in a coverage area of a cell associated with the base station, a first random access message of a RACH procedure is received (A, B, or ) from the user device. The method also includes determining (A, B, or ), by processing hardware of the base station, that the first random access message is associated with a first network slice, at least in part by determining that the user device transmitted the first random access message using a RACH configuration associated with the first network slice. The method also includes transmitting (A, B, ) to the user device a second random access message of the RACH procedure. The second random access message includes information regarding network support for the first network slice.
Inventor
Pavan Nuggehalli
Load Vectoring Heat Sink (17696127)
Abstract
A heat sink includes multiple load points and a plurality of load cell for each of the load points. Each of the load cells is configured to attach to a respective attachment point on a component and to create a tensile load between the respective attachment point of the component and a respective one of the load points of the heat sink. At least one of the load cells is configured to produce a different maximum tensile load than another load cell among the plurality of load cells.
Inventor
Ryan Tong