Jump to content

18726881. Machine Learning Models Featuring Resolution-Flexible Multi-Axis Attention Blocks (GOOGLE LLC)

From WikiPatents

Machine Learning Models Featuring Resolution-Flexible Multi-Axis Attention Blocks

Organization Name

GOOGLE LLC

Inventor(s)

Yinxiao Li of Sunnyvale CA (US)

Zhengzhong Tu of Austin TX (US)

Hossein Talebi of San Jose CA (US)

Han Zhang of Sunnyvale CA (US)

Feng Yang of Sunnyvale CA (US)

Peyman Milanfar of Menlo Park CA (US)

Machine Learning Models Featuring Resolution-Flexible Multi-Axis Attention Blocks

This abstract first appeared for US patent application 18726881 titled 'Machine Learning Models Featuring Resolution-Flexible Multi-Axis Attention Blocks

Original Abstract Submitted

Provided are machine learning systems and models featuring resolution-flexible multi-axis attention blocks. In particular, the present disclosure provides example multi-axis MLP based architectures (example implementations of which can be generally referred to as MAXIM) that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks. In some implementations, MAXIM can use a UNet-shaped hierarchical structure and supports long-range interactions enabled by spatially-gated MLPs. Specifically, some example implementations of MAXIM can contain two MLP-based building blocks: a multi-axis gated MLP that allows for efficient and scalable spatial mixing of local and global visual cues, and a cross-gating block, an alternative to cross-attention, which accounts for cross-feature mutual conditioning.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.