Claiming the Attention Layer: A Framework for Transformer-Based Bioinformatics Patents in the AI Space

By: Claire Baek, Ph.D.

The transformer architecture has reshaped modern artificial intelligence, powering large language models (LLMs), multimodal models, and bioinformatics platforms. Transformers rely on a defining computational operation: the attention mechanism. Understanding the attention mechanism is essential to drafting patentable claims involving transformer-based systems.

One approach in the existing landscape is to claim the operation of the transformer at a functional level, describing what the model does rather than how it does it. Functional claiming of this kind can be appropriate and may secure broad coverage in certain contexts. However, functional claiming for transformer-related patents may often be exposed to design-arounds or validity challenges, and in some instances, capture operations that were similarly or already performed by previous AI model architectures.

This article proposes a complementary, structurally grounded approach centered around the attention mechanism. Rather than claiming the transformer’s outputs or end-to-end behavior, the proposed framework focuses on claiming the specific operations of the attention layer as the locus of inventive contribution. In many transformer-based inventions, the key design choices that determine what the model can do reside precisely in how the attention mechanism is configured, for example, what data it operates on, how it computes, and how its outputs are used downstream.