Home > Our Thinking > Blogs > Tech & Sourcing @ Morgan Lewis > Structuring Rights to AI/ML Outputs, Insights, and Improvements When Customer Data Is Foundational

BLOG POST

Tech & Sourcing @ Morgan Lewis

TECHNOLOGY TRANSACTIONS, OUTSOURCING, AND COMMERCIAL CONTRACTS NEWS FOR LAWYERS AND SOURCING PROFESSIONALS

Structuring Rights to AI/ML Outputs, Insights, and Improvements When Customer Data Is Foundational

Contract Corner

How are intellectual property (IP) and data rights allocated when a particular dataset is a key to unlocking a powerful new artificial intelligence/machine learning (AI/ML) model or use case? To find a balance, contracting parties may end up trading a black box for Pandora’s box.

As we previously discussed, businesses continue to rapidly evaluate and integrate AI tools, platforms, and other technology, and it is important to pay close attention to contract provisions relating to IP and data. Where a customer or its data underpins the development, training, or improvement of a particular AI technology, the customer may view itself more as a collaborator than a customer when the parties negotiate resulting IP rights.

When contracting parties are evaluating a prospective relationship involving AI technology, the first step is to assess the particular context and intended use cases. For example, IP rights may follow a “typical” pattern if

the customer will be one of many customers using the AI technology on day one;
the customer’s data is not unique, proprietary, sensitive, or otherwise valuable;
the customer’s data is logically separated from other data and is not used for training the model;
the customer will not provide significant input into model development, customization (for a particular industry or repeatable use case), or improvement; and/or
the customizations for the AI technology are generally available to all of the vendor’s customers and are not unique to the customer.

Conversely, negotiations could be particularly challenging if, for example

the customer is an early adopter or the first customer, and the vendor will not scale very quickly;
the customer and the customer’s data are heavily involved in product and model development;
the customer’s data is a significant portion of the data used to train the model or improve the service;
the customer intends to engage in customer-specific fine-tuning, supervised learning, or prompt engineering that the customer intends to use with other AI technologies; and/or
the customer’s data is unique, proprietary, or particularly valuable.

In this second and more complex scenario, the customer may claim that a market-ready AI technology would not happen without the customer’s data and collaboration. This argument can be especially relevant if the vendor is tailoring the model to a specific industry setting and the customer is contributing expertise, funds for development, or data that the vendor is lacking. For example, if the ultimate objective is to improve efficiency and quality in healthcare services and digital health applications, then data and know-how related to patient care and monitoring could be instrumental to model training and fine-tuning.

Considerations under such circumstances include the following:

Inputs

If the customer’s raw data belongs to the customer, what (if any) transformations or derivatives of that data also belong to the customer?
The parties may agree that transformations of the customer’s data, for model input formatting purposes, belong to the customer. Consider the possibility, though, of the vendor using the customer’s data to generate a synthetic, roughly equivalent dataset such that the need for the customer’s data is largely diminished.
Note that the definition of “Customer Data” in the contract could include certain model outputs that the parties agree belong to the customer (see below).
Can the customer’s data be commingled with other data and used to develop, train, and/or improve the AI technology? If so, does the customer get the same rights in the model weights and insights that the customer would get in the customer data-only scenario?

Outputs

Are there any restrictions on how the customer may use the outputs (e.g., competitive development)? For example, what if the customer wants to use the output in a high-risk or regulated context, such as a medical device? In that situation, the vendor may be concerned that liabilities, regulatory burden, and/or reputational harm could flow back to the vendor. This is particularly notable if the use case (e.g., healthcare services) is outside of the vendor’s traditional expertise and operations (e.g., B2B software).
What constitutes “output”? For example, language like “resulting from use of the model” could be interpreted broadly to include downstream insights and analyses.
If the customer owns the output, for what purposes (if any) may the vendor use it? For any output owned by the vendor, the customer may want to determine whether it has sufficient license rights for its intended use.
If the customer owns the output, then the vendor could propose clarifying in the contract that independently generating output for a third party based on the third party’s inputs, without using the customer’s data, would be permissible even if the output is the same or very similar.

Base Technology

If the vendor owns the underlying AI technology (including the vendor’s proprietary or example prompts), what (if any) improvements to such technology belong to the vendor? And what constitutes the base technology?
For example, the parties may agree that, as between them, the vendor owns (1) the original model that was developed and trained prior to use of the customer’s data (pre-trained), including the associated hyperparameters and other aspects or components that are independent of training on the customer’s data; (2) any vendor interface to receive inputs and deliver outputs; and (3) any pre-trained model weights or other parameters. But then the possibly lengthy discussion of improvements to the pre-trained IP arises, especially if the customer’s data is viewed as a cornerstone of the model’s fine-tuning.
What if the customer anticipates needing use of the model, or its custom parameters, after the relationship ends? What will the customer actually be able to do from a practical perspective?

Is It IP?

The above assumes that there are IP rights to be owned and allocated, but keep in mind our prior warning that IP ownership may be moot if AI is used to conceive or create the IP without significant human involvement in the creative or inventive steps.

If a new AI technology appears to be transformative and if a customer gets involved at the ground floor, the parties may need to dig deep into the model’s architecture and the parties’ respective roles and contributions in order to lay the proper foundation.