News & Updates

AI Act: Requirements for General-Purpose AI Models

This article was first published in the IAPP’s AI Act series here.  It is republished here with minor editorial edits to improve readability.

If you were to read the European Commission’s original AI Act proposal, published in April 2021, you would find it conspicuously devoid of references to general-purpose AI (“GPAI”). With the benefit of hindsight, this might seem a surprising omission. Yet, outside of the world of AI experts, few people had ever heard of GPAI at the time the proposal was published.

Fast-forward to a little over one year later, OpenAI released ChatGPT to an unsuspecting public in November 2022, wowing them with its human-like, if sometimes unreliable, responses to their prompts. It quickly went viral, reportedly reaching 100 million users in just two months and becoming the fastest adopted consumer application of all time.

As a result, terms like large language models, generative AI and GPAI began to enter the consciousness of European legislators, if not exactly the public consciousness. Clearly, the AI Act would need to regulate GPAI, but how?

This was not an easy question to answer. The proposed law worked by placing AI systems into prohibited, high and low risk buckets to decide which rules to apply. However, by its very nature, GPAI could be implemented across an unimaginably wide range of use cases that spanned the entire risk spectrum. The risks arising in any given scenario would necessarily depend on context, making it impossible to place GPAI into a single risk bucket.

Consequently, Europe’s legislators ultimately proposed an entirely new chapter of the AI Act dedicated specifically to regulating GPAI models: Chapter V (General Purpose-AI Models).

Distinguishing AI “models” from AI “systems”

To understand how the AI Act regulates GPAI, it is critical to understand the difference between AI models and AI systems.

Chapter V sets out rules that address the use of GPAI models. While the AI Act also defines the concept of a GPAI system – as an AI system based on a GPAI model – a GPAI system is simply a subset of the broader concept of an AI system, and GPAI systems are not addressed within Chapter V’s rules.

By specifying rules for GPAI models, Chapter V takes a different regulatory approach from the one taken generally throughout the AI Act.  The rest of the Act regulates AI systems, not AI models, and GPAI systems are just one type of AI system. The rules applicable to any AI system (including GPAI systems) are determined by whether the system in question is prohibited, high or low risk.

This distinction is not accidental. According to Recital 97, “the notion of GPAI models should be clearly defined and set apart from the notion of AI systems to enable legal certainty.”  Article 3(63) of the Act defines a GPAI model as “an AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications.

Therefore, to understand the definition of a GPAI model, it is necessary first to understand what an “AI model” is and how it is different from an “AI system”.

The Act does not define the concept of an AI model, but IBM helpfully explains “an AI model is a program that has been trained on a set of data to recognize certain patterns or make certain decisions without further human intervention.Recital 97 of the AI Act notes “AI models are essential components of AI systems” but “they do not constitute AI systems on their own.” This is because “AI models require the addition of further components, such as for example a user interface, to become AI systems. AI models are typically integrated into and form part of AI systems.

An AI model can therefore be thought of as the program that powers the intelligence of an AI system, but it cannot be used on a standalone basis. Accordingly, an AI model must first be integrated with other software and/or hardware components, so users have a means to access and interact with the AI model via a user interface, such as using a dialogue box to submit prompts. The set of hardware and software components that integrate, and enable users to interact with, one or more AI models collectively comprise the AI system. For example, and in very generalised terms, an autonomous vehicle can be thought of as an AI system that integrates multiple AI models to enable it to steer the vehicle, manage fuel consumption, apply brakes and so on.

What is a GPAI model?

The AI Act mostly applies to AI systems, not AI models – with the notable exception of GPAI models.

As explained above, a GPAI model:

  • Is an AI model, not an AI system, although it may be integrated into an AI system.
  • Is trained with a large amount of data using self-supervision at scale. For example, ChatGPT 3 was reportedly trained on at least 570 gigabytes of data or about 300 billion words.
  • Displays significant generality and is capable of competently performing a wide range of distinct tasks.

Further, the Act only regulates GPAI models that are placed on the EU market. “AI models that are used for research, development or prototyping activities before they are placed on the market” are excluded from the definition of a general purpose-AI model under Article 3(63), and from the scope of the Act under Article 2(8). 

Types of GPAI models covered by the Act

Chapter V distinguishes between two types of GPAI models: those with and without “systemic risk”. This distinction reflects the need to have stricter regulatory controls for GPAI models with systemic risk, due to their potential for significant harmful effects if not closely regulated.

To this end, under Article 3(65) of the AI Act, systemic risk is defined as “a risk that is specific to the high-impact capabilities of GPAI models, having a significant impact on the Union market due to their reach, or due to actual or reasonably foreseeable negative effects on public health, safety, public security, fundamental rights, or the society as a whole, that can be propagated at scale across the value chain.

At first glance, this definition appears circular. A GPAI model with systemic risk is one presenting risks that would have significant impact and are “specific to the high-impact capabilities of GPAI models.”  However, the definition hints at the types of concerns the AI Act legislators believe genera-purpose AI could present, namely “negative effects on public health, safety, public security, fundamental rights, or the society as a whole, that can be propagated at scale.

As to what these “negative effects … propagated at scale” could include, Recital 110 lists “major accidents, disruptions of critical sectors and serious consequences to public health and safety; any actual or reasonably foreseeable negative effects on democratic processes, public and economic security; the dissemination of illegal, false, or discriminatory content.” It continues that these might result in “chemical, biological, radiological, and nuclear risks… offensive cyber capabilities… the capacity to control physical systems and interfere with critical infrastructure; risks from models of making copies of themselves or ‘self-replicating’ or training other models … harmful bias and discrimination … the facilitation of disinformation or harming privacy with threats to democratic values and human rights.

 

How to identify a GPAI model with systemic risk

There are two ways for a GPAI model to be deemed to present a systemic risk under the AI Act.

  • First, under Article 51(1-2), the GPAI model must have “high impact capabilities”, as evaluated by “appropriate technical tools and methodologies, including indicators and benchmarks.”

    For these purposes, a GPAI model is presumed to have high impact capabilities if the cumulative amount of computation used for training is greater than 1025 floating point operations.

    To put this in human terms, according to some estimates, the computational power of the human brain is approximately in the order of 1016 to 1017 floating point operations. However, this is a crude and imprecise comparison for all sorts of reasons, not least that, while considerably slower than a computer, the brain is capable of much greater parallel processing at much lower levels of energy consumption. Nevertheless, it does provide a simple way for non-engineers to picture the type of computing power concerned.

  • Second, under Articles 51(1)(b), a GPAI model can be determined to have high impact capabilities by the European Commission, either on its own initiative or following a qualified alert from the scientific panel of independent experts pursuant to Articles 68 and 90 of the Act. In reaching such a determination, the Commission must have regard to certain criteria set out in Annex XIII.

The Commission must publish a list of GPAI models with systemic risk per Article 52(6) and can adopt delegated legislation to amend and supplement the thresholds, benchmarks and indicators that determine what qualify as high impact capabilities under Article 51(3) to keep pace with evolving technological developments.

Obligations for providers of all GPAI models 

Providers of GPAI models with or without systemic risk must comply with the obligations set out in Article 53 and Article 54 of the AI Act. These primarily address technical documentation requirements, the provision of transparency information to providers of AI systems that integrate the GPAI models, compliance with EU copyright rules and the need for non-EU model providers to appoint an EU representative.

Providers of GPAI models without systemic risk have fewer obligations than those with systemic risk. For that reason, while providers of GPAI models without systemic risk only need to comply with Article 53 and Article 54, providers of models with systemic risk have additional compliance responsibilities under Article 55.

The obligations that apply to all providers of GPAI models (with or without systemic risk) are to:

  • Prepare and maintain technical documentation about the GPAI model, including its training and testing process and evaluation results, containing the mandatory information set out in Annex XI, listed in Table 1 below. The European Commission’s AI Office and national competent authorities can require the GPAI model provider to provide this documentation on request. See also Article 91(1).
  • Make available certain information and documentation to providers of AI systems that integrate the GPAI model, so they have a good understanding of the capabilities and limitations of the model and can comply with their own obligations under the AI Act. This must include the mandatory information set out in Annex XII, listed in Table 2 below.
  • Put a policy in place to comply with EU rules on copyright and related rights. This should include a means to identify and comply, through state-of-the-art technologies, with any reservation of rights expressed by rights holders.
  • Prepare and make publicly available a detailed summary about the GPAI model’s training content, using a template provided by the AI Office that is not yet available as at the date of this article. This latter requirement has raised eyebrows among providers of GPAI models over concerns that it may force them to reveal trade secrets about their training content.

The first two points above do not apply to providers of open-source GPAI models, unless they have systemic risk, provided these models can be used and adapted without restriction and that information about their parameters, including weights, model architecture and model usage are made publicly available.

In addition, and with more than a passing nod towards EU representative requirements under the EU General Data Protection Regulation, non-EU providers of GPAI models must additionally appoint an authorised representative in the EU per Article 54(1). This appointment must be by way of a written mandate that authorises the representative to:

  • Verify that the GPAI model provider has prepared the required technical documentation and otherwise fulfilled its obligations under Article 53, as described above, and Article 55, if it provides a GPAI model with systemic risk, as described below.
  • Keep a copy of the GPAI model provider’s required technical documentation for a period of 10 years after the model is placed on the market so it is available to the European Commission’s AI Office and national competent authorities, in addition to its contact details.
  • Provide the AI Office with compliance information and documentation necessary to demonstrate the GPAI model provider’s compliance upon request; and
  • Cooperate with the AI Office and competent authorities upon request in any action they take in relation to the GPAI model, including when it is integrated into AI systems available in the EU.

Once again, this requirement does not ordinarily apply to providers of open-source general-purpose models, unless those models have systemic risk.

Obligations of providers of GPAI models with systemic risks

As already noted, providers of GPAI models with systemic risk are subject to additional obligations under Article 55 of the AI Act. In addition to the rules already described above, they must also:

  • Perform model evaluation in accordance with standardised protocols and tools reflecting the state of the art, including conducting and documenting adversarial testing of the model with a view to identifying and mitigating the systemic risks described above.
  • Assess and mitigate possible systemic risks at an EU level, including their sources, that may stem from the development, sale or use of GPAI models with systemic risk.
  • Keep track of, document and report relevant information about serious incidents without undue delay to the AI Office and, as appropriate, to national competent authorities, including possible corrective measures.
  • Ensure an adequate level of cybersecurity protection for the GPAI model with systemic risk and the physical infrastructure of the model.

Regarding the requirement to report document and report relevant information about serious incidents, a key question is how this requirement will be operationalised in practice, and further guidance would be welcomed in this respect. However, it is clear that this requirement is distinct from the requirement for high-risk AI systems providers and deployers to report serious incidents under Article 26(5) and Article 73.

Codes of Practice for GPAI

Pending the EU’s adoption of harmonised standards for GPAI pursuant to Article 40, providers of GPAI models with or without systemic risk can demonstrate their compliance by adhering to codes of practice which are expected to be drawn up and finalised by the AI Office within nine months after the AI Act enters into force. This would follow consultation with the AI Board and national competent authorities, as well as industry, academic and civil society stakeholders under Article 56.

The European AI Office launched a consultation for a first Code of Practice for GPAI models on 30 July 2024 (available here).

When does this take effect?

The AI Act’s rules for GPAI model providers come into effect in two phases under Articles 111(3) and 113.

Providers of older GPAI models, being those placed on the EU market before 2 August 2025, have up to three years from the Act’s entry into force to come into compliance (i.e. to 2 August 2027). However, providers of newer general-purpose models (that is, all other GPAI model providers) have up to 12 months after the Act enters into force to come into compliance (i.e. to 2 August 2025).

Practical steps for GPAI

Any organization using GPAI will need to ask itself the following questions and implement compliance measures accordingly:

  • Is the GPAI in question a “GPAI model” to which Chapter V applies or, instead, a “GPAI system” that must then be categorised as prohibited, high or low risk to determine which rules under the AI Act will apply?
  • Is the organization in question the provider of the GPAI model? Chapter V applies only to providers of GPAI models.
  • Does the GPAI model present systemic risk? If not, it will be subject only to the rules in Article 53 and Article 54. If so, it will be subject to additional rules in Article 55.
  • Has the AI Office produced any applicable codes of practice yet under Article 56? If so, consider alignment with these as a means of demonstrating compliance with the AI Act.
  • Is the GPAI model provider established outside of the EU? If so, it must appoint an authorised representative in the EU in accordance with Article 54.
  • Are you a provider of an older or newer GPAI model for the purposes of Article 111(3) and Article 113? This will determine when the AI Act’s rules apply to you, and when you need to come into compliance.

Annex

Table 1: Mandatory information to be included in technical documentation for GPAI models

Mandatory technical information for all GPAI models, with or without systemic risk

A general description of the GPAI model, including:

  • The tasks that the model is intended to perform and the type and nature of AI systems in which it can be integrated
  • The acceptable use policies applicable
  • The date of release and methods of distribution
  • The architecture and number of parameters
  • The modality (e.g. text, image) and format of inputs and outputs
  • The licence

A detailed description of the elements of the model referred to above and relevant information of the process for the development, including the following elements:

  • The technical means required to integrate the GPAI model in AI systems (e.g. instructions of use, infrastructure, tools)
  • The design specifications of the model and training process, including training methodologies and techniques, the key design choices including the rationale and assumptions made; what the model is designed to optimise for and the relevance of the different parameters, as applicable
  • Information on the data used for training, testing and validation, where applicable, including the type and provenance of data and curation methodologies (e.g. cleaning, filtering etc.), the number of data points, their scope and main characteristics; how the data was obtained and selected as well as all other measures to detect the unsuitability of data sources and methods to detect identifiable biases, where applicable
  • The computational resources used to train the model (e.g. number of floating point operations), training time, and other relevant details related to the training
  • Known or estimated energy consumption of the model. When the energy consumption of the model is unknown, the energy consumption may be based on information about computational resources used.
Additional mandatory technical information GPAI models with systemic risk
A detailed description of the evaluation strategies, including evaluation results,  on the basis of available public evaluation protocols and tools or otherwise of other evaluation methodologies. Evaluation strategies shall include evaluation criteria, metrics and the methodology on the identification of limitations.
Where applicable, a detailed description of the measures put in place to conduct internal and/or external adversarial testing (e.g., red teaming), model adaptations, including alignment and fine-tuning.
Where applicable, a detailed description of the system architecture explaining how software components build or feed into each other and integrate into the overall processing.

Table 2: Mandatory transparency information for GPAI models

Mandatory transparency information for all GPAI models, with or without systemic risk

1.     A general description of the GPAI model including:

  • The tasks that the model is intended to perform and the type and nature of AI systems into which it can be integrated
  • The acceptable use policies applicable
  • The date of release and methods of distribution
  • How the model interacts, or can be used to interact, with hardware or software that is not part of the model itself, where applicable
  • The versions of relevant software related to the use of the GPAI model, where applicable
  • The architecture and number of parameters
  • The modality (e.g.,text, image) and format of inputs and outputs
  • The licence for the model.

A description of the elements of the model and of the process for its development, including:

  • The technical means required to integrate the GPAI model into AI systems (e.g., instructions for use, infrastructure, tools)
  • The modality (e.g., text, image, etc.) and format of the inputs and outputs and their maximum size (e.g., context window length, etc.)
  • Information on the data used for training, testing and validation, where applicable, including the type and provenance of data and curation methodologies.