Stanford: Most large AI systems don’t meet proposed EU requirements

Compliance with Europe’s draft AI Act may prove challenging for creators of large language models.

article cover — Eugen Barbu / 500px/Getty Images

June 26, 2023

· 3 min read

After the European Parliament passed a draft of the AI Act in June, key additions caught the eye of Stanford researchers. Namely, this version of the proposed law, in addition to regulating use cases, also includes requirements for the creators of “foundation models,” or models trained on one large data set with many applications.

“It’s the only version so far that has specific separate requirements for foundation model providers,” Daniel Zhang, senior manager for policy initiatives at Stanford’s Institute for Human-Centered Artificial Intelligence, told Tech Brew.

Zhang is one of the co-authors of a report from the Institute’s Center for Research on Foundation Models, which conducted an evaluation of 10 major foundation model providers, grading them against 12 requirements from the European Parliament’s draft.

Based on Stanford’s five-point rubric, most flagship foundation models, including those created by OpenAI, Google, and Meta, don’t currently comply with EU regulations.

The top score possible was 48, but none of the models came close. The top score was an open-source model called Bloom, made by Hugging Face, which scored 36 points. Google’s PaLM 2 scored 27 points, GPT-4 scored 25, and Meta’s LLaMA scored 21.

Stanford Institute for Human-Centered Artificial Intelligence

Image of Stanford Research on Foundation Models (CRFM), Institute for Human-Centered Artificial Intelligence — Stanford Institute for Human-Centered Artificial Intelligence

The report identified four areas where models tended to receive low scores: copyrighted training data, emissions reporting, risk mitigation disclosures, and evaluation.

“Transparency should be the top priority for policymakers, because this is an essential precondition for rigorous science and effective regulation, and holding companies accountable,” Zhang said.

He pointed to a “clear divide” between open models and their closed or restricted-access counterparts. “Openly released models generally achieve higher scores on resource disclosure requirements, where closed and restricted-access models generally achieve stronger scores on deployment-related requirements,” he said.

Stanford’s scorecard could serve as a baseline for foundation model providers to get a head start on compliance issues, but it’s important to note that the report reflects an interpretation of the draft EU law, which didn’t get into specifics about enforcement or evaluation metrics, Zhang added.

That being said, there’s clearly a lot of work to do.

“We recognize that it’s a draft of the EU AI Act, but transparency, again, plays such an important factor in the public evaluation and the regulation of foundation models that I think policymakers around the world, research around the world, should pay attention to this particular topic and bring more transparency into the systems.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.