Small changes can reduce GenAI energy use by 90%

No comments

UNESCO report says adaptations to how large language models are built and used can dramatically reduce energy consumption

Robot working on video workflows

New research published by UNESCO and UCL reveals that small changes to how large language models (LLMs) are built and used can dramatically reduce energy consumption without compromising performance.

The report suggests moving away from resource-heavy AI models in favour of more compact models.

Used together, these measures can reduce energy consumption by up to 90%.

“Generative AI’s annual energy footprint is already equivalent to that of a low-income country, and it is growing exponentially. To make AI more sustainable, we need a paradigm shift in how we use it, and we must educate consumers about what they can do to reduce their environmental impact,” said Tawfik Jelassi, UNESCO assistant director-general for communication and information.

UNESCO has a mandate to support its 194 Member States in their digital transformations, providing them with insights to develop energy-efficient, ethical and sustainable AI policies.

In 2021 the organisation’s member states unanimously adopted the UNESCO Recommendation on the Ethics of AI, a governance framework which includes a policy-oriented chapter on AI’s impact on the environment and ecosystems.

This new report calls on governments and industry to invest in sustainable AI research and development, as well as AI literacy, to empower users to better understand the environmental impact of their AI use and make more informed decisions.

Generative AI tools are now used by over 1 billion people daily. Each interaction consumes about 0.34 watt-hours per prompt. This adds up to 310 gigawatt-hours per year, equivalent to the annual electricity use of over 3 million people in a low-income African country.

For this report, a team of computer scientists at UCL carried out a series of original experiments on a range of different open-source LLMs. They identified three innovations which enable substantial energy savings, without compromising the accuracy of the results:

The first is that smaller models are just as smart and accurate as large ones. Currently, users rely on large, general-purpose models for all their needs. The research shows that using smaller models tailored to specific tasks—like translation or summarisation—can cut energy use significantly without losing performance.

Each model should only be activated when needed to accomplish a specific task.

The second finding is that shorter, more concise prompts and responses can reduce energy use by over 50%.

And the final innovation it uncovered is that model-compression can save up to 44% in energy. Reducing the size of models through techniques such as quantisation helps them use less energy while maintaining accuracy.

The three techniques explored in the report are particularly useful in low-resource settings, where energy and water are scarce as small models are much more accessible in low-resource environments with limited connectivity.