ExplorerMachine LearningMachine Learning
Research PaperResearchia:202601.29042

Where Do the Joules Go? Diagnosing Inference Energy Consumption

Jae-Won Chung

Abstract

Energy is now a critical ML computing resource. While measuring energy consumption and observing trends is a valuable first step, accurately understanding and diagnosing why those differences occur is crucial for optimization. To that end, we begin by presenting a large-scale measurement study of inference time and energy across the generative AI landscape with 46 models, 7 tasks, and 1,858 different configurations on NVIDIA H100 and B200 GPUs. Our empirical findings span order-of-magnitude vari...

Submitted: January 29, 2026Subjects: Machine Learning; Machine Learning

Description / Details

Energy is now a critical ML computing resource. While measuring energy consumption and observing trends is a valuable first step, accurately understanding and diagnosing why those differences occur is crucial for optimization. To that end, we begin by presenting a large-scale measurement study of inference time and energy across the generative AI landscape with 46 models, 7 tasks, and 1,858 different configurations on NVIDIA H100 and B200 GPUs. Our empirical findings span order-of-magnitude variations: LLM task type can lead to 25×\times energy differences, video generation sometimes consumes more than 100×\times the energy of images, and GPU utilization differences can result in 3--5×\times energy differences. Based on our observations, we present a framework for reasoning about the underlying mechanisms that govern time and energy consumption. The essence is that time and energy are determined by latent metrics like memory and utilization, which are in turn affected by various factors across the algorithm, software, and hardware layers. Our framework also extends directly to throughput per watt, a critical metric for power-constrained datacenters.


Source: arXiv:2601.22076v1 - http://arxiv.org/abs/2601.22076v1 PDF: https://arxiv.org/pdf/2601.22076v1 Original Link: http://arxiv.org/abs/2601.22076v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jan 29, 2026
Topic:
Machine Learning
Area:
Machine Learning
Comments:
0
Bookmark
Where Do the Joules Go? Diagnosing Inference Energy Consumption | Researchia