ExplorerNumerical AnalysisMathematics
Research PaperResearchia:202601.29156

LAMP: Look-Ahead Mixed-Precision Inference of Large Language Models

Stanislav Budzinskiy

Abstract

Mixed-precision computations are a hallmark of the current stage of AI, driving the progress in large language models towards efficient, locally deployable solutions. This article addresses the floating-point computation of compositionally-rich functions, concentrating on transformer inference. Based on the rounding error analysis of a composition $f(g(\mathrm{x}))$, we provide an adaptive strategy that selects a small subset of components of $g(\mathrm{x})$ to be computed more accurately while ...

Submitted: January 29, 2026Subjects: Mathematics; Numerical Analysis

Description / Details

Mixed-precision computations are a hallmark of the current stage of AI, driving the progress in large language models towards efficient, locally deployable solutions. This article addresses the floating-point computation of compositionally-rich functions, concentrating on transformer inference. Based on the rounding error analysis of a composition f(g(x))f(g(\mathrm{x})), we provide an adaptive strategy that selects a small subset of components of g(x)g(\mathrm{x}) to be computed more accurately while all other computations can be carried out with lower accuracy. We then explain how this strategy can be applied to different compositions within a transformer and illustrate its overall effect on transformer inference. We study the effectiveness of this algorithm numerically on GPT-2 models and demonstrate that already very low recomputation rates allow for improvements of up to two orders of magnitude in accuracy.


Source: arXiv:2601.21623v1 - http://arxiv.org/abs/2601.21623v1 PDF: https://arxiv.org/pdf/2601.21623v1 Original Link: http://arxiv.org/abs/2601.21623v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jan 29, 2026
Topic:
Numerical Analysis
Area:
Mathematics
Comments:
0
Bookmark
LAMP: Look-Ahead Mixed-Precision Inference of Large Language Models | Researchia