Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert eip7667 into an rip #26

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions RIPS/rip-7667.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
rip: 7667

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rip: 7667
rip: 7724

Assigning next sequential EIP/ERC/RIP number.

Please also update the filename.

title: Raise gas costs of hash functions
description: Raise the gas costs of hash function opcodes and precompiles, to match prover expenses in ZK-EVMs
author: Vitalik Buterin (@vbuterin)
discussions-to: https://ethereum-magicians.org/t/eip-7667-raise-gas-costs-of-hash-functions/19446
status: Draft
type: Standards Track
category: Core
created: 2024-03-31
---

## Abstract

Raise the gas costs of opcodes and precompiles that involve hash functions.

## Motivation

Gas costs for hash function opcodes and precompiles were originally set based on the time that it takes to execute them on a regular CPU. Since then, however, there has emerged another equally important execution substrate that the EVM is executed on: zero knowledge proof (ZK-SNARK) systems. By that standard, these opcodes and precompiles are _very_ underpriced relative to other operations.

Blocks that are heavy with hash function executions take much longer to ZK-SNARK prove than average blocks. Today, many layer-2 protocols are using workarounds such as arbitrary limits and rules enforced by centralized sequencers to deal with this issue. These workarounds are often poorly designed and lead to unneeded security and centralization concerns.

Additionally, there is a design goal of eventually using ZK-SNARKs to prove the correctness of layer-1 Ethereum blocks. To do this, we need to establish very tight bounds on how long it takes to generate a ZK-SNARK proof of an Ethereum execution block's correctness. Today, the difference between average-case proving time and worst-case proving time is large, and hash function executions are the largest reason why.

Verkle trees solve the part of this problem that comes from Keccak hashing in the Merkle Patricia tree. Today, the theoretical worst case is a block with `30000000 / 2600 = 11538` account accesses (minus a small percent for EVM overhead), where proving each access would require proving `24000` bytes of code, plus a roughly `512 * log(500m) / log(16) = 3712` byte Merkle-Patricia proof: roughly `305 MB` of hashing spread between `83650` hash calls. Verkle trees solve this. However, using opcodes, a block execution can still require roughly the following amount of hashing:

| Hash | Cost per word | Maximum bytes from 30 million gas |
| --------------------- | -------------------- | --------------------------------- |
| `KECCAK` (opcode) | 6 | 160 million |
| `SHA256` (precompile) | 12 | 80 million |
| `RIPEMD` (precompile) | 120 | 8 million |
| `BLAKE2` (precompile) | 3 (12 per 128 bytes) | 320 million |
| `LOG*` (hashing data) | 256 (8 per byte) | 3.75 million |

These hash functions have been optimized for rapid CPU computation, so CPUs can compute them quickly: for example, this author's laptop can compute a 160-million byte keccak in less than half a second. In ZK-SNARKs, however, these operations are _very_ time-consuming and inefficient. This is mitigated in newer proof systems based on small fields, but hashes remain underpriced.

## Specification

| Parameter | Previous value | New value |
| ------------------ | -------------- | --------- |
| `KECCAK_BASE_COST` | 30 | 300 |
| `KECCAK_WORD_COST` | 6 | 60 |
| `SHA256_BASE_COST` | 60 | 300 |
| `SHA256_WORD_COST` | 12 | 60 |
| `RIPEMD_BASE_COST` | 600 | 600 |
| `RIPEMD_WORD_COST` | 120 | 120 |
| `BLAKE2_GFROUND` | 1 | 10 |
| `GLOGBYTE` | 8 | 10 |

Change the above parameters to their new values.

## Rationale

The above increases the gas costs of all opcodes and precompiles that can be used to require large amounts of hashing in the EVM. All hashing costs are increased to 300 per hash plus 60 per word (or kept the same if they are already higher than this).
'"
A possible alternative to this approach is to implement either multidimensional gas pricing (ie. a separate floating basefee and per-block target and limit for hashes) or a "total gas cost floor" similar to what [EIP-7623](eip-7623.md) does for calldata. However, this approach is much harder to implement for in-EVM gas costs such as hashes than it is for calldata and blobs, because EVM gas limits are set not just by users, for whom new transaction types that specify newly required limits, max-basefees and priority fees can easily be added, but also by contracts making sub-calls to other contracts.

## Backwards Compatibility

For most applications, a reasonable upper bound is that data that is getting hashed in the EVM is getting brought in as calldata. If the hashing being done is binary Merkle proof verification, every 32 bytes of data corresponds to a 64-byte (2-word) hash. A word of costs 512 gas. Under the new costs, the hashing per word in that situation would be `300 + 60 * 2 = 420` gas. Hence, this will increase gas consumption for that component of the application by less than 2x.

Concretely, a length-20 Keccak-based binary Merkle proof gas cost would increase from `(512 + 42) * 20 = 11080` to `(512 + 420) * 20 = 18640`. This is a small increase, especially taking into account other costs associated with such an application.

## Security Considerations

As no new operations are introduced or made cheaper, no security concerns were raised.

## Copyright

Copyright and related rights waived via [CC0](../LICENSE.md).