<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://samuelemarro.it/feed.xml" rel="self" type="application/atom+xml" /><link href="https://samuelemarro.it/" rel="alternate" type="text/html" hreflang="en" /><updated>2025-10-11T18:04:41+00:00</updated><id>https://samuelemarro.it/feed.xml</id><title type="html">blank</title><subtitle>Personal page for Samuele Marro</subtitle><entry><title type="html">A Protocol Sketch For LLM Communication</title><link href="https://samuelemarro.it/blog/2024/a-protocol-for-llm/" rel="alternate" type="text/html" title="A Protocol Sketch For LLM Communication" /><published>2024-04-07T07:12:00+00:00</published><updated>2024-04-07T07:12:00+00:00</updated><id>https://samuelemarro.it/blog/2024/a-protocol-for-llm</id><content type="html" xml:base="https://samuelemarro.it/blog/2024/a-protocol-for-llm/"><![CDATA[<h1 id="tldr">TL;DR</h1>

<ul>
  <li>Natural language is flexible, structured data is efficient</li>
  <li>A protocol that supports structured data but allows natural language as a default achieves the best of both worlds</li>
  <li>Machines can use documents to describe highly specific communication protocols, which can be handled by traditional code routines</li>
  <li>For everything else (including negotiating new protocols, handling code failures and writing new routines) we can use LLMs</li>
</ul>

<h1 id="introduction">Introduction</h1>

<p>About 1.5 years ago I made a blog post where I stated that natural language was the natural choice to allow flexible, programmer-less interfacing between machines. In that time span, a lot happened! GPT-4 was released and made a lot of people question their definitions of general intelligence. Llama and Mistral showed that good models aren’t necessarily closed source. And, most importantly, there has been a decisive trend towards bigger, more expensive and more centralized models, although the advances in quantization and MoE have somewhat slowed the growth.</p>

<p>At the moment of writing, all LLMs that have comparable performance to the “big guys” (ChatGPT, Claude, Bard) are 70B models, which have requirements way beyond those the average consumer can afford. For example, <a href="https://qwenlm.github.io/blog/qwen1.5/">Qwen-1.5-72B</a> (#9 in the <a href="https://chat.lmsys.org/">Chatbot Arena</a>) requires, even with quantization, <a href="https://qwen.readthedocs.io/en/latest/benchmark/hf_infer.html">two NVIDIA A100 80 GB</a>, which puts a the cost of running Qwen locally at about 40k USD.</p>

<p>So, how can we make open source LLMs more available to the general public?</p>

<p>I think that the best tool to compete with massive LLMs is to use networks of (relatively) small LLMs. Having 10, 100 or 1000 LLMs talking to each other is definitely less efficient compared to just training a model that is 10, 100 or 1000 times larger, but it also means that you don’t need an entire data center just to do inference.</p>

<p>And so, while I’m here in Oxford for my research visit, I thought it would be a good chance to expand upon the idea further, with the goal of building a concrete first step towards networks of LLMs. And I’d argue that the first step towards a network is how nodes in such a network communicate.
Specifically, what I’m interested in is a simple protocol to allow communication between (LLM-powered) machines. Rather than directly outline the protocol I have in mind, I’ll walk you through the reasoning that led me to this specific implementation.</p>

<h1 id="a-sample-task">A Sample Task</h1>

<p>We’re going to use a very simple task: querying the price of a stock. There are two machines, Alice and Bob, each having their own databases, with their own schemas and their own conventions:</p>

<p><img src="https://i.imgur.com/zuFelp4.png" /></p>

<p>Alice wants to query Bob to obtain the current price of MSFT, which at the time of writing sits at 425.52 USD.</p>

<p>We have two seemingly competing goals:</p>
<ul>
  <li>Flexibility: the machines should be able to adapt to changes in data, schema or goals with little effort;</li>
  <li>Efficiency: performing a task should be as fast and low-resource as possible, especially if it is performed multiple times.</li>
</ul>

<h1 id="level-0-manual-implementation">Level 0: Manual Implementation</h1>

<p>The most standard approach for answering a query is for someone to expose an API for Bob, while someone else writes some code for Alice to query the API, convert it into a database-friendly schema and store it.</p>

<p>This is as efficient as it gets: there’s a reason if pretty much all of machine-to-machine communication is performed over APIs. But it’s not flexible: if you want to use a different API, or if you want to query more information (e.g. the price-to-earnings ratio), or even if you just want to change the internal schema of your database, you need a human to implement the changes.</p>

<h1 id="level-1-natural-language">Level 1: Natural Language</h1>

<p>But it’s 2024, and LLMs are hot. Since LLMs can solve everything, let’s just build two LLMs that chat with each other!</p>

<p>Specifically, we add two LLMs that can interface with Alice’s and Bob’s databases, respectively. Both LLMs are capable of using natural language to communicate with each other:</p>

<p><img src="https://i.imgur.com/rc26z8X.png" /></p>

<p>This is as flexible as it gets: if Bob’s database changes schema, Bob’s LLM only needs to be informed of the new schema, while from the point of view of Alice nothing changed. If Alice wants to query the price-to-earnings ratio, it’s just a matter of changing the question. If Bob’s server goes down, Alice can start querying another natural language-friendly machine without any major disruptions.</p>

<p>At the same time, however, we are using two language models to send a floating point value between two databases. This is the computational equivalent of shooting a fly with a bazooka.</p>

<h1 id="level-2-reinventing-apis">Level 2: Reinventing APIs</h1>

<p>Wouldn’t it be just easier if we could have the LLMs agree on a way to send the data? For example, suppose that Alice’s query is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Alice: What's the price of MSFT? Send the data as a JSON with a single field, "price", containing the price of the answer
Bob: { "price" : 425.52 }
</code></pre></div></div>

<p>If Bob is sufficiently consistent with its replies, Alice can just write a simple routine to store the data:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const result = await Alice.queryPrice('MSFT')
const price = JSON.parse(result).price
await myDatabase.update({ 'MSFT' : price })
</code></pre></div></div>

<p>This means that we don’t need to use Alice’s LLM to store the result in the database: we can rely on good ol’ code, written by the LLM. Similarly, Bob can ask Alice to provide queries in a standard format, so that it can also use routines. So, the first communication might be something like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Alice: I want to establish a protocol between us. I want to know the price of a stock.
Bob: I can send you the price as an XML, something like &lt;xml&gt;The price of &lt;stockName&gt;MSFT&lt;/stockName&gt; is &lt;stockPrice&gt;425.52&lt;/stockPrice&gt;&lt;/xml&gt;
Alice: Can we use JSON instead?
Bob: Ok, how about you send me a JSON with the price query, e.g. { "priceQuery" : "MSFT" }, and I send you a JSON with the result, e.g. { "price" : 425.52 } ?
Alice: Sounds good!
</code></pre></div></div>

<p>After this natural-language negotiation, future communication might be something much more terse:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Alice: { "priceQuery" : "MSFT" }
Bob: { "price" : 425.52 }
</code></pre></div></div>

<p>In this case, Alice and Bob could just use the routines written by the LLMs, which means that the latter don’t need to be invoked.</p>

<p>In order to ensure that Alice and Bob are both sure that they’re using the same protocol, we might add an identifier for the protocol, plus an ACK/NAK token to programmatically check if the other party understood which protocol they’re using:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Alice: ALICE-BOB-STOCK-PRICING { "priceQuery" : "MSFT" }
Bob: ACK { "price" : 425.52 }
</code></pre></div></div>

<p>Congratulations, we’ve just reinvented APIs. However, this approach has several benefits over a regular API:</p>
<ul>
  <li>If either Alice or Bob need to change the protocol, they can just negotiate a new one</li>
  <li>If Alice needs to make some unusual queries, it can just ask in natural language</li>
</ul>

<p>In general, the cool thing about this approach is that natural language is a universally supported default for communication. If there’s a more efficient protocol, that’s great! But if there are issues or unexpected changes, natural language is still an option.</p>

<p><img src="https://i.imgur.com/b2wfxqF.png" /></p>

<h1 id="level-3-scaling-to-a-network">Level 3: Scaling to a Network</h1>

<p>Our protocol is perfectly fine for communication between two machines. But when we scale to a network of machines, there are three new problems:</p>
<ul>
  <li>Negotiating a protocol with each machine in the network is inefficient</li>
  <li>If a query is forwarded to a different machine, it might lack the context required to understand the query</li>
  <li>There might be two protocols with the same name, or a protocol with two different names</li>
</ul>

<p>Fortunately, we humans have created a tool to efficiently tell the rest of the world how to implement a protocol: we call them standards! Whether they’re RFCs, ISOs, ERCs or whatever, these documents provide everything two parties need to initiate a communication.</p>

<p>So, for our stock market info, we can just write a document:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The client sends a plaintext JSON document having a single field, "priceQuery", which is the ticker of the stock (in uppercase). The server replies by sending a JSON document having a single field, "price", which is the price (in USD) of the corresponding stock. This price is expressed as a floating-point value.
</code></pre></div></div>

<p>Note that:</p>
<ul>
  <li>Unlike a schema, this document also provides the semantics for the data, which means that a model that has never seen this document before can still understand the data that was received;</li>
  <li>The document can range from general-purpose communication formats to highly specialized data formats;</li>
  <li>LLMs can write these documents.</li>
</ul>

<p>Now we just need a way to assign a unique identifier to this document. The classic approach would be to rely on a standards agency to assign codes, but:
a. That defeats the whole point of having a decentralized system;
b. No agency (even with the help of LLMs) could process and catalogue all the possible communication protocols that two LLMs might establish.</p>

<p>There is, however, an alternative, inspired by the <a href="https://ipfs.tech/">IPFS Protocol</a>: using the hash of the document as a unique identifier. By definition, a document can only have one hash, and with a good enough hashing scheme the probability of collisions is so low (e.g. 1 over 2^160, if we’re using SHA1) that they might as well be non-existant.</p>

<p>And that’s it! The final communication, assuming that we’re using Base64-encoded SHA1 as hashing protocol, is thus something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Alice: o2R8vS9V7BqfhHQFVWapyCVyqNs= { "priceQuery" : "MSFT" }
Bob: ACK { "price" : 425.52 }
</code></pre></div></div>

<p>All of this happens without the intervention of the LLMs: if, however, there are any issues with the communication, the LLMs can simply intervene. For example, in case Bob doesn’t have a routine to handle the protocol, it can just send a NAK, which makes Alice send the corresponding protocol (which is then read by Bob’s LLM). Similarly, if for some reason the routine fails, the LLM can take over. As a flowchart:</p>

<p><img src="https://i.imgur.com/fNvqXCD.png" /></p>

<h1 id="conclusion">Conclusion</h1>

<p>This pretty much covers the core of my idea for the communication protocol. Of course, this is by far not a full-fledged proposal, as there are lots of open questions, such as:</p>
<ul>
  <li>How do we broadcast information about protocols to the rest of the network?</li>
  <li>How can a machine trust a previously-unknown protocol to be safe?</li>
  <li>How can we implement some common features of other communication protocols (e.g. authentication, retrying, message forwarding…)?</li>
  <li>Is there a preferrable way to write a protocol document?</li>
  <li>Which hashing algorithm should be used?</li>
</ul>

<p>I hope to cover potential solutions to these challenges in the next posts. That said, if you have any opinions on the protocol, its strengths and its flaws, feel free to reach out! The more I think about LLM communication, the more it looks like building a proper ecosystem (protocols, software, know-how…) for networks of LLMs is something that will require the effort of an entire community. Fortunately, it looks like quite a lot of people seem to care about keeping Machine Learning decentralized.</p>

<p>Edit: It turns out that there are indeed lots of people who care about decentralized AI! In the meantime, I and some other folks at Oxford refined this idea and made <a href="https://agoraprotocol.org/">Agora</a>. Check out the <a href="https://arxiv.org/abs/2410.11905">paper</a> as well.</p>]]></content><author><name></name></author><category term="ai" /><summary type="html"><![CDATA[So, how do we make LLMs talk to each other?]]></summary></entry><entry><title type="html">Networks of Neural Networks</title><link href="https://samuelemarro.it/blog/2022/networks-of-nns/" rel="alternate" type="text/html" title="Networks of Neural Networks" /><published>2022-08-22T15:12:00+00:00</published><updated>2022-08-22T15:12:00+00:00</updated><id>https://samuelemarro.it/blog/2022/networks-of-nns</id><content type="html" xml:base="https://samuelemarro.it/blog/2022/networks-of-nns/"><![CDATA[<h1 id="tl-dr">TL; DR</h1>

<ul>
  <li>Natural language is a great universal language for machine-readable information</li>
  <li>We can create natural language models that formulate questions and talk with other AIs using natural language</li>
  <li>Such a network-of-neural-networks would provide a decentralized alternative to the more typical “supercomputer AI”</li>
</ul>

<h1 id="introduction">Introduction</h1>

<p>Humanity has a lot of information and computational power, and both have long proven to be very useful.
It is thus natural that entities that have the capacity to both collect information on large scales and process it in meaningful ways tend to have an advantage over those that don’t. At the same time, this advantage tends to benefit organizations that already have considerable resources, creating a potentially dangerous feedback loop.</p>

<p>That said, it’s not like computational power is inaccessible. The most powerful supercomputer in the world is the <a href="https://en.wikipedia.org/wiki/Frontier_(supercomputer)">Hewlett Packard Enterprise Frontier</a>, with roughly 1.1 exaFLOPS, corresponding to the computational power of ~1M smartphones. An impressive figure, but it also means that the citizens of <a href="https://en.wikipedia.org/wiki/Phoenix,_Arizona">Phoenix, Arizona</a> could theoretically outperform a supercomputer by downloading an app. Similarly, most information is either publicly available (e.g. Wikipedia) or being produced by regular users.</p>

<p>A more relevant asymmetry concerns the <em>programming power</em>: a government or a large company can afford to pay programmers and employees to collect and process information in a centralized manner.
Open-source and grassroot projects tend, on the other hand, to be fragmented into countless independent systems, each with its own objective, approach and policy. In my opinion, that’s a feature, not a bug.</p>

<p>The downside is that a large portion of information and computational power is thus locked away: twenty NGOs could collect data on the population of trouts in different parts of the world, but unless someone manually collects the data from all their websites and standardizes them, you can’t run queries on the sum of their knowledge. They could also pool their computational resources to create a ML model of which factors influence the trout population, but convincing twenty organizations to spend money to setup such a system is going to be difficult at best.</p>

<p>Is there thus a way to pool information and computational resources without sacrificing decentralization?
Let’s consider two existing techniques that allow machines to share machine-readable information, and why they fall short of the task.</p>

<h1 id="ontologies">Ontologies</h1>

<p>In short, ontologies are standardized machine-readable descriptions of entities that are related to other entities. For example, let’s say that you want to formalize “Tim Berners-Lee was born in London”. To do so, you write something like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>_:a &lt;https://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;https://schema.org/Person&gt; .
_:a &lt;https://schema.org/name&gt; "Tim Berners-Lee" .
_:a &lt;https://schema.org/birthPlace&gt; &lt;https://www.wikidata.org/entity/Q80&gt; .
&lt;https://www.wikidata.org/entity/Q80&gt; &lt;https://schema.org/itemtype&gt; &lt;https://schema.org/Place&gt; .
&lt;https://www.wikidata.org/entity/Q80&gt; &lt;https://schema.org/name&gt; "London" .
</code></pre></div></div>

<p>Here, “_” represents Tim Berners-Lee, while Q80 is WikiData’s ID for London. As you can see, everything is standardized and machine-readable. Even the relations between entities (i.e. name and birthPlace) are standardized, and these standards are accessible in machine-readable formats. There are standards for pretty much everything, from books to people to ideas. They can be used by servers and people alike to to share information in an universal manner.</p>

<p>Ontologies suffer from two problems:</p>
<ul>
  <li>They don’t capture the actual meaning of concepts: a machine can easily read the birthPlace schema, but it cannot dentify relations between concepts (e.g. birth and death) or use the collected data in other contexts, unless told so by a programmer</li>
  <li>They are standardized: if your goal is to formalize the entirety of human knowledge, any type of standard is going to be either too formal and complex, or too loose and devoid of nuance</li>
</ul>

<p>In practice, this makes working with ontologies both frustrating and mind-numbing.</p>

<h1 id="api-hell">API Hell</h1>

<p>APIs represent the most common way to allow machines to communicate between each other.
We can split the API usage process in four parts:</p>
<ul>
  <li>Figure out what the various API paths and arguments do</li>
  <li>Figure out how the data offered by the API can be used to obtain the information you want</li>
  <li>Write the code to pull data</li>
  <li>Write the code to convert the received data into something usable for your program</li>
</ul>

<p>The process of writing an API is similar:</p>
<ul>
  <li>Figure out what information you have</li>
  <li>Figure out how you can expose information as standardized data</li>
  <li>Write the code to convert your program’s internal representation of information into something standardized</li>
  <li>Write the code to provide data</li>
</ul>

<p>APIs are the basic tools of machine-to-machine communication, and yet both exposing information and retrieving it involve considerable work. Ontologies suffer from the same problem, with the only difference being that you’re exposing XML/JSON documents instead.</p>

<p>This type of work can be well achievable by motivated programmers, but can we say the same about regular folks? Would we be able to convince a rural doctor to provide medical statistics (e.g. to track the spread of diseases) <em>and</em> implement a standardized API <em>and</em> input the data in a standardized format?</p>

<h1 id="the-universal-language">The Universal Language</h1>

<p>If we want to create a communication system that allows machines to gather and share information, we need to follow some ground rules:</p>
<ol>
  <li>It must be flexible enough to capture concepts with both high and low levels of nuance</li>
  <li>Machines interacting with it must have some way to connect concepts beyond what is said in the document</li>
  <li>The system must not be difficult to learn or use</li>
</ol>

<p>There’s actually a language that (kinda) satisfies all three requirements: natural language.</p>

<p>While points 1 and 3 are both self-evident, it is only in the recent years that natural language models have been able to reason beyond the immediate data. If you ask GPT-3 what’s a “colorful animal with wings that is capable of singing”, it will (on a good day) say that it’s a bird.
Combined with recent advances in neural programming, it suggests a way for organizations to pool data and information without any effort: have natural language models act as “middlemen”, with natural language being a flexible (if ambiguous) API.</p>

<h1 id="neural-crawlers">Neural Crawlers</h1>

<p>Imagine a Neural Network (or whatever ML model we will come up with) that is capable of performing three tasks:</p>
<ul>
  <li>Reading natural language documents (including web pages) and tabular data</li>
  <li>Coming up with relevant Google queries and questions</li>
  <li>Chatbot-level conversation</li>
</ul>

<p>All of these tasks have been studied intensively, although there’s little work on unifying them in a more general model.
Combined with existing search engines, crawling tools and translation software, we can create a Neural Crawler (NC), which works as follows:</p>
<ul>
  <li>The user asks a question in natural language (e.g. “What’s the average weight of a male trout in August 2022?”)</li>
  <li>The NC comes up with relevant queries (“fish weight database”, “common trout locations”, “north dakota trout data”…)</li>
  <li>The NC crawls the found websites, parsing the data whether it’s in an Excel file, a PDF document or a web page</li>
  <li>The NC pools the information and provides a final answer, in addition to a list of sources so that the user can check its correctness</li>
</ul>

<p>Google already offers a similar mechanism: if you try to search a question, it will attempt to provide an answer, although it is usually only taken from one source.</p>

<p>While a fully-supported Q&amp;A system would represent a massive improvement in user experience (and make all my Google-Fu skills outdated), the true benefit comes from allowing other AIs to interface with this system. A model with a natural language module would be able to access the entirety of humanity’s knowledge without the massive data and hardware requirements that general-purpose models currently face. Such “Google-Fu” models would then be able to make contributions by processing information in interesting ways or by combining private data (without exposing it).</p>

<h1 id="the-world-wide-web-of-ais">The World Wide Web of AIs</h1>

<p>The next logical step is to create a network of such models. If a model doesn’t have enough information, it can just query other models, which can then query other models and so on. Each AI would be able to combine sources, add new information or provide unexpected insight. Since the “interface” uses natural language, there’s not even a need to define a new standard: even regular emails would be a valid way to communicate between AIs (although there would probably be a slow transition towards a semi-defined standard, for example to simplify chatting). Note that this removes the need of relying on a search engine.</p>

<p>It’s possible that this system would need financial incentives in order to encourage meaningful contributions. For example, the models might agree on a price in exchange for the information or the computational power. Smart-contract blockchains seem to be the obvious decentralized choice for signing these agreements, although they would probably need to process transactions in the order of the billions to handle high-speed communication between NNs. In order to filter out malicious or inaccurate sources, there could even be a web-of-trust system like in PGP.</p>

<p>This network of neural networks would be under all points of view a macro-AI, capable of acting like a large-scale AI (although less efficiently) while still being decentralized. In a world where general models require millions of dollars to be trained and hosted, this represents an alternative way to reap the benefits of AI without giving up independence.</p>

<p>Of course, there are several obstacles to an actual implementation of this system, such as:</p>
<ul>
  <li>ML models are still fragile and tend to fail when dealing with unexpected data</li>
  <li>ML models struggle to understand if data is reliable, and can easily be misled (see: the entirety of the adversarial ML field)</li>
  <li>Natural language models are still way too big (or, conversely, hardware is not powerful enough yet) to conceive running a good model on consumer hardware</li>
  <li>There’s no guarantee that people would agree on some form of payment protocol</li>
</ul>

<p>However, I believe that, with enough dedicated effort, it could be achieved without major leaps in AI or hardware technology.</p>

<h1 id="conclusion">Conclusion</h1>

<p><a href="https://en.wikipedia.org/wiki/Memex">Memex</a>, <a href="https://en.wikipedia.org/wiki/Project_Xanadu">Xanadu</a> and the <a href="https://en.wikipedia.org/wiki/World_Wide_Web">World Wide Web</a> all shared the same objective: simplifying access to information. However, AI-driven information is becoming more expensive to produce, and at the same time more powerful in the hands of those who have it. Even if Neural Crawlers and the Network of Neural Networks don’t turn out to be the solution, democratizing information is a key challenge for the 21st century, one which I hope will be solved by individuals motivated by the same ideals that moved the founders of the Internet.</p>]]></content><author><name></name></author><category term="ai" /><summary type="html"><![CDATA[Natural language models as a tool for decentralized computation]]></summary></entry><entry><title type="html">Revisiting GreenNFT, one year later</title><link href="https://samuelemarro.it/blog/2022/revisiting-greennft/" rel="alternate" type="text/html" title="Revisiting GreenNFT, one year later" /><published>2022-08-15T15:12:00+00:00</published><updated>2022-08-15T15:12:00+00:00</updated><id>https://samuelemarro.it/blog/2022/revisiting-greennft</id><content type="html" xml:base="https://samuelemarro.it/blog/2022/revisiting-greennft/"><![CDATA[<h1 id="tldr">TL;DR</h1>

<ul>
  <li>Our predictions were pretty correct</li>
  <li>Still, the data are outdated: emissions-per-dollar have decreased by ~40%</li>
  <li>The model diverges significantly from de Vries’, although there are several factors that can explain it</li>
</ul>

<h1 id="introduction">Introduction</h1>

<p>In May 2021, <a href="https://twitter.com/lucadonnoh">Luca Donno</a> and I published “Green NFTs: A Study on the Environmental Impact of Cryptoart Technologies”, a 16-page investigation into the actual greenhouse gas impact of NFTs. While at the time the impact of blockchains was already well studied (see <a href="https://www.sciencedirect.com/science/article/pii/S2542435118301776">de Vries’ landmark paper</a> on the topic), whether the NFT market played a role in emissions was still up for debate, with some even claiming that <a href="https://medium.com/superrare/no-cryptoartists-arent-harming-the-planet-43182f72fc61">NFTs have</a> <a href="https://www.artnews.com/art-news/news/nft-carbon-environmental-impact-1234589742/">no impact</a>.</p>

<p>According to our model, that couldn’t have been farther from the truth: spending 1 dollar on a transaction (either by minting an NFT, bidding or transferring it) causes an estimated emission of 1.305 kgCO2eq. To put that number in context, a typical NFT (using May 2021 data) would have emitted as much as driving 1862 km/1157 mi.</p>

<p>Our work (the paper + the companion website, now out of date) won a 3k USD award and is now cited by <a href="https://en.wikipedia.org/wiki/Non-fungible_token">Wikipedia’s article on NFTs</a> (which in my opinion still represents my greatest academic achievement).</p>

<p>The thing is, time flies very fast in the blockchain space, to the point where we were pretty sure that our results wouldn’t have been relevant for more than a couple months. Is this what actually happened?</p>

<h1 id="predictions-predictions">Predictions, Predictions</h1>

<p>The paper made four medium-term predictions:</p>
<ul>
  <li><a href="EIP-1559">https://eips.ethereum.org/EIPS/eip-1559</a>, a new fee auction mechanism, would have reduced miners’ revenue (and thus global Ethereum emissions) by 30%</li>
  <li>EIP-1559 would have reduced miner’s revenue from transactions by 98%</li>
  <li>Layer-2 blockchains, which are significantly less polluting than Ethereum, would have been much more common</li>
  <li>The Merge, i.e. the transition of Ethereum to Proof-of-Stake (which we expected to happen in January 2022) would have made Ethereum emissions negligible</li>
</ul>

<p>Most of them turned out to be fairly correct: the revenue (in ETH) of miners decreased by 30-40%, while the revenue (in ETH) from transactions decreased by 90% (note that the ETH price crash led to the USD revenue being even lower). Moreover, L2s went from a niche topic to a common tool for low-fee transactions, with Optimism and Arbitrum becoming well-known names and a growing interest in zero-knowledge rollups.</p>

<h1 id="what-didnt-we-predict">What Didn’t We Predict?</h1>

<p>May 2021 was a very different time: ETH/USD was inching ever closer to 4k, gas prices fluctuated between 50 and 150 Gwei, and “En Eff Tees” were just starting to leak into mainstream knowledge. While we did expect price fluctuations to affect miners’ revenue, there were four events that made our data outdated:</p>
<ul>
  <li>September 2021: the People’s Bank of China forbids all Chinese citizens from using cryptocurrency. Most Chinese miners were faced with three choices: closing shop, moving to other countries, or continuing their operations underground</li>
  <li>February 24th, 2022: Russia invades Ukraine. Natural gas prices skyrocket, followed by increases in the price of electricity as well. Mining suddenly becomes more expensive in many regions</li>
  <li>April 2022: Tim Beiko (Ethereum Foundation) announces a <a href="https://twitter.com/timbeiko/status/1513610106721603587">new delay of the Merge</a> to September</li>
  <li>May 2022: the $LUNA + $UST crash worsens an already strong bear run. The gas price, which was already declining, reaches in a couple months an average below 20 Gwei</li>
</ul>

<h1 id="the-new-data">The New Data</h1>

<p>New circumstances call for new sources of data.</p>

<p>First, the original paper used Silva’s region estimate from April 2019, which is now more than two years ago. We therefore turned to mining pool statistics, which, despite their flaws (mainly lack of region reporting and no guarantee that a miner using a node from a certain region is actually in that region), still represent the best source on the geographical distribution of miners. Instead of the older AMD RX 590, we picked as reference GPU the RTX 3060 Ti. We also used updated electricity prices, although the fact that some countries report average prices at the end of the year means that the electricity price is likely underestimated (which means that the emissions are probably overestimated).</p>

<p>You can find all our (August 1st) data and calculations <a href="https://docs.google.com/spreadsheets/d/18MDF_jAqI217GSUl90akfEp7cCi1gRg5eStPoLwUKXw/edit?usp=sharing">here</a>.</p>

<p>Running the model on the new data gives us the following results:</p>
<ul>
  <li>A miner earning 1 USD corresponds to emitting 0.791 kgCO2eq (-40%)</li>
  <li>Only ~10% of a transaction fee goes to miners, which means that the individual impact of spending 1 USD on transaction fees is ~0.08 kgCO2eq (note that this doesn’t take into account the effects that transactions have on prices)</li>
  <li>The global impact of the Ethereum blockchain is 5.88 MtCO2eq per year</li>
</ul>

<p>The last figure is significantly different from [de Vries’ current estimate of 46.72 MtCO2eq(https://digiconomist.net/ethereum-energy-consumption), which is based on a variant of his <a href="https://digiconomist.net/bitcoin-energy-consumption">Bitcoin model</a>. What can explain such a difference?</p>

<ul>
  <li>De Vries uses a fixed electricity price of 0.05 USD/kWh, which doesn’t take into account the recent increase in prices (our weighted average is 0.16 USD/kWh). The price is then doubled to 0.10 USD/kWh to reflect the lower efficiency of the Ethereum hashing algorithm; however, this fixed correction fails to capture the efficiency improvements over the years</li>
  <li>De Vries doesn’t provide information on how he estimates the geographical distribution of miners; it is reasonable to assume that, similarly to his Bitcoin model, he relies on <a href="https://ccaf.io/cbeci/mining_map">Cambridge’s Bitcoin estimates</a>, which might not correspond to the distribution of Ethereum miners</li>
  <li>For Bitcoin, de Vries assumes that electricity represents 99.5% of total costs for miners, compared to our ~90% figure (which we derived from our estimates of the cost of both hardware and electrictity)</li>
</ul>

<h1 id="new-predictions">New Predictions</h1>

<p>In the spirit of the original paper, here are four new (and more cautious) predictions:</p>
<ul>
  <li>The impact of Ethereum will increase again, as the price of ETH rises again and the price of electricity stabilizes</li>
  <li>The Merge will happen within 6 months: the merging of the test nets has been completed successfully, and while I’m still a bit skeptical that the Merge won’t be delayed in September, I’m confident that the team behind it can pull it off in less than 6 months</li>
  <li>If the Merge actually succeds, Ethereum emissions will decrease by at least 99%, since the entire concept of electricity-intensive computation will be abandoned</li>
  <li>If the Merge will be delayed again, it will provide a boost to the growth of L2s, which means that at the very least a growing percentage of transactions within the Ethereum ecosystem will be on energy-efficient blockchains</li>
</ul>

<p>For now, we can only hope that the Merge will go smoothly as planned, and that it will bring Ethereum emissions to such low levels that the entire discussion regarding its impact (including our paper) will become irrelevant. And, to be fair, that’s not a bad reason for a work to be forgotten.</p>]]></content><author><name></name></author><category term="blockchain" /><summary type="html"><![CDATA[Has the model held up?]]></summary></entry></feed>