This is an ongoing collection of longer form thoughts on the design and philosophy behind the Coalescent Computer.

Blog < NLNet NGI Entrust February '23 Open Call

@jakintosh | February 1, 2023

As part of my effort to find more support for my work, I submitted a proposal to the NGI Zero Entrust open call from NLNet. This open call was focused on trustworthiness, privacy, and security. The Coalescent Computer is not specifically a privacy or security project, but it does provide many of those benefits as a side effect.

I submitted a proposal for work on the Coalescent Database component of the project, as that piece will be a critical infrastructure for the privacy and security benefits that can be enabled, and because the CDB can have the most broad applicability outside of the CC ecosystem.

For transparency, below is the full text of my proposal.

Update (April 10th, 2023): My submission was not selected for this call.

NGI0 Entrust Grant Application

Submitted on January 31st, 2023

Abstract: Can you explain the whole project and its expected outcome(s). (1200 chars)

The proposed project, Coalescent Database (CDB), is a content-addressable graph database. It will conform to a deterministic binary data protocol called Coalescent Data (CODA) to enable content addressability. It will provide the ability to name and link data hashes. It will provide a simple query language to fetch and filter data objects and relationship. It will serve data in the native binary CODA format, and JSON for cross compatibility. Finally, it will allow for fetching and querying over the network with authorization.

The purpose and expected outcome of the CDB project is to replace the “filesystem” for a new paradigm of local-first, network-coalescent, data-centric computing. This means data primarily lives and changes locally, can easily move to other systems when authorized, and is a “first class citizen” (instead of applications).

CDB will be a foundational component of a larger project reimagining the networked personal computer (the “Coalescent Computer”, stay tuned for future proposals), but its design makes it useful even in existing technology stacks (HTTP + JSON) and aligns closely to the goals of NGI0 Entrust: data portability, data privacy, and user agency.

Have you been involved with projects or organisations relevant to this project before? And if so, can you tell us a bit about your contributions?

I have always been at the forefront of understanding and developing new technology paradigms, from mobile applications to head-mounted augmented reality. I’ve had professional roles as a game designer, a lead software engineer, and a data analytics infrastructure engineer.

Most relevant to this proposal, I have been working on the larger project of untangling and reimagining networked computing for just over two years, full time and self funded. I spent the first 9 months untangling the issues with information architecture of the World Wide Web (an example of my thoughts), before then spending a long time researching and understanding most of the foundational architectural components of the internet by reinventing them from first principles. I have many technical prototypes and explorations of this project on my github, and I have the in-progress reference implementations of the larger project on my source hut. Finally, depending on when you read this I will have my historical developer logs, notes, and project component overview on the project website.

Explain what the requested budget will be used for? Does the project have other funding sources, both past and present?(If you want, you can in addition attach a budget at the bottom of the form)

The budget I’m asking for is 12,500 EUR. This is purely labor hour rate, based on an expectation of 3 months of work at an average of 25 hours per week, or about 325 total hours, and coming to just under 40 EUR per hour rate. Based on my cadence of related projects in this umbrella, the expected split of those hours is roughly 50/50 between continued research and actual implementation.

Compare your own project with existing or historical efforts.

The primary comparison you might reach for is IPFS, due to its content-addressability, focus on p2p computing, and its use of the term “file system” in the name. However, there are key differences.

  1. IPFS does not use a data protocol with a schema, while CDB uses CODA which requires schema. Deterministic schemas allow for all data objects to be self describing, which is a key requirement for “data-centric” computing. An arbitrary binary blob is not useful unless you have an external signal that tells you how to handle it.
  2. IPFS spends a lot of resources mimicking and integrating with a unix filesystem, where as CDB provides a clean slate. Providing hard links (data hashes inside data objects), soft links (separate data entries that create a relationship between two or more other data objects), and namespaces (mapping strings to data hashes) is all that is required to refer to and relate data. CDB replaces the file system with cleaner abstractions, it doesn’t directly imitate it.
  3. IPFS bakes the p2p and networking abstractions into the file system abstraction, while CDB respects the fact that these are two fundamentally separate components. Heavy abstractions do not make for good choices in foundational infrastructure, and IPFS’s global peer-mesh abstraction is fundamentally not how the internet works. It’s an incorrect assumption that breaks the fluidity of the system, much like blockchain’s insistence on global state consensus forces it to make tradeoffs on energy consumption or decentralization of power.

Overall, CDB will be much more lightweight, and by accepting a single limitation in its root data representation (using CODA data, instead of arbitrary binary blobs) it ends up becoming less opinionated and more extensible throughout the remainder of the stack.

What are significant technical challenges you expect to solve during the project, if any?

The only real technical challenge I see might be around disk space optimization, though I honestly don’t think that will be a primary concern for an initial implementation of the project. The majority of the challenges will be research/conceptual, most likely working out the details of the query language and the CODA specification as it gets a real world implementation and use. I’ve drafted up a good portion of a preliminary CODA spec, but based on my experience in previous parts of this project there will certainly be growing pains as it becomes concrete.

Describe the ecosystem of the project, and how you will engage with relevant actors and promote the outcomes?

As I mentioned a few times, the CDB is an important component of the larger project of the “Coalescent Computer”, a set of protocols and reference implementations for a local-first, data-centric “social computing” environment. Outside of this grant work, I have been building the execution environment, composed of the Coalescent Instruction Set (COINS), the Co programming language, and the Cohost virtual machine. CDB will integrate directly with the Cohost virtual machine, and play the role of a mass storage device to the virtual CPU. The COINS/Co/Cohost tools should be completed well before the projected completion of this grant proposal, and so the CDB implementation will serve to complete the local-only vision of the Coalescent Computer. Over the course of the year, I hope to build a community around this project as it begins to take a concrete shape, and CDB will take the early Cohost VM from a CPU + RAM system to a CPU + RAM + STORAGE system.

Outside of my hope to grow interest in the Coalescent Computer ecosystem, I believe that it will be of interest to many of the people who are exploring local-first and/or data-centric computing paradigms, including those in the growing “permacomputing” space. Organizations like Ink + Switch and communities like Merveilles may be particularly interested in the possibilities of a system like CDB.