For the April round of NLNet submissions, there was the new, and much better aligned, NGI Zero Core open call that I submitted to. To let the fund speak for itself:
We need your contributions to help reshape the state of play, and to help create an open, trustworthy and reliable internet for all. This includes developing alternatives and improvements to core internet hardware, software and protocols removes gatekeepers, choke points and surveillance capabilities. But also improving issues of security, privacy, interoperability, high availability and scalability of decentralised technologies to allow us to benefit from both 'local first' and from economies of scale without unnecessary centralisation.
This seemed like an almost exact match for the project, and so I submitted a proposal around building up the Co programming language tooling and its execution environment.
Update (June 8th, 2023): My submission advanced to the second round of the call.
Update (July 18th, 2023) My submission was not selected for funding.
For transparency, the full submission is below.
NGI0 Core Application
Project: Co - A content-addressable programming language, vISA, and VM
Website: http://coalescent.computer
Abstract: Can you explain the whole project and its expected outcome(s). (1200 chars)
This proposal is for the development of (1) a virtual instruction set architecture, (2) a content-addressable programming language that compiles to that instruction set, and (3) a virtual stack machine that loads those language symbols and executes the vISA. The distinguishing research factor of this project is that all three components are designed together and deeply intertwined to maximize the benefits of content-addressable code, all the way down to the machine instructions.
This concept—a content-addressable vISA and programming language—appears to be wildly under-explored. Essentially, I ask: what if all routines on a system were referenced by their hash? Benefits include the absolute deduplication of subroutines in memory and on disk, extreme ease in scanning entire systems for security vulnerabilities, "built-in" delta versioning for application updates, storage/energy efficiency, and many more that won't fit in this abstract. In short, it provides a system with many of properties that NGI Zero Core seeks to foster.
The project is underway, but early, under the names (1) COINS (2) Co and (3) Cohost, with devlogs and source code available on the linked project website.
Have you been involved with projects or organisations relevant to this project before? And if so, can you tell us a bit about your contributions?
I have always been at the forefront of understanding and developing new technology paradigms, from mobile applications to head-mounted augmented reality. I’ve had professional roles as a game designer, a lead software engineer, and a data analytics infrastructure engineer.
Most relevant to this proposal, I have been working on the larger project of untangling and reimagining networked computing for just over two years, full time and self funded. I spent the first 9 months untangling the issues with information architecture of the World Wide Web (an example of my thoughts), before then spending a long time researching and understanding most of the foundational architectural components of the internet by reinventing them from first principles. I have many technical prototypes and explorations of this project on my github, and I have the in-progress reference implementations of the larger project on my source hut. Finally, I have my historical developer logs, notes, and project component overview on the project website.
Explain what the requested budget will be used for? Does the project have other funding sources, both past and present?
The budget I’m asking for is 12,500 EUR. This is purely labor hour rate, based on an expectation of 3 months of work at an average of 25 hours per week, or about 325 total hours, and coming to just under 40 EUR per hour rate. Based on my cadence of related projects in this umbrella, the expected split of those hours is roughly 50/50 between continued research and actual implementation.
Compare your own project with existing or historical efforts.
Two modern comparisons come to mind: WebAssembly and the Unison Programming Language.
WebAssembly similarly seeks to create a new high-performance and space-efficient binary instruction set, but is not built explicitly around the notion of content addressability and so misses out on the majority of the benefits of Co/COINS. It is born out of practical concerns to expand the capability of modern web browsers (seemingly stumbling on it's non-browser applicability after the fact), and likewise inherits a lot of constraints (and features) that are not necessary in a bedrock platform (like its relationship to JavaScript). Perhaps most overlooked, and very relevant to this call, is that the WebAssembly spec is not approachable; it is complicated, dense, and only intended to be implemented by a handful of large browser vendors. COINS is simple enough for an individual person to implement on their own in the language (and/or hardware) of their choice, which both democratizes the implementations of the system, and keeps it simple enough to remain approachable and understandable by regular people. We need to make sure our protocols are actually able to be implemented by many people (like HTTP), otherwise they will remain functionally centralized (like Email).
Unison Language is the opposite; its key feature is that it *is* content addressable, but it is a high-level application programming language, geared mostly towards solving problems of distributed systems programming. It has paved the way to show what is possible even at that high level, but Co pushes this exploration to the bottom of the stack by instead asking: what if our *assembly language* was content addressable? And again, the answers like massively improved system-wide security, built-in software deltas, system-wide "scriptability", and deduplication of code on disk and memory make it worth exploring.
What are significant technical challenges you expect to solve during the project, if any?
I have already worked through a first draft of the COINS ISA and Cohost Virtual Machine, and have it executing simple programs. The bulk of the work I would like to accomplish with this funding is extensive iteration on unknown technical challenges that will emerge as the full system comes together. I expect the ISA to need revisions and additions as more complex programs become possible to write, and for the VM to need performance or architecture changes as well.
However, here are some hunches on where points of challenge may be:
- One key difference between Co as an assembly language and COINS as a raw instruction set is that Co can export not only ROMs but also code modules that are loaded at runtime, where the system maps from content hash to the concrete address in memory. There's a lot of room for questions, hiccups, and optimization in this component.
- This ability to export code modules will make it very easy (conceptually) to share code across machines and between different users. This opens up a lot of room for emergent application of the system, but also for emergent technical challenges. I hope to at least begin exploring the problems that arise from code sharing before the time period of this grant is up.
Describe the ecosystem of the project, and how you will engage with relevant actors and promote the outcomes?
As I mentioned a few times, the Co (and COINS and Cohost) are an important component of the larger project of the “Coalescent Computer”, a set of protocols and reference implementations for a local-first, data-centric “social computing” environment. There are other components to this project, such as a data layer that allows for self describing binary data and a "Coalescent Database" that acts as a disk/filesystem for the environment, and a networking layer that enables flexible and evolvable communication and data transfer protocols. The execution environment defined in this proposal would be the first step towards a local-only experimental virtual computer, the data layer will then begin to enable data-centric programs, and the networking component will unleash the true social computing nature of the project.
Over the course of the year, I hope to build a community around this project as it begins to take more of a concrete shape, and expect hobbyists and enthusiasts to be excited about the possibilities of this new computing environment, experimenting with the language and perhaps even porting VMs to other devices.