ravenscott-blog/The Bulding Blocks of Peer to Peer.md at 5d1217ef3d9672dff8f0bf6d15edcb11158e1761

Files

Raven Scott 5d1217ef3d update article

2025-02-17 05:04:31 -05:00

23 KiB

Raw Blame History

The Hyper Ecosystem: A Deep Dive into Peer-to-Peer Building Blocks

In this post, we’re going to explore in great detail the powerful set of libraries that comprise the Hyper ecosystem. Whether you’re building distributed filesystems, real-time collaborative applications, or secure, append-only logs, these libraries offer the tools you need. We’ll cover everything from peer discovery to file replication, and from merging multiple writers’ inputs into a consistent log to providing a high-performance key/value store. This post will take you step-by-step through HyperSwarm, Hyper-DHT, Autobase, Hyperdrive, Hyperbee, and Hypercore, complete with code examples and real-world scenarios.

HyperSwarm: Peer Discovery and Connection Simplified

At the forefront of any decentralized application is the ability to find and connect to other peers. HyperSwarm is a high-level API designed to abstract the complexities of peer discovery. It leverages a distributed hash table (DHT) under the hood to let you join a “swarm” of peers that are interested in a specific 32-byte topic.

Key Concepts

Topics:
Every topic must be exactly 32 bytes. They often represent a hashed value of a human-readable string.
Client and Server Modes:
- Server Mode: Announce your presence on the DHT and accept incoming connections.
- Client Mode: Actively search for peers that are announcing a particular topic and establish outgoing connections.

Getting Started

Install HyperSwarm using npm:

npm install hyperswarm

Below is an example that demonstrates both client and server behavior:

const Hyperswarm = require('hyperswarm')

// Create two separate swarm instances
const serverSwarm = new Hyperswarm()
const clientSwarm = new Hyperswarm()

// Server: Listen for incoming connections on a topic
serverSwarm.on('connection', (socket, info) => {
  console.log('Server: Connection established with', info.peer)
  socket.write('Hello from the server!')
  socket.end()
})

// Client: Handle data received from the server
clientSwarm.on('connection', (socket, info) => {
  socket.on('data', data => {
    console.log('Client: Received message:', data.toString())
  })
})

// Define a 32-byte topic (here, a simple repeated string)
const topic = Buffer.alloc(32, 'hello world')

// Server announces itself on the topic (server mode)
const serverDiscovery = serverSwarm.join(topic, { server: true, client: false })
await serverDiscovery.flushed() // Wait until fully announced

// Client searches for servers (client mode)
clientSwarm.join(topic, { server: false, client: true })
await clientSwarm.flush() // Wait until all connections are established

Advanced Features

PeerDiscovery Object:
The object returned by swarm.join() gives you control over the announcement or lookup lifecycle. Methods like flushed(), refresh(), and destroy() allow you to monitor or change discovery behavior on the fly.
Direct Peer Connection:
Use swarm.joinPeer(noisePublicKey) to establish a direct connection to a known peer. This is useful if you already have a peer’s public key and want to bypass the DHT lookup.
Events and Metadata:
Every connection event comes with a PeerInfo object, which includes details such as the peer’s Noise public key and the topics they are associated with. This information can be used to build custom user interfaces or manage reconnection strategies.

HyperSwarm makes it straightforward to integrate robust peer discovery into your P2P applications with minimal overhead.

Hyper-DHT: The Low-Level Networking Backbone

Beneath HyperSwarm lies Hyper-DHT, the distributed hash table that enables decentralized peer discovery and connection. Hyper-DHT is built on top of dht-rpc and uses a series of hole-punching techniques to connect peers even across restrictive NATs and firewalls.

Core Features

Public Key Identification:
Unlike traditional DHTs that rely on IP addresses, Hyper-DHT identifies peers using public keys. This makes it easy to connect to peers regardless of network changes.
Direct P2P Connections:
You can both create P2P servers and initiate connections to remote servers using direct public keys.

Setting Up a Hyper-DHT Node

Install Hyper-DHT via npm:

npm install hyperdht

Below is a simple example that creates a new DHT node and demonstrates bootstrapping:

const DHT = require('hyperdht')

// Create a DHT node with default bootstrap servers
const node = new DHT()

// Generate a key pair for the node (or use an existing one)
const keyPair = DHT.keyPair()

console.log('DHT node created with public key:', keyPair.publicKey.toString('hex'))

Creating and Managing P2P Servers

Hyper-DHT provides methods to create P2P servers that accept encrypted connections using the Noise protocol.

// Create a server that accepts encrypted connections
const server = node.createServer({
  firewall: (remotePublicKey, remoteHandshakePayload) => {
    // Here you can validate incoming connections
    // Return false to accept, true to reject
    return false
  }
})

// Start listening on a key pair
await server.listen(keyPair)
console.log('Server is listening on:', server.address())

// Handle incoming connections
server.on('connection', (socket) => {
  console.log('New connection from', socket.remotePublicKey.toString('hex'))
})

Peer Discovery and Announcements

Hyper-DHT offers additional APIs for peer discovery:

Lookup:

const lookupStream = node.lookup(topic)
lookupStream.on('data', (data) => {
  console.log('Lookup response:', data)
})

Announce:
Announce that you’re listening on a particular topic. This is especially useful for servers.

const announceStream = node.announce(topic, keyPair)
// The stream returns details about nearby nodes
announceStream.on('data', (data) => {
  console.log('Announced to:', data)
})

Mutable/Immutable Records:
Store and retrieve records in the DHT with methods such as node.immutablePut() and node.mutableGet(). This is useful for decentralized record storage.

Hyper-DHT provides the low-level connectivity and data exchange mechanisms that are essential for building robust P2P networks.

Autobase: Merging Multiple Data Streams into One Linear Log

Autobase is an experimental module designed to automatically rebase multiple causally-linked Hypercores into a single, linearized Hypercore. This functionality is crucial for collaborative applications where multiple writers need to merge their changes into a consistent view.

Why Autobase?

Imagine a multi-user chat application where each user has their own log of messages. Without a central coordinator, reconciling these different logs into a single, chronological order is challenging. Autobase solves this problem by:

Automatically Rebasing Inputs:
It takes several input Hypercores and computes a deterministic, causal ordering over all the entries.
Producing a Linearized View:
The output is a Hypercore-like log that represents a merged view, which can be used by higher-level data structures such as Hyperbee or Hyperdrive.
Low-Friction Integration:
Autobase’s output adheres to the Hypercore API, so you can plug it into your existing pipelines with minimal changes.

Working with Autobase

Install Autobase with npm:

npm install autobase

Here’s an example demonstrating basic usage:

const Autobase = require('autobase')

// Assume we have multiple input Hypercores (inputCore1, inputCore2)
const base = new Autobase({
  inputs: [inputCore1, inputCore2],
  localInput: localCore,  // Use this core for local appends
  autostart: true         // Automatically create the linearized view
})

// Append a new entry (Autobase automatically embeds a causal clock)
await base.append('Hello from user A')

// Create a causal stream to view the deterministic ordering of entries
const causalStream = base.createCausalStream()
causalStream.on('data', node => {
  console.log('Linearized node:', node)
})

Customizing the Linearized View

Autobase allows you to customize how the merged log is processed by providing an apply function. This function can transform or filter the batch of nodes before they are appended to the output view.

base.start({
  async apply(batch) {
    // For example, uppercase all string messages before storing
    const transformed = batch.map(({ value }) =>
      Buffer.from(value.toString('utf-8').toUpperCase())
    )
    await base.view.append(transformed)
  }
})

With Autobase, you gain the flexibility to support multi-user collaborative workflows without having to build complex conflict resolution mechanisms from scratch.

Hyperdrive: A Distributed, Real-Time Filesystem

Hyperdrive is a secure, real-time distributed filesystem that simplifies P2P file sharing. It’s built on top of Hypercore and Hyperbee and is used in projects like Holepunch and distributed web applications.

The Core Idea

Hyperdrive abstracts away the complexity of distributed file storage and allows you to work with files and directories just like in a traditional filesystem. It uses Hyperbee for file metadata (such as file hierarchies, permissions, and timestamps) and Hypercore to replicate the actual file data (blobs).

Setting Up a Hyperdrive

Install Hyperdrive along with a storage backend like Corestore:

npm install hyperdrive corestore

Below is a simple example using an in-memory store (for demonstration):

const Hyperdrive = require('hyperdrive')
const Corestore = require('corestore')
const ram = require('random-access-memory')

// Initialize a corestore using in-memory storage
const store = new Corestore(ram)
await store.ready()

// Create a new Hyperdrive instance
const drive = new Hyperdrive(store)
await drive.ready()

console.log('Hyperdrive ID:', drive.id)

File Operations

Hyperdrive provides a rich set of file operations similar to traditional filesystems:

Writing Files:

await drive.put('/hello.txt', Buffer.from('Hello, Hyperdrive!'))

Reading Files:

const fileBuffer = await drive.get('/hello.txt')
console.log('File content:', fileBuffer.toString('utf-8'))

Deleting and Updating Files:
```
await drive.del('/hello.txt')
```
Directory Listing and Batch Operations:
You can list directory entries, watch folders for changes, and even perform atomic batch updates.

Replication and Networking

Hyperdrive integrates seamlessly with HyperSwarm and Hyper-DHT to replicate data between peers:

const Hyperswarm = require('hyperswarm')
const swarm = new Hyperswarm()

swarm.on('connection', (socket) => {
  // Replicate the drive over the incoming connection
  drive.replicate(socket)
})

// Use the drive's discovery key to join the swarm
swarm.join(drive.discoveryKey)
swarm.flush().then(() => {
  console.log('Hyperdrive replication is active')
})

Hyperdrive also includes tools like localdrive and mirrordrive to facilitate importing/exporting files between distributed and local environments.

Hyperbee: The Append-Only B-Tree for Key/Value Storage

Hyperbee transforms the basic append-only log of Hypercore into a high-level, sorted key/value store. It is especially useful when you need a database-like interface with features such as atomic batch updates, sorted iterators, and efficient diffing between versions.

Core Concepts

Append-Only B-Tree:
Data is stored in a structure that allows for sorted key iteration. New changes are appended, ensuring that the history of modifications is preserved.
Encoding Options:
Hyperbee supports customizable keyEncoding and valueEncoding (e.g., 'utf-8', 'json', or 'binary').

Creating a Hyperbee Instance

Assuming you already have a Hypercore (possibly from Hyperdrive), you can create a Hyperbee:

const Hyperbee = require('hyperbee')

// Use an existing Hypercore instance for storage
const db = new Hyperbee(drive.core, {
  keyEncoding: 'utf-8',
  valueEncoding: 'json'
})

await db.ready()

// Put a key/value pair
await db.put('greeting', { message: 'Hello, Hyperbee!' })

// Retrieve the stored value
const { value } = await db.get('greeting')
console.log('Retrieved from Hyperbee:', value)

Advanced Operations

Batch Writes:
Hyperbee allows you to perform atomic batch operations for improved performance:

const batch = db.batch()
await batch.put('foo', 'bar')
await batch.put('lorem', 'ipsum')
await batch.flush()

Diff and History Streams:
You can generate streams that show the history of changes or compute diffs between versions:

// Create a history stream to see all modifications
const historyStream = db.createHistoryStream({ live: false })
historyStream.on('data', (entry) => {
  console.log('History entry:', entry)
})

// Generate a diff stream between two versions
const diffStream = db.createDiffStream(previousVersion, { reverse: false })
diffStream.on('data', diff => {
  console.log('Difference:', diff)
})

Hyperbee’s rich API makes it a great choice for building decentralized databases that require both performance and a clear history of changes.

Hypercore: The Foundational Append-Only Log

At the heart of the entire ecosystem lies Hypercore. It is a secure, distributed append-only log that forms the basis for all higher-level data structures. Hypercore is designed for performance and security, making it ideal for real-time data streaming and replication.

Understanding Hypercore

Append-Only Log:
Hypercore only allows data to be appended. Once written, data cannot be altered, ensuring an immutable history.
Security and Integrity:
Every block is cryptographically signed, and the log uses a Merkle tree for data verification. The use of a private key (that should remain on a single machine) guarantees that only the creator can modify the log.
Replication:
Hypercore includes robust replication capabilities, enabling peers to synchronize data over insecure networks using encrypted channels.

Basic API Usage

Install Hypercore via npm:

npm install hypercore

Create a new Hypercore instance using an in-memory storage backend (for demo purposes):

const Hypercore = require('hypercore')
const ram = require('random-access-memory')

const core = new Hypercore(ram, { valueEncoding: 'utf-8' })
await core.ready()

// Append data to the log
await core.append('This is the first block')
await core.append('Here is the second block')
console.log('Current core length:', core.length)

Advanced Operations

Reading Data:
You can read blocks individually or stream them:

// Get a specific block by index
const block = await core.get(0)
console.log('Block 0:', block)

// Create a read stream to iterate through all blocks
const readStream = core.createReadStream()
readStream.on('data', data => {
  console.log('Streamed block:', data)
})

Replication:
Replication can be done over any transport. For example, using a TCP connection:

// On the server side:
const net = require('net')
const server = net.createServer(socket => {
  socket.pipe(core.replicate(false)).pipe(socket)
})
server.listen(8000)

// On the client side:
const clientSocket = net.connect(8000)
clientSocket.pipe(core.replicate(true)).pipe(clientSocket)

Sessions and Snapshots:
Sessions allow you to create lightweight clones of a Hypercore, and snapshots provide a consistent, read-only view:
```
const session = core.session()
const snapshot = core.snapshot()
// Use session and snapshot as independent Hypercore instances
```
Truncation and Forking:
Hypercore supports truncating the log and intentionally forking the core using the truncate() method. This is particularly useful for applications that require resetting or pruning data.
```
// Truncate the core to a new length (and optionally set a fork ID)
await core.truncate(1)
console.log('Core length after truncation:', core.length)
```

Merkle Tree Hashing and Clearing Blocks:
For verifying data integrity, you can compute the Merkle tree hash or clear stored blocks to reclaim space:

const treeHash = await core.treeHash()
console.log('Merkle tree hash:', treeHash.toString('hex'))

// Clear a block from local storage
await core.clear(0)

Hypercore’s robust API and design make it the cornerstone of a secure, decentralized data ecosystem.

HyperDB: Database Built for P2P and Local Indexing

HyperDB is a versatile database engine designed to serve both peer-to-peer applications and local indexing needs. Whether you’re looking to build a distributed, collaborative data store using Hyperbee or need a fast, local-only database backed by RocksDB, HyperDB provides a unified, high-level API that makes it easy to work with structured data.

Installation

Install HyperDB via npm:

npm install hyperdb

Usage

Before you start, generate your database definition using the builder. This definition specifies the schemas and collections you want to work with. (For now, see the example in the ./example directory.)

You can then boot your database using the same definition for both fully P2P and local-only scenarios:

const HyperDB = require('hyperdb')

// Choose your engine:
// For a local-only database backed by RocksDB:
const db = HyperDB.rocks('./my-rocks.db', require('./my-definition'))

// Alternatively, for a P2P database backed by Hyperbee:
// const db = HyperDB.bee(hypercore, require('./my-definition'), [options])

console.log('Database initialized!')

It’s that simple.

API Overview

HyperDB exposes a rich set of methods to query, update, and manage your data:

Database Creation
- db = Hyperdb.bee(hypercore, definition, [options])
  Create a P2P database powered by Hyperbee.
- db = Hyperdb.rocks(path, definition, [options])
  Create a local-only database backed by RocksDB.
Querying
- Find Documents:
```
const queryStream = db.find(collectionOrIndex, query, [options])
```
  The query object follows this pattern:
```
{
  gt: { /* lower bound (exclusive) */ },
  gte: { /* lower bound (inclusive) */ },
  lt: { /* upper bound (exclusive) */ },
  lte: { /* upper bound (inclusive) */ }
}
```
  And options may include:
```
{
  limit,   // maximum number of results
  reverse  // whether to stream in reverse order
}
```
  Note: The query operates on a snapshot—any inserts or deletes made while the query stream is active will not affect the current results.
  - Stream Helpers:
    - all = await queryStream.toArray() — collect all entries.
    - one = await queryStream.one() — retrieve the last entry.
- Convenience Methods:
  - doc = await db.findOne(collectionOrIndex, query, [options])
    Alias for: await queryStream.one()
  - doc = await db.get(collection, query)
    Get a single document from a collection.
  - { count } = await db.stats(collectionOrIndex)
    Get statistics for a collection or index.
Modifying the Database
- Inserting and Deleting:
```
await db.insert(collection, doc)
await db.delete(collection, query)
```
  Note: You need to call await db.flush() later to persist these changes.
- Checking for Updates:
```
const updated = db.updated([collection], [query])
```
  Returns a boolean indicating whether the database (or a specific record) has been updated.
Snapshot and Transaction Management
- await db.flush() — persist all pending changes.
- db.reload() — reload the internal snapshot and clear memory state.
- const snapshot = db.snapshot() — create a read-only snapshot locked in time.
- const txn = db.transaction() — create a writable snapshot; flushing this snapshot updates the main instance.
- await db.close() — close the database (remember to close any snapshots you’ve created).

Builder API

The builder API lets you define the structure of your database. Each field in the definition file is an object that specifies its characteristics:

{
  name: 'field-name',
  type: 'uint',       // a compact-encoding type, or a reference to another encoding defined in the builder file
  required: true,     // set to false for optional fields
  array: false        // set to true to store an array of this type
}

This flexible schema definition allows you to easily declare collections, indices, and relationships tailored to your application’s needs.

HyperDB’s rich feature set makes it an excellent choice for building robust, scalable databases that work seamlessly in both decentralized and local environments. Experiment with its flexible API to build custom queries, manage snapshots, and implement transactions effortlessly.

Integrating the Ecosystem: Real-World Applications

The true power of the Hyper ecosystem is revealed when you combine these libraries to build real-world applications. Here are a few scenarios:

Distributed Collaborative Applications:
Use Autobase to merge multiple users’ Hypercores into a single, linearized log. With Hyperbee, index and query the merged log efficiently. Replicate the data using HyperSwarm and Hyper-DHT so that all participants stay in sync.
Decentralized Filesystems:
Hyperdrive leverages Hyperbee for file metadata and Hypercore for file content. With built-in replication and versioning, you can build a secure, real-time filesystem that works even under unreliable network conditions.
Secure Data Streams and Logs:
Hypercore provides the secure append-only log that can be used as the backbone for distributed messaging systems or real-time feeds. Its replication and snapshot capabilities allow for both live updates and historical consistency.
Peer Discovery and Direct Connections:
HyperSwarm and Hyper-DHT handle the complexities of NAT traversal, bootstrapping, and encrypted peer-to-peer connections, letting you focus on building the core logic of your application.

By understanding and leveraging these components, you can build highly resilient, decentralized applications that perform well at scale and provide a secure, tamper-proof data layer.

What do I think?

The Hyper ecosystem offers a modular yet deeply integrated set of tools for building the next generation of peer-to-peer applications. Whether you’re just starting out or looking to build a production-grade decentralized system, these libraries provide robust APIs, extensive customization options, and powerful features—from secure peer discovery with HyperSwarm and Hyper-DHT, to automatic conflict resolution and multi-user collaboration with Autobase, to distributed file sharing with Hyperdrive, and efficient key/value storage and append-only logs with Hyperbee and Hypercore.

By diving deep into these libraries, you gain the knowledge and flexibility to architect truly decentralized systems that are resilient, secure, and scalable. Experiment with the code examples, explore the API documentation on GitHub, and start building your own applications in this exciting ecosystem.

23 KiB Raw Blame History Unescape Escape

HyperSwarm: Peer Discovery and Connection Simplified

Key Concepts

Getting Started

Advanced Features

Hyper-DHT: The Low-Level Networking Backbone

Core Features

Setting Up a Hyper-DHT Node

Creating and Managing P2P Servers

Peer Discovery and Announcements

Autobase: Merging Multiple Data Streams into One Linear Log

Why Autobase?

Working with Autobase

Customizing the Linearized View

Hyperdrive: A Distributed, Real-Time Filesystem

The Core Idea

Setting Up a Hyperdrive

File Operations

Replication and Networking

Hyperbee: The Append-Only B-Tree for Key/Value Storage

Core Concepts

Creating a Hyperbee Instance

Advanced Operations

Hypercore: The Foundational Append-Only Log

Understanding Hypercore

Basic API Usage

Advanced Operations

HyperDB: Database Built for P2P and Local Indexing

Installation

Usage

API Overview

Builder API

Integrating the Ecosystem: Real-World Applications

What do I think?

23 KiB

Raw Blame History