Marmot v2

What & Why?

Marmot v2 is a leaderless, distributed SQLite replication system built on a gossip-based protocol with distributed transactions and eventual consistency.

Key Features:

Leaderless Architecture: No single point of failure - any node can accept writes
MySQL Protocol Compatible: Connect with any MySQL client (DBeaver, MySQL Workbench, mysql CLI)
Distributed Transactions: Percolator-style write intents with conflict detection
Multi-Database Support: Create and manage multiple databases per cluster
DDL Replication: Distributed schema changes with automatic idempotency and cluster-wide locking
Production-Ready SQL Parser: Powered by rqlite/sql AST parser for MySQL→SQLite transpilation
CDC-Based Replication: Row-level change data capture for consistent replication

Quick Start

# Start a single-node cluster
./marmot-v2

# Connect with MySQL client
mysql -h localhost -P 3306 -u root

# Or use DBeaver, MySQL Workbench, etc.

Testing Replication

# Test DDL and DML replication across a 2-node cluster
./scripts/test-ddl-replication.sh

# This script will:
# 1. Start a 2-node cluster
# 2. Create a table on node 1 and verify it replicates to node 2
# 3. Insert data on node 1 and verify it replicates to node 2
# 4. Update data on node 2 and verify it replicates to node 1
# 5. Delete data on node 1 and verify it replicates to node 2

# Manual cluster testing
./examples/start-seed.sh              # Start seed node (port 8081, mysql 3307)
./examples/join-cluster.sh 2 localhost:8081  # Join node 2 (port 8082, mysql 3308)
./examples/join-cluster.sh 3 localhost:8081  # Join node 3 (port 8083, mysql 3309)

# Connect to any node and run queries
mysql --protocol=TCP -h localhost -P 3307 -u root
mysql --protocol=TCP -h localhost -P 3308 -u root

# Cleanup
pkill -f marmot-v2

Stargazers over time

Architecture

Marmot v2 uses a fundamentally different architecture from other SQLite replication solutions:

vs. rqlite/dqlite/LiteFS:

❌ They require a primary node for all writes
✅ Marmot allows writes on any node
❌ They use leader election (Raft)
✅ Marmot uses gossip protocol (no leader)
❌ They require proxy layer or page-level interception
✅ Marmot uses MySQL protocol for direct database access

How It Works:

Write Coordination: 2PC (Two-Phase Commit) with configurable consistency (ONE, QUORUM, ALL)
Conflict Resolution: Last-Write-Wins (LWW) with HLC timestamps
Cluster Membership: SWIM-style gossip with failure detection
Data Replication: Full database replication - all nodes receive all data
DDL Replication: Cluster-wide schema changes with automatic idempotency

DDL Replication

Marmot v2 supports distributed DDL (Data Definition Language) replication without requiring master election:

How It Works

Cluster-Wide Locking: Each DDL operation acquires a distributed lock per database (default: 30-second lease)
- Prevents concurrent schema changes on the same database
- Locks automatically expire if a node crashes
- Different databases can have concurrent DDL operations

Automatic Idempotency: DDL statements are automatically rewritten for safe replay

CREATE TABLE users (id INT)
→ CREATE TABLE IF NOT EXISTS users (id INT)

DROP TABLE users
→ DROP TABLE IF EXISTS users

Schema Version Tracking: Each database maintains a schema version counter
- Incremented on every DDL operation
- Exchanged via gossip protocol for drift detection
- Used by delta sync to validate transaction applicability
Quorum-Based Replication: DDL replicates like DML through the same 2PC mechanism
- No special master node needed
- Works with existing consistency levels (QUORUM, ALL, etc.)

Configuration

[ddl]
# DDL lock lease duration (seconds)
lock_lease_seconds = 30

# Automatically rewrite DDL for idempotency
enable_idempotent = true

Best Practices

✅ Do: Execute DDL from a single connection/node at a time
✅ Do: Use qualified table names (mydb.users instead of users)
⚠️ Caution: ALTER TABLE is less idempotent - avoid replaying failed ALTER operations
❌ Don't: Run concurrent DDL on the same database from multiple nodes

CDC-Based Replication

Marmot v2 uses Change Data Capture (CDC) for replication instead of SQL statement replay:

How It Works

Row-Level Capture: Instead of replicating SQL statements, Marmot captures the actual row data changes (INSERT/UPDATE/DELETE)
Binary Data Format: Row data is serialized as CDC messages with column values, ensuring consistent replication regardless of SQL dialect
Deterministic Application: Row data is applied directly to the target database, avoiding parsing ambiguities

Benefits

Consistency: Same row data applied everywhere, no SQL parsing differences
Performance: Binary format is more efficient than SQL text
Reliability: No issues with SQL syntax variations between MySQL and SQLite

Row Key Extraction

For UPDATE and DELETE operations, Marmot automatically extracts row keys:

Uses PRIMARY KEY columns when available
Falls back to ROWID for tables without explicit primary key
Handles composite primary keys correctly

SQL Statement Compatibility

Marmot supports a wide range of MySQL/SQLite statements through its MySQL protocol server. The following table shows compatibility for different statement types:

Statement Type	Support	Replication	Notes
DML - Data Manipulation
`INSERT` / `REPLACE`	✅ Full	✅ Yes	Includes qualified table names (db.table)
`UPDATE`	✅ Full	✅ Yes	Includes qualified table names
`DELETE`	✅ Full	✅ Yes	Includes qualified table names
`SELECT`	✅ Full	N/A	Read operations
`LOAD DATA`	✅ Full	✅ Yes	Bulk data loading
DDL - Data Definition
`CREATE TABLE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`ALTER TABLE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`DROP TABLE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`TRUNCATE TABLE`	✅ Full	✅ Yes
`RENAME TABLE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`CREATE/DROP INDEX`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`CREATE/DROP VIEW`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`CREATE/DROP TRIGGER`	✅ Full	✅ Yes	Replicated with cluster-wide locking
Database Management
`CREATE DATABASE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`DROP DATABASE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`ALTER DATABASE`	✅ Full	✅ Yes	Replicated with cluster-wide locking
`SHOW DATABASES`	✅ Full	N/A	Metadata query
`SHOW TABLES`	✅ Full	N/A	Metadata query
`USE database`	✅ Full	N/A	Session state
Transaction Control
`BEGIN` / `START TRANSACTION`	✅ Full	N/A	Transaction boundary
`COMMIT`	✅ Full	✅ Yes	Commits distributed transaction
`ROLLBACK`	✅ Full	✅ Yes	Aborts distributed transaction
`SAVEPOINT`	✅ Full	✅ Yes	Nested transaction support
Locking
`LOCK TABLES`	✅ Parsed	❌ No	Requires distributed locking coordination
`UNLOCK TABLES`	✅ Parsed	❌ No	Requires distributed locking coordination
Session Configuration
`SET` statements	✅ Parsed	❌ No	Session-local, not replicated
XA Transactions
`XA START/END/PREPARE`	✅ Parsed	❌ No	Marmot uses its own 2PC protocol
`XA COMMIT/ROLLBACK`	✅ Parsed	❌ No	Not compatible with Marmot's model
DCL - Data Control
`GRANT` / `REVOKE`	✅ Parsed	❌ No	User management not replicated
`CREATE/DROP USER`	✅ Parsed	❌ No	User management not replicated
`ALTER USER`	✅ Parsed	❌ No	User management not replicated
Administrative
`OPTIMIZE TABLE`	✅ Parsed	❌ No	Node-local administrative command
`REPAIR TABLE`	✅ Parsed	❌ No	Node-local administrative command

Legend

✅ Full: Fully supported and working
✅ Parsed: Statement is parsed and recognized
⚠️ Limited: Works but has limitations in distributed context
❌ No: Not supported or not replicated
N/A: Not applicable (read-only or session-local)

Important Notes

Schema Changes (DDL): DDL statements are fully replicated with cluster-wide locking and automatic idempotency. See the DDL Replication section for details.
XA Transactions: Marmot has its own distributed transaction protocol based on 2PC. MySQL XA transactions are not compatible with Marmot's replication model.
User Management (DCL): User and privilege management statements are local to each node. For production deployments, consider handling authentication at the application or proxy level.
Table Locking: LOCK TABLES statements are recognized but not enforced across the cluster. Use application-level coordination for distributed locking needs.
Qualified Names: Marmot fully supports qualified table names (e.g., db.table) in DML and DDL operations.

MySQL Protocol & Metadata Queries

Marmot includes a MySQL-compatible protocol server, allowing you to connect using any MySQL client (DBeaver, MySQL Workbench, mysql CLI, etc.). The server supports:

Metadata Query Support

Marmot provides full support for MySQL metadata queries, enabling GUI tools like DBeaver to browse databases, tables, and columns:

SHOW Commands: SHOW DATABASES, SHOW TABLES, SHOW COLUMNS FROM table, SHOW CREATE TABLE, SHOW INDEXES
INFORMATION_SCHEMA: Queries against INFORMATION_SCHEMA.TABLES, INFORMATION_SCHEMA.COLUMNS, INFORMATION_SCHEMA.SCHEMATA, and INFORMATION_SCHEMA.STATISTICS
Type Conversion: Automatic SQLite-to-MySQL type mapping for compatibility

These metadata queries are powered by the rqlite/sql AST parser, providing production-grade MySQL query compatibility.

Connecting with MySQL Clients

# Using mysql CLI
mysql -h localhost -P 3306 -u root

# Connection string for applications
mysql://root@localhost:3306/marmot

Recovery Scenarios

Marmot handles various failure and recovery scenarios automatically:

Network Partition (Split-Brain)

Scenario	Behavior
Minority partition	Writes fail - cannot achieve quorum
Majority partition	Writes succeed - quorum achieved
Partition heals	Delta sync + LWW merges divergent data

How it works:

During partition, only the majority side can commit writes (quorum enforcement)
When partition heals, nodes exchange transaction logs via StreamChanges RPC
Conflicts resolved using Last-Writer-Wins (LWW) with HLC timestamps
Higher node ID breaks ties for simultaneous writes

Node Failure & Recovery

Scenario	Recovery Method
Brief outage	Delta sync - replay missed transactions
Extended outage	Snapshot transfer + delta sync
New node joining	Full snapshot from existing node

Anti-Entropy Background Process:

Marmot v2 includes an automatic anti-entropy system that continuously monitors and repairs replication lag across the cluster:

Lag Detection: Every 60 seconds (configurable), each node queries peers for their replication state
Smart Recovery Decision:
- Delta Sync if lag < 10,000 transactions AND < 1 hour: Streams missed transactions incrementally
- Snapshot Transfer if lag exceeds thresholds: Full database file transfer for efficiency
Gap Detection: Detects when transaction logs have been GC'd and automatically falls back to snapshot
Multi-Database Support: Tracks and syncs each database independently
GC Coordination: Garbage collection respects peer replication state - logs aren't deleted until all peers have applied them

Delta Sync Process:

Lagging node queries last_applied_txn_id for each peer/database
Requests transactions since that ID via StreamChanges RPC
Gap Detection: Checks if first received txn_id has a large gap from requested ID
- If gap > delta_sync_threshold_txns, indicates missing (GC'd) transactions
- Automatically falls back to snapshot transfer to prevent data loss
Applies changes using LWW conflict resolution
Updates replication state tracking (per-database)
Progress logged every 100 transactions

GC Coordination with Anti-Entropy:

Transaction logs are retained with a two-tier policy:
- Min retention (2 hours): Must be >= delta sync threshold, respects peer lag
- Max retention (24 hours): Force delete after this time to prevent unbounded growth
Config validation enforces: gc_min >= delta_threshold and gc_max >= 2x delta_threshold
Each database tracks replication progress per peer
GC queries minimum applied txn_id across all peers before cleanup
Gap detection prevents data loss if GC runs while nodes are offline

Consistency Guarantees

Write Consistency	Behavior
`ONE`	Returns after 1 node ACK (fast, less durable)
`QUORUM`	Returns after majority ACK (default, balanced)
`ALL`	Returns after all nodes ACK (slow, most durable)

Conflict Resolution:

All conflicts resolved via LWW using HLC timestamps
No data loss - later write always wins deterministically
Tie-breaker: higher node ID wins for equal timestamps

Limitations

Selective Table Watching: All tables in a database are replicated. Selective table replication is not supported.
WAL Mode Required: SQLite must use WAL mode for reliable multi-process changes.
Eventually Consistent: Rows may sync out of order. SERIALIZABLE transaction assumptions may not hold across nodes.
Concurrent DDL: Avoid running concurrent DDL operations on the same database from multiple nodes (protected by cluster-wide lock with 30s lease).

Compatibility

AUTO_INCREMENT and Integer Types

IMPORTANT: Marmot automatically converts INT AUTO_INCREMENT to BIGINT

This is a breaking change from standard MySQL/SQLite behavior. Marmot does not respect 32-bit INT for auto-increment columns - they are automatically promoted to BIGINT to support distributed ID generation.

Why?

In a distributed, leaderless system, each node must generate unique IDs independently without coordination. Marmot uses HLC-based (Hybrid Logical Clock) 64-bit IDs to ensure:

Global Uniqueness: IDs are unique across all nodes without central coordination
Monotonicity: IDs increase over time (within each node)
No Collisions: Unlike auto-increment sequences, HLC IDs cannot collide between nodes

How It Works:

DDL Transformation: When you create a table with AUTO_INCREMENT:

CREATE TABLE users (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(100))
-- Becomes internally:
CREATE TABLE users (id BIGINT PRIMARY KEY, name TEXT)

DML ID Injection: When inserting with 0 or NULL for an auto-increment column:

INSERT INTO users (id, name) VALUES (0, 'alice')
-- Becomes internally:
INSERT INTO users (id, name) VALUES (7318624812345678901, 'alice')

Explicit IDs Preserved: If you provide an explicit non-zero ID, it is used as-is:

INSERT INTO users (id, name) VALUES (12345, 'bob')
-- Remains:
INSERT INTO users (id, name) VALUES (12345, 'bob')

Important Considerations:

Aspect	Behavior
ID Range	64-bit (up to 9.2 quintillion) instead of 32-bit (4.2 billion)
ID Format	HLC-based, not sequential integers
SQLite ROWID	Not used - Marmot manages IDs explicitly
Client Libraries	Ensure your client handles `BIGINT` correctly (some JSON serializers may lose precision)
Existing Data	Migrate existing `INT` columns to `BIGINT` before enabling Marmot

Schema-Based Detection:

Marmot automatically detects auto-increment columns by querying SQLite schema directly. A column is considered auto-increment if:

It is a single-column INTEGER PRIMARY KEY (SQLite rowid alias), or
It is a single-column BIGINT PRIMARY KEY (Marmot's transformed columns)

This means:

No registration required - columns are detected from schema at runtime
Works across restarts - no need to re-execute DDL statements
Works with existing databases - tables created directly on SQLite work too

Configuration

Marmot v2 uses a TOML configuration file (default: config.toml). All settings have sensible defaults.

Core Configuration

node_id = 0  # 0 = auto-generate
data_dir = "./marmot-data"

Transaction Manager

[transaction]
heartbeat_timeout_seconds = 10  # Transaction timeout without heartbeat
conflict_window_seconds = 10    # Conflict resolution window
lock_wait_timeout_seconds = 50  # Lock wait timeout (MySQL: innodb_lock_wait_timeout)

Note: Transaction log garbage collection is managed by the replication configuration to coordinate with anti-entropy. See replication.gc_min_retention_hours and replication.gc_max_retention_hours.

Connection Pool

[connection_pool]
pool_size = 4              # Number of SQLite connections
max_idle_time_seconds = 10 # Max idle time before closing
max_lifetime_seconds = 300 # Max connection lifetime (0 = unlimited)

gRPC Client

[grpc_client]
keepalive_time_seconds = 10    # Keepalive ping interval
keepalive_timeout_seconds = 3  # Keepalive ping timeout
max_retries = 3                # Max retry attempts
retry_backoff_ms = 100         # Retry backoff duration

Coordinator

[coordinator]
prepare_timeout_ms = 2000 # Prepare phase timeout
commit_timeout_ms = 2000  # Commit phase timeout
abort_timeout_ms = 2000   # Abort phase timeout

Cluster

[cluster]
grpc_bind_address = "0.0.0.0"
grpc_port = 8080
seed_nodes = []                # List of seed node addresses
cluster_secret = ""            # PSK for cluster authentication (see Security section)
gossip_interval_ms = 1000      # Gossip interval
gossip_fanout = 3              # Number of peers to gossip to
suspect_timeout_ms = 5000      # Suspect timeout
dead_timeout_ms = 10000        # Dead timeout

Security

Marmot supports Pre-Shared Key (PSK) authentication for cluster communication. This is strongly recommended for production deployments.

[cluster]
# All nodes in the cluster must use the same secret
cluster_secret = "your-secret-key-here"

Environment Variable (Recommended):

For production, use the environment variable to avoid storing secrets in config files:

export MARMOT_CLUSTER_SECRET="your-secret-key-here"
./marmot

The environment variable takes precedence over the config file.

Generating a Secret:

# Generate a secure random secret
openssl rand -base64 32

Behavior:

If cluster_secret is empty and MARMOT_CLUSTER_SECRET is not set, authentication is disabled
A warning is logged at startup when authentication is disabled
All gRPC endpoints (gossip, replication, snapshots) are protected when authentication is enabled
Nodes with mismatched secrets will fail to communicate (connection rejected with "invalid cluster secret")

Cluster Membership Management

Marmot provides admin HTTP endpoints for managing cluster membership (requires cluster_secret to be configured):

Node Lifecycle:

New/restarted nodes auto-join via gossip - no manual intervention needed
Nodes marked REMOVED via admin API cannot auto-rejoin - must be explicitly allowed
This prevents decommissioned nodes from accidentally rejoining the cluster

# View cluster members and quorum info
curl -H "X-Marmot-Secret: your-secret" http://localhost:8080/admin/cluster/members

# Remove a node from the cluster (excludes from quorum, blocks auto-rejoin)
curl -X POST -H "X-Marmot-Secret: your-secret" http://localhost:8080/admin/cluster/remove/2

# Allow a removed node to rejoin (node must then restart to join)
curl -X POST -H "X-Marmot-Secret: your-secret" http://localhost:8080/admin/cluster/allow/2

See the Operations documentation for detailed usage and examples.

Replication

[replication]
default_write_consistency = "QUORUM"      # Write consistency level: ONE, QUORUM, ALL
default_read_consistency = "LOCAL_ONE"    # Read consistency level
write_timeout_ms = 5000                   # Write operation timeout
read_timeout_ms = 2000                    # Read operation timeout

# Anti-Entropy: Background healing for eventual consistency
# - Detects and repairs divergence between replicas
# - Uses delta sync for small lags, snapshot for large lags
# - Includes gap detection to prevent incomplete data after GC
enable_anti_entropy = true                 # Enable automatic catch-up for lagging nodes
anti_entropy_interval_seconds = 60         # How often to check for lag (default: 60s)
delta_sync_threshold_transactions = 10000  # Delta sync if lag < 10K txns
delta_sync_threshold_seconds = 3600        # Snapshot if lag > 1 hour

# Garbage Collection: Reclaim disk space by deleting old transaction records
# - gc_min must be >= delta_sync_threshold (validated at startup)
# - gc_max should be >= 2x delta_sync_threshold (recommended)
# - Set gc_max = 0 for unlimited retention
gc_min_retention_hours = 2   # Keep at least 2 hours (>= 1 hour delta threshold)
gc_max_retention_hours = 24  # Force delete after 24 hours

Anti-Entropy Tuning:

Small clusters (2-3 nodes): Use default settings (60s interval)
Large clusters (5+ nodes): Consider increasing interval to 120-180s to reduce network overhead
High write throughput: Increase delta_sync_threshold_transactions to 50000+
Long-running clusters: Keep gc_max_retention_hours at 24+ to handle extended outages

GC Configuration Rules (Validated at Startup):

gc_min_retention_hours must be >= delta_sync_threshold_seconds (in hours)
gc_max_retention_hours should be >= 2x delta_sync_threshold_seconds
Violating these rules will cause startup failure with helpful error messages

Query Pipeline

[query_pipeline]
transpiler_cache_size = 10000  # LRU cache for MySQL→SQLite transpilation
validator_pool_size = 8        # SQLite connection pool for validation

MySQL Protocol Server

[mysql]
enabled = true
bind_address = "0.0.0.0"
port = 3306
max_connections = 1000

Logging

[logging]
verbose = false          # Enable verbose logging
format = "console"       # Log format: console or json

Prometheus Metrics

[prometheus]
enabled = true  # Metrics served on gRPC port at /metrics endpoint

Accessing Metrics:

# Metrics are multiplexed with gRPC on the same port
curl http://localhost:8080/metrics

# Prometheus scrape config
scrape_configs:
  - job_name: 'marmot'
    static_configs:
      - targets: ['node1:8080', 'node2:8080', 'node3:8080']

See config.toml for complete configuration reference with detailed comments.

Benchmarks

Performance benchmarks on a local development machine (Apple M-series, 3-node cluster, single machine):

Test Configuration

Parameter	Value
Nodes	3 (ports 3307, 3308, 3309)
Threads	16
Batch Size	10 ops/transaction
Consistency	QUORUM

Load Phase (INSERT-only)

Metric	Value
Throughput	4,175 ops/sec
TX Throughput	417 tx/sec
Records Loaded	200,000
Errors	0

Mixed Workload

Metric	Value
Throughput	3,370 ops/sec
TX Throughput	337 tx/sec
Duration	120 seconds
Total Operations	404,930
Errors	0
Retries	37 (0.09%)

Operation Distribution:

READ: 20%
UPDATE: 30%
INSERT: 35%
DELETE: 5%
UPSERT: 10%

Latency (Mixed Workload)

Percentile	Latency
P50	4.3ms
P90	14.0ms
P95	36.8ms
P99	85.1ms

Replication Verification

All 3 nodes maintained identical row counts (346,684 rows) throughout the test, confirming consistent replication.

Note: These benchmarks are from a local development machine with all nodes on the same host. Production deployments across multiple machines will have different characteristics based on network latency.

FAQs & Community

For FAQs visit this page
For community visit our discord or discussions on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 514 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
admin		admin
cfg		cfg
cluster		cluster
coordinator		coordinator
db		db
docs		docs
encoding		encoding
examples		examples
grpc		grpc
hlc		hlc
id		id
protocol		protocol
replica		replica
scripts		scripts
telemetry		telemetry
test		test
tools/pika		tools/pika
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
build-linux.sh		build-linux.sh
config.toml		config.toml
config.toml.example		config.toml.example
go.mod		go.mod
go.sum		go.sum
marmot.go		marmot.go

License

maxpert/marmot

Folders and files

Latest commit

History

Repository files navigation

Marmot v2

What & Why?

Quick Start

Testing Replication

Stargazers over time

Architecture

DDL Replication

How It Works

Configuration

Best Practices

CDC-Based Replication

How It Works

Benefits

Row Key Extraction

SQL Statement Compatibility

Legend

Important Notes

MySQL Protocol & Metadata Queries

Metadata Query Support

Connecting with MySQL Clients

Recovery Scenarios

Network Partition (Split-Brain)

Node Failure & Recovery

Consistency Guarantees

Limitations

Compatibility

AUTO_INCREMENT and Integer Types

Configuration

Core Configuration

Transaction Manager

Connection Pool

gRPC Client

Coordinator

Cluster

Security

Cluster Membership Management

Replication

Query Pipeline

MySQL Protocol Server

Logging

Prometheus Metrics

Benchmarks

Test Configuration

Load Phase (INSERT-only)

Mixed Workload

Latency (Mixed Workload)

Replication Verification

FAQs & Community

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 83

Packages 0

Uh oh!

Contributors 9

Languages

Packages