Engineering the VillageSQL Extension Framework (VEF)
Extending MySQL has traditionally meant a difficult choice: maintain a complex fork or accept the performance and functional limitations of the legacy UDF, plugin, or component systems. We built VillageSQL to provide a new path: a stable, versioned Application Binary Interface (ABI) for loading custom types and logic into MySQL without modifying the core engine. The VillageSQL Extension Framework (VEF) provides that interface. What we have released in alpha is the foundation—the type system and execution model—upon which we will continue to build more advanced hooks that enable greater extensibility of the database.
Limitations of Existing Mechanisms
Extending the open-source community edition of MySQL has historically meant navigating complex architectural trade-offs. Legacy User-Defined Functions (UDFs) offer a weakly typed interface, where string and binary UDFs rely on manual char* validation without charset guarantees, and each requires manual global state management—all without any guarantee of binary compatibility across MySQL versions. Stored Procedures, while easier to write, suffer from the overhead of interpreted SQL and lack the means to execute performance-critical logic in native compiled code.
The MySQL plugin system provides a formal API, but historically most plugins bypassed it to access server symbols directly—a pattern so common that MySQL's newer component system was designed specifically to address it. The component system addressed some of this by introducing a service-based registry, but both systems still require familiarity with low-level binary interfaces and lack the bundle-level orchestration that makes VEF practical. MySQL components store only a component URI with no version or integrity record in the database; VEF records a version and SHA256 hash and verifies both on every startup, preventing the silent drift between what the database expects and what binary is actually loaded — a problem MySQL's component system has no mechanism to detect. Agentic AI can lower the barrier to entry for navigating these complex internals, but it cannot compensate for the structural fragility of a system that lacks the strict ABI guarantees required for seamless binary compatibility across updates. For most developers, these hurdles make the cost of adding custom logic and specialized data types prohibitively high.
VillageSQL’s Philosophy
To understand VEF, it's important to distinguish it from existing MySQL extension mechanisms—plugins, components, and UDFs alike. All of these approaches treat the database as a fixed system you add tools to from the outside.
In a UDF-based approach, the database remains blind to your data. If you add a UUID function to MySQL, the core engine still sees a VARCHAR or BINARY string. It doesn't know what a "UUID" is; it only knows how to run a function that generates one.
Extending the Type System
Instead of just adding functions, VillageSQL lets you add new types to the core database. This "type-aware" approach shifts the responsibility of data integrity and optimization from the application to the database itself:
By moving from "extending functions" to "extending types," VEF transforms MySQL from a rigid data store into a programmable platform. When you run INSTALL EXTENSION extension_name;, you aren't just adding a utility; you are upgrading the database's engine to natively understand a new kind of data.
The Architecture: Decoupled Logic, Native Performance
VEF moves custom logic into independent, versioned shared libraries loaded at runtime. This allows for rapid iteration on domain-specific features while keeping the database core clean and maintainable.
1. The .veb Bundle System
To make extending the type system practical for developers, we needed a distribution format that was more than just a compiled library. We created .veb (VillageSQL Extension Bundle)—a system that pairs a self-contained binary package with the source-level tools needed for validation. A VillageSQL extension project includes:
- The Manifest: A JSON file (
manifest.json) defining naming, versioning, and metadata—packaged inside the .veb. - The Binary: The compiled
.so(or.dllwhen we support Windows) object, built against VEF headers to ensure ABI compatibility—the core of the .veb bundle. - Test Suite: MTR-compatible test cases that ensure the extension behaves correctly in your specific environment before you package and install the bundle—included in the extension's source distribution.
Unlike traditional plugins that require placing a platform-specific library file in a server directory and then manually registering it through multiple steps, VEF streamlines the process to a single self-contained bundle and one atomic command: INSTALL EXTENSION extension_name;. While MySQL's component system improved upon legacy plugins by introducing a service-based architecture, VEF moves toward full orchestration. Once the bundle is in place, a single command drives the entire process — extracting the archive, loading the binary, registering all custom types and functions through the VEF API, and committing the installation record.
Under the hood, when you run the install command, the server manages the entire lifecycle to ensure stability:
- Verification: The server calculates a SHA256 hash of the bundle to confirm consistency with its registered state before any code is loaded.
- The Registry: The archive is extracted from its staging directory (
--veb-dir) into a structured, versioned directory:{veb_dir}/_expanded/{name}/{sha256}/. This keeps the user extensions architecturally separate from core MySQL system plugins. - Architectural Isolation: The shared library is designed for strict symbol isolation. This ensures that an extension’s internal logic never interferes with the MySQL kernel or other extensions—addressing a stability pain point in traditional plugin development—a challenge recently highlighted in the VLDB survey on DBMS extensibility.
2. Type Registration and Storage
A significant architectural decision in VEF is how we handle data storage: Custom types map to VARBINARY storage (binary-collated variable strings). This design choice prioritizes the MySQL ecosystem’s greatest strengths: compatibility and stability.
Why VARBINARY Mapping?
By utilizing MySQL's existing variable-length storage infrastructure, VEF extensions inherit the performance and stability of the MySQL engine. This design decision prioritizes:
- Ecosystem Compatibility: Because custom types map to
VARBINARY, VillageSQL is designed so data replicates seamlessly to older replicas and works out-of-the-box with existing tools like mysqldump. (Note: To ensure replication integrity, extensions must be installed on all replicas.) - Standard Storage Overhead: Custom types use the standard MySQL 1-byte length prefix (for types with a defined storage size ≤ 255 bytes) or a 2-byte prefix for larger type sizes.
- Data Integrity: Custom types are stored as standard
VARBINARY. This ensures that even if an extension is removed or fails to load, the data remains physically accessible on disk as a binary blob.
How It Works: The Semantic Layer
While InnoDB sees a VARBINARY string, the VEF framework intercepts the database's logic at the semantic layer. When you register a new type, the framework relies on a set of C++ callbacks that define the type’s behavior:
compare(): Tells the B-tree index how to order the data (e.g., ensuring time-ordered types like UUID v1 sort by time rather than raw bytes).encode()/decode(): Manages the transformation between the user's view and the on-disk representation.
3. The Execution Model
While traditional MySQL uses a legacy UDF (User-Defined Function) system, VEF introduces VillageSQL Defined Functions (VDF)—a modern API designed for performance and stability.
Key Features of the VDF Model:
- Typed Function Signatures: Unlike the weak typing of legacy UDFs, VDFs support native type-checking for up to 8 parameters. This removes the burden of manual input validation from the extension developer and ensures the database rejects bad data before the function even runs.
- Performance Optimization: VEF passes data between the engine and extensions using direct pointers rather than copying values. Extensions receive const pointers to input buffers and can return results in their own pre-allocated memory, eliminating redundant copies. By using direct pointers, VEF ensures that data movement costs are minimized by avoiding redundant copies, providing a foundation for high-performance extensions like AI and UUID processing.
- Versioning: To solve the binary compatibility issues that often plague database extensibility, VEF incorporates a protocol handshake at the architectural level. By establishing this versioning infrastructure in the initial release, we ensure that as the framework evolves, the server can automatically negotiate ABI compatibility with every extension before it loads, ensuring each side communicates using a mutually supported protocol version.
Reference Extensions
We validated the initial VEF by building several extensions that showcase how to extend MySQL's type system and add functionality:
VEF Surface Area
We are excited about VEF’s future capabilities. While scalar functions and types provide immediate utility, the hooks we are working on will unlock entirely new classes of applications. Below is what VEF currently supports as well as our near-term roadmap:
- Supported: Custom data types (storage + comparison), scalar functions, simple aggregates, and hash functions for joins.
- Roadmap:
- Custom Indexes: We will improve the extension framework to support custom indexes. This is the prerequisite for vector search and specialized data types.
- Variable length custom types: We will support data types that don't have a fixed size, such as strings or binary data, to provide more flexibility for extension developers.
- Query Hooks: We are designing query hooks to allow extensions to intercept and modify queries at various stages of the lifecycle (such as parsing, planning, and execution). This will unlock advanced capabilities like custom query rewriting, specialized auditing, and custom execution routing.
Build Your Own Extension
We’re building VillageSQL for the community. To start authoring extensions, fork the vsql-extension-template. It provides the C++17 scaffold, CMake configurations, and VEF headers needed to build a binary-compatible extension. Try building an extension and let us know how it goes. Keep us informed on what you build by sending us a note on Discord. We are compiling a list of community-built extensions and plan to publish a directory.