TL;DR: Apache Fory C++ is a blazing-fast, cross-language serialization framework delivering exceptional binary performance with support for polymorphic types, circular references, schema evolution, and seamless interoperability with Java, Python, Go, Rust, and JavaScript — all via modern C++17 with zero runtime reflection overhead.
- 🐙 GitHub: https://github.com/apache/fory
- 📚 Docs: https://fory.apache.org/docs/guide/cpp
The C++ Serialization Problem
Every C++ developer working in a polyglot environment eventually hits the same wall. Existing options force a painful choice:
- IDL-first frameworks (Protocol Buffers, FlatBuffers): Require upfront schema compilation, lose native C++ type expressiveness, and carry significant integration friction. Every type change means regenerating code across all languages in lock-step.
- Reflection-based frameworks (Boost.Serialization, cereal): Limited cross-language support, no circular reference handling, no polymorphism without boilerplate. They work well within a single language but break down at system boundaries.
- Hand-rolled binary formats: Fast but brittle — any schema change risks silent corruption, and every new type requires manual encode/decode logic.
Apache Fory C++ eliminates this trade-off. It delivers performance competitive with the fastest C++ serialization libraries while providing first-class support for polymorphism, shared/circular references, schema evolution, and binary compatibility with Java, Python, Go, Rust, and JavaScript — through a clean C++17 API.
What Makes Apache Fory C++ Different?
Compile-Time Code Generation
Most serialization frameworks pay a runtime cost for flexibility — inspecting type information through virtual dispatch or hash maps at every call. Apache Fory takes a different approach: the FORY_STRUCT macro uses C++ template metaprogramming to generate all serialization logic at compile time. The result is inlined, type-specific code with no virtual dispatch, no reflection, and no runtime overhead:
#include "fory/serialization/fory.h"
using namespace fory::serialization;
struct Person {
std::string name;
int32_t age;
std::vector<std::string> hobbies;
std::map<std::string, std::string> metadata;
std::optional<std::string> nickname;
};
FORY_STRUCT(Person, name, age, hobbies, metadata, nickname);
That single macro generates compile-time field metadata, efficient serialization/deserialization code via ADL (Argument-Dependent Lookup), and type registration hooks. The macro can be placed inside the class body to access private fields, or at namespace scope for third-party types.
Cross-Language Binary Protocol
Apache Fory C++ speaks the same binary wire format as Java, Python, Go, Rust, and JavaScript. Serialize a struct in C++, deserialize it in Python — no adaptation layer, no schema translation, no version negotiation needed. This is especially powerful for microservice architectures where different teams own different services in different languages:
// C++: Serialize
auto fory = Fory::builder().xlang(true).build();
fory.register_struct<Person>(100);
auto bytes = fory.serialize(person).value();
# Python: Deserialize (same binary format, same type ID)
fory = pyfory.Fory(xlang=True)
fory.register(Person, type_id=100) # Same ID as C++
person = fory.deserialize(data)
The core requirements for cross-language interoperability are consistent type IDs/Names across participating runtimes, matching canonical field names, and compatible field types for those names.
Polymorphism via Smart Pointers
Serializing polymorphic objects is notoriously difficult in C++. Most frameworks require manual type tagging or generate large amounts of boilerplate. Apache Fory handles it automatically: it detects polymorphic types via std::is_polymorphic<T> and preserves the full runtime type identity through std::shared_ptr and std::unique_ptr. When you deserialize a shared_ptr<Animal> that holds a Dog, you get a Dog back — no extra code required:
struct Animal { virtual ~Animal() = default; int32_t age = 0; };
FORY_STRUCT(Animal, age);
struct Dog : Animal { std::string breed; };
FORY_STRUCT(Dog, FORY_BASE(Animal), breed);
struct Cat : Animal { std::string color; };
FORY_STRUCT(Cat, FORY_BASE(Animal), color);
struct Shelter { std::vector<std::shared_ptr<Animal>> animals; };
FORY_STRUCT(Shelter, animals);
auto fory = Fory::builder().track_ref(true).build();
fory.register_struct<Shelter>(10); fory.register_struct<Dog>(11); fory.register_struct<Cat>(12);
Shelter s;
s.animals.push_back(std::make_shared<Dog>()); // Dog at runtime
s.animals.push_back(std::make_shared<Cat>()); // Cat at runtime
auto decoded = fory.deserialize<Shelter>(fory.serialize(s).value()).value();
assert(dynamic_cast<Dog*>(decoded.animals[0].get()) != nullptr); // Runtime type preserved!
Fory also supports std::unique_ptr for exclusive-ownership polymorphic fields, and collections of smart pointers (std::vector<std::shared_ptr<Base>>, std::map<K, std::unique_ptr<Base>>).
Shared/Circular Reference Tracking
Many real-world data models contain shared objects or cycles: a parent node pointing to its children, which point back to the parent; an order referencing a customer who appears in multiple orders. Standard serialization frameworks either duplicate the data (wasting space) or crash with a stack overflow when they encounter a cycle.
With track_ref(true), Fory tracks object identity across the entire graph. Shared objects are serialized exactly once; every subsequent reference is encoded as a back-reference. Cycles terminate naturally:
struct Node {
virtual ~Node() = default;
int32_t id = 0;
std::vector<std::shared_ptr<Node>> neighbors;
};
FORY_STRUCT(Node, id, neighbors);
auto fory = Fory::builder().track_ref(true).build();
fory.register_struct<Node>(200);
auto node1 = std::make_shared<Node>(); node1->id = 1;
auto node2 = std::make_shared<Node>(); node2->id = 2;
node1->neighbors.push_back(node2);
node2->neighbors.push_back(node1); // Cycle — handled correctly!
auto bytes = fory.serialize(node1).value();
// No stack overflow, no duplicate data — the cycle is preserved faithfully
auto decoded = fory.deserialize<std::shared_ptr<Node>>(bytes).value();
This makes Fory a natural fit for graph databases, entity-component systems, and any domain model with bidirectional relationships.
Schema Evolution
In a microservice deployment, services update independently. A new version of the user service may add a phone field while old consumers are still running. Without schema evolution support, this forces a coordinated, big-bang deployment. Apache Fory's compatible mode removes this constraint entirely:
// Version 1
struct UserV1 { std::string name; int32_t age; };
FORY_STRUCT(UserV1, name, age);
// Version 2 — new fields added independently
struct UserV2 { std::string name; int32_t age; std::string email; };
FORY_STRUCT(UserV2, name, age, email);
auto fory_v1 = Fory::builder().compatible(true).xlang(true).build();
auto fory_v2 = Fory::builder().compatible(true).xlang(true).build();
fory_v1.register_struct<UserV1>(100);
fory_v2.register_struct<UserV2>(100); // Same type ID enables evolution
auto bytes = fory_v1.serialize(UserV1{"Alice", 30}).value();
auto v2 = fory_v2.deserialize<UserV2>(bytes).value();
assert(v2.name == "Alice" && v2.email == ""); // Default for missing field
In compatible mode, fields are matched by name rather than position. New fields receive C++ default values when missing; removed fields are safely skipped. This enables rolling upgrades and independent service deployments without any serialization errors.
Row Format: Zero-Copy Analytics
Beyond object graph serialization, Apache Fory C++ implements a row-based binary format designed for analytics workloads. The row format stores data in a contiguous memory layout with a null bitmap, fixed-size slots for primitives, and a variable-length section for strings and nested objects. This enables O(1) random field access by index — you can read a single field from a large struct without touching the rest of the data.
This is particularly valuable in data pipelines and OLAP workloads where only a small subset of fields are queried per record:
#include "fory/encoder/row_encoder.h"
using namespace fory::row::encoder;
struct SensorReading {
int32_t sensor_id; double temperature; std::string location;
FORY_STRUCT(SensorReading, sensor_id, temperature, location);
};
RowEncoder<SensorReading> encoder;
encoder.encode({42, 23.5, "rack-B"});
auto row = encoder.get_writer().to_row();
// Read any field in O(1) — no deserialization of unused fields
int32_t id = row->get_int32(0);
double temp = row->get_double(1);
auto loc = row->get_string(2);
For zero-copy access into an existing buffer (e.g., from a memory-mapped file or network receive buffer), Fory can point a Row directly at the memory without copying:
auto src = encoder.get_writer().to_row();
fory::row::Row view(src->schema());
view.point_to(src->buffer(), src->base_offset(), src->size_bytes()); // Zero-copy view
int32_t id = view.get_int32(0); // Reads directly from the original buffer
Use the row format for analytics, OLAP-style workloads, and partial field access. Use object graph serialization for full object round-trips with references and polymorphism.
Installation
Fory C++ requires a C++17-compatible compiler (GCC 7+, Clang 5+, MSVC 2017+) and supports both CMake and Bazel build systems.
CMake (FetchContent)
The simplest integration is via CMake's FetchContent module, which fetches and builds Fory as part of your project:
cmake_minimum_required(VERSION 3.16)
project(my_project LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 17)
include(FetchContent)
FetchContent_Declare(fory
GIT_REPOSITORY https://github.com/apache/fory.git
GIT_TAG v0.15.0
SOURCE_SUBDIR cpp)
FetchContent_MakeAvailable(fory)
add_executable(my_app main.cc)
target_link_libraries(my_app PRIVATE fory::serialization)
Bazel
For Bazel-based projects, add Fory as a module dependency:
bazel_dep(name = "fory", version = "0.15.0")
git_override(module_name = "fory",
remote = "https://github.com/apache/fory.git",
commit = "v0.15.0")
cc_binary(name = "my_app", srcs = ["main.cc"],
deps = ["@fory//cpp/fory/serialization:fory_serialization"])
