Skip to main content
Version: dev

Row Format

Apache Fory™ provides a high-performance row format for zero-copy deserialization.

Overview

Unlike traditional object serialization that reconstructs entire objects in memory, row format enables random access to fields directly from binary data without full deserialization.

Key benefits:

  • Zero-copy access: Read fields without allocating or copying data
  • Partial deserialization: Access only the fields you need
  • Memory-mapped files: Work with data larger than RAM
  • Cache-friendly: Sequential memory layout for better CPU cache utilization
  • Lazy evaluation: Defer expensive operations until field access

When to Use Row Format

  • Analytics workloads with selective field access
  • Large datasets where only a subset of fields is needed
  • Memory-constrained environments
  • High-throughput data pipelines
  • Reading from memory-mapped files or shared memory

Basic Usage

use fory::{to_row, from_row};
use fory::ForyRow;
use std::collections::BTreeMap;

#[derive(ForyRow)]
struct UserProfile {
id: i64,
username: String,
email: String,
scores: Vec<i32>,
preferences: BTreeMap<String, String>,
is_active: bool,
}

let profile = UserProfile {
id: 12345,
username: "alice".to_string(),
email: "alice@example.com".to_string(),
scores: vec![95, 87, 92, 88],
preferences: BTreeMap::from([
("theme".to_string(), "dark".to_string()),
("language".to_string(), "en".to_string()),
]),
is_active: true,
};

// Serialize to row format
let row_data = to_row(&profile);

// Zero-copy deserialization - no object allocation!
let row = from_row::<UserProfile>(&row_data);

// Access fields directly from binary data
assert_eq!(row.id(), 12345);
assert_eq!(row.username(), "alice");
assert_eq!(row.email(), "alice@example.com");
assert_eq!(row.is_active(), true);

// Access collections efficiently
let scores = row.scores();
assert_eq!(scores.size(), 4);
assert_eq!(scores.get(0), 95);
assert_eq!(scores.get(1), 87);

let prefs = row.preferences();
assert_eq!(prefs.keys().size(), 2);
assert_eq!(prefs.keys().get(0), "language");
assert_eq!(prefs.values().get(0), "en");

How It Works

  • Fields are encoded in a binary row with fixed offsets for primitives
  • Variable-length data (strings, collections) stored with offset pointers
  • Null bitmap tracks which fields are present
  • Nested structures supported through recursive row encoding

Performance Comparison

OperationObject FormatRow Format
Full deserializationAllocates all objectsZero allocation
Single field accessFull deserialization requiredDirect offset read
Memory usageFull object graph in memoryOnly accessed fields in memory
Suitable forSmall objects, full accessLarge objects, selective access

ForyRow vs ForyObject

Feature#[derive(ForyRow)]#[derive(ForyObject)]
DeserializationZero-copy, lazyFull object reconstruction
Field accessDirect from binaryNormal struct access
Memory usageMinimalFull object
Best forAnalytics, large dataGeneral serialization