Skip to main content
Version: dev

Zero-Copy Serialization

Zero-copy serialization allows large binary data (byte arrays, numeric arrays) to be serialized out-of-band, avoiding memory copies and reducing serialization overhead.

When to Use Zero-Copy

Use zero-copy serialization when:

  • Serializing large byte arrays or binary blobs
  • Working with numeric arrays (int[], double[], etc.)
  • Transferring data over high-performance networks
  • Memory efficiency is critical

How It Works

  1. Serialization: Large buffers are extracted and returned separately via a callback
  2. Transport: The main serialized data and buffer objects are transmitted separately
  3. Deserialization: Buffers are provided back to reconstruct the original object

This avoids copying large data into the main serialization buffer.

Java

import org.apache.fory.*;
import org.apache.fory.config.*;
import org.apache.fory.serializer.BufferObject;
import org.apache.fory.memory.MemoryBuffer;

import java.util.*;
import java.util.stream.Collectors;

public class ZeroCopyExample {
public static void main(String[] args) {
Fory fory = Fory.builder().withLanguage(Language.XLANG).build();

// Data with large arrays
List<Object> list = List.of(
"str",
new byte[1000], // Large byte array
new int[100], // Large int array
new double[100] // Large double array
);

// Collect buffer objects during serialization
Collection<BufferObject> bufferObjects = new ArrayList<>();
byte[] bytes = fory.serialize(list, e -> !bufferObjects.add(e));

// Convert to buffers for transport
List<MemoryBuffer> buffers = bufferObjects.stream()
.map(BufferObject::toBuffer)
.collect(Collectors.toList());

// Deserialize with buffers
Object result = fory.deserialize(bytes, buffers);
System.out.println(result);
}
}

Python

import array
import pyfory
import numpy as np

fory = pyfory.Fory()

# Data with large arrays
data = [
"str",
bytes(bytearray(1000)), # Large byte array
array.array("i", range(100)), # Large int array
np.full(100, 0.0, dtype=np.double) # Large numpy array
]

# Collect buffer objects during serialization
serialized_objects = []
serialized_data = fory.serialize(data, buffer_callback=serialized_objects.append)

# Convert to buffers for transport
buffers = [obj.to_buffer() for obj in serialized_objects]

# Deserialize with buffers
result = fory.deserialize(serialized_data, buffers=buffers)
print(result)

Go

package main

import forygo "github.com/apache/fory/go/fory"
import "fmt"

func main() {
fory := forygo.NewFory()

// Data with large arrays
list := []any{
"str",
make([]byte, 1000), // Large byte array
}

buf := fory.NewByteBuffer(nil)
var bufferObjects []fory.BufferObject

// Collect buffer objects during serialization
fory.Serialize(buf, list, func(o fory.BufferObject) bool {
bufferObjects = append(bufferObjects, o)
return false
})

// Convert to buffers for transport
var buffers []*fory.ByteBuffer
for _, o := range bufferObjects {
buffers = append(buffers, o.ToBuffer())
}

// Deserialize with buffers
var newList []any
if err := fory.Deserialize(buf, &newList, buffers); err != nil {
panic(err)
}
fmt.Println(newList)
}

JavaScript

// Zero-copy support coming soon

Use Cases

High-Performance Data Transfer

When sending large datasets over the network:

// Sender
Collection<BufferObject> buffers = new ArrayList<>();
byte[] metadata = fory.serialize(dataObject, e -> !buffers.add(e));

// Send metadata and buffers separately
network.sendMetadata(metadata);
for (BufferObject buf : buffers) {
network.sendBuffer(buf.toBuffer());
}

// Receiver
byte[] metadata = network.receiveMetadata();
List<MemoryBuffer> buffers = network.receiveBuffers();
Object data = fory.deserialize(metadata, buffers);

Memory-Mapped Files

Zero-copy works well with memory-mapped files:

// Write
Collection<BufferObject> buffers = new ArrayList<>();
byte[] data = fory.serialize(largeObject, e -> !buffers.add(e));
writeToFile("data.bin", data);
for (int i = 0; i < buffers.size(); i++) {
writeToFile("buffer" + i + ".bin", buffers.get(i).toBuffer());
}

// Read
byte[] data = readFromFile("data.bin");
List<MemoryBuffer> buffers = readBufferFiles();
Object result = fory.deserialize(data, buffers);

Performance Considerations

  1. Threshold: Small arrays may not benefit from zero-copy due to callback overhead
  2. Network: Zero-copy is most beneficial when buffers can be sent without copying
  3. Memory: Reduces peak memory usage by avoiding buffer copies

See Also