Zero-Copy Serialization
Zero-copy serialization allows large binary data (byte arrays, numeric arrays) to be serialized out-of-band, avoiding memory copies and reducing serialization overhead.
When to Use Zero-Copy
Use zero-copy serialization when:
- Serializing large byte arrays or binary blobs
- Working with numeric arrays (int[], double[], etc.)
- Transferring data over high-performance networks
- Memory efficiency is critical
How It Works
- Serialization: Large buffers are extracted and returned separately via a callback
- Transport: The main serialized data and buffer objects are transmitted separately
- Deserialization: Buffers are provided back to reconstruct the original object
This avoids copying large data into the main serialization buffer.
Java
import org.apache.fory.*;
import org.apache.fory.config.*;
import org.apache.fory.serializer.BufferObject;
import org.apache.fory.memory.MemoryBuffer;
import java.util.*;
import java.util.stream.Collectors;
public class ZeroCopyExample {
public static void main(String[] args) {
Fory fory = Fory.builder().withLanguage(Language.XLANG).build();
// Data with large arrays
List<Object> list = List.of(
"str",
new byte[1000], // Large byte array
new int[100], // Large int array
new double[100] // Large double array
);
// Collect buffer objects during serialization
Collection<BufferObject> bufferObjects = new ArrayList<>();
byte[] bytes = fory.serialize(list, e -> !bufferObjects.add(e));
// Convert to buffers for transport
List<MemoryBuffer> buffers = bufferObjects.stream()
.map(BufferObject::toBuffer)
.collect(Collectors.toList());
// Deserialize with buffers
Object result = fory.deserialize(bytes, buffers);
System.out.println(result);
}
}
Python
import array
import pyfory
import numpy as np
fory = pyfory.Fory()
# Data with large arrays
data = [
"str",
bytes(bytearray(1000)), # Large byte array
array.array("i", range(100)), # Large int array
np.full(100, 0.0, dtype=np.double) # Large numpy array
]
# Collect buffer objects during serialization
serialized_objects = []
serialized_data = fory.serialize(data, buffer_callback=serialized_objects.append)
# Convert to buffers for transport
buffers = [obj.to_buffer() for obj in serialized_objects]
# Deserialize with buffers
result = fory.deserialize(serialized_data, buffers=buffers)
print(result)
Go
package main
import forygo "github.com/apache/fory/go/fory"
import "fmt"
func main() {
fory := forygo.NewFory()
// Data with large arrays
list := []any{
"str",
make([]byte, 1000), // Large byte array
}
buf := fory.NewByteBuffer(nil)
var bufferObjects []fory.BufferObject
// Collect buffer objects during serialization
fory.Serialize(buf, list, func(o fory.BufferObject) bool {
bufferObjects = append(bufferObjects, o)
return false
})
// Convert to buffers for transport
var buffers []*fory.ByteBuffer
for _, o := range bufferObjects {
buffers = append(buffers, o.ToBuffer())
}
// Deserialize with buffers
var newList []any
if err := fory.Deserialize(buf, &newList, buffers); err != nil {
panic(err)
}
fmt.Println(newList)
}
JavaScript
// Zero-copy support coming soon
Use Cases
High-Performance Data Transfer
When sending large datasets over the network:
// Sender
Collection<BufferObject> buffers = new ArrayList<>();
byte[] metadata = fory.serialize(dataObject, e -> !buffers.add(e));
// Send metadata and buffers separately
network.sendMetadata(metadata);
for (BufferObject buf : buffers) {
network.sendBuffer(buf.toBuffer());
}
// Receiver
byte[] metadata = network.receiveMetadata();
List<MemoryBuffer> buffers = network.receiveBuffers();
Object data = fory.deserialize(metadata, buffers);
Memory-Mapped Files
Zero-copy works well with memory-mapped files:
// Write
Collection<BufferObject> buffers = new ArrayList<>();
byte[] data = fory.serialize(largeObject, e -> !buffers.add(e));
writeToFile("data.bin", data);
for (int i = 0; i < buffers.size(); i++) {
writeToFile("buffer" + i + ".bin", buffers.get(i).toBuffer());
}
// Read
byte[] data = readFromFile("data.bin");
List<MemoryBuffer> buffers = readBufferFiles();
Object result = fory.deserialize(data, buffers);
Performance Considerations
- Threshold: Small arrays may not benefit from zero-copy due to callback overhead
- Network: Zero-copy is most beneficial when buffers can be sent without copying
- Memory: Reduces peak memory usage by avoiding buffer copies
See Also
- Serialization - Standard serialization examples
- Python Out-of-Band Guide - Python-specific zero-copy details