Fory Java Serialization Format
Spec overview
Fory Java Serialization is an automatic object serialization framework that supports reference and polymorphism. Fory will convert an object from/to fory java serialization binary format. Fory has two core concepts for java serialization:
- Fory Java Binary format
- Framework to convert object to/from Fory Java Binary format
The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fory flexible, much more easy to use, but also introduce more complexities compared to static serialization frameworks. So the format will be more complex.
Here is the overall format:
| fory header | object ref meta | object class meta | object value data |
The data are serialized using little endian byte order overall. If bytes swap is costly for some object, Fory will write the byte order for that object into the data instead of converting it to little endian.
Fory header
Fory header consists starts one byte:
| 4 bits | 1 bit | 1 bit | 1 bit | 1 bit | optional 4 bytes |
+---------------+-------+-------+--------+-------+------------------------------------+
| reserved bits | oob | xlang | endian | null | unsigned int for meta start offset |
- null flag: 1 when object is null, 0 otherwise. If an object is null, other bits won't be set.
- endian flag: 1 when data is encoded by little endian, 0 for big endian.
- xlang flag: 1 when serialization uses xlang format, 0 when serialization uses Fory java format.
- oob flag: 1 when passed
BufferCallback
is not null, 0 otherwise.
If meta share mode is enabled, an uncompressed unsigned int is appended to indicate the start offset of metadata.
Reference Meta
Reference tracking handles whether the object is null, and whether to track reference for the object by writing corresponding flags and maintaining internal state.
Reference flags:
Flag | Byte Value | Description |
---|---|---|
NULL FLAG | -3 | This flag indicates the object is a null value. We don't use another byte to indicate REF, so that we can save one byte. |
REF FLAG | -2 | This flag indicates the object is already serialized previously, and fory will write a ref id with unsigned varint format instead of serialize it again |
NOT_NULL VALUE FLAG | -1 | This flag indicates the object is a non-null value and fory doesn't track ref for this type of object. |
REF VALUE FLAG | 0 | This flag indicates the object is referencable and the first time to serialize. |
When reference tracking is disabled globally or for specific types, or for certain types within a particular
context(e.g., a field of a class), only the NULL
and NOT_NULL VALUE
flags will be used for reference meta.
Class Meta
Fory supports to register class by an optional id, the registration can be used for security check and class
identification.
If a class is registered, it will have a user-provided or an auto-growing unsigned int i.e. class_id
.
Depending on whether meta share mode and registration is enabled for current class, Fory will write class meta differently.
Schema consistent
If schema consistent mode is enabled globally or enabled for current class, class meta will be written as follows:
- If class is registered, it will be written as a fory unsigned varint:
class_id << 1
. - If class is not registered:
- If class is not an array, fory will write one byte
0bxxxxxxx1
first, then write class name.- The first little bit is
1
, which is different from first bit0
of encoded class id. Fory can use this information to determine whether to read class by class id for deserialization.
- The first little bit is
- If class is not registered and class is an array, fory will write one byte
dimensions << 1 | 1
first, then write component class subsequently. This can reduce array class name cost if component class is or will be serialized. - Class will be written as two enumerated fory unsigned by default:
package name
andclass name
. If meta share mode is enabled, class will be written as an unsigned varint which points to index inMetaContext
.
- If class is not an array, fory will write one byte
Schema evolution
If schema evolution mode is enabled globally or enabled for current class, class meta will be written as follows:
- If meta share mode is not enabled, class meta will be written as schema consistent mode. Additionally, field meta such as field type and name will be written with the field value using a key-value like layout.
- If meta share mode is enabled, class meta will be written as a meta-share encoded binary if class hasn't been written before, otherwise an unsigned varint id which references to previous written class meta will be written.