This content originally appeared on DEV Community and was authored by Pigeon Codeur
Serialization is fundamental to engine development and data-driven applications, enabling complex objects to be saved, transferred, and loaded easily. In my custom game engine, I needed a serializer that could manage various types of data structures while ensuring efficient and readable output. This post explores the design and functionality of the C++ serializer I built, capable of handling basic and custom data types with ease.
This guide will help you understand how to create a flexible serializer, including practical examples and test cases to ensure reliability.
Converting Complex Data Structures to a Readable Format
Serialization allows developers to convert complex C++ structs into a format that’s both human-readable and easy to restore. This is particularly useful in game engines where game objects, configurations, and states need to be saved and reloaded frequently.
For example, consider the following Texture2DComponent
struct in a game engine:
struct Texture2DComponent : public Ctor
{
std::string textureName;
float opacity = 1.0f;
constant::Vector3D overlappingColor = {0.0f, 0.0f, 0.0f};
float overlappingColorRatio = 0.0f;
};
When serialized, this struct might look like this in YAML:
Texture2DComponent:
textureName: "exampleTexture"
opacity: 0.8
overlappingColor: [1.0, 0.0, 0.5]
overlappingRatio: 0.75
This transformation makes the data easy to read and edit manually, which can be a huge advantage during development, it also helps a lot in transferring data across the internet. In this post, we’ll walk through how to build this serializer, exploring its design and extensibility.
Core Design and Structure of the Serializer
In this chapter, I’ll walk through the core design of the serializer, which includes three main classes:
-
Archive
: The core class responsible for constructing serialized data. -
UnserializedObject
: A class for handling deserialized data, allowing easy access to serialized attributes and nested structures. -
Serializer
: The main manager that coordinates file handling, saving, and loading operations.
Each of these classes plays a critical role in ensuring that data can be serialized and deserialized in a structured, reliable manner. Let’s dive into each component.
The Archive Class: Building the Serialized Data
The Archive
class acts as a container for the serialized string. It manages formatting, indentation, and data flow, ensuring that the serialized output is both readable and parsable.
Key Design Features of Archive
:
-
Indentation Control:
- To keep the serialized output clean and readable, the
Archive
class uses an indentation level (indentLevel
) that increases or decreases depending on the depth of the data structure. Each new line starts with the appropriate number of tabs based onindentLevel
.
- To keep the serialized output clean and readable, the
-
End of Line Struct:
- The
Archive
class includes anEndOfLine
helper struct that handles line breaks and resets formatting flags (requestNewline
andrequestComma
). This struct ensures that each data entry is properly formatted, with commas and line breaks applied where necessary.
- The
-
Operator Overloading:
- Overloading the
<<
operator allowsArchive
to handle multiple data types seamlessly. The template-basedoperator<<
ensures that any type of data can be serialized, provided it has a matchingserialize
function or template specialization. This feature gives the serializer the flexibility to handle simple data types, custom components, and complex structures alike.
- Overloading the
Here’s a snippet of the Archive
class in action:
Archive& operator<<(const EndOfLine&)
{
requestComma = true;
requestNewline = true;
return *this;
}
template <typename Type>
Archive& operator<<(const Type& rhs)
{
if (requestNewline)
{
if (requestComma)
container << ",";
container << std::endl;
container << std::string(*endOfLine.indentLevel, '\t');
requestNewline = false;
}
container << rhs;
return *this;
}
With these design choices, Archive
keeps serialized data clean, ensuring that even complex structures are readable and formatted correctly.
The UnserializedObject Class: Parsing Deserialized Data
The UnserializedObject
class represents the deserialized data, allowing us to access fields by name and handle nested structures. It holds metadata such as object names and types, making it easier to retrieve individual fields from serialized data.
Key Features of UnserializedObject
:
-
Attribute Handling:
-
UnserializedObject
includes a helper method,getAsAttribute
, that retrieves individual attributes within serialized data. This is particularly useful for objects containing a mix of fields, as each attribute is stored with a unique name for easy access.
-
-
Error Checking and Logging:
- The
UnserializedObject
class performs checks to validate serialized data during deserialization, including checks for missing or mismatched braces, delimiters, and attribute names. Errors are logged to simplify debugging.
- The
-
Overloaded Operators for Field Access:
- Operator overloads, such as
operator[]
, make it easy to access attributes by name or index. This approach simplifies the code for handling nested data, allowing for intuitive retrieval of deserialized objects.
- Operator overloads, such as
Here’s an example of how UnserializedObject
handles attribute retrieval:
const UnserializedObject& UnserializedObject::operator[](const std::string& key)
{
auto isObjectName = [=](UnserializedObject obj) { return obj.objectName == key; };
auto it = std::find_if(children.begin(), children.end(), isObjectName);
if (it != children.end())
return *it;
else
{
LOG_ERROR(DOM, "Requested child '" + key + "' not present in the object");
return children[0];
}
}
With UnserializedObject
, parsing serialized data becomes straightforward, supporting intuitive access to data fields while maintaining error handling and flexibility.
The Serializer Class: Managing Files and Serialization Flow
The Serializer
class orchestrates the entire serialization and deserialization process. It handles file input and output, reading and writing serialized data, and storing serialized objects in a serializedMap
for easy retrieval.
Key Functions in Serializer
:
-
File Management:
- The
Serializer
can read from and write to files, supporting both direct paths and file objects. It also has an optional auto-save feature (autoSave
) that automatically writes serialized data to file when the program exits.
- The
-
Version Control:
- Each serialized file includes a version header. This allows the
Serializer
to parse files according to different serialization formats if needed, supporting backward compatibility.
- Each serialized file includes a version header. This allows the
-
Serializing and Deserializing Objects:
- The
Serializer
class includes methods for serializing (serializeObject
) and deserializing (deserializeObject
) various data types. These methods useArchive
andUnserializedObject
instances to manage the data flow, ensuring that objects are serialized and deserialized consistently.
- The
Here’s how Serializer
initializes a file read:
void Serializer::readFile(const std::string& data)
{
LOG_THIS_MEMBER(DOM);
if (data.empty())
{
LOG_MILE(DOM, "Reading an empty file");
return;
}
std::string line;
std::istringstream stream(data);
// First line of the file should always be the version number
std::getline(stream, line);
version = line;
auto stringData = gulp(stream);
LOG_INFO(DOM, stringData);
serializedMap = readData(version, stringData);
}
The Serializer
’s ability to handle file-based storage and version control makes it ideal for game development, where serialized data needs to be saved, loaded, and versioned consistently.
The combination of Archive
, UnserializedObject
, and Serializer
provides a powerful and flexible system for managing data in a structured, human-readable format. By controlling indentation, handling errors, and managing complex structures with ease, this serializer is a valuable tool for game development. It enables the efficient saving and loading of game state, assets, and configurations, making the development process smoother and more efficient.
In the next chapters, we’ll dive deeper into each component, exploring specific features like attribute handling, error checking, and practical examples for custom components.
Data Types and Extensibility
This custom C++ serializer is built to handle both basic data types and complex structures, a feature that’s particularly useful in game engines where data persistence and readability are essential. In this chapter, we’ll explore how basic types are serialized and deserialized, and how this system can be extended to support custom types.
The serializer achieves flexibility through template specializations for each type, allowing us to control precisely how different types of data are stored and retrieved. Let’s go through the handling of basic types first, and then look at how the system can easily accommodate custom types like vectors and models.
Basic Type Serialization and Deserialization
Each basic type has its own serialize
and deserialize
template specialization. This allows the serializer to convert each type to a string representation with a label, which is then used to identify the type during deserialization.
Below are some examples:
- Boolean:
template <>
void serialize(Archive& archive, const bool& value) {
LOG_THIS(DOM);
std::string res = value ? "true" : "false";
archive.setAttribute(res, "bool");
}
template <>
bool deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
auto attribute = serializedString.getAsAttribute();
if (attribute.name != "bool") {
LOG_ERROR(DOM, "Serialized string is not a bool (" << attribute.name << ")");
return false;
}
return attribute.value == "true";
}
The bool
serializer converts the value to "true"
or "false"
, which can be easily checked during deserialization.
- Integer:
template <>
void serialize(Archive& archive, const int& value) {
LOG_THIS(DOM);
archive.setAttribute(std::to_string(value), "int");
}
template <>
int deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
int value = 0;
auto attribute = serializedString.getAsAttribute();
if (attribute.name != "int") {
LOG_ERROR(DOM, "Serialized string is not an int (" << attribute.name << ")");
return value;
}
std::stringstream sstream(attribute.value);
sstream >> value;
return value;
}
Here, integers are converted to strings, with type validation during deserialization to ensure that data remains consistent.
- Floating Point Types:
template <>
void serialize(Archive& archive, const float& value) {
LOG_THIS(DOM);
archive.setAttribute(std::to_string(value), "float");
}
template <>
float deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
float value = 0;
auto attribute = serializedString.getAsAttribute();
if (attribute.name != "float") {
LOG_ERROR(DOM, "Serialized string is not a float (" << attribute.name << ")");
return value;
}
std::stringstream sstream(attribute.value);
sstream >> value;
return value;
}
The floating-point serializers handle both float
and double
, converting these to strings for easy readability and conversion.
- String:
template <>
void serialize(Archive& archive, const std::string& value) {
LOG_THIS(DOM);
archive.setAttribute(value, "string");
}
template <>
std::string deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
auto stringAttribute = serializedString.getAsAttribute();
if (stringAttribute.name != "string") {
LOG_ERROR(DOM, "String attribute name is not 'string' (" << stringAttribute.name << ")");
return "";
}
return stringAttribute.value;
}
Strings are handled directly, stored without conversion, and labeled as "string"
for consistency during deserialization.
Custom Type Support with Template Specializations
For game engines, custom data types like vectors and models are essential. Using template specializations, we can handle these types in the same way as basic types by creating specific serialize
and deserialize
functions for each custom type.
- Vector2D:
template <>
void serialize(Archive& archive, const constant::Vector2D& vec2D) {
LOG_THIS(DOM);
archive.startSerialization("Vector 2D");
serialize(archive, "x", vec2D.x);
serialize(archive, "y", vec2D.y);
archive.endSerialization();
}
template <>
constant::Vector2D deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
auto x = deserialize<float>(serializedString["x"]);
auto y = deserialize<float>(serializedString["y"]);
return constant::Vector2D{x, y};
}
For Vector2D
, each component (x
and y
) is serialized individually. During deserialization, each component is retrieved and used to reconstruct the vector.
-
Vector3D and Vector4D:
Similar functions are defined for
Vector3D
andVector4D
, with each component (x
,y
,z
, andw
) serialized as a separate attribute. Here’s an example forVector3D
:
template <>
void serialize(Archive& archive, const constant::Vector3D& vec3D) {
LOG_THIS(DOM);
archive.startSerialization("Vector 3D");
serialize(archive, "x", vec3D.x);
serialize(archive, "y", vec3D.y);
serialize(archive, "z", vec3D.z);
archive.endSerialization();
}
template <>
constant::Vector3D deserialize(const UnserializedObject& serializedString) {
LOG_THIS(DOM);
auto x = deserialize<float>(serializedString["x"]);
auto y = deserialize<float>(serializedString["y"]);
auto z = deserialize<float>(serializedString["z"]);
return constant::Vector3D{x, y, z};
}
-
ModelInfo:
The
ModelInfo
struct represents a more complex custom type. Each attribute, such asvertices
andindices
, is serialized as a list. Here’s an example:
template <>
void serialize(Archive& archive, const constant::ModelInfo& modelInfo) {
LOG_THIS(DOM);
archive.startSerialization("Model Info");
std::string attribute = "[ ";
for (unsigned int i = 0; i < modelInfo.nbVertices; i++)
attribute += std::to_string(modelInfo.vertices[i]) + " ";
attribute += "]";
archive.setAttribute(attribute, "Vertices");
attribute = "[ ";
for (unsigned int i = 0; i < modelInfo.nbIndices; i++)
attribute += std::to_string(modelInfo.indices[i]) + " ";
attribute += "]";
archive.setAttribute(attribute, "Indices");
archive.endSerialization();
}
This function serializes arrays as lists, which makes it easy to store and retrieve large datasets.
Extending the Serializer with New Data Types
The template-based design of this serializer makes it highly extensible. To add support for new types:
-
Define the
serialize
Function: Create a template specialization forserialize
to store each field of the custom type. -
Define the
deserialize
Function: Create a corresponding specialization fordeserialize
, retrieving each field from the serialized data. - Test the New Type: Once defined, new data types can be serialized and deserialized like any other, ensuring seamless integration.
This design allows new data types to be added with minimal changes to the core serializer, making it ideal for game engines where data structures evolve frequently.
With specialized functions for both basic and custom types, this serializer can handle diverse data structures in a game engine. Its template-based extensibility allows new types to be added easily, and each serialized field remains labeled and readable. By supporting types from simple integers to complex vectors, the serializer is robust, flexible, and ready for any type of data management needs in game development.
Practical Example – Serializing a Texture2DComponent
Serialization becomes particularly valuable in scenarios where complex game components need to be stored, loaded, and modified easily. In this chapter, we’ll go through a detailed example of serializing and deserializing a Texture2DComponent
, a struct in our game engine that represents a 2D texture with properties like opacity, color, and a texture name.
By using template specializations, we can define custom serialization and deserialization functions for Texture2DComponent
, making it possible to save this component’s data in a readable format, such as YAML, and retrieve it as needed.
The Texture2DComponent
Struct
Here’s the structure of Texture2DComponent
that we want to serialize:
struct Texture2DComponent
{
std::string textureName; // Name of the texture
float opacity = 1.0f; // Opacity level
constant::Vector3D overlappingColor = {0.0f, 0.0f, 0.0f}; // Overlapping color
float overlappingColorRatio = 0.0f; // Intensity of the overlapping color
};
This struct includes various fields:
-
textureName
: The name of the texture as a string. -
opacity
: A float representing the texture’s opacity level. -
overlappingColor
: A customVector3D
struct, representing RGB color values. -
overlappingColorRatio
: A float that indicates the overlapping color's intensity.
To serialize and deserialize Texture2DComponent
, we’ll create template specializations for each operation.
Step 1: Serializing Texture2DComponent
The serialize
function for Texture2DComponent
needs to convert each field into a human-readable format. Here’s the code for serializing Texture2DComponent
:
template <>
void serialize(Archive& archive, const Texture2DComponent& value)
{
LOG_THIS(DOM); // Logging for debugging and tracking
// Start the serialization process, labeling the component type
archive.startSerialization("Texture2DComponent");
// Serialize each field by its name
serialize(archive, "textureName", value.textureName);
serialize(archive, "opacity", value.opacity);
serialize(archive, "overlappingColor", value.overlappingColor);
serialize(archive, "overlappingRatio", value.overlappingColorRatio);
// End the serialization
archive.endSerialization();
}
In this code:
-
Start Serialization:
archive.startSerialization("Texture2DComponent")
begins the serialization process, identifying this block of data as aTexture2DComponent
. -
Serialize Fields: Each field in
Texture2DComponent
is serialized with a label. For instance,textureName
is serialized as"textureName"
so that it can be easily identified during deserialization. -
End Serialization: We call
archive.endSerialization()
to mark the end of this component’s data.
The result is a structured, readable output that might look like this:
Texture2DComponent {
textureName: "exampleTexture",
opacity: 0.8,
overlappingColor: {x: 1.0, y: 0.0, z: 0.5},
overlappingRatio: 0.75
}
Each field is clearly labeled and indented, making it easy to understand and even edit directly if needed.
Step 2: Deserializing Texture2DComponent
Deserialization is the reverse process, where we reconstruct a Texture2DComponent
from its serialized representation. Here’s the code for deserializing this component:
template <>
Texture2DComponent deserialize(const UnserializedObject& serializedString)
{
LOG_THIS(DOM); // Logging for debugging
// Check if the serialized object is valid
if (serializedString.isNull()) {
LOG_ERROR(DOM, "Element is null");
return Texture2DComponent{""}; // Return an empty Texture2DComponent if null
}
LOG_INFO(DOM, "Deserializing a Texture2DComponent");
// Extract each field from the serialized object by its name
auto textureName = deserialize<std::string>(serializedString["textureName"]);
auto opacity = deserialize<float>(serializedString["opacity"]);
auto overlappingColor = deserialize<constant::Vector3D>(serializedString["overlappingColor"]);
auto overlappingColorRatio = deserialize<float>(serializedString["overlappingRatio"]);
// Construct and populate the Texture2DComponent
Texture2DComponent texture{textureName};
texture.opacity = opacity;
texture.overlappingColor = overlappingColor;
texture.overlappingColorRatio = overlappingColorRatio;
return texture;
}
Here’s what each part of this function does:
-
Null Check:
if (serializedString.isNull())
checks if the serialized data is valid. If not, an emptyTexture2DComponent
is returned, and an error is logged. -
Retrieve Fields: Each field is extracted from
serializedString
using the field’s label, ensuring it matches the serialized format. This process allows the component’s data to be restored to its original values. -
Reconstruct the Component: After all fields are retrieved, they’re used to populate a new
Texture2DComponent
object, which is then returned.
This approach allows us to reconstitute the Texture2DComponent
from its serialized form with minimal effort.
Sample Usage
Here’s a quick example of how you might use the serializer to handle a Texture2DComponent
in code:
// Creating a Texture2DComponent instance
Texture2DComponent texture;
texture.textureName = "grass_texture";
texture.opacity = 0.9f;
texture.overlappingColor = {0.5f, 0.8f, 0.2f};
texture.overlappingColorRatio = 0.4f;
// Serializing the component
Archive archive;
serialize(archive, texture);
// Display the serialized output
std::cout << archive.container.str() << std::endl;
// Deserializing the component from serialized data
UnserializedObject unserializedObj(archive.container.str(), "Texture2DComponent");
Texture2DComponent deserializedTexture = deserialize<Texture2DComponent>(unserializedObj);
// Verifying the deserialized component
std::cout << "Deserialized texture name: " << deserializedTexture.textureName << std::endl;
std::cout << "Opacity: " << deserializedTexture.opacity << std::endl;
In this example:
- We create a
Texture2DComponent
, fill it with data, and serialize it. - The serialized output can be printed, edited, or saved to a file.
- We then create an
UnserializedObject
from the serialized data and usedeserialize
to reconstruct theTexture2DComponent
. - Finally, we confirm the fields to ensure they match the original values.
Key Benefits of This Approach
Using this serializer for Texture2DComponent
provides several advantages:
- Readability: The serialized format is organized and easy to understand.
- Modifiability: Fields are labeled, making it possible to edit values directly in serialized files.
- Consistency: The serializer enforces a structure that can be easily parsed back into C++ objects.
- Scalability: Adding new fields or even new types of components requires minimal code changes.
This example illustrates how custom template specializations allow complex components like Texture2DComponent
to be serialized and deserialized easily. The process ensures data consistency and flexibility, making it simple to manage game components, configurations, and state information.
With this serializer, you have a powerful tool to handle everything from game assets to state data in a clear, readable format. The custom serializer is adaptable, handling both primitive and custom data types, and is a robust solution for data persistence in game development.
Conclusion and Final Thoughts
This custom serializer provides flexibility, readability, and efficiency, all of which are crucial for game development. By supporting both basic and custom types, it adapts to evolving game mechanics and data structures. Through rigorous testing, we ensure that the serializer can handle complex scenarios reliably.
If you’re building a game engine or data-driven application, this approach can streamline data management, improve debugging, and make saved data easily accessible.
The complete code is open-source — check it out here leave a star or a reaction if you found this post interesting, explore the examples, and consider contributing new features or improvements!
This content originally appeared on DEV Community and was authored by Pigeon Codeur
Pigeon Codeur | Sciencx (2024-11-06T01:07:36+00:00) Building a Custom C++ Serializer for Efficient Data Handling. Retrieved from https://www.scien.cx/2024/11/06/building-a-custom-c-serializer-for-efficient-data-handling/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.