7.5 KiB
Storage Architecture
Horde's storage platform is designed to support manipulating massive data structures consisting of interlinked blobs. Blobs are immutable, and consist of an arbitrary block of data and zero or more outward references to other blobs.
The entry point to any such data structure is a named reference, which maps a user-defined name to a blob at the root of the data structure. Any nodes that are not directly or indirectly referenced by a named reference are subject to garbage collection.
History
Horde's storage system can be seen as an evolution of the GitSync tool used to download binaries for Epic's GitHub repo; for each change committed to Epic's Perforce server, we upload matching binaries to AWS S3 and commit a manifest to those files to the Git repository that can be used to download those files. The GitSync tool uses the manifest to retrieve and unpack those files whenever a new commit is checked out via a Git hook that is installed by running the Setup.bat file in the root of the repo.
One of the main design goals for GitSync was to offload the hosting of binary data to a proven, scalable third-party storage service (AWS S3) without having to maintain an active server capable of supporting many Unreal Engine developers. As such, we kept the ideas of content addressing used by Git but packed the content-addressed payloads into non-deterministic packages for more efficient downloads. At upload time, we use a heuristic to decide whether to reuse existing packages to download data versus re-packing them into new packages to avoid expensive gather operations.
While clients can still model the data as a Git-like Merkle tree - using any locally cached data they may have available using a uniquely identifying SHA1 hash - we reduce the chattiness and server-side compute load in negotiating data to transfer to clients by putting the data into pre-made, static download packages that are arranged to optimize for coherency between blobs we anticipate to be requested together.
This model optimizes for streaming reads and writes while still accomodating point reads when necessary.
Blobs
A blob in Horde has the following attributes (see BlobData
class):
-
Type: Represented by a GUID and integer version number and used to distinguish between payloads that may have a particular serialization format but the same data.
-
Payload: A byte array. Blobs are meant to be fully read into memory, so payloads are typically limited to a few hundred kb. Larger payloads can be split into smaller blobs using static or content-defined chunking using utility libraries.
-
References: A set of references to other blobs.
References to blobs are typically manipulated in memory using IBlobHandle
instances. After flushing to
storage, blob handles can be converted to and from a BlobLocator
, an opaque implementation-defined
string identifier assigned by the storage system.
Horde abstracts the way that blobs are serialized to the underlying storage backend through the IBlobWriter
interface; the programmer requests a buffer to serialize a blob to, writes the data and its references, and
gets a IBlobHandle
back to allow retrieving it at any point in the future. The implementation is left to
decide how to store the data, implementing compression, packing, buffering, and uploading the data as
necessary.
Multiple IBlobWriter
instances can be created to write separate streams of related blobs.
Refs and Aliases
Refs and aliases provide entry points into the storage system. With either, you can assign a user-defined name to a particular blob and retrieve it later.
-
Refs are strong references into the blob store and act as roots for the garbage collector. Refs can be set to expire at fixed times or after not being retrieved for a specific period of time - which can be useful for implementing caches.
-
Aliases are weak references to blobs. Multiple aliases with the same name may exist, and users can query for one or more blobs with a particular alias. Aliases have an associated user-specified rank, and consumers can query aliases ordered by rank.
Content Addressing
Horde models blobs in a way that supports content addressing. While hashes are not exposed
directly through BlobLocator
strings, hashes can be encoded into a blob's payload, which has a matching
entry in the references array.
Since references are stored separately from the blob payload, the unique identifier stored through a BlobLocator
does not affect the payload's hash.
The implementation primarily uses IoHash
for hashing blob data (a truncated 20-byte Blake 3 hash), but
decoupling the encoding of hashes in the payload from references in the storage system allows any other
hashing algorithm to be used instead. The underlying storage system can reason about the topology of blob
trees while still supporting a variety of hashing algorithms.
The IBlobRef
interface extends the basic IBlobHandle
interface with an IoHash
of the target blob.
Implementation
Note: This section describes the storage system's current implementation details, which may change in future releases.
Layers
The storage system is implemented through several layers:
-
The C# serialization library (
BlobSerializer
,BlobConverter
, and so on). -
The logical storage model is declared through the
IStorageClient
interface, which is the primary method for interacting with the storage system. At this layer, blobs are manipulated viaIBlobHandle
objects.-
BundleStorageClient
is the standard implementation ofIStorageClient
and packs blobs into bundles. -
KeyValueStorageClient
implements a client that passes individual blobs to the underlyingIStorageBackend
.
-
-
The physical storage model is declared through the
IStorageBackend
interface, which deals with sending data over the wire to storage service implementations.-
HttpStorageBackend
uploads data to the Horde Server over HTTP. -
FileStorageBackend
writes data directly to files on disk. -
MemoryStorageBackend
stores data in memory.
-
-
The bulk data store is declared through the
IObjectStore
interface, which interacts with low-level storage services.-
FileObjectStore
writes data to files on disk. -
AwsObjectStore
reads and writes data from AWS S3. -
AzureObjectStore
reads and writes data from the Azure blob store.
-
Bundles
Blobs are designed to be used as a general-purpose storage primitive, so we try to efficiently accommodate blob types ranging from a handful of bytes to several hundred kilobytes (larger streams of data can be split up into smaller chunks along fixed boundaries or using content-defined slicing).
Blobs are packed together into bundles for storage in the underlying object store.
The implementation of bundles and their use within the storage system is mostly hidden from user code, though knowing how a stream of blobs will be written to storage may help reasoning about access patterns.
Each bundle consists of a sequence of compressed packets, each of which may contain several blobs. Each packet is self-contained, so it may be decoded from a single contiguous ranged read of the bundle data.
Locators
Locators typically have the following form:
[path]#pkt=[offset],[length]&exp=[index]
[path]
: Path to an object within the underlying object store.[offset]
and[length]
: Byte range of the compressed packet data within a bundle.[index]
: The index of an exported blob within the packet.