AccelByte Blog: Insights on Game Development & Backend

New and Improved: ADT Smart Builds

Written by Phil Tossell | Nov 4, 2024 7:25:38 PM

Smart Builds is a revolutionary feature of AccelByte Development Toolkit (ADT)’s build distribution functionality that enables developers to quickly and efficiently distribute only incremental changes between builds instead of full builds. 

Unlike most incremental build distribution methods, Smart Builds does not use a binary diffing approach but instead performs what we call ‘content aware’ diffing. What this means is that you get much smaller, much faster differentials between any two builds than is possible with binary diffing or other common methods. This not only saves developer time but also significantly cuts costs. Our ultimate goal with Smart Builds is to make it an industry standard for build distribution and an indispensable part of every developer's toolbox.

We initially launched the beta version of Smart Builds back in Q2 2023 and since then we’ve been steadily improving it. During the beta we’ve been listening to customers and have accumulated a lot of great feedback on usability, performance and reliability. We realized that some of this feedback could not be delivered with small incremental improvements but rather needed a more comprehensive rework of some parts of the system. Driven by this feedback we decided to invest the time to make these changes and take Smart Builds to the next level in performance and reliability. We are excited to finally share these improvements. 

The latest version of Smart Builds containing all these improvements is now available in the newest version of ADT (2024.4). Existing customers will get this as an automatic update on October 24, 2024. If you’ve never tried ADT, you can sign up for a 30 day trial

Reliability

When distributing builds there is nothing more important than reliability. While Smart Builds was over 99.9% reliable we were unhappy that some customers observed infrequent and inconsistent build integrity failures.  

Investigating these issues was challenging due to the variability of network conditions, build composition and local machine configurations. We collected a lot of data points and eventually we identified the following areas that needed to be improved:

  • Intermittent AWS internal server errors from S3 were causing failed integrity checks.
  • Smart Builds pushes resource limits to the maximum and we discovered a little known prefix limit in S3. This was causing contention in the S3 buckets.
  • There was variability depending on geographical and hardware configurations as well as build composition.

To build further robustness into the system we made the following changes:

  • Added exponential back-off retry when we receive the intermittent AWS internal server errors. 
  • Introduced randomization to the prefix request order to spread the load over time. 
  • Reduced the overall load on S3 by caching more in CloudFront, further reducing the load to the S3 origin.
  • Introduced internal company-wide testing to ensure we have tested in many more geographical locations.
  • Partnered up with select customers so we can test with real builds from real customers.

With these changes in all our testing on the newest version of Smart Builds we have not seen a single integrity failure so far.

Performance 

Smart Builds pushes memory, disc and bandwidth to the limits at different parts of the process. This means that depending on build composition and local configuration, bottlenecks can occur at different stages. To better understand the improvements we made, it’s useful to know that downloading a Smart Build is composed of three main steps:

  1. Preparing 
    • Downloading the build manifest from server that details exactly which files make up the build
    • Comparing the build manifest against the local cache to determine which files we do not have locally and therefore will need to download from server
  2. Downloading
    • Downloading the build files that are missing from the local cache and storing them in the local cache
  3. Staging
    • Reconstructing the full build from the local cache

We systematically focused on each step using different setups to identify these bottlenecks and reduce them as much as possible.

Improvements to preparing

Preparing took an excessive amount of time, ranging from 1 to 3 minutes. This was caused by two factors:   

  • The build manifest could get very large depending on the composition of the build. Builds with many small files would be most affected.
  • Comparing the hashes for determining which actual files had changed was slow due to the way we were storing the local cache.

We made two key changes:

  • Cached and served the build manifest from Cloudfront rather than hit our server and the database with a large query to speed up its download time and reduce contention on the database
  • Changed the local cache database format to greatly optimize the hash checks

Cumulatively these changes showed a 10-20x improvement in the preparing process to the point that it is now negligible in time (see images below).

Result from Unreal Engine 5 build with build size: 5.39 GB

Improvements to downloading

Downloading consists of downloading the actual changed files as well as introducing them into the local cache. We identified a number of areas for improvement:

  • Writing to the local cache was actually slower than downloading the files, especially when there were many small files in the build,
  • A lot of cache misses on CloudFront was causing the overall file download speeds to be much slower than expected.

To improve these, we:

  • Refactored the local cache database format to be much more efficient for the most frequent operations.
  • Introduced more effective multi-threading.
  • Worked together with AWS to understand the way we use CloudFront and achieve far fewer cache misses.

Overall this led to much more efficient use of the available bandwidth as well as fewer cache misses.

End-to-end Download to Play time comparison of a 5 GB build on a 45 Mbps connection.

Existing customers will get improved Smart Builds as an automatic update on 24th October. If you’ve never tried ADT, you can sign up for a 30 day trial

The end result of these extensive changes is almost a Smart Builds v2 and delivers a much faster and more reliable experience. We look forward to seeing developers use it and continuing to improve Smart Builds into the future.