How We Reduced Latency in Our Platform by 50%

We have all experienced latency-- whether through waiting seconds for a web page to load or suffering through the sudden buffering of a movie you're streaming. Latency refers to the time it takes for a system to respond to instructions or transfer data.

At Unwrap.ai, we recognize the importance of delivering quality user experiences quickly. That's why we recently made a suite of improvements to minimize the latency on our platform to deliver a smooth, more responsive experience. Here are some recent strategies we implemented to significantly decrease the average latency on our platform.

Manage Third-Party Javascript

One method of reducing page latency in web applications involves removing any unnecessary or unused script files. Each script requires data to be downloaded and processed. Keeping only the essential scripts limits the data needed and decreases page load times.

Another strategy to minimize latency is by loading scripts asynchronously. Typically, scripts are loaded and run sequentially as they appear in the HTML file, which can hinder your page's rendering if placed in the <head> tag. To counter this, we used async or defer attributes on these script tags whenever possible. This tells the browser that the script's execution should occur either: 1) independently of other scripts on the page and not block the page from loading or 2) after the page has already loaded.

By analyzing and tweaking how we implement third-party Javascript files on our site, we managed to shave approximately 300 milliseconds of latency per page load.

Run Functions In Parallel

Another easy way to reduce latency on a site is to run functions in parallel rather than synchronously. To illustrate this, let’s look at an example where we run two asynchronous functions synchronously using the await keyword:

Example of code running two asynchronous functions synchronously using the *await* keyword

Here, the asynchronous function expensiveFunction(1) is executed first, blocking the rest of exampleFunction’s execution until it is complete. The time taken for both functions to run is the sum of the time to execute each function.

On the other hand, running multiple asynchronous functions together using something like Promise.all() will be much faster.

Example of expensiveFunction(1) and *expensiveFunction(2)* being fired off almost simultaneously

In this example, expensiveFunction(1) and expensiveFunction(2) are fired off almost simultaneously. The total execution time for this single line is equivalent to the time of the longest-running function.

Thus, if each run of expensiveFunction takes 2 seconds, the first function will take 4 seconds to finish rendering, whereas the second function will take only 2 seconds to finish.

While running functions in parallel can significantly decrease page load times, it is important to ensure the functions running in parallel do not depend on each other's results. If such dependencies exist, then functions must be executed sequentially; otherwise, running them in parallel will lead to issues.

Reduce Data Fetched

Another key strategy for improving performance involves reducing the data we fetch from the backend. Thanks to GraphQL, a technology we use to create our APIs, we can let the client (the web application) specify the exact data it needs rather than having the server dictate what gets returned, as with traditional REST APIs.

In practice, it looks like this:

If you have a user object with ten attributes and only need to display a user's name and email when loading a user’s profile page, you can request those fields instead of the entire user object. This cuts down significantly on the size of the data being transferred, which can help reduce latency and speed up your application.

Therefore, by fetching fewer fields when they are not needed, you are effectively optimizing for performance and reducing unnecessary data transfer.

‍Latency Tracking

As you work to bring down the overall latency of your site, it's important to keep track of how your efforts impact the speed of your site. To track how we're doing at Unwrap, we use AWS OpenSearch to collect, analyze and visualize latency data, which we use to identify performance bottlenecks in our application. This data includes how long requests take to move across our platform--from start to finish.

Based on the previous strategies I’ve referenced, we've reduced latency on key operations on our platform by 50%, bringing the average down from 2 seconds to a single second.

‍Improving Latency Takes Work