Browser vs. server trash collection in JavaScript

We've all heard how crucial garbage collection (GC) is in modern application development. Depending on your programming language, such as C, you may be doing this on your own. In some languages, it is so buried that many developers have no idea how it is done.

Garbage collection, by definition, is always about freeing up memory that is no longer in use. The tactics and mechanisms used to do this differ per language. Depending on whether you're in a browser or on a Node.js server, JavaScript, for example, can take several intriguing routes.

But have you ever thought about how this process works in the background? Let's have a look at how the JavaScript GC works in both the browser and on the server.

The Memory Cycles

We require GC because of the many memory allocations generated while programming. You build functions, objects, and so on, all of which consume up space.

When compared to C, for example, the main benefit of JavaScript is that it handles memory allocation automatically. This procedure is relatively straightforward and consists of three distinct steps:

Right, but where precisely does JavaScript save this data? There are essentially two places to which JavaScript transmits data: the memory heap and the stack.

Another phrase that everyone is familiar with is the heap. It is in charge of what is known as dynamic memory allocation. In other words, this space is set aside for JavaScript to store resources like objects and functions as needed, with no restrictions on the amount of RAM it may consume.

This is distinct from the stack, which is a data structure used to physically stack pieces such as raw data and pointers to real things. The stack allocation approach is "safer" since it knows exactly how much memory was allocated because it is fixed.

It's vital to remember that these constraints differ from vendor to vendor, so keep that in mind when aiming for high RAM consumption.

As an example, consider the following code listing:

// heap and stack
const task = {
  name: 'Laundry',
  description: 'Call Mary to go with you...',
};

// stack
let name = 'Walk the dogs'; // 1
name = 'Walk; Feed the dogs'; // 2
const firstTask = name.slice(0, 5); // 3

When you create a new object in JavaScript, heap memory space is allocated to it. However, because its internal values are primitives, they will be piled within the stack. The task reference is the same.

When it comes to specific instances, such as the usage of immutable values (as in JavaScript's primitives), the language always prefers fresh allocations over the use of the prior memory slot.

Here are the reasons for comments 1-3 in the above code example:

  • Simply assigning a string value to a new primitive variable
  • Overwriting it with a new value. When this occurs, JavaScript creates a new location on the stack rather than changing the existing value.
  • No matter how many times you do this, whether by direct assignment or as the result of a method, JavaScript will always use the same garbage collection procedures. We now understand how JavaScript manages memory allocation and where things go when they are allocated. But how does it accomplish this?

Garbage collection algorithms in JavaScript

We now understand how JavaScript manages memory allocation and where things go when they are allocated. But how does it accomplish this? The garbage collector (GC) in JavaScript handles it, and the procedure is as straightforward as it sounds: when an object is no longer utilized, the GC releases its memory.

What is not so straightforward is how JavaScript determines which objects are likely to be gathered. This is when the algorithms come into play.

The GC that counts references

This technique, as the name implies, looks through the resources allocated in memory for those that have no references pointing to them.

To have a better understanding, consider the following code snippet:

const task = {
  name: 'Laundry',
  description: 'Call Mary to go with you...',
};

task = 'Walk the dogs';

Initially, the task object has a slew of internal characteristics. Assume another developer determined that a task might simply be expressed as a primitive. As a result, the initial task object no longer has any references pointing to it, making it eligible for GC.

That can't be so straightforward... It does sound naive! That it is.

There is, however, one specific edge situation to be wary of: circular dependencies. You probably never considered them previously because JavaScript can handle them as well. However, they frequently take the following form:

function task(n, d) {
    // ...

    reporter = { ... };
    assignee = { ... };

    reporter.assignee = assignee;
    assignee.reporter = reporter;
};

myTask = task('Laundry', 'Call Mary to go with you...');

This is unlikely to be a functional job in a real-world application, but it is sufficient to conceive a circumstance in which two objects' internal characteristics relate to each other.

This starts a cycle. Once the function is completed, JavaScript's reference-counting GC will be unable to determine that these two objects may be collected since they still have references to each other.

This is a typical condition that can easily result in memory leaks in real-world applications. To circumvent this, JavaScript supplies us with a second battlefront tactic.

The algorithm of mark-and-sweep

The mark-and-sweep technique is well-known for being used for garbage collection in various computer languages. In summary, it employs an ingenious method to assess if a particular item can be accessed from the root object.

If you're in a Node.js application, the root object is the global object; if you're in a browser, it's window.

The process begins at the top and works its way down the hierarchy, marking each item that can be accessed (i.e., that is still being referred) from the root and sweeping the ones that cannot.

Can you see how the GC from the previous example will gather both the reporter and the assignee?

The Scavenger

As we've seen, it's more expensive for Node to free up space in the old space. When necessary, the mark-and-sweep algorithm is used to achieve the aim.

The GC scavenger only gathers rubbish from the youthful generation. Its technique is choosing the surviving elements and relocating them to a new page. V8 assures that at least half of the new generation stays vacant for this stage to take place; else, it would experience memory issues.

The goal is to trace all references into the younger generation without having to go through the full older generation. Furthermore, the scavenger preserves a collection of references from the old space that point to items in the new space.

The operation then copies the remaining items to the new page in chunks, continuing until the entire GC is completed. Finally, it changes the pointers for the originally moved items.

Conclusion

Of course, this was only a high-level summary of the GC techniques available in the JavaScript world. The technique is significantly more complicated and requires more reading. As supplementary materials, I strongly recommend the well-known Mozilla GC documentation and V8's lecture about the Orinoco garbage collector.

It's critical to remember that, like with many other languages, we don't know when the GC will execute. Since 2019, it has been up to the GC to do periodic clean-up, which you cannot initiate.

Aside from that, how you write has a significant influence on how much memory JavaScript allocates. That is why it is critical to understand the details of garbage collector memory allocation and memory releasing algorithms. Several open-source lint and hint tools are available to assist you in discovering and evaluating these leaks, as well as other hazards in your code. Go for it!