Dibash Thapa

Strings in Webassembly

Screenshot 2025-10-06 at 16

Problem

I was working on compiling a small subset of javascript to webassembly. Representing numbers , if else conditions and for loops was intuitive and straightforward. However, because i didn't have any prior experience working with low level instructions, I was stuck with using strings in webassembly.

Compiling a hello world program in webassembly was not easy as I was thinking

Memory Layout

After researching for a bit, I found that webassembly has a data section, where we can initialize data. So, i could push "hello world" to the data section and print by its offset from the memory.

(data (i32.const 0) "Hello World")

Unlike high-level languages where strings are first-class citizens, WebAssembly treats them as raw bytes in memory. It follows a linear memory model, so like array, we can store data in contiguous memory. And this is how hello world is stored in memory.

Memory Layout

You can understand more about linear memory from this blog by Lin Clark

Storing the Strings

For printing and storing the string, we need two offsets

We need to track this offsets in our compiler. So, if we want to track two different strings "Hello" and "world" in different line of the program

1| var a = "hello"
....
18| var b = "world"

We can track the offsets by simply incrementing our last offset by the length of the string.

new_offset = last_offset + length of the string

For example

new_offset = 0 + "hello".length = 0+5 = 5

In webassembly, this is translated to

(data (i32.const 0) "hello")
(data (i32.const 5) "world")

Okay, but how do we print this now ? We only know the offsets of the string, how about the length ?

Length of the string

Then i found this blog, where the author was storing the length of the string in the first byte. And this technique is called Length Prefixed Encoding, which is common in network protocols and even in programming language like Pascal.

String: "\05hello"
Memory: [0x05]['h']['e']['l']['l']['o']
         ↑______↑
         length  data

So if my string is hello, then the string with length will be stored as \05hello, the first byte is the length of the string.

\05 will be stored as 0x05

I defined this function to get the length of the string

 (func $len (param $addr i32) (result i32)
          local.get $addr
          call $nullthrow
          i32.load8_u 
 )

End Notes

But we're not done yet. In my next post, we'll tackle the real challenge: dynamic strings that can grow, shrink, and be modified at runtime. We'll build our own memory allocator.