Short string optimisation trick

While reading a particular issue in lwan, a blog post caught my interest. It describes a way to store short ascii strings in the same area as their header data. Suppose the struct describing a heap allocated string looks like this:

typedef struct {
  uint32_t length;
  uint32_t capacity;
  uint32_t flags;
  char *buffer;
} string_t;

sizeof(string_t) // on 32-bits = 12

A clever cookie would realise that if the flag is moved to the end of the structure, you can set a bit to decide whether the string is short enough to live directly on the stack:

typedef struct {
  uint32_t capacity;
  char *buffer;
} string_info_t;

typedef struct {
  uint32_t length;
  union {
    struct {
      char buffer[sizeof(string_info_t)];
    } small;
    string_info_t info;
  }
  uint32_t flags;
} string_t;

This is a less optimal implementation than in the original blog post. A fascinating point the original post makes is that if we store the length at the end of the structure and negate the size of string_info_t from it when we store it, we can double it up as a null terminator, effectively allowing us to use the entire structure and removing the need for flag bits:

typedef struct {
  uint32_t capacity;
  char *buffer;
  uint32_t length;
} string_info_t;

typedef struct {
  char buffer[sizeof(string_info_t) - 1];
  uint8_t sizeofinfo_minus_length;
} small_string_t;

typedef union {
  string_info_t info;
  small_string_t small;
} string_t;

When sizeofinfo_minus_length is zero (buffer full), its bits will overlap with the upper bits of length, still allowing us to store lengths up to 2^24 but quickly indicating whether the small buffer is in use or not, as well as automatically terminating it, allowing the string to have an effective length of 16 bytes.

The original blog post is very interesting and explains it well!