Vertical label placement

Labels on a vertical axis may need to be moved up or down from their preferred positions to prevent overlaps. This page describes a fast (linear time) algorithm for vertical label placement that minimises the maximum absolute offset of any label from its preferred position, while respecting limits on how high or low labels may be placed.

In this interactive diagram, drag the points up or down to see the labels being repositioned using the algorithm:

Formal statement of the problem

Given:

A set of n preferred positions, P = { p₁, p₂, … , p_n }
A separation, s
Limits, p_min and p_max

Such that:

The preferred positions are sorted: p_i ≤ p_j if i < j
Separation is required: s > 0
The limits permit all labels to be placed: p_max − p_min ≥ (n − 1)s

Find a set of permitted positions Q = { q₁, q₂, … , q_n } such that:

Labels are separated: q_i+1 − q_i ≥ s for all i
Limits are respected: p_min ≤ q_i ≤ p_max for all i
Absolute offsets are minimised: max(|o_i|), where the offsets are defined by o_i = q_i − p_i, cannot be reduced with an alternative choice of Q

All p_i, s, p_min, p_max, and all q_i must be integers.

The vertical label placement algorithm

The vertical label placement algorithm works by transforming the preferred positions into separated clusters, and then transforming these clusters into permitted positions.

Clusters

A cluster represents a set of neighbouring labels whose permitted positions are separated by exactly s. A cluster is represented by a data structure with four fields:

The start position, p_start
The end position, p_end
The minimum offset, o_min
The maximum offset, o_max

For example, if a set of preferred positions { 10, 20, 20 } is transformed into a cluster with permitted positions { 8, 16, 24 } due to a separation of s = 8, then the offsets are { −2, −4, 4}, and hence p_start = 8, p_end = 24, o_min = −4, and o_max = 4.

The number of labels in a cluster does not need to be recorded as the permitted positions are just the set { p_start, p_start + s, p_start + 2s, … , p_end }.

In pseudocode, we will assume the existence of a function CREATE-CLUSTER(p_start, p_end, o_min, o_max) that creates a cluster.

Transforming clusters

The algorithm makes use of three simple functions for transforming clusters.

SHIFT moves an entire cluster by a chosen offset:

SHIFT(cluster, offset)

cluster[p_start] = cluster[p_start] + offset
cluster[p_end] = cluster[p_end] + offset
cluster[o_min] = cluster[o_min] + offset
cluster[o_max] = cluster[o_max] + offset

BALANCE shifts a cluster to minimise the sum of o_min and o_max, which is equivalent to minimising the maximum absolute offset within the cluster:

BALANCE(cluster)

imbalance = ROUND-TOWARDS-ZERO((cluster[o_min] + cluster[o_max]) ÷ 2)

SHIFT(cluster, −imbalance)

Note that rounding towards zero, rather than rounding down, ensures that a cluster with o_min = −m and o_max = m − 1 isn't needlessly shifted so that o_min = 1 − m and o_max = m, which would not reduce the maximum absolute offset.

LIMIT shifts a cluster to respect the limits:

LIMIT(cluster, p_min, p_max)

if cluster[p_start] < p_min
        SHIFT(cluster, p_min − cluster[p_start])

if cluster[p_end] > p_max
        SHIFT(cluster, p_max − cluster[p_end])

Merging clusters

Using the functions for transforming clusters, we can now define MERGE, which creates a new cluster by merging two neighbouring clusters:

MERGE(cluster₁, cluster₂, s)

SHIFT(cluster₁, cluster₂[p_start] − cluster₁[p_end] − s)

cluster = CREATE-CLUSTER(
        cluster₁[p_start],
        cluster₂[p_end],
        MIN(cluster₁[o_min], cluster₂[o_min]),
        MAX(cluster₁[o_max], cluster₂[o_max])
)

BALANCE(cluster)

return cluster

MERGE works by shifting cluster₁ so that its end position is separated from the start position of cluster₂ by exactly s, fusing the two clusters end-to-end, and then balancing the resulting cluster.

The cluster list

The cluster list is a data structure used to store clusters. It is used as a stack, but must also support iteration to transform the clusters into a set of permitted positions. Most programming languages have a vector type that supports this functionality, while also allowing the required memory to be allocated on initialisation.

In pseudocode, we will assume the existence of a function CREATE-LIST that creates an empty list, and PUSH, POP, PEEK, and IS-EMPTY functions that operate on it as a stack.

Popping from the cluster list

POP-IF-NOT-SEPARATE pops and returns the last cluster from a cluster list if it is not sufficiently separated from a new cluster, and otherwise returns a NONE value and leaves the cluster list unchanged. Most programming languages would represent this with either an option type or a nullable type.

POP-IF-NOT-SEPARATE(list, cluster, s)

if not IS-EMPTY(list) and PEEK(list)[p_end] + s > cluster[p_start]
        return POP(list)
else
        return NONE

Transforming the cluster list into positions

POSITIONS transforms a cluster list into a list of permitted positions:

POSITIONS(list, s)

positions = CREATE-LIST()

for each cluster_i in list
        position = cluster_i[p_start]
        while position ≤ cluster_i[p_end]
                PUSH(positions, position)
                position = position + s

return positions

Completing the algorithm

We can now complete the algorithm by defining PLACE:

PLACE(P, s, p_min, p_max)

list = CREATE-LIST()

for each p_i in P
        cluster = CREATE-CLUSTER(p_i, p_i, 0, 0)
        LIMIT(cluster, p_min, p_max)

        while previous = POP-IF-NOT-SEPARATE(list, cluster, s)
                cluster = MERGE(previous, cluster, s)
                LIMIT(cluster, p_min, p_max)

        PUSH(list, cluster)

return POSITIONS(list, s)

PLACE works but iterating over the preferred positions, converting each to a single-position cluster, repeatedly merging this cluster with a cluster popped from the cluster list if they are not sufficiently separated, and then pushing the resulting cluster onto the cluster list.

The limits are applied each time a cluster is created. If limits are not required, the algorithm can easily be modified by removing the p_min and p_max arguments and the two calls to LIMIT.

Asymptotic running time

As the cluster for each p_i may be merged with up to i − 1 previous clusters, it may appear the algorithm has quadratic asymptotic running time — O(n²). However, each merger pops a cluster from the cluster list, reducing the number of clusters available for future mergers. As only n clusters are ever pushed to the list, at most n − 1 mergers can occur, giving the algorithm linear asymptotic running time — O(n).

Rust implementation

A Rust crate containing a reference implementation of the vertical label placement algorithm is available on GitHub.