ocaml • geospatial

Antonin Guttman first proposed R-Trees in 1984.

In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations...

There are lots of excellent resources for understanding more about R-Trees and all of the wonderful variations there of (e.g. R*-trees). This blog post seems to be the most popular introduction.

In the Energy and Environment Group we are working on various geospatial tools, one of which is written in OCaml and needed a geospatial index to speedup spatial queries over a dataset. Searching Github revealed an implementation in pure OCaml. After reaching out to the author, they were more than happy for their work to be maintained in the Geocaml organisation, updated to the latest OCaml tools and hopefully released on opam.

The dunification went smoothly whilst fast-forwarding 15 OCaml compiler versions with no changes or problems (just dune unused variable warnings).

An R-Tree in OCaml is created from an envelope (the kind of bounding box) and a value (for storing in the tree). The value depends on the envelope as you have to provide a function to calculate an envelope from a value. Here is one way we might define a line using the provided two-dimensional `Rectangle`

envelope.

```
module Line = struct
type t =
{ p0 : float * float; p1 : float * float }
let t =
let open Repr in
record "line" (fun p0 p1 -> { p0; p1 })
|+ field "p0" (pair float float) (fun t -> t.p0)
|+ field "p1" (pair float float) (fun t -> t.p1)
|> sealr
type envelope = Rtree.Rectangle.t
let envelope { p0 = (x1, y1); p1 = (x2, y2) } =
let x0 = Float.min x1 x2 in
let x1 = Float.max x1 x2 in
let y0 = Float.min y1 y2 in
let y1 = Float.max y1 y2 in
Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1
end
```

In addition to the `envelope`

function, we also need to provide a runtime representation using `Repr`

. In the example, we've constructed it by hand but I recommend using ppx_repr to derive it for you. From here we can create our index.

```
module R = Rtree.Make (Rtree.Rectangle) (Line)
```

There are two ways to insert values into the index -- either by inserting them one-by-one or by bulk loading them. The latter is the recommended approach where possible as the resulting layout of the R-Tree is much better.

```
# let r =
let i = R.empty 4 in
R.insert i Line.{ p0 = (1., 1.); p1 = (2., 2.) };;
val r : R.t = <abstr>
```

As you might expect, bulk loading takes a list of values instead.

```
# let r =
let lines =
Line.[
{ p0 = (0., 0.); p1 = (1., 1.) };
{ p0 = (1., 1.); p1 = (2., 2.) };
{ p0 = (2., 2.); p1 = (3., 3.) };
{ p0 = (3., 3.); p1 = (4., 4.) };
]
in
R.load ~max_node_load:4 lines;;
val r : R.t = <abstr>
```

The lines we loaded into the last index are really just four segments of the line `y = x`

. Let's spatially query for the first two segments.

```
# R.find r (Rtree.Rectangle.v ~x0:0. ~y0:0. ~x1:2. ~y1:2.);;
- : Line.t list =
[{Line.p0 = (0., 0.); p1 = (1., 1.)}; {Line.p0 = (1., 1.); p1 = (2., 2.)};
{Line.p0 = (2., 2.); p1 = (3., 3.)}]
```

You can see the query actually returned the first three, that's because all bound checks are inclusive.

The source repository includes a fairly simple example of rendering an R-Tree using Vg. This can be done outside of the repository using the `tree`

representation.

```
# #show R.tree;;
val tree : R.t -> R.tree
type nonrec tree =
R.tree =
Node of (Line.envelope * R.tree) list
| Leaf of (Line.envelope * Line.t) list
| Empty
```

Here is an example rendering from the repository showing some random lines and the envelopes of the R-tree.