This document describes a matching algorithm that provides all the functionality needed for topic matching and most of the functionality needed for content-based routing. This document was originally written by Pieter Hintjens of iMatix Corporation as part of the OpenAMQ specifications and the techniques described here form the basis of OpenAMQ's topic matching engine. Later on, the document was updated to cover optimalised topic routing algorithms and message filtering issues.
Matching messages to requests is a typical bottleneck in a message oriented middleware server. The standard mechanism is the "selector", an SQL-like clause that is interpreted to give a true/false result. Selectors are the basic tool for "content-based routing". Given a high volume of messages and requests, the cost of selectors grows exponentially. The "topic" mechanism used by middleware servers provides a faster matching algorithm based on a hierarchical naming system, but this still acts as a bottleneck in high-volume scenarios.
We define a "request" as consisting of one or more "criteria", and a message as providing one or more "fields". The request specifies the desired criteria in terms of fields: for example, a field having a certain value, or matching a certain pattern.
Topic matching is based on patterns (e.g. "forex.*") while content-based routing is based on field values ("currency = USD or GBP").
The most obvious matching technique is to compare each message with each request. If the cost of such a comparison is C, then the cost of matching one message is:(1)
Where R is the number of requests. The cost of C is proportional to the complexity of the request - i.e. the average number of criteria per request. In a low-volume scenario, R might be 1-10. In a high volume scenario we might have:
- 10,000 active requests (R = 10,000)
- a matching cost of 100 microseconds (C = 100)
Giving a cost of 10000 * 100 microseconds per message, or 1 second per message.
We can improve this by remarking that many requests are identical. If we assume that the maximum value for R will be around 100, we may reduce the cost per message to 0.01 seconds per message, giving us a maximum throughput of 100 messages per second.
Our goal is to get a matching cost of under 10 microseconds per message, for a potential throughput of as much as 100,000 messages per second.
The Inverted Bitmap Technique
In 1980-81, working for Lockheed, Leif Svalgaard and Bo Tveden built the Directory Assistance System for the New York Telephone company. The system consisted of 20 networked computers serving 2000 terminals, handling more than 2 million lookups per day. In 1982 Svalgaard and Tveden adapted the system for use in the Pentagon (Defense Telephone Service). This system is still in operation.
Svalgaard and Tveden invented the concept of "inverted bitmaps" to enable rapid matching of requests with names in the directory.
The inverted bitmap technique is based on these principles:
- We change data rarely, but we search frequently.
- The number of possible searches is finite and can be represented by a large, sparse array of items against criteria, with a 1 in each position where an item matches a criteria and 0 elsewhere.
The indexing technique works as follows:
- We number each searchable item from 0 upwards.
- We analyse each item to give a set of "criteria" name and value tuples.
- We store the criteria tuple in a table indexed by the name and value.
- For each criteria tuple we store a long bitmap representing each item that matches it.
The searching technique works as follows:
- We analyse the search request to give a set of criteria name and value tuples.
- We look up each criteria tuple in the table, giving a set of bitmaps.
- We AND or OR each bitmap to give a final result bitmap.
- Each 1 bit in the result bitmap represents a matching item.
Note that the bitmaps can be huge, representing millions of items, and are usually highly compressible. Much of the art in using inverted bitmaps comes from:
- Deriving accurate criteria tuples from items and search requests.
- Careful compression techniques on the large sparse bitmaps.
- Post-filtering search results to discard false positives.
Today's web search engines use such techniques. We (Hintjens et al) have built several search engines using these techniques (from STAR in 1990, to sms@ in 2001).
Application to Message Matching
The inverted bitmap technique thus works by pre-indexing a set of searchable items so that a search request can be resolved with a minimal number of operations.
It is efficient if and only if the set of searchable items is relatively stable with respect to the number of search requests. Otherwise the cost of re-indexing is excessive.
When we apply the inverted bitmap technique to message matching, we may be confused into thinking that the message is the "searchable item". This seems logical except that message matching requests are relatively stable with respect to messages.
So, we must invert the roles so that:
- The "searchable item" is the matching request, which we will call a "subscription" for the purposes of discussion.
- The "search request" is the message.
The indexing process now works as follows:
- We number each match request from 0 upwards.
- We analyse each match request to give a set of criteria tuples.
- We store the criteria tuples in a table indexed by name and value.
- For each criteria tuple we store a long bitmap representing each match request that asks for it.
The message matching process works as follows:
- We analyse the message to give a set of criteria tuples.
- We look up each tuple in the table, giving a set of bitmaps.
- We accumulate the bitmaps to give a final result bitmap.
- Each 1 bit in the result bitmap represents a matching subscription.
Topic Matching Example
Imagine we have these topics:
forex forex.gbp forex.eur forex.usd trade trade.usd trade.jpy
Imagine these subscriptions, where * matches one topic name section.
0 = "forex.*" matches: forex, forex.gbp, forex.eur, forex.usd 1 = "*.usd" matches: forex.usd, trade.usd 2 = "*.eur" matches: forex.eur 3 = "*" matches: forex, trade
When we index the matches for each subscription we get these bitmaps:
Now let us examine in detail what happens for a series of messages:
Message A -> "forex.eur" forex.eur 1 0 1 0 => Subscriptions 0, 2 Message B -> "forex" forex 1 0 0 1 => Subscriptions 0, 3 Message C -> "trade.jpy" trade.jpy 0 0 0 0 => No subscription
Field Matching Example
For our example we will allow matching on field value and/or presence. That is, a subscription can specify a precise value for a field or ask that the field be present.
Imagine these subscriptions:
|Nbr||Criteria||Number of criteria|
When we index the field name/value tuples for each subscription we get these bitmaps:
Now let us examine in detail what happens for a series of messages:
Message A: "currency=JPY, market=forex, slow"
Message B: "currency=JPY, urgent"
Message C: "market=forex, currency=EUR"
Note that the hit count is:
- zero for subscriptions that do not match.
- 1 or greater for subscriptions that have at least one matching criteria (a logical OR match).
- Equal to the criteria count when ALL criteria match (a logical AND match).
Optimised topic routing algorithm
While routing on topics as presented above is extremely fast, it tends to consume too much memory in large stock trading scenarios (millions of distinct topics, thousands of subscriptions). This section introduces optimised topic routing algorithm with modest memory requirements.
First of all, let's break individual subscriptions into their constituent parts. There's a 1st part of the topic, 2nd part of the topic etc. The longest subscription determines maximal number of parts we're going to take into account. If there's a message with more parts that the maximal number we've computed, it's safe to assume it doesn't match any subscription. We can drop such a message straight away without doing any additional processing.
If the topic doesn't exceed maximal size, we'll match each individual constituent with corresponding constituents of the subscriptions using inverted bitmap technique described above.
Say we have following subscriptions "trade, forex.eur, forex.*". Maximal number of constituents is 2:
Inverted bitmap for the first constituent:
Inverted bitmap for second constituent:
Note the "null" row in each bitmap. "null" means that the particular has no corresponding constituent. For instance, "trade" has no second constituent, thus it is assumed to be equal to "null".
Also note the "other" row. This is a way to limit the number of rows in each bitmap to match the number of possible alternatives for the constituent as they appear in subscriptions. There's no need to add new rows for values that don't appear in subscriptions. Say, the message "forex.usd" contains second constituent "usd", however, as it is not mentioned in any subscription we don't need a separate line in the bitmap for it. We'll simply treat it as "other" value.
The matching algorithm itself is easy: Break the topic of the message into its constituents, find a corresponding row in each inverted bitmap and perform logical AND on the selected rows. Note that this means that at most N rows are combined, N being maximal number of constituents in any subscription. This is quite a low number, presumably less than 10.
Additionally, it's not always necessary to perform logical conjunction on all the rows. Once you get an empty bitmap after processing couple of rows, there's no point and doing AND for subsequent rows - the topic does not match any subscription anyway.
Here are few examples of how the matching works:
Message has "forex.eur" topic. Get the "forex" row from the first consituent bitmap, "eur" row from the second constituent bitmap and perform logical AND on the rows:
|1 = forex||0||1||1|
|2 = eur||0||1||1|
"forex.eur" and "forex.*" subscriptions match.
Now the same thing for a message with "forex.jpy" topic:
|1 = forex||0||1||1|
|2 = other||0||0||1|
"forex.*" subscriptions matches.
One more example. This time message with single-constituent topic ("trade"):
|1 = trade||1||0||0|
|2 = null||1||0||0|
"trade.*" subscriptions matches.
There's one important optimisation that haven't been mentioned so far. If the topic of the message contains a sequence of nulls (e.g. "forex.null.null.null.null.null") there's no need to worry about constituents corresponding to the second and any subsequent nulls. in case of "forex.null.null.null.null.null" algorithm has to AND the row from first constituent bitmap and the row from second constituent bitmap, however, it can safely ignore the rows for third to sixth constituent. (The proof of correctness of the optimisation is left as an exercise for the reader.)
The above optimisation is especially important in the case the topic tree is unbalanced. For instance if most topics consist of three constituents, however, there's a single topic with 100 constituents, subscribing for that topic causes inverted bitmaps to have 100 rows. Thus, without above optimisation, each message matching requires 100 bitmap conjunctions, while in 99% of cases 3 conjunctions are enough to perform the matching.
Filtering is similar to matching, the only distinction being that you are not interested in which particular subscriptions match, rather what you want to know whether at least one subscription is matches.
Filtering is especially useful in multicast scenarios where all messages are transferred over the wire and it's receiver's responsibility to filter out those that don't match any local subscription.
While exactly the same inverted bitmap algorithm is used, it is possible to optimise it even further for the filtering use case.
With filtering we are basically interested in whether there is at least one bit set in the resulting bitmap. Thus, if we find out that the first bit is 1, there's no need to evaluate the remaining bits. Consequently, the algorithm can evaluate individual subscriptions from left to right and stop when it encounters the first one that matches.
Consider the "forex.eur" example above:
|1 = forex||0||1||1|
|2 = eur||0||1||1|
Note that AND is not done for the third subscription. Although this may seem to be of little help, with large bitmaps consisting of thousands of subscriptions it can save a lot of processing.
Obviously, doing logical AND on per-bit basis is terribly inefficient. In real-world implementation AND should be done from left to right in batches containing 32 or 64 subscriptions, depending on the processor data bus width or even batches of 128 or more with dedicated SIMD instructions.
There are several ways to speed the matching algorithm even more, trading more work to do on subscribing/unubscribing for faster message matching and filtering.
One possibility is to lower the number of subscriptions by subsuming them. In the filtering use case, if there's "forex.*" subscription, there's no need for "forex.eur" subscription, because the former is a superset of the latter.
Another possibility is to move the subscriptions that we expect to match the most messages to the left of the bitmap while leaving those that have only small probability of succeeding on the right. In combination with left-to-right evaluation of logical conjunction in the filtering use case, we can get an algorithm that will in most cases succeed early on, hopefully after evaluating first few columns of the inverted bitmap.
There's even an obvious heuristic to determine whether a particular subscription is likely to match: The more wildcards (*) there are in the subscription the larger set of messages it is able to match and thus more likely it is that it'll succeed. That way, moving the subscriptions with large number of wildcards to the left while keeping those with no or little wildcards on the right is likely to give a very efficient algorithm.
We believe that application of the above algorithms can give a system that will be able to match or filter a single message in the range of nanoseconds or couple of microseconds even it the case of large amount of different topics and subscriptions.
Written: 07 Aug 2007 10:19
Revised: 17 Apr 2012 11:55
If you found this page useful, please rate it up so others will find it.
- 0MQ Termination
- 0MQ/3.0 pubsub
- Background to AMQP
- Broker vs. Brokerless
- Internal Architecture of libzmq
- Market Analysis
- Measuring jitter
- Measuring messaging performance
- Messaging enabled network
- Multithreading Magic
- ØMQ Lightweight Messaging Kernel (v0.1)
- ØMQ Lightweight Messaging Kernel (v0.2)
- ØMQ Lightweight Messaging Kernel (v0.3)
- ØMQ Lightweight Messaging Kernel (v0.4)
- ØMQ Lightweight Messaging Kernel (v0.5)
- ØMQ Lightweight Messaging Kernel (v0.6)
- Routing (early experience)
- Switch or Broker
- Traffic Monitoring
- Whalesharks - an Open Source Data Plant
Who's watching this page?pieterh
... and more