Finally solving the performance problem
Posted in Technology on 2008-07-06 14:49
Lamps, Wittelsbach Residence, München
I wrote about the performance problems the tmphoto application had suffered from, and my failed attempts to fix them. I decided that the problem was that I'd set out to solve the problems without really knowing what the cause was, since reducing traffic by 55% obviously wasn't going to rescue the site. So the solution must be to do some proper research first, and then try to solve it.
Finding the problem
The application is written using the OKS, and uses tolog queries to extract information from the topic map. An obvious place to start was to profile the tolog queries used by the application to see if one of them might be the source of the problem. I'd already adapted the SQL query profiler in our product so that it also profiled tolog queries, so this was easy to do.
I figured it would also be a good idea to try to approximate the traffic that the real server was seeing, so I wrote a 10-line Python script to just sequentially replay requests from the server logs. I let this run for some thousand requests, then looked at the profile report. That was pretty unambiguous: two queries took up 97.8% of the time the application had spent on queries.
The two queries were the queries on the photo page that find the previous and next photo. To see what the effect would be of improving these, I went back to the Python script and made it compute the number of requests it got through per second. 3.1 per second, it reported, which isn't much. So I commented out the previous and next buttons, reloaded in the browser (to recompile the JSP) and ran it again. This time the script managed 48 requests per second. So clearly this was the place to start looking.
The two queries are actually rather complex, but the complexity comes from filtering out hidden photos for people who have not logged in. If this complexity is taken out, the queries basically look like this:
last-modified-at($PHOTO, $DATE), $DATE > %time% /* more code to remove hidden photos */ order by $DATE limit 1?
tolog works pretty much like Prolog (and even more like Datalog), but variables are indicated with a $ prefix. So the first line says to find all combinations of photos and the datetime they were taken (because both parameters are variables). The second line compares the retrieved datetime with the parameter time (which is the time of the current photo) and keeps only the ones that are later. Then it sorts by datetime and keeps only the first. In other words: the query finds the photo with the lowest datetime that's bigger than that of the current photo. So it finds the next photo.
The trouble is that when we do a trace of the query (I've set time to "2007-00-00"), we find that it actually performs the query in the way described above:
ENTER (1): lastmod($PHOTO, $DATE) LEAVE (0.011, 8572) ENTER (8572): >($DATE, 2007-00-00) LEAVE (0.0030, 4092) /* filtering performed here, taking lots of time, because it's chewing thousands of rows */ ENTER order by LEAVE (0.0050)
Fountain, Altes Hackerhaus, München
So first it finds all photo/date combinations, of which there are 8572. Then it cuts those earlier than 2007, leaving 4092. Then it pumps 4092 rows through the complex filtering, finally sorting them, and taking the first one. It's clear that this is never going to be very fast. Unfortunately, this is the only way to express the query in tolog, so there is no way to improve the query itself.
However, I wrote the query engine, and I realize there is a way to make the engine do this faster. We have an index on property values, so I can look up every topic with a particular datetime, for example. This index is a TreeMap, so I should be able to look up the current value and then walk forwards through the next values from there. This can be used to implement a different execution strategy for the query where we just examine enough values to get us the next photo that satisfies the filters, and then stop. In many cases, this means looking at one photo will be enough.
Now, I've been programming for long enough to know that this is not the moment to start implementing a complex query optimization. First, I need to double-check that TreeMap will indeed do what I need to do. I see that I can call the headMap method to get a submap starting at the current value and extending to the end of the map, and then use keySet().iterator() to get the keys in order. Fine. And there is a corresponding tailMap. This doesn't include the current value for some reason, which may be a complication, but I decide to work that out later. (A small aside here: if the purpose of a sorted map is to allow you to traverse keys in sorted order, why is the API for it seemingly geared towards producing submaps instead? And why is the API for traversal so convoluted and indirect? Strange.)
Anyway, I go on to implement the optimization itself. There is already an optimization framework in the query processor, so all I need to do is to implement a query optimizer class that modifies the query into the new query I want. The first step is to make it recognize the sort of query for which this optimization will work. This is a lot more complex than it sounds. Basically, the query has to satisfy about 10 different conditions, all of which have to be tested for explicitly. This actually comes to about 60 lines of code.
Once that's done, the optimization can be applied. This is done by rewriting the query to a new fake predicate replacing the last-modified-at. This predicate picks out a batch of the next key values from the index, looks up the corresponding objects, and then pumps all of this through a lightly rewritten version of the rest of the query. If that produces enough values to meet the limit it stops, leaving the ORDER BY and the LIMIT to do the rest of the job. If it's not enough, it doubles the batch size and tries again. This way we don't look at any more photos than we have to get the next one, and we never look at earlier photos at all.
I implement all of this over about one working day, and then discover a problem when I try to go backwards, which is needed for the query finding the previous photo. Yes, I can get a tailMap, but its keySet is of course ordered the wrong way: starting with the lowest value and ending with the highest. You can get the highest value, but there's no way to walk backwards from it.
Well, strictly speaking there is a way. You can call tailMap and then lastKey to get the previous value, then repeat ad infinitum, but this creates a new SortedMap object for each value you retrieve, which is a massive waste. Or you can call keySet on the first tailMap, put all the keys in to a list, reverse it, and then iterate. This could be a lot of values, so this isn't good, either. Another way to solve it is to build two TreeMaps, sorted in opposite directions. This, of course, neatly doubles the memory and time cost of building and maintaining the index, so this isn't an interesting option.
Death, Asamkirche, München
What's amazing is that the TreeMap is a red-black tree, so implementing the backwards walking takes just a few lines of code, but studying this a bit it becomes clear that the problem here is the design of the SortedMap interface, which just does not provide those operations. So why does it exist, then? I really can't say.
The "good" news is that in Java 1.6 there is the NavigableMap interface, which not only has the keys in sorted order, but also lets you traverse them in both directions. Amazing. Why didn't they call it SensibleMap? Still, this is no good for me, as we need to support Java 1.4.
I try subclassing TreeMap to add the backwards traversal that way, but most of the methods and attributes I need are private, so that doesn't work. I then look at the Apache Commons Collections, which does have a sorted map implementation with bidirectional traversal, but unfortunately this is sorted on both keys and values, making it a double tree, with the same disadvantages as above.
The eventual hack I settle for is to copy the source code of TreeMap and AbstractMap into our own source tree, then adding the necessary code for backwards traversal and a couple of iterator implementations to access that from outside. It comes to 70 lines of code, the meat of it being the 17 lines that implement the backwards traversal. Unfortunately, this probably means I've violated the Sun copyrights. So most likely I'll have to implement my own red-black tree in order to be able to get backwards traversal. (Thank you, Sun!)
Anyway, with my own "stolen" ReversibleTreeMap in place I manage to finish the optimization, then write test cases for it, and see that everything works as it should. But is it any faster? I deploy the web application again to get the new OKS/tolog code underneath it, but with the web application unchanged. Then I run the log-replay script again. 40 requests/second. So I've speeded it up by a factor of 13. That's good enough for me. I deployed it about 10 days ago. Since then, no more performance problems.
The lesson? Before optimizing, make absolutely sure you are optimizing in the right place. Otherwise, you are most likely both wasting your time and introducing unnecessary bugs (most non-trivial optimizations introduce bugs).
I realized quite a while ago (a year ago, maybe) that it's possible to implement SPARQL on top of tolog without too much effort
Read | 2005-12-15 23:03
We've known for a long time that sooner or later we'd have to start supporting data types (numbers, dates, ...) in tolog, but so far we haven't done it
Read | 2006-05-10 23:17
Holger Hoffstätte - 2008-07-06 09:36:57
You could just use the classes from the concurrency backport lib (http://backport-jsr166.sourceforge.net/); it contains everything you need.