opensubscriber
   Find in this group all groups
 
Unknown more information…

h : hsqldb-developers@lists.sourceforge.net 13 August 2009 • 11:11PM -0400

[Hsqldb-developers] Performance enhancement for com.
by Frost, Gary

REPLY TO AUTHOR
 
REPLY TO GROUP




I was profiling some code which is using HSQLDB (I profiled with
oprofile on Linux) and discovered (looking at the generated x64 code)
that the org.hsqldb.Like.compareAt() method was consuming a lot of CPU
cycles.



So referring to ...



http://hsqldb.cvs.sourceforge.net/viewvc/hsqldb/hsqldb/src/org/hsqldb/Li
ke.java?revision=1.4&view=markup



Even though compareAt() is a clean recursive solution to wildcard
matching, unfortunately some recursive patterns don't get optimized
particularly well by the JIT. In this case the code is recursive and is
accessing fields of the instance and (as we may know) field accesses are
slower than stack access.



I may try and refactor to use a non recursive solution (and avoiding
field accesses), however my first experiment yielded a 10% improvement
on the application I was using, so I figured I would pass it along as a
suggestion.



If we wide the compareAt() API so that instance fields are passed as
arguments, the JIT optimizer is likely to assign the array references to
registers, which can then be kept in registers throughout the call
sequence. As the fields are never modified, and the JVM avoids having to
keep accessing memory we yields an improvement in performance.
Especially as the code recurses deeper into the match.



So my suggested change is to change



private boolean compareAt(String s, int i, int j, int jLen)



to



private boolean compareAt(String s, int i, int j, int jLen, char
cLike[],  int[] iType)





and then to widen the call site (in compare(String))  from



return compareAt(s, 0, 0, s.length());
to
return compareAt(s, 0, 0, s.length(), cLike, iType);


Note that the code mody of compareLike is not changed (we are relying on
the fact that the stack version of clike and iType are hiding the fields
(I was too lazy to rename everything).
One some local microbenchmarks on a 24 core machine (it's nice working
at AMD ;) ) I have observed 57% improvement using this code
transformation.  On my laptop I see a few %, but of course every little
bit helps.  

If you guys have a performance benchmark/regression suite that you use
to measure performance regressions I would be interested in hearing what
kind of performance delta you observe.

Of course I would welcome comments/suggestions.

Gary  














Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

Related Messages

opensubscriber is not affiliated with the authors of this message nor responsible for its content.