[CLN-list] Does someone see a way to speed this up
bruno at clisp.org
Sat Dec 11 00:50:47 CET 2010
> I am using a lot of function calls, would passing constant pointers speed up
> my program significantly? ...
> I don't expect people to read the code, just look at the way I am using cln.
> Is this an efficient way?
Based on past experience, not on your program in particular, here's a list
of suggestions, starting with the most promising ones, and the least
promising ones at the end.
1) Use a profiler that displays the bottlenecks and relative consumption
of CPU time. On Linux/x86, the tool of choice is
valgrind <http://valgrind.org/docs/manual/cl-manual.html>, together
with kcachegrind for the visualization of the results.
2) Are you using excess precision in large parts of the computations?
Often, in order to get 20 digits of a result, you need 40 digits
throughout the computation if you choose a constant precision.
But sometimes you can actually do part of the computation with
22 digits and part with 40 digits, and the result will still have
20 digits of accuracy.
For analyzing this, you need to have a certain understanding of the
numerical stability of your algorithm. For example, when you are
evaluating integrals along a path around a pole, you use 40
digits in the computation and get 20 digits accuracy in the result;
but when you are evaluating an integrals along a path around a
region with no singularity, then 40 digits in the computation normally
lead to slightly less than 40 digits accuracy in the result.
3) When computing integrals, use higher-order integration methods than
simple averaged summation, if the integrand has continuous derivates.
If your integrand is infinitely differentiable, then Runge-Kutta
(the formula with 1/6, 4/6, 1/6) is a big win, and higher-order
Runge-Kutta methods are even better.
4) Pull calls to transcendental functions out of loops if possible.
In CLN, elementary operations and sqrt are fast, whereas sin, exp,
log, etc. are significantly slower.
5) Pull common computations out of loops. The compiler won't do it
for it. Things like cl_float(2,prec)*PI are quite fast to compute,
but you may save a couple of percent by moving all these "trivial"
common subexpressions outside the loops.
6) Then only, start to worry about whether you pass numbers by
"copy" (e.g. cl_N x) or by reference (e.g. const cl_N& x).
CLN numbers are references anyway, so by doing this you can save
the increment and decrement of a reference count, nothing else.
If you had many function calls and many small-sized bignums
and floats, you could save maybe at most 20% by using this trick.
It's more a micro-optimization than the other five points.
More information about the CLN-list