Fix the compliation error *for real* ... and restore performance
Commit 8bf0597dde55e4c94a2ff39f1d8130902e3d7a9b (titled as 'Fixed the parser
such that it can read in user defined classes again.') made the parser a bit
slower, especially if the input contains many terms of user-defined type.
The reason for that is quite simple: we throw and catch an exception every
time we construct an object of user-defined type:
// dirty hack to distinguish between serial numbers of functions and real
// pointers.
GiNaC::function* f = NULL;
try {
unsigned serial = (unsigned)(unsigned long)(void *)(reader->second);
f = new GiNaC::function(serial, args);
}
catch ( std::runtime_error ) {
if ( f ) delete f;
ex ret = reader->second(args);
return ret;
}
Fortunately functions are aligned and we can use much more efficient
technique to distinguish between serial and pointers to functions.
This patch is an ugly hack that does the same as the commit f38cbcd651246fb5c1294705d29399f3cbfddaf5
but without changing the ABI (so it can be used in ginac_1-5).
Added include for cstdio header.
gcc 4.0 does not include cstdio as part of other headers like string
anymore. As a result, gcc 4.0 complained about EOF being undefined.
Due to the strange (although permitted by the standard) behaviour of C++
RTTI on woe32 calchash() returns different hash values for equal objects.
As a result automatic evaluation gets spectacularly broken:
examining clifford objects.....({1+t,2+x,3+y,4+z}) - ({1+t,2+x,3+y,4+z}) erroneously returned -{1+t,2+x,3+y,4+z}+{1+t,2+x,3+y,4+z} instead of 0
({1+t,2+x,3+y,4+z}) - ({1+t,2+x,3+y,4+z}) erroneously returned -{1+t,2+x,3+y,4+z}+{1+t,2+x,3+y,4+z} instead of 0
.({1+t,2+x,3+y,4+z}) - ({1+t,2+x,3+y,4+z}) erroneously returned -{1+t,2+x,3+y,4+z}+{1+t,2+x,3+y,4+z} instead of 0
({1+t,2+x,3+y,4+z}) - ({1+t,2+x,3+y,4+z}) erroneously returned {1+t,2+x,3+y,4+z}-{1+t,2+x,3+y,4+z} instead of 0
[skipped]
.......FAIL: exam_clifford.exe
This patch works around `features' of woe32 RTTI, so calchash() works
properly.
Univariate GCD timing: use sr_gcd when appropriate.
The benchmark consists of two parts:
1) timing of different GCD calculation methods (i.e. subresultant PRS,
heuristic, chinese remaindering).
2) timing of different implementations of the same method. The purpose
is to find out how (in)efficient GiNaC::ex is as a representation
of (univariate) polynomials (as a side note, the result is a bit
depressing -- using coefficient vector instead of GiNaC:ex makes
GCD calculation 50x -- 1000x faster).
Now GiNaC uses modular (chinese remaindering) GCD by default, so part 2)
got broken, i.e. instead of (intended) timings
a) (heuristic, GiNaC::ex) versus (heuristic, coefficient vector)
b) (PRS, GiNaC::ex) versus (PRS, coefficient vector)
one gets
a') (heuristic, GiNaC::ex) versus (heuristic, coefficient vector)
b') (chinese remaindering, GiNaC::ex) versus (PRS, coefficient vector)
Set the gcd_options::use_sr_gcd to restore the meaning of the benchmark.
Note: this patch does not affect the library proper.
Jens Vollinga [Fri, 6 Feb 2009 15:07:10 +0000 (16:07 +0100)]
Prettified source code.
- Added copyright and GPL licencing to new files.
- Increased year to 2009.
- Changed guarding macros in header to uniform pattern without leading or
trailing __ (double underscores).
- Put includes of system wide header consistently below own includes (help
a tiny bit to clarify dependencies).
Implement modular multivariate GCD (based on chinese remaindering algorithm).
Both algorithm and implementation are a bit inefficient:
* algorithm does not make use of sparseness of inputs (and most of
multivariate polynomials are quite sparse)
* GiNaC's expressions are essentially immutable. Thus some simple
operations (i.e. multiplying a polynomial by an element of the base ring)
are prohibitively expensive.
* All numbers (i.e. GiNaC::numeric objects) are heap allocated.
Still it's much faster (~5x on bivariate polynomials, ~100x on 3-variate
ones) than (subresultant) PRS algorithm, so gcd() uses modular algorithm
by default.
Jens Vollinga [Thu, 18 Dec 2008 10:10:50 +0000 (11:10 +0100)]
Fixed bug in multivariate factorization. Fractional numerical content was not
extracted and led to a not so finite running behavior. Added new checks for that
bug.
Added static keywords to hide debugging output operators.
The libtool naming scheme cannot guarantee that on all systems, the numbering
is consecutive. It only guarantees that it is increasing. This doesn't matter,
though: there is not incurred cost for numbers that are omitted, except for
shrinking the available space of leftover numbers. Not something we need to
worry about yet. ;-) On the other hand, tricks which we were using to make
version numbers consecutive on GNU/Linux break versioning on other OSes.
Jens Vollinga [Thu, 13 Nov 2008 10:13:05 +0000 (11:13 +0100)]
Fixed lots of bugs in factor_multivariate().
factor_univariate() takes now the content of the polynomial into accout when
choosing a prime. It also returns the chosen prime. This supports the
factor_multivariate() code.
Polynomials are expanded before calling factor_sqrfree(). This avoids problems
with some input polynomials.
Jens Vollinga [Mon, 10 Nov 2008 12:38:11 +0000 (13:38 +0100)]
Added modular square free factorization.
Completed distinct degree factorization.
Univariate polynomial factorization uses now upoly.
Merged class Partition and function split into class factor_partition.
Jens Vollinga [Wed, 5 Nov 2008 10:22:19 +0000 (11:22 +0100)]
Changed code from using cl_UP_MI to using umodpoly. The cl_UP_MI interface was
inconvenient to use and caused several very difficult bugs (some were still
unresolved before changing to umodpoly!).
Fixed severe bug in multivariate factorization. The condition that all modular
factors should be relatively prime in the base ring was violated. This caused
the factorization sometimes to go into an infinite loop.
symbol: remove return_type_tinfo() and return_type() (shrink symbol by 8 bytes)
Bloating symbol to the state sizeof(symbol) == 64 is not appreciated at all.
If someone needs/wants non-commutative symbols or other fancy stuff, please
derive from class symbol and do whatever you want.
return_type_tinfo() is used in order to distingish between non-commutative
objects of different type. However, often it's necessary to distingish
between non-commutative objects of the same type, for example, between
gamma matrices with different representation label. return_type_tinfo()
does not provide any clean way to do that. Hence, introduce return_type_t
type which holds representation label along with type information, and
use it for return_type_tinfo().
calchash(): use type_info::name() instead of tinfo().
Custom RTTI considered harmful, part 4.
The hash value of the object of different types should be different whenever
possible. Hence calcash() needs a unique per type number. Previously we used
tinfo_key for this. typeinfo::name() (a *pointer* to implementation dependent
string) is also unique for each class, so it's just as good as tinfo() is.
basic, expairseq: use standard C++ RTTI in compare(), is_equal(), match(), etc.
Custom RTTI considered harmful, part 2.
* basic.cpp: use standard C++ RTTI in compare(), is_equal(), operator=(),
match().
* expairseq.cpp: use standard C++ RTTI in match(), make_flat(),
construct_from_2_ex().
is_exactly_a: use typeid() to check the type of expression.
Custom RTTI considered harmful, part 1.
Custom run time type information (RTTI) system implemented in GiNaC have
several serious drawbacks, such as
1. It makes writing GiNaC classes unnecessary cumbersome.
2. It wastes sizeof(void *) bytes per object, for small objects like
symbol, numeric, etc. this overhead is considerable.
It turns out that GiNaC's RTTI is not any faster than the standard one
(at least on GNU/Linux and Solaris with GNU C++ library and compiler).
So, GiNaC RTTI have no reasons to exit any more.
ptr.h: use unsigned int to store reference counts (in order to save memory).
We'll hardly ever get more than 2^32 references to the same object. So using
size_t to store reference count only wastes 4 bytes per object (on 64-bit
architecture). Use unsigned int instead.
1. Bad performance. Parsing a sum seems to be O(N^{2 + a}) (a > 0).
For example, parsing a sum (actually, a univariate polynomial) of 32768
terms takes about 90 sec. on my box, parsing a sum of 10^6 terms takes
"eternity".
2. The user is expected to provide list of all symbols in the expression.
Often this is very annoying (and useless), sometimes it is not possible
at all.
3. Parser is not reentrant (bison *can* produce reentrant parsers, but that
won't solve other problems).
4. Parser is difficult to extend.
Since the new parser handles almost everything (useful) as the old one, let's
remove the latter.
check: time_parser.cpp: don't run the same benchmark twice.
Since ex(const string&, lst&) ctor uses the new parser now comparing its
performance (and result) with one of direct invocation of the parser is
pointless.
parser: add necessary checks to operator() to stop accepting nonsense.
Since the parser is recursive parse_* methods (in particular,
parse_binop_rhs()) does NOT check if the last unparsed token is valid.
Thus parse_expression() stops if it encounters an unknown token. That's
why parser::operator() needs to make sure nothing is left in the input
stream.
parser: change order of the constructor optional arguments.
Functions/methods having multiple optional arguments are not very convenient
to use in C++ (to put it mildly). Shuffle parser ctor arguments so not so
frequently used argument is the last one.
parser: allow read/write access to symbol table and strictness.
Intended usage:
parser reader;
ifstream input_file1, input_file2;
// read the first file...
ex e1 = reader(input_file1);
// ... add extra entry into the symbol table used by parser
symbol x;
parser.get_syms()["x"] = x;
// Disable the parser to introduce new symbols, e.g. to ensure the expression
// in input_file2 contains the same symbols as e1 (read from input_file1).
parser.strict = true;
ex e2;
try {
e2 = reader(input_file2);
} catch (...) {
abort();
}
G_numeric: use cl_N and int to manipulate numbers (instead of ex).
Convert helper functions G_do_hoelder and G_do_trafo in the same manner,
update all call sites. This should speed up numerical calculation of
multiple polylogarithms a little bit, and reduce the memory usage.
G_numeric: put convergence/acceleration transofrmations into helper functions.
This is simple code move, everything else should be considered a bug.
The helper functions (as well as G_numeric itself) will be improved by
subsequent patches.