[CLN-list] Linking against cln fails when built with link-time-optimization

Alexey Sheplyakov asheplyakov at yandex.ru
Sat Sep 26 21:26:22 CEST 2020


Hello,

22.09.2020, 03:53, "Atri Bhattacharya" <badshah400 at opensuse.org>:

> To follow up on this, we found that the problem was us building the cln
> library with "-fvisibility-inlines-hidden" in combination with LTO.
> Turning off the "-fvisibility-inlines-hidden" flag has fixed this issue
> (even with LTO enabled). 

Not exactly. The problem is *us* breaking the standard (ISO C++98) for performance reasons.

Consider include/cln/rational.h

19
20 extern const cl_I denominator (const cl_RA& r);
21

and src/rational/cl_RA.h

149 inline const cl_I denominator (const cl_RA& r)
150 {
151         if (integerp(r))
152                 return 1;
153         else
154                 return TheRatio(r)->denominator;
155 }

Notice that the same function is declared non-inline in the public header, however
it's implemented as inline. That's because we don't want to expose too many
implementation details, and yet we want to inline that function in CLN itself.

With '-fvisibility-inlines-hidden' the compiler can assume that those inline versions
are not used outside of the CLN. Thus if LTO is enabled the compiler can find out all
call sites, inline the function eliminate the out-of-the-line copy. As a result
`cln::denominator` won't be exported from libcln.so

ISO C++ 98 standard demands (in 7.1.2.4) that "If a function with external linkage
is declared inline in one translation unit, it shall be declared inline in all translation
units in which it appears; no diagnostic is required." So our approach (having
an inline and non-inline function with the same name and argument(s)) violates
the standard and has caused quite a number of linking failures in the past.
The issue has been partially addressed by commit ce250e91fb8d16bc6b35be0add1896fc64f31ec1.
However there are still lots of non-conforming code, in particular cln::denominator.


In general the need to inline a function in some places and not in others came
from three situations:

1) Some functions, like cl_SF_zerop or cl_FF_zerop, are just a single
    machine instruction when inlined. Putting their definitions into a public
    header would expose too many internals of the library, leading violation
    of abstraction and increased compilation times. Still, it would be nice
    to use the inline version of these functions in the library itself.

2) Some functions, like cl_{SF,FF,DF,LF}_idecode, are usually only
    invoked through a dispatcher cl_F_idecode that does nothing but dispatch
    the call to the right function. Here inlining is used, regardless of
    the size of the inlined functions, because it removes one function call
   from the chain of function calls. A compiler cannot know that this
   caller is the main caller for the 4 inlined functions (unless LTO is used)
 
3) Similarly, cl_I_from_NDS would be a bottleneck if not inlined: every
    creation of a new cl_I goes through this function. A compiler cannot
    know a priori the bottlenecks (unless LTO is used)

To solve the problem properly we should either

1) Stop playing tricks and use the LTO. The bug report being discussed kind of proves that
    LTO actually works these days.
2) Introduce `denominator_inline` and `denominator` and use the former in CLN itself
    (similarly to ce250e91fb8d16bc6b35be0add1896fc64f31ec1).


Best regards,
      Alexey



More information about the CLN-list mailing list