Performance testing shared vs. static libs
2013-01-22 17:20
429 查看
I wrote these tests to see how much slower it is to call into a shared library than it is to call into a static library. My intuition would say that shared is slower, of course, but by how much? These timings are calling a routine that simply increments an integer (it's only 8 instructions total), so the differences shown here are pretty much worst-case. For example, if my test says that calling a shared library is 10% slower, but your routine is 80 instructions long (10x larger than mine), then for you, calling a shared library routine would only be a 1% performance hit. The tests shown below were run on Linux 2.6, gcc 3.3.3 on an Athlon 1800. You can retrieve the test code from funcptrs-0.2.tar.gz on: http://www.rinspin.com/bronson/code/gcc/ COMPILED RESULTS Shared vs. static Calling a shared library routine is 1-20% slower, with the typical performance hit somewhere around 15%: direct to shared library: -O0=5%, -01=20%, -O2=20%, -O3=1%, -Os=10% slower indirect to shared library: -O0=10%, -01=15%, -O2=15%, -O3=15%, -Os=10% slower PIC (position-independent code) vs. position dependent code: -fPIC (position independent code) doesn't affect the speed of the direct function call at all (except for -Os, where it's 5% slower). However, when calling via function pointer, -fPIC causes a 0 to 30% peformance hit (-O0=0%, -O1=10%, -O2=20%, -O3=30%, -Os=15% slower) over position-dependent code. Static linking vs. compiling directly: As you would expect, statically linking to a routine in a library usually provides exactly the same performance as directly compiling the routine into your program. There are some exceptions, however: -O1 is 5% slower and -Os is 5% faster. This seems really weird to me. Why would it be any different at all? CONCLUSION Yes, calling a shared library routine is slower. But not much. For trivial functions, it might be 30% slower worst case, 15% typical, depending on your code and optimization level. For real-world functions, as long as they're not used in the innermost loops, the delay caused by calling into a shared library is negligible. - Scott DATA: For -O0: Directly calling a shared library routine is 5% slower than calling it statically. Indirect function call: (5% slower than direct) Calling into a shared library is 10% slower than calling statically. (i.e. calling a shared library function indirectly is 15% slower than calling a static library function directly) null: min=0.35936 max=0.37501 avg=0.36353 44.917% direct: min=0.80486 max=0.82058 avg=0.80934 100.000% dirshare: min=0.82967 max=0.87794 avg=0.84628 104.563% dirstatic: min=0.80114 max=0.80580 avg=0.80432 99.379% dirpicstatic: min=0.80263 max=0.82554 avg=0.80936 100.002% indirect: min=0.83391 max=0.85256 avg=0.84431 104.320% indirshare: min=0.91188 max=0.95370 avg=0.93296 115.273% indirstatic: min=0.83172 max=0.86983 avg=0.84892 104.889% indirpicstatic: min=0.83234 max=0.87370 avg=0.84546 104.462% For -O1: calling a shared library routine is 20% slower than calling it statically (for both direct and indirect). Directly calling a static library routine is 5% slower than calling a routine that has been directly compiled in (?!). This is reproducible. For some reason, PIC causes a 10% performance hit in the indirect call, but not in the direct call! null: min=0.24641 max=0.25326 avg=0.24962 48.134% direct: min=0.51562 max=0.51982 avg=0.51859 100.000% dirshare: min=0.62994 max=0.63484 avg=0.63222 121.911% dirstatic: min=0.54484 max=0.54833 avg=0.54584 105.254% dirpicstatic: min=0.54667 max=0.55310 avg=0.55051 106.156% indirect: min=0.53638 max=0.55393 avg=0.54647 105.377% indirshare: min=0.63173 max=0.64108 avg=0.63423 122.299% indirstatic: min=0.51710 max=0.52236 avg=0.51930 100.137% indirpicstatic: min=0.57687 max=0.57940 avg=0.57802 111.460% For -O2: calling a shared library routine is 20% slower than calling it statically (for both direct and indirect). Directly calling a static library routine is 5% slower than calling a routine that has been directly compiled in (?!). This is reproducible. For some reason, PIC causes a 20% performance hit in the indirect call, but not in the direct call! null: min=0.25051 max=0.25293 avg=0.25157 48.603% direct: min=0.51552 max=0.52036 avg=0.51761 100.000% dirshare: min=0.63215 max=0.63767 avg=0.63372 122.432% dirstatic: min=0.51592 max=0.51873 avg=0.51756 99.991% dirpicstatic: min=0.51541 max=0.52058 avg=0.51775 100.027% indirect: min=0.51572 max=0.52056 avg=0.51831 100.136% indirshare: min=0.60469 max=0.60900 avg=0.60630 117.135% indirstatic: min=0.51512 max=0.51920 avg=0.51735 99.950% indirpicstatic: min=0.63046 max=0.64098 avg=0.63390 122.466% For -O3: It doesn't matter if youre callind a shared library, static library, or your own code, or -fPIC or not. All direct calls are very close to each other (about 1%). We do see that indirectly calling your own code or a static library goes 10% faster than direct (as found in the previous battery of tests). Indirectly calling a shared library takes 15% longer than indirectly calling a static library (and 5% longer than directly calling either static or shared). PIC continues to cause a 20% performance hit for indirect. null: min=0.24947 max=0.25187 avg=0.25082 43.595% direct: min=0.57265 max=0.57676 avg=0.57534 100.000% dirshare: min=0.58120 max=0.58617 avg=0.58403 101.509% dirstatic: min=0.57705 max=0.59201 avg=0.58076 100.942% dirpicstatic: min=0.57274 max=0.57839 avg=0.57535 100.001% indirect: min=0.51557 max=0.51962 avg=0.51780 89.999% indirshare: min=0.60617 max=0.60948 avg=0.60813 105.699% indirstatic: min=0.51572 max=0.51917 avg=0.51813 90.056% indirpicstatic: min=0.62970 max=0.63533 avg=0.63271 109.970% For -Os: Calling a shared library takes 10% longer than static (both direct and indirect) Calling a static library is 5% faster than calling direct code?!?! -fPIC causes a 5% performance hit direct and a 15% performance hit indirect. null: min=0.25008 max=0.25290 avg=0.25153 43.712% direct: min=0.57253 max=0.57886 avg=0.57543 100.000% dirshare: min=0.62959 max=0.63425 avg=0.63178 109.793% dirstatic: min=0.54431 max=0.54895 avg=0.54653 94.977% dirpicstatic: min=0.57430 max=0.58078 avg=0.57699 100.270% indirect: min=0.57590 max=0.57842 avg=0.57706 100.283% indirshare: min=0.63282 max=0.63675 avg=0.63517 110.382% indirstatic: min=0.57224 max=0.57681 avg=0.57524 99.966% indirpicstatic: min=0.65924 max=0.66185 avg=0.66071 114.819%
相关文章推荐
- Performance vs. load vs. stress testing
- Performance vs. load vs. stress testing
- [转] Performance vs. load vs. stress testing _Grig Gheorghiu (翻译水平有限,如有错误请帮忙刊正)
- Performance Testing vs. Load Testing vs. Stress Testing
- configure: error: libmpfr not found or uses a different ABI (including static vs shared).
- gcc - shared library vs static library
- Performance Testing Guidance学习笔记之一
- vs xamarin android SharedPreferences
- 为什么VB.net的Shared(共享)方法在C#中叫Static(静态)?
- Branch performance test, ATI VS NV, SM3.0 only
- static VS relative
- A communication conference on performance testing
- RabbitMQ Performance Testing Tool 性能测试工具
- cocos2d-x中为什么要用sharedXX()函数创建单例类的static对象
- -static-link-runtime-shared-libraries 选项设置为 true
- static const vs. extern const
- Singleton VS 'static class'
- Static, Shared Dynamic and Loadable Linux Libraries
- Static, Shared Dynamic and Loadable Linux Libraries
- Advanced Unit Testing, Part IV - Fixture Setup/Teardown, Test Repetition And Performance Tests