Tags

,

Use of profilers and investigating results may make you see how a few lines of code may drain significant performance. The following figure compares buffer clearing using callgrind – kcachegrind. As may be seen, manual short int (%7.46)  operation nearly takes twice time than integer (%4.34)  operation. This is result of natural memory boundaries. Besides, using highly optimized std::memset (%0.19) literally grinds both. Results are somewhat affected from making code observable,  by placing debug information to enable callgrind, but gains of std::memset is out of question. A priori, I expected dominance of std::memset but not to this extend. Avoiding BufferClear altogether (replacing call with memset), will further save by avoiding method call. In such poorly written old codes, five minutes of work may make you save more than 5% execution time.

profile_result

What is another surprise is performance result of naive replacement with std::fill() template.

profile_fill_result

Advertisements