Introduction
After implementing a basic dummy pass in Stage 1, and exploring the different concepts that will be involved in Stage 2, it’s time to experiment with the pass. This post will tackle accomplishing the first requirement, identifying function clones.
The goal: Identify function clones, by comparing base names.
Understanding the Flow
Looking at the dummy pass from Stage 1, I can already see there is a difference in how the pass should operate with the new requirements. While the dummy pass simply iterated through the current function, analyzing it, to achieve the goals of this stage, the pass will require knowledge of all the functions that the pass applied to. This implies the use of a static
method or variable, that will keep track of all the data that needs to be persisted between functions.
// Static structure to hold function clone names
static std::unordered_map<std::string, std::vector<std::string>> function_clones;
This structure will hold the function’s base name as the key, and a vector of function names as the value. Each variant’s full name will be stored in that collection, attached to the cooresponding base name.
The current function being analyzed can be accessed through the fun
argument, from the execute
method. The current plan is as follows:
- Get the function’s base name.
- Check if the base name has a cooresponding clone in
function_clones
. - If a clone is already recorded, output a “clone found” message.
- Add the new variant to the structure.
Implementation Attempt
To extract the function name from the function * fun
argument passed to the execute()
method of the pass, I used the method shown by professor Chris Tyler, Seneca Polytechnic.
std::string name = IDENTIFIER_POINTER(DECL_NAME(fun->decl));
This gets the function’s declaration, extracts its identifier node with DECL_NAME
, then converts it to a string with IDENTIFIER_POINTER
.
Now, I need to extract the base name. A function’s name structure looks something like this: base.variant
. The base name is simply the substring of the name, up to (not including) the first period (if it exists, as some functions don’t have a “variant”).
std::string base = name.find_first_of(".") != std::string::npos
? name.substr(0, name.find_first_of("."))
: name;
Now I will use my base name to query my function_clones
unordered_map, and if there are existing records with the same base – I will output a message. In any scenario, I will record the function on the structure.
auto it = function_clones.find(base_name);
// If this base name already has registered clones, print them
if (it != function_clones.end()) {
for (const auto& variant : it->second)
fprintf(dump_file, "CLONE IDENTIFIED: %s\n", variant.c_str());
fprintf(dump_file, "CURRENT: %s\n\n", name.c_str());
}
// Add this function name to the list of variants for the base function_clones[base_name].push_back(name);
After implementing these functionallties, I realized I missed an important step:
resolver
functions shouldn’t be treated as “clones”, or “variants”, even if the full name looks like one:function.resolver
.To account for that case, I added :
if (name.find(".resolver") != std::string::npos) return 0;
Here is the full pass, implementing all the logic outlined above, with added diagnostic print statements:
/* efagin pass - identifies function clones
Emily Fagin - Seneca Polytechnic College
This file is part of GCC
GCC is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3, or (at your option)
any later version.
GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#include <unordered_map>
#include <string>
#include <vector>
#include <stdlib.h>
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "tree-pass.h"
#include "ssa.h"
#include "gimple-iterator.h"
#include "internal-fn.h"
#include "cgraph.h"
#include "function.h"
#include "basic-block.h"
namespace {
// Pass metadata
const pass_data pass_data_efagin =
{
GIMPLE_PASS, /* type */
"efagin", /* name */
OPTGROUP_NONE, /* optinfo_flags */
TV_NONE, /* tv_id */
PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
0, /* todo_flags_finish */
};
/* Currently, this pass detects function clones during compilation.
It identifies variants using the base name of the function, and outputs the analysis.
*/
class pass_efagin : public gimple_opt_pass
{
public:
pass_efagin (gcc::context *ctxt)
: gimple_opt_pass (pass_data_efagin, ctxt)
{}
/* gate function: always execute */
bool gate (function *) final override { return 1; }
unsigned int execute (function *) final override;
private:
/* Stores function names to identify clones*/
static std::unordered_map<std::string, std::vector<std::string>> function_clones;
};
// Initialize static member
std::unordered_map<std::string, std::vector<std::string>> pass_efagin::function_clones;
unsigned int pass_efagin::execute(function* fun)
{
// If dump file not enabled, exit
if (!dump_file || !fun)
return 0;
// Get the full name of the function
std::string name = IDENTIFIER_POINTER(DECL_NAME(fun->decl));
fprintf(dump_file, "-------------Examining Func: %s--------\n", name.c_str());
// Skip resolver functions
if (name.find(".resolver") != std::string::npos) {
fprintf(dump_file, "-------------End of Diagnostic (resolver function)-------------\n\n");
return 0;
}
// Extract base function name (remove everything after first dot)
std::string base_name = name.find_first_of(".") != std::string::npos
? name.substr(0, name.find_first_of("."))
: name;
auto it = function_clones.find(base_name);
// If this base name already has registered clones, print them
if (it != function_clones.end()) {
for (const auto& variant : it->second)
fprintf(dump_file, "CLONE IDENTIFIED: %s\n", variant.c_str());
fprintf(dump_file, "CURRENT: %s\n\n", name.c_str());
}
// Add this function name to the list of variants for the base function
function_clones[base_name].push_back(name);
fprintf(dump_file, "-------------End of Diagnostic-------------\n\n");
return 0;
}
} // anonymous namespace
gimple_opt_pass *
make_pass_efagin (gcc::context *ctxt)
{
return new pass_efagin (ctxt);
}
After modifying the tree-efagin.cc
pass file, I re-built the gcc compiler, and installed it.
cd ~/gcc-build-001
(time make -j 24) |& tee build.log
# ...
# real 0m43.384s
# user 1m3.901s
# sys 0m21.201s
make install
# ...
# Libraries have been installed in:
# /home/efagin/gcc-test-001/lib/../lib64
# ...
To test my pass, I will use the tester modules provided by professor Chris Tyler (Seneca Polytechnic). The tgz
file is located in the public
folder. I unpacked it in my home directory:
tar xzvf ~/../../public/spo600-test-clone.tgz -C ./
The README.md file reads like this:
This file will build on Linux systems with
GCC 14 or higher and x86_64 or aarch64
architectures.
To build the test cases, run 'make'
The output will be:
test-clone-ARCH-prune
This file has two implementations
of the scale_samples function, which
are effectively identical.
test-clone-ARCH-noprune
This binary has two implementations
of the scale_sample function, which
are significantly different.
In both cases, ARCH will be replaced with
the string 'aarch64' or 'x86' according to
the system on which the build was performed.
To enable GCC diagnostic dumps, define the
DUMP_ALL variable to have a non-null value.
See the comments in the Makefile.
-Chris Tyler 2024-11-20 - Version 0.1
Following the instructions specified here, I enabled the dump file for my pass by adding -fdump-tree-efagin
to the CFLAGS. Now all that’s left is to make
in the ./spo600/examples/test-clone
directory, and examine
# Go to test directory
cd ./spo600/examples/test-clone
# Build test cases
make
# View output for both identical and non-identical clone tests
cat *noprune*efagin*
cat *-prune*efagin*
Analyzing Output
At this point in the project, the output for both test cases is similar. I will show the output for the “prune” test case, but it does work for “noprune” as well. The full dump file can be accessed here. I summarized the output (900 lines),
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)
-------------Examining Func: scale_samples--------
-------------End of Diagnostic-------------
__attribute__((target ("default"), target_clones ("default", "popcnt")))
void scale_samples (int16_t * in, int16_t * out, int cnt, int volume)
{
sizetype ivtmp.33;
sizetype ivtmp.28;
# .... (361 lines)
return;
}
;; Function scale_samples.popcnt (scale_samples.popcnt, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)
-------------Examining Func: scale_samples.popcnt--------
CLONE IDENTIFIED: scale_samples
CURRENT: scale_samples.popcnt
-------------End of Diagnostic-------------
__attribute__((target ("popcnt"), target_clones ("default", "popcnt")))
void scale_samples.popcnt (int16_t * in, int16_t * out, int cnt, int volume)
{
sizetype ivtmp.67;
sizetype ivtmp.63;
# ... (361 lines)
return;
}
;; Function sum_sample (sum_sample, funcdef_no=22, decl_uid=3942, cgraph_uid=23, symbol_order=22)
-------------Examining Func: sum_sample--------
-------------End of Diagnostic-------------
int sum_sample (int16_t * buff, size_t samples)
{
unsigned long ivtmp.88;
vector(4) int vect_t_11.81;
# ... (53 lines)
return _28;
}
;; Function scale_samples.resolver (scale_samples.resolver, funcdef_no=27, decl_uid=3994, cgraph_uid=32, symbol_order=30)
-------------Examining Func: scale_samples.resolver--------
-------------End of Diagnostic (resolver function)-------------
__attribute__((returns_nonnull))
void * scale_samples.resolver ()
{
int _3;
# ... (12 lines)
return scale_samples;
}
;; Function main (main, funcdef_no=24, decl_uid=3961, cgraph_uid=25, symbol_order=24) (executed once)
-------------Examining Func: main--------
-------------End of Diagnostic-------------
int main ()
{
unsigned long ivtmp.109;
vector(4) int vect_t_16.102;
# ... (82 lines)
return 0;
}
As we see in the output, the pass correctly identified scale_samples.popcnt
as a clone of the scale_samples
function.
Testing on aarch64
While the pass was developed on the x86
machine, it was also tested in the aarch64
environment. I used git to apply and sync changes across the different environments. Aside from execution time, there were differences in how the pass behaved.
When attempting to compile the tester code with the pass, the following warnings were produced:
gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default","rng") ))'\
-march=armv8-a -g -O3 -fno-lto -ftree-vectorize -fdump-tree-efagin \
clone-test-core.c:28:6: warning: Function Multi Versioning support is experimental, and the behavior is likely to change [-Wexperimental-fmv-target]
28 | void scale_samples(int16_t *in, int16_t *out, int cnt, int volume) {
| ^~~~~~~~~~~~~
In order to create the tests, make
was configured to artificially create clones of specific version, using the target_clones
attribute. rng
and sve2
are ARM
-specific features, but Function Multi-Versioning (FMV) is only experimentally supported on aarch64
. Regardless, these warnings are not critical, and don’t interrupt execution.
Conclusion
The pass now successfully identifies function clones! The next step is to analyze the clones to find out if they are significantly different from eachother. Aside from adding new logic to support this functionallity, the function_clones
structure might change to hold different information. Analyzing and comparing the output of the “prune” and “noprune” tests will also be useful in identifying next steps.
Leave a Reply