Stage 2.2: Identifying Clones

Introduction

After implementing a basic dummy pass in Stage 1, and exploring the different concepts that will be involved in Stage 2, it’s time to experiment with the pass. This post will tackle accomplishing the first requirement, identifying function clones.

The goal: Identify function clones, by comparing base names.

Understanding the Flow

Looking at the dummy pass from Stage 1, I can already see there is a difference in how the pass should operate with the new requirements. While the dummy pass simply iterated through the current function, analyzing it, to achieve the goals of this stage, the pass will require knowledge of all the functions that the pass applied to. This implies the use of a static method or variable, that will keep track of all the data that needs to be persisted between functions.

// Static structure to hold function clone names
static std::unordered_map<std::string, std::vector<std::string>> function_clones;

This structure will hold the function’s base name as the key, and a vector of function names as the value. Each variant’s full name will be stored in that collection, attached to the cooresponding base name.

The current function being analyzed can be accessed through the fun argument, from the execute method. The current plan is as follows:

  1. Get the function’s base name.
  2. Check if the base name has a cooresponding clone in function_clones.
  3. If a clone is already recorded, output a “clone found” message.
  4. Add the new variant to the structure.

Implementation Attempt

To extract the function name from the function * fun argument passed to the execute() method of the pass, I used the method shown by professor Chris Tyler, Seneca Polytechnic.

std::string name = IDENTIFIER_POINTER(DECL_NAME(fun->decl));

This gets the function’s declaration, extracts its identifier node with DECL_NAME, then converts it to a string with IDENTIFIER_POINTER.

Now, I need to extract the base name. A function’s name structure looks something like this: base.variant. The base name is simply the substring of the name, up to (not including) the first period (if it exists, as some functions don’t have a “variant”).

std::string base = name.find_first_of(".") != std::string::npos 
  ? name.substr(0, name.find_first_of(".")) 
  : name;

Now I will use my base name to query my function_clones unordered_map, and if there are existing records with the same base – I will output a message. In any scenario, I will record the function on the structure.

auto it = function_clones.find(base_name);

// If this base name already has registered clones, print them
if (it != function_clones.end()) {
  for (const auto& variant : it->second)
    fprintf(dump_file, "CLONE IDENTIFIED: %s\n", variant.c_str());
  fprintf(dump_file, "CURRENT: %s\n\n", name.c_str());
}
  
// Add this function name to the list of variants for the base   function_clones[base_name].push_back(name);

After implementing these functionallties, I realized I missed an important step: resolver functions shouldn’t be treated as “clones”, or “variants”, even if the full name looks like one: function.resolver.

To account for that case, I added :

if (name.find(".resolver") != std::string::npos)
  return 0;

Here is the full pass, implementing all the logic outlined above, with added diagnostic print statements:

/* efagin pass - identifies function clones
   Emily Fagin - Seneca Polytechnic College

This file is part of GCC

GCC is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3, or (at your option)
any later version.

GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.  */

#include <unordered_map>
#include <string>
#include <vector>  
#include <stdlib.h>
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "tree-pass.h"
#include "ssa.h"
#include "gimple-iterator.h"
#include "internal-fn.h"
#include "cgraph.h"
#include "function.h"
#include "basic-block.h"

namespace {

// Pass metadata
const pass_data pass_data_efagin =
{
  GIMPLE_PASS,    /* type */
  "efagin",       /* name */
  OPTGROUP_NONE,  /* optinfo_flags */
  TV_NONE,        /* tv_id */
  PROP_cfg,       /* properties_required */
  0,              /* properties_provided */
  0,              /* properties_destroyed */
  0,              /* todo_flags_start */
  0,              /* todo_flags_finish */
};

/* Currently, this pass detects function clones during compilation. 
   It identifies variants using the base name of the function, and outputs the analysis.
*/
class pass_efagin : public gimple_opt_pass
{
public:
  pass_efagin (gcc::context *ctxt)
    : gimple_opt_pass (pass_data_efagin, ctxt)
  {}

  /* gate function: always execute */
  bool gate (function *) final override { return 1; }

  unsigned int execute (function *) final override;
	
private:  
  /* Stores function names to identify clones*/
  static std::unordered_map<std::string, std::vector<std::string>> function_clones;
	
};

// Initialize static member
std::unordered_map<std::string, std::vector<std::string>> pass_efagin::function_clones;

unsigned int pass_efagin::execute(function* fun)
{
  // If dump file not enabled, exit
  if (!dump_file || !fun)
    return 0;

  // Get the full name of the function
  std::string name = IDENTIFIER_POINTER(DECL_NAME(fun->decl));
  fprintf(dump_file, "-------------Examining Func: %s--------\n", name.c_str());
	
  // Skip resolver functions
  if (name.find(".resolver") != std::string::npos) {
    fprintf(dump_file, "-------------End of Diagnostic (resolver function)-------------\n\n");
    return 0;
  }

  // Extract base function name (remove everything after first dot)
  std::string base_name = name.find_first_of(".") != std::string::npos 
    ? name.substr(0, name.find_first_of(".")) 
    : name;
  
  auto it = function_clones.find(base_name);
 
  // If this base name already has registered clones, print them
  if (it != function_clones.end()) {
    for (const auto& variant : it->second)
      fprintf(dump_file, "CLONE IDENTIFIED: %s\n", variant.c_str());
    fprintf(dump_file, "CURRENT: %s\n\n", name.c_str());
  }
  
  // Add this function name to the list of variants for the base function
  function_clones[base_name].push_back(name);
  fprintf(dump_file, "-------------End of Diagnostic-------------\n\n");

  return 0;
}

} // anonymous namespace

gimple_opt_pass *
make_pass_efagin (gcc::context *ctxt)
{
  return new pass_efagin (ctxt);
}

After modifying the tree-efagin.cc pass file, I re-built the gcc compiler, and installed it.

cd ~/gcc-build-001

(time make -j 24) |& tee build.log
# ...
# real    0m43.384s
# user    1m3.901s
# sys     0m21.201s

make install
# ...
# Libraries have been installed in:
#    /home/efagin/gcc-test-001/lib/../lib64
# ...

To test my pass, I will use the tester modules provided by professor Chris Tyler (Seneca Polytechnic). The tgz file is located in the public folder. I unpacked it in my home directory:

tar xzvf ~/../../public/spo600-test-clone.tgz -C ./

The README.md file reads like this:

This file will build on Linux systems with
GCC 14 or higher and x86_64 or aarch64
architectures.

To build the test cases, run 'make'

The output will be:

  test-clone-ARCH-prune

    This file has two implementations
    of the scale_samples function, which
    are effectively identical.

  test-clone-ARCH-noprune

    This binary has two implementations
    of the scale_sample function, which
    are significantly different.

In both cases, ARCH will be replaced with
the string 'aarch64' or 'x86' according to
the system on which the build was performed.

To enable GCC diagnostic dumps, define the
DUMP_ALL variable to have a non-null value.
See the comments in the Makefile.

-Chris Tyler 2024-11-20 - Version 0.1

Following the instructions specified here, I enabled the dump file for my pass by adding -fdump-tree-efagin to the CFLAGS. Now all that’s left is to make in the ./spo600/examples/test-clone directory, and examine

# Go to test directory
cd ./spo600/examples/test-clone 
# Build test cases
make 

# View output for both identical and non-identical clone tests
cat *noprune*efagin*
cat *-prune*efagin*

Analyzing Output

At this point in the project, the output for both test cases is similar. I will show the output for the “prune” test case, but it does work for “noprune” as well. The full dump file can be accessed here. I summarized the output (900 lines),

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)

-------------Examining Func: scale_samples--------
-------------End of Diagnostic-------------

__attribute__((target ("default"), target_clones ("default", "popcnt")))
void scale_samples (int16_t * in, int16_t * out, int cnt, int volume)
{
  sizetype ivtmp.33;
  sizetype ivtmp.28;
  
  # .... (361 lines)
  
  return;

}


;; Function scale_samples.popcnt (scale_samples.popcnt, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)

-------------Examining Func: scale_samples.popcnt--------
CLONE IDENTIFIED: scale_samples
CURRENT: scale_samples.popcnt

-------------End of Diagnostic-------------

__attribute__((target ("popcnt"), target_clones ("default", "popcnt")))
void scale_samples.popcnt (int16_t * in, int16_t * out, int cnt, int volume)
{
  sizetype ivtmp.67;
  sizetype ivtmp.63;

  # ... (361 lines)
	
  return;

}


;; Function sum_sample (sum_sample, funcdef_no=22, decl_uid=3942, cgraph_uid=23, symbol_order=22)

-------------Examining Func: sum_sample--------
-------------End of Diagnostic-------------

int sum_sample (int16_t * buff, size_t samples)
{
  unsigned long ivtmp.88;
  vector(4) int vect_t_11.81;
 
  # ... (53 lines)
 
  return _28;

}


;; Function scale_samples.resolver (scale_samples.resolver, funcdef_no=27, decl_uid=3994, cgraph_uid=32, symbol_order=30)

-------------Examining Func: scale_samples.resolver--------
-------------End of Diagnostic (resolver function)-------------

__attribute__((returns_nonnull))
void * scale_samples.resolver ()
{
  int _3;

  # ... (12 lines)
	
  return scale_samples;

}


;; Function main (main, funcdef_no=24, decl_uid=3961, cgraph_uid=25, symbol_order=24) (executed once)

-------------Examining Func: main--------
-------------End of Diagnostic-------------

int main ()
{
  unsigned long ivtmp.109;
  vector(4) int vect_t_16.102;
  
  # ... (82 lines)
	
  return 0;

}

As we see in the output, the pass correctly identified scale_samples.popcnt as a clone of the scale_samples function.

Testing on aarch64

While the pass was developed on the x86 machine, it was also tested in the aarch64 environment. I used git to apply and sync changes across the different environments. Aside from execution time, there were differences in how the pass behaved.

When attempting to compile the tester code with the pass, the following warnings were produced:

gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default","rng") ))'\
        -march=armv8-a -g -O3 -fno-lto  -ftree-vectorize -fdump-tree-efagin \
clone-test-core.c:28:6: warning: Function Multi Versioning support is experimental, and the behavior is likely to change [-Wexperimental-fmv-target]
   28 | void scale_samples(int16_t *in, int16_t *out, int cnt, int volume) {
      |      ^~~~~~~~~~~~~

In order to create the tests, make was configured to artificially create clones of specific version, using the target_clones attribute. rng and sve2 are ARM-specific features, but Function Multi-Versioning (FMV) is only experimentally supported on aarch64. Regardless, these warnings are not critical, and don’t interrupt execution.

Conclusion

The pass now successfully identifies function clones! The next step is to analyze the clones to find out if they are significantly different from eachother. Aside from adding new logic to support this functionallity, the function_clones structure might change to hold different information. Analyzing and comparing the output of the “prune” and “noprune” tests will also be useful in identifying next steps.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *