Skip to content

A Step By Step Guide For Adding A New Algorithm Type (With Example)

Joana Be edited this page Aug 19, 2019 · 8 revisions

Table of Contents

Goal of this Guide

Do you want to add a new Algorithm Type but don't know where to start? Look no further, this guide will show you all the Classes and Interfaces you'll have to go through in order to successfully add a new Algorithm Type which should be displayed on the list of algorithms on the Metanome frontpage. In this guide, we'll add the new Type Conditional Inclusion Dependency to the project. If you want a reference, see the corresponding pull request but caution: Some code could be outdated due to refactorings. Always check a file's status in the master branch.

Creating a new Maven Project and Setting up the Folder Structure

Since Metanome is a giant Maven project with every algorithm being its own smaller Maven project, we first have to ensure that our algorithm is being included when building.

  1. Clone the metanome-algorithms-repository
  2. Add a new folder to the root directory. Its name should be the first letter of each word of your Algorithm Type. So for `Conditional Inclusion Dependency` we'll add a folder named `cid`.
  3. Create a pom file named `pom.xml` and a folder named src. The folder will contain the source code and tests itself while the pom file is responsible for building our Algorithm.
  4. For the contents of pom.xml, please refer to this wiki page.
At the end, our structure should look like this:
 ├──  metanome-algorithms/
 │      ├──  cid/
 │	│	├──  pom.xml
 │	│	├──  target/ (after initial Maven build)
 │	│	├──  src/
 │	│	│	├── test/
 │	│	│	│	├── java/
 │	│	│	├── main/
 │	│	│	│	├── java/
 │ 	│	│	│	├── resources/

Implementing the Algorithm Type itself

Pre-steps

Every Algorithm Type implements the Algorithm-Interface which returns a Result for a ResultReceiver to receive and process further. Before we can implement our algorithm itself, we have to extend these interfaces with our own implementation first. So, we need to go to the main Metanome-repository which you should clone if not already done. Instead of flooding the guide with code for our Conditional Inclusion Dependency example, the belonging example files will be linked at the start of each step.

  1. We'll start by implementing our Result. This class is returned after an Algorithm was executed. Go to de.metanome.algorithm_integration.results and add your own Result-Class which implements the Result-Interface. Name the class after the name of the Algorithm Type you want to add, eg. ConditionalInclusionDependency. Specify all the variables the result of your algorithm should return. Implement sendResultTo by resultReceiver.receiveResult(this). Make sure the JsonTypeName-Annotation matches the one of your class.
  2. So we send our result to an OmniscientResultReceiver in de.metanome.algorithm_integration.result_receiver which extends all of the Algorithms` ResultReceiver`s. Add your own, upcoming one by adding the suffix ResultReceiver to the name of your Algorithm and create the class, eg. ConditionalInclusionDependencyResultReceiver.
  3. Write your ResultReceiver interface in the same package. It should at least specify the methods shown in the example so you can receive and evaluate a result.
  4. Add a new interface to de.metanome.algorithm_integration.algorithm_types which extends the Algorithm-Interface that you can find in de.metanome.algorithm_integration, eg. ConditionalInclusionDependencyAlgorithm. Define the method shown in the example to set your ResultReceiver in the implementation.

Algorithm implementation

After everything we need to use is set up, go back to your folder in metanome-algorithms and create a class named after the abbreviation of your algorithm which implements at least your added Algorithm-Class. Here you can implement the algorithm's logic. execute() can be seen as the main-method of the class and is called when performing the algorithm.

Don't forget your tests!

As any good programmer would, you shouldn't forget about testing your code. So, go into src/test and write your tests using the junit framework. Consider naming your tests like test<Method Name><Input><Expected Outcome> for convenience when debugging. When done, go back to your algorithm`s pom file and ensure that you've added

    <dependency>
      <groupId>junit</groupid>
      <artifactId>junit</artifactid>
      <version>4.12</version>
      <scope>test</scope>
    </dependency> 

in the <dependencies>-tag. If you've copied the pom file layout from the previous step, you already have it included. To run your tests, either run mvn test in your algorithm`s root folder (eg metanome-algorithms/cid in our example) or directly run it in the IDE of your choice.

Adding Your Algorithm To The Backend

Now that we've implemented the algorithm itself, we've got to include it to Metanome's backend in order for the server to be able to execute it as well. The required steps can be divided into two parts: one that requires you to implement additional logic to your algorithm and another that extends existing constructs which are rather copy-paste-ish. For them, just use the same structure as the already existing algorithms and change the names.

Extending and Adjusting Pre-Existing Structures

For the first few points, go to the package de.metanome.backend.results_db. As the name suggests, this package defines the database for an Algorithm Type's result which we defined earlier. Since a result is dependent from an Algorithm Type, we have to add which algorithm produces which result.

  1. AlgorithmType:
    Add your Algorithm Type to the enumeration. As a convention, an item should be named as the abbreviation of your Algorithm Type in all caps, eg. CID.
  2. Resulttype:
    Add your result type to the enumeration. As a convention, an item should be named as the abbreviation of your Algorithm Type in all caps, eg. CID.
  3. Algorithm:
    At the beginning of the class under OUTPUT, add a protected boolean named after your Algorithm Type. In the constructor Algorithm(String fileName, Set<Class<?>> algorithmInterfaces) under INPUT, set that boolean by checking if algorithmInterfaces contains your Algorithm Type's interface. Add a getter and setter for your boolean. To ensure correct mapping from the set to your variable, adjust the setter to look like the other ones. Additionally, add an if-statement in setAlgorithmTypeProperty(AlgorithmType algorithmType, boolean hasAlgorithmType).
  4. de.metanome.backend.algorithm_execution.AlgorithmExecutor:
    In executeAlgorithm(...) add an if-statement for your Algorithm Type.
  5. de.metanome.backend.resources.AlgorithmResource:
    Import your Algorithm Type. Add the method list<YourAlgorithmTypeName>s() and adjust the annotations.
For the last bit, go to de.metanome.backend.result_receiver. This package is responsible for outputting an Algorithm Type's result.
  1. ResultReceiver:
    Add the overloaded function acceptResult(YourAlgorithmType result).
  2. ResultCache:
    Add the overloaded function receiveResult(YourAlgorithmType yourAlgorithmType).
  3. ResultCounter:
    Add the overloaded function receiveResult(YourAlgorithmType yourAlgorithmType).
  4. ResultPrinter:
    Add the overloaded function receiveResult(YourAlgorithmType yourAlgorithmType).
  5. ResultReader:
    In convertString(...) add an if-statement for your Algorithm Type. You can either use your own format or use JSON. Using JSON might be slower.

Adding Additional Logic AKA The Post Processing

In the post processing, one can define additional information their Algorithm Type should collect after excution about the gotten result, eg. dependant column ratio. Our example has added a few values to show how it could look like. If you don't want any, you might want to check out all post processing classes of Matching Dependency. If you want many, we recommend Functional Dependency. For more information about post processing, go here.

  1. de.metanome.backend.result_postprocessing.result_comparator:
    Add your own class which extends ResultComparator<YourAlgorithmTypeResult>. Define the variables for the information you want to calculate as static final Strings. Add the required constructor and the compare method which should compare every attribute of the parameters on equality.
  2. de.metanome.backend.result_postprocessing.results:
    Add your own class which extends RankingResult. Here you specify accessors for your previously defined variables. Add an empty constructor, one which initializes your variables for an instance of your Algorithm Type and add getters and setters. Overwriting equals(...) and hashCode() is not required, but good practice.
  3. de.metanome.backend.result_postprocessing.result_analyzer:
    Add your own class which extends ResultAnalyzer<YourAlgorithmType, YourAlgorithmTypeResult>. Override analyzeResultsDataIndependent(...), analyzeResultsDataDependent(...) and convertResults(...). If you're not doing any analysing at all, you can do it just like in the example. If not, we recommend to look into FunctionalDependencyResultAnalyzer.
  4. de.metanome.backend.result_postprocessing.result_store:
    Add your own class which extends ResultStore<YourAlgorithmTypeResult>. Override getResultComparator(...) by creating and returning a new YourAlgorithmResultComparator from the given parameters.
  5. de.metanome.backend.result_postprocessing.result_ranking:
    Add your own class which extends Ranking. Add a full constructor, override calculateIndependentRankings(...) and calculateDependentRankings(...).
  6. de.metanome.backend.result_postprocessing.ResultPostProcessor:
    Add an if-statement in analyzeAndStoreResults(...). It's structure can be taken from existing Algorithm Types and adjusted for your one.

Writing Tests For The Backend

To test all the addditions you just made, we also have to extend existing test classes. For that, we first have to create an Example Algorithm for your Algorithm Type on which the tests are executed on.

  1. Go to Metanome/testing_algorithms and add a folder for your Algorithm Type. Make sure to also include it in the root pom.xml.
  2. In your subdirectory, set up a new Maven Project by creating your own pom.xml. Its contents can mostly be the same as the example's - just make sure to adjust <artifactId>, <name> and <Algorithm-Bootstrap-Class> to your Algorithm Type.
  3. Add a .gitignore and pom.xml.versionsBackup to your subdirectory. These can be copied from the example.
  4. In src/main, write an ExampleAlgorithm-class, which extends your Algorithm Type's Algorithm and all the Interfaces from de.metanome.algorithm_integration your Algorithm Type is depended on.
  5. In src/test, write an ExampleAlgorithmTest-class, which should at least test your Example Algorithm on execution and if its configuration requirements are properly set.
  6. After that, build the Maven Project of your subdirectory.
  7. Go to /target/pom.xml and add your example algorithm in <plugin> and dependencies.
Now you can write the tests themselves and initialize your Algorithm as the jar you just added. For the tests, you should at least include the following:
  1. de.metanome.backend.algorithm_execution.AlgorithmExecutorTest:
    Add a method testExecuteYourAlgorithmTypeAlgorithm which tests if your Algorithm is even executed.
  2. de.metanome.backend.resources.AlgorithmResourceTest:
    Add a method testListYourAlgorithmTypeAlgorithms which tests de.metanome.backende.resources.AlgorithmResource getAll().
  3. de.metanome.backend.result_receiver.ResultPrinterTest:
    Add a method testWriteYourAlgorithmType which tests the functionality of your Algorithm Type's ResultPrinter.
  4. Add classes to the de.metanome.backend.result_postprocessing-packages if your algorithm performs post processing. Since the example doesn't, it's left out.

Last but not least: Displaying your Algorithm on the Frontend

To Display your Algorithm Type on the website, we have to modify a few files in the Frontend. If not already done, set it up as a submodule. For that, go to src/app. Again, most of what you have to do is copy the structure of the existing ones and adjust it to your Algorithm Type.

  1. /history/history.controller.js:
    In loadExecution() add an if-statement for your Algorithm Type and push it to $scope.content.
  2. /new/new.controller.js:
    In initializeAlgorithmList() add a list element for your Algorithm Type to algorithmCategoryNames. In openNewAlgorithm() add a map element for your Algorithm Type to obj. In executeAlgorithm() add a string concatenation for your Algorithm Type to url.
  3. /result/result.controller.js:
    At the start of the class under VARIABLE DEFINITIONS add an element for your Algorithm Type to $scope and define its properties. Write a method loadYourAlgorithmType() which you call by adding an if-statement In init(). Write a method onPageYourAlgorithmType'sAbbrevation().
  4. /result/result.html:
    Add a md-card for your Algorithm Type to add it as a new element in the table.

You Did It, Now Build It!

After all these steps, your Algorithm Type is successfully integrated into the Metanome Project. Congrats! 🎉 All that's left to do is build the Maven project and see it appearing on the front page.