Meta Mining

The Figure below illustrates how the workflows designed by the AI planner are ranked based on meta-models built by three workflow assessment components. Meta-models are generated in offline mode (right side of the figure) using meta-data from the DM experiment repository (DMER), e.g., input data characteristics computed by the DCTool, descriptions of workflows and performance results. For the meta-miner, these meta-data are parsed and structured into a Data Mining Experiment Database (DMEX-DB) using concepts from the Data Mining Optimization Ontology (DMOP).

tl_files/elico/meta_mining/mm-overview.png
The meta mining architecture.

The three meta-level components use these meta-data in diverse ways to build workflow assessment models. First, an ontology-based meta-miner builds a predicitve model by correlating dataset and workflow/algorithm characteristics with observed performance in past experiments. A second component uses expert rules to build a qualitative workflow assessment model, whereas a Time and Memory Analyser builds a model that estimates time and memory consumption of candidate workflows.

In production mode, the AI Planner generates a typically large number of workflows that need to be ranked. First, the meta-mined model is applied to score the candidate workflows, and only the k top-ranked workflows are further assessed by the non-functional (qualitative and time/memory-based) models. Finally, the probabilistic ranker aggregates the scores produced by the three meta-models using predefined weights and delivers a final ranking of the preselected workflows.