Analysis and Optimization of Decision and Inhibitory Trees, Rules, and Tests (initiative project)
Overview
Design and analysis of greedy algorithms for construction of decision and inhibitory trees, rules and tests. Development of extensions of dynamic programming approach for study of trees and rules (counting the number of optimal trees and rules, multi-stage optimization, construction of the set of Pareto optimal points, study of relationships between two cost functions and between cost and uncertainty). Applications to test optimization. Experimental study of trees and systems of rules as algorithms, as classifiers, and as ways for knowledge representation.
Details
This is main project of the group. The aims of project are study of extensions of dynamic programming for machine learning and discrete optimization including theory, algorithms, software implementation, and applications.
Usual dynamic programming approach allows finding one optimal solution. The considered extensions allow;
- to describe all or an essential part of optimal solutions (depending on the considered cost function);
- make sequential optimization relative to different cost functions;
- study relationships between two cost functions or between a cost function and an uncertainty measure (if we consider not only exact but also approximate solutions).
To use such approaches we need to find sub-problems of the initial problem and have an efficient procedure to create optimal solutions for bigger sub-problems based on optimal solutions for smaller sub-problems.
In the case of decision tables (which represent problems in machine learning), as sub-problems we consider so-called separable sub-tables of the initial table that are given by systems of conditions of the kind “attribute = value”. We study mainly exact and approximate decision trees and rules for usual decision tables (with one-valued decisions). The problem of construction of all separable sub-tables is complicated since the number of such sub-tables can be exponential depending on the size of the table. The situation can be improved essentially if we consider not exact but approximate trees and rules. In this case, it is enough to construct only part of separable sub-tables. If the required set of sub-tables is constructed in the form of Directed Acyclic Graph (DAG), it is easily to “propagate” optimal trees or rules from the smallest sub-tables from DAG up to the initial table.
We created a part of software system Dagger which allows;
- build DAG for a given decision table (or a partial DAG if we study approximate trees or rules);
- make sequential optimization for decision trees relative to depth, average depth, number of nodes, etc;
- study relationships between two cost functions or between cost and uncertainty of decision trees;
- make sequential optimization for decision rules relative to length, coverage, number of misclassifications;
- study relationships between length and coverage of decision rules;
- construct trees and rules by various greedy algorithms.
Dagger allows also working with inhibitory rules for usual tables, and decision rules for decision tables with many-valued decisions. In both cases, some greedy heuristics are implemented as well as sequential optimization of rules.
The considered extensions of dynamic programming can be used not only to optimize trees and rules but in various discrete optimization problems that are solvable by dynamic programming in polynomial time and have at least two criterions of optimization (cost functions). This functionality is added to Dagger and adapted for four optimization problems: global sequence alignment, optimal paths in directed graph, chain matrix multiplication, and binary search trees.
Dagger is mainly a tool for design and analysis of optimal decision and inhibitory trees and rules. It will be used in various applications including machine learning (comparison of optimal relative to some cost function classifiers with classifiers constructed by heuristics, feature selection based on new approach to evaluation of attribute importance), comparison of greedy heuristics for optimization of trees and rules, study of relationships among time complexity, space complexity and accuracy of decision trees, optimization of algorithms for fault diagnosis, etc.
We considered the first applications connected with optimization relative to average depth of decision trees for corner point detection problem, and comparative analysis of 16 greedy algorithms for decision tree construction.