ink.miner.rulemining.RuleSetMiner

class ink.miner.rulemining.RuleSetMiner(support=10, max_rules=100000000000000.0, max_len_rule_set=5, max_iter=10, chains=1000, forest_size=1000, criteria='precision', rule_complexity=2, propose_threshold=0.1, verbose=False)

Bases: object

The INK RuleSetMiner. Class which can mine both task specific and task agnostic rules.

Parameters
  • support (int) – Support measure, only rules with this level of support will be taken into account.

  • max_rules (int) – Maximal number of rules which can be mined.

  • max_len_rule_set (int) – Maximal number of rules used to separate the classes during task-specific mining.

  • max_iter (int) – Maximal number of iterations used for the task-specific miner.

  • chains (int) – Maximal number of chains used for the task-specific miner.

  • forest_size (int) – Maximal number of forest within the classifier for the task-specific miner.

  • criteria (str) – Criteria used to screen the generated rules. Possible criteria’s are precision, specificity, sensitivity, mcc (matthew correlation coefficient) or cross-entropy (default).

  • propose_threshold (int) – Threshold used to propose new combinations of possible rules for the task-specific mining.

  • verbose – Parameter to show tqdm tracker (default False).

Type

bool

__init__(support=10, max_rules=100000000000000.0, max_len_rule_set=5, max_iter=10, chains=1000, forest_size=1000, criteria='precision', rule_complexity=2, propose_threshold=0.1, verbose=False)

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([support, max_rules, …])

Initialize self.

exec_chain(t)

Function to execute chaining in parallel.

fit(data[, label])

Fit function to train the classifier or generate agnostic rules :param data: Tuple value containing 1) a sparse binary representation, 2) list of indices, 3) column features.

precompute(y)

Precompute values based on the given labels.

predict(data)

Predict function used to predict new data against the learned task-specific rules.

print_rules(rules)

Function to represent the rules in a human-readable format.

screen_rules(X_trans, y)

Function to pre_screen the generated rules based on the enabled criteria :param X_trans: Binary data frame.

set_parameters(X)

Function to set some initial parameters based on the data.

exec_chain(t)

Function to execute chaining in parallel. :param t: Tuple with number of rules, split, the RMatrix, y, T0 and chain indicator :type t: tuple :return: Chaining results :rtype: list

fit(data, label=None)

Fit function to train the classifier or generate agnostic rules :param data: Tuple value containing 1) a sparse binary representation, 2) list of indices, 3) column features. :type data: tuple :param label: List containing the labels for each index (task-specific) or None (task-agnostic) :return: Rules

precompute(y)

Precompute values based on the given labels. :param y: List of labels. :return:

predict(data)

Predict function used to predict new data against the learned task-specific rules. :param data: Tuple value containing 1) a sparse binary representation, 2) list of indices, 3) column features. :type data: tuple :return: Predicted labels :rtype: list

print_rules(rules)

Function to represent the rules in a human-readable format. :param rules: Output generated from the task-specific fit function :type rules: list :return:

screen_rules(X_trans, y)

Function to pre_screen the generated rules based on the enabled criteria :param X_trans: Binary data frame. :param y: Label list :return: RMatrix

set_parameters(X)

Function to set some initial parameters based on the data. :param X: Tuple value containing 1) a sparse binary representation, 2) list of indices, 3) column features. :type X: tuple :return: