您的位置:首页 > 其它


2007-08-12 18:22 537 查看

9. 如何挖掘块结构的工作流?--一种数据挖掘方法。

在执行了所有的步骤之后,输出的是完整的、最小的块结构模型形式。有关过程挖掘程序的更多细节请参考[55, 56]。
对块结构模型的挖掘方法受到名为Process Miner的工具的支持。该工具能够从数据库或第4章中提到的XML格式的文件中读取基于事件的工作流数据。然后,在该数据上自动执行完整的过程挖掘程序。在没有提供内容数据时结论挖掘步骤是缺省的。
Process Miner的图形化界面如图12所示。它在图形化编辑器中以结构图和树的形式显示了输出模型。另外,它允许用户编辑一个模型并输出以作进一步的用途。它也包含一个工作流模拟组件。关于Process Miner的描述参见[57]。

9. How to mine block-structured workflows?––A data mining approach

The last approach discussed in this paper is tailored towards mining block-structured workflows. There are two notable differences with the approaches presented in the preceding four sections. First of all, only block structured workflow patterns are considered. Second, the mining algorithm is based on rewriting techniques rather than graph-based techniques. In addition, the objective of this approach is to mine complete and minimal models: Complete in the sense that all recorded cases are covered by the extracted model, minimal in the sense that only recorded cases are covered. To achieve this goal the approach uses a stronger notion of completeness than e.g. the completeness notion based on direct successor (cf. Section 5).
Before we can mine a workflow model from event-based data it is necessary to determine what kind of model the output should be, i.e., the workflow language being used or the class of workflow models considered. Different languages/classes of models have different meta-models. We distinguish two major groups of workflow meta-models: graph-oriented meta-models and block-oriented meta-models. This approach is based on a block-oriented meta-model. Models of this meta-model (i.e., block-structured workflows) are always well-formed and sound.
Block-structured models are made up from blocks which are nested. These building blocks of block-structured models can be differentiated into operators and constants. Operators build the process flow, while constants are the tasks or sub-workflows that are embedded inside the process flow. We build a block-structured model in a top-down fashion by setting one operator as starting point of the workflow and nest other operators as long as we get the desired flow structure. At the bottom of this structure we embed constants into operators which terminate the nesting process. A block-structured workflow model is a tree whose leafs are always operands.
Besides the tree representation of block-structured models we can specify them as a set of terms. Let S denote the operator sequence, P denote the operator parallel, and a, b, c denote three different tasks, the term S(a,P(b,c)), for example, represents a workflow performing task a completely before task b and task c are performed in parallel. Because of the model’s block-structure each term is always well-formed. Further on, we can specify an algebra that consists of axioms for commutativity, distributivity, associativity, etc. These axioms form the basis for term rewriting systems we can use for mining workflows. A detailed description of the meta-model can be found in [54].
Based on the block-structured meta-model a process mining procedure extracts workflow models from event-based data. The procedure consists of the following five steps that are performed in sequential order.
First, the procedure reads event-based data that belongs to a certain process and builds a trace for each process instance from this data. A trace is a data structure that contains all start and complete events of a process instance in correct chronological order. After building traces, they are condensed on the basis of their sequence of start and complete events. Each trace group constitutes a path in the process schema.
Second, a time-forward algorithm constructs an initial process model from all trace groups. This model is in a special form called disjunctive normal form (DNF). A process model in this form starts with an alternative operator and enumerates inside this block all possible paths of execution as blocks that are built up without any alternative operator. For each trace group such a block is constructed by the algorithm and added to the alternative operator that builds the root of the model.
The next step deals with relations between tasks that result from the random order of performing tasks without a real precedence relation between them. These pseudo precedence relations have to be identified and then removed from the model. In order to identify pseudo precedence relations the model is transformed by a term rewriting system into a form that enumerates all sequences of tasks inside parallel operators embedded into the overall alternative. Then, a searching algorithm determines which of these sequences are pseudo precedence relations. This is determined by finding the smallest subset of sequences that completely explains the corresponding blocks in the initial model. All sequences out of the subset are pseudo precedence relations and therefore removed. At the end of this step, the initial transformation is reversed by a term rewriting system.
Because the process model was built in DNF, it is necessary to split the model’s overall alternative and to move the partial alternatives as near as possible to the point in time where a decision cannot be postponed any longer. This is done by a transformation step using another term rewriting system. It is based on distributivity axioms and merges blocks while shifting alternative operators towards later points in time. It also leads to a condensed form of the model.
The last step is an optional decision-mining step that is based on decision tree induction. In this step an induction is performed for each decision point of the model. In order to perform this step we need data about the workflow context for each trace. From these data a tree induction algorithm builds decision trees. These trees are transformed into rules and then attached to the particular alternative operators.
After performing all steps, the output comes in form of a block-structured model that is complete and minimal. The process mining procedure is reported in more detail in [55, 56].
The approach on mining block-structured models is supported by a tool named Process Miner. This tool can read event-based workflow data from data-bases or from files in the XML format presented in Section 4. It then automatically performs the complete process mining procedure on this data. The decision-mining step is omitted if no context data are provided.
Process Miner comes with a graphical user interface (see Fig. 12). It displays the output model in a graphical editor in form of a diagram and a tree. Additionally, it allows the user to edit a model and to export it for further use. It also contains a workflow simulation component. A description of Process Miner can be found in [57].

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息