JOURNAL ARTICLE

Programming knowledge discovery workflows in service‐oriented distributed systems

Eugenio CesarioMarco LackovicDomenico TaliaPaolo Trunfio

Year: 2012 Journal:   Concurrency and Computation Practice and Experience Vol: 25 (10)Pages: 1482-1504   Publisher: Wiley

Abstract

SUMMARY In several scientific and business domains, very large data repositories are generated. To find interesting and useful information in those repositories, efficient data mining techniques and knowledge discovery processes must be used. The exploitation of data mining techniques in science helps scientists in hypothesis formation and gives them a support on their scientific practices, whereas in industrial processes, data mining can exploit existing data sources as a real value for companies that can take advantage from the knowledge that can be extracted from their large data sources. Data mining tasks are often composed by multiple stages that may be linked to each other to form various execution flows. Moreover, data mining tasks are often distributed because they involve data and tools located over geographically distributed environments. Therefore, it is fundamental to exploit effective paradigms, such as services and workflows, to model data mining tasks that are both multi‐staged and distributed. This paper discusses data mining services and workflows for analyzing scientific data in high‐performance distributed environments such as Grids and Clouds. We discuss how it is possible to define basic and complex services for supporting distributed data mining tasks in Grids. We also present a workflow formalism and a service‐oriented programming framework, named DIS3GNO, for designing and running distributed knowledge discovery processes in the Knowledge Grid system. DIS3GNO supports all the phases of a knowledge discovery process, including composition, execution, and results visualization. After introducing DIS3GNO, some relevant use cases implemented by it and a performance evaluation of the system are discussed.Copyright © 2012 John Wiley & Sons, Ltd.

Keywords:
Computer science Workflow Exploit Knowledge extraction Distributed knowledge Data science Data discovery Process mining Grid Data stream mining Data mining Grid computing Software mining Business process discovery Distributed computing Business process Database World Wide Web Business process management Metadata Business process modeling Software Work in process Knowledge management Software development

Metrics

15
Cited By
3.03
FWCI (Field Weighted Citation Impact)
27
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Distributed and Parallel Computing Systems
Physical Sciences →  Computer Science →  Computer Networks and Communications
Scientific Computing and Data Management
Social Sciences →  Decision Sciences →  Information Systems and Management
Reservoir Engineering and Simulation Methods
Physical Sciences →  Engineering →  Ocean Engineering

Related Documents

BOOK-CHAPTER

Service-Oriented Architectures for Distributed and Mobile Knowledge Discovery

Domenico TaliaPaolo Trunfio

Chapman & Hall/CRC data mining and knowledge discovery series Year: 2008
DISSERTATION

Service-Oriented workflows for distributed data mining applications

Marco LackovicDomenico TaliaLuigi Palopoli

University:   Archive of Doctoral Theses and Digital Collections (University of Calabria) Year: 2023
JOURNAL ARTICLE

A Service-Oriented Programming Approach for Dynamic Distributed Manufacturing Systems

Udayanto Dwi AtmojoZoran SalčićKevin I‐Kai WangValeriy Vyatkin

Journal:   IEEE Transactions on Industrial Informatics Year: 2019 Vol: 16 (1)Pages: 151-160
BOOK-CHAPTER

DNS-Based Discovery System in Service Oriented Programming

Maurizio Giordano

Lecture notes in computer science Year: 2005 Pages: 840-850
© 2026 ScienceGate Book Chapters — All rights reserved.