The scientific publication WIREs Data Mining and Knowledge Discovery accepts a scientific paper from the CADC.
It concerns the article “Evaluation and Comparison of Open Source Software Suites for Data Mining and Knowledge Discovery” written by Miguel Angel Vallejo, CADC data scientist, in collaboration with other international researchers. WIREs Data Mining and Knowledge Discovery publication from Wiley is a reference in the scientific field with an impact factor of 1579.
This study aims to evaluate 19 open source data mining tools and provide companies and the scientific community with an extensive study, based on a wide set of characteristics that any data mining tool should possess, either from a subjective (tool comparison) or objective (with or without a specific characteristic) point of view.
The results show that RapidMiner, KNIME, and WEKA are the tools that have a greater number of these characteristics.
The growing interest in the extraction of useful knowledge from data favors the emergence of several data mining tools. The scientific community is aware of the importance that open source data mining software has in facilitating the diffusion of new algorithms. The availability of these free-of-charge tools, along with the possibility of understanding the different approaches by examining the source code, provides a great opportunity to refine and improve algorithms.
Value and applications for companies and the scientific community:
Value
Applications
If a functionality that any of these tools may have is needed, it is possible to look for tools that possess it and study their source code.