The proliferation of sensor technology, information appliances, and imaging modalities has enabled us to monitor the interconnected physical planet and the interrelated social world. Big data with large volumes and wide variety, spatial vs. temporal and structured vs. unstructured, are being generated dynamically from monitoring sensors and embedded devices. On the one hand, data-centric algorithms and systems such as regression, inference, classification, clustering, association rule, neural network, SVM, decision trees, k-NN, na´ve Bayes, and genetic algorithms have been developed and widely used in machine learning and data mining. On the other hand, data and information obtained from these heterogeneous sources and diverse systems such as information appliances, geographic mapping, social networks, multimedia, scoring, and ranking have to be explored, combined, and analyzed. For these huge amounts of data to be meaningful and useful, significant patterns have to be identified, information from various sources and systems has to be fused, and useful knowledge must be extracted for decision making and valuable action.
Information fusion and data mining are fundamental in the scientific discovery process of data acquisition, information integration, and knowledge discovery. Although methods for information fusion and data mining have been used for hundreds of years, it remains a challenging problem to understand when, what, and how to optimally mine data, fuse information and discover knowledge. Among others, the DIMACS Workshop on Algorithmic Information Fusion and Data Mining (WAIFDM) will address the following two types of problems:
Given a complex problem in a data-rich environment, how to extract variables and how to perform variable selection and combination? Here "variable" includes feature, attribute, cue, indicator, and parameter.
Given two machine learning or data mining systems A and B, when and how to best combine A and B? Given many possible decisions systems for a solution, how to best select and combine a subset of these systems?
Results in several contexts have shown that fusion of two systems A and B can be better than each of the individual systems only if these individual systems performs relatively well and they are cognitively diverse. However, to understand when and why this might happen, the concept of diversity has to be well defined. Another issue of great importance is the performance variation between score combination and rank combination.
The Workshop WAIFDM will provide a forum for researchers and practitioners to mingle with and learn from each other regarding the design, analysis, and implementation of algorithms for information fusion and data mining. This workshop will gather and enable multidisciplinary researchers and domain experts to conceive new ideas and create novel solutions. WAIFDM will address, among others, the following topics and directions: