About
While the machine learning community has primarily focused on analysing the output of a single data source, there has been relatively few attempts to develop a general framework, or heuristics, for analysing several data sources in terms of a shared dependency structure. Learning from multiple data sources (or alternatively, the data fusion problem) is a timely research area. Due to the increasing availability and sophistication of data recording techniques and advances in data analysis algorithms, there exists many scenarios in which it is necessary to model multiple, related data sources, i.e. in fields such as bioinformatics, multi-modal signal processing, information retrieval, sensor networks etc.
The open question is to find approaches to analyse data which consists of more than one set of observations (or view) of the same phenomenon. In general, existing methods use a discriminative approach, where a set of features for each data set is found in order to explicitly optimise some dependency criterion. However, a discriminative approach may result in an ad hoc algorithm, require regularisation to ensure erroneous shared features are not discovered, and it is difficult to incorporate prior knowledge about the shared information. A possible solution is to overcome these problems is a generative probabilistic approach, which models each data stream as a sum of a shared component and a private component that models the within-set variation.
In practice, related data sources may exhibit complex co-variation (for instance, audio and visual streams related to the same video) and therefore it is necessary to develop models that impose structured variation within and between data sources, rather than assuming a so-called 'flat' data structure. Additional methodological challenges include determining what is the 'useful' information to extract from the multiple data sources, and building models for predicting one data source given the others. Finally, as well as learning from multiple data sources in an unsupervised manner, there is the closely related problem of multitask learning, or transfer learning where a task is learned from other related tasks.
More information about workshop - http://web.mac.com/davidrh/LMSworkshop08/
Videos

Multiview Clustering via Canonical Correlation Analysis
Dec 20, 2008
·
7329 views

The Double-Barrelled LASSO (Sparse Canonical Correlation Analysis)
Dec 20, 2008
·
4845 views

Discussion & Future Directions
Dec 20, 2008
·
3151 views

GP-LVM for Data Consolidation
Dec 20, 2008
·
5307 views

Multi-View Dimensionality Reduction via Canonical Correlation Analysis
Dec 20, 2008
·
5765 views

Regression Canonical Correlation Analysis
Dec 20, 2008
·
5948 views

Probabilistic Models for Data Combination in Recommender Systems
Dec 20, 2008
·
9331 views

Learning from Multiple Sources by Matching Their Distributions
Dec 20, 2008
·
616 views

Learning Shared and Separate Features of Two Related Data Sets using GPLVMs
Dec 20, 2008
·
4771 views

Multiple kernel learning for multiple sources
Dec 20, 2008
·
9346 views

Multiview Fisher Discriminant Analysis
Dec 20, 2008
·
6399 views

Selective Multitask Learning by Coupling Common and Private Representations
Dec 20, 2008
·
3232 views

Two-level infinite mixture for multi-domain data
Dec 20, 2008
·
3032 views