- Assume we have a common unit of information from which we can observe two states (or modalities).
- Each modality is an incomplete view of the actual information there.
- Also, each observed modality is corrupted or noisy.
- Modalities are not independent, they have relationships, dependences, joint probabilities.
- Use multimodal fusion to complement the representation of the true information unit. To make it more accurate with respect to the original content. To reconstruct the missed information.
These are some ideas that we were discussing with prof. Fabio this morning. I think they make perfect sense from a global perspective, even though they require some formalization yet.