foundation model for sensor fusion
Table of Contents
As foundation model may represent language-visual, language-audio associations as seen in human behaviours, and output text/embeddings of homogeneous format, it could used to fuse sensor readings(perhaps multimodal) to one representation(the embedding)