Latest 4 news

MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. (arXiv:2005.03545v1 [cs.CL])

By arxiv.org
Published On ::

Multimodal Sentiment Analysis is an active area of research that leverages multimodal signals for affective understanding of user-generated videos. The predominant approach, addressing this task, has been to develop sophisticated fusion techniques. However, the heterogeneous nature of the signals creates distributional modality gaps that pose significant challenges. In this paper, we aim to learn effective modality representations to aid the process of fusion. We propose a novel framework, MISA, which projects each modality to two distinct subspaces. The first subspace is modality invariant, where the representations across modalities learn their commonalities and reduce the modality gap. The second subspace is modality-specific, which is private to each modality and captures their characteristic features. These representations provide a holistic view of the multimodal data, which is used for fusion that leads to task predictions. Our experiments on popular sentiment analysis benchmarks, MOSI and MOSEI, demonstrate significant gains over state-of-the-art models. We also consider the task of Multimodal Humor Detection and experiment on the recently proposed UR_FUNNY dataset. Here too, our model fares better than strong baselines, establishing MISA as a useful multimodal framework.

MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. (arXiv:2005.03545v1 [cs.CL])

p for political: Participation Without Agency Is Not Enough. (arXiv:2005.03534v1 [cs.HC])

Sunny Pointer: Designing a mouse pointer for people with peripheral vision loss. (arXiv:2005.03504v1 [cs.HC])

Computing with bricks and mortar: Classification of waveforms with a doped concrete blocks. (arXiv:2005.03498v1 [cs.ET])

Subquadratic-Time Algorithms for Normal Bases. (arXiv:2005.03497v1 [cs.SC])

Text Recognition in the Wild: A Survey. (arXiv:2005.03492v1 [cs.CV])

Algorithmic Averaging for Studying Periodic Orbits of Planar Differential Systems. (arXiv:2005.03487v1 [cs.SC])

Anonymized GCN: A Novel Robust Graph Embedding Method via Hiding Node Position in Noise. (arXiv:2005.03482v1 [cs.LG])

Brain-like approaches to unsupervised learning of hidden representations -- a comparative study. (arXiv:2005.03476v1 [cs.NE])

Bundle Recommendation with Graph Convolutional Networks. (arXiv:2005.03475v1 [cs.IR])

Ensuring Fairness under Prior Probability Shifts. (arXiv:2005.03474v1 [cs.LG])

Indexing Metric Spaces for Exact Similarity Search. (arXiv:2005.03468v1 [cs.DB])

Predictions and algorithmic statistics for infinite sequence. (arXiv:2005.03467v1 [cs.IT])

High Performance Interference Suppression in Multi-User Massive MIMO Detector. (arXiv:2005.03466v1 [cs.OH])

How Can CNNs Use Image Position for Segmentation?. (arXiv:2005.03463v1 [eess.IV])

ExpDNN: Explainable Deep Neural Network. (arXiv:2005.03461v1 [cs.LG])

AIBench: Scenario-distilling AI Benchmarking. (arXiv:2005.03459v1 [cs.PF])

NTIRE 2020 Challenge on NonHomogeneous Dehazing. (arXiv:2005.03457v1 [cs.CV])

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture. (arXiv:2005.03454v1 [cs.LG])

A combination of 'pooling' with a prediction model can reduce by 73% the number of COVID-19 (Corona-virus) tests. (arXiv:2005.03453v1 [cs.LG])

Lifted Regression/Reconstruction Networks. (arXiv:2005.03452v1 [cs.LG])

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG])

Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences. (arXiv:2005.03436v1 [cs.CL])

Parametrized Universality Problems for One-Counter Nets. (arXiv:2005.03435v1 [cs.FL])

Dirichlet spectral-Galerkin approximation method for the simply supported vibrating plate eigenvalues. (arXiv:2005.03433v1 [math.NA])

The Perceptimatic English Benchmark for Speech Perception Models. (arXiv:2005.03418v1 [cs.CL])

Kunster -- AR Art Video Maker -- Real time video neural style transfer on mobile devices. (arXiv:2005.03415v1 [cs.CV])

NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image. (arXiv:2005.03412v1 [eess.IV])

Detection and Feeder Identification of the High Impedance Fault at Distribution Networks Based on Synchronous Waveform Distortions. (arXiv:2005.03411v1 [eess.SY])

AutoSOS: Towards Multi-UAV Systems Supporting Maritime Search and Rescue with Lightweight AI and Edge Computing. (arXiv:2005.03409v1 [cs.RO])

Joint Prediction and Time Estimation of COVID-19 Developing Severe Symptoms using Chest CT Scan. (arXiv:2005.03405v1 [eess.IV])

A LiDAR-based real-time capable 3D Perception System for Automated Driving in Urban Domains. (arXiv:2005.03404v1 [cs.RO])

Datom: A Deformable modular robot for building self-reconfigurable programmable matter. (arXiv:2005.03402v1 [cs.RO])

Scheduling with a processing time oracle. (arXiv:2005.03394v1 [cs.DS])

Playing Minecraft with Behavioural Cloning. (arXiv:2005.03374v1 [cs.AI])

Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT])

DMCP: Differentiable Markov Channel Pruning for Neural Networks. (arXiv:2005.03354v1 [cs.CV])

Error estimates for the Cahn--Hilliard equation with dynamic boundary conditions. (arXiv:2005.03349v1 [math.NA])

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

Arranging Test Tubes in Racks Using Combined Task and Motion Planning. (arXiv:2005.03342v1 [cs.RO])

Scene Text Image Super-Resolution in the Wild. (arXiv:2005.03341v1 [cs.CV])

Global Distribution of Google Scholar Citations: A Size-independent Institution-based Analysis. (arXiv:2005.03324v1 [cs.DL])

Boosting Cloud Data Analytics using Multi-Objective Optimization. (arXiv:2005.03314v1 [cs.DB])

Safe Data-Driven Distributed Coordination of Intersection Traffic. (arXiv:2005.03304v1 [math.OC])

Expressing Accountability Patterns using Structural Causal Models. (arXiv:2005.03294v1 [cs.SE])

Continuous maximal covering location problems with interconnected facilities. (arXiv:2005.03274v1 [math.OC])

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV])

Coding for Optimized Writing Rate in DNA Storage. (arXiv:2005.03248v1 [cs.IT])

DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC])

Enhancing Software Development Process Using Automated Adaptation of Object Ensembles. (arXiv:2005.03241v1 [cs.SE])

Will LEED v4 Ever Be Usable?

Wagner Meters rolls out Rapid RH 4.0

24 hours in Copenhagen

Black Friday, Cyber Monday, Travel Tuesday 2024

The Carpet and Rug Institute Presents the 2024 Joseph J.Smrekar Memorial Award

WHATSAPP: +1(443) 720-5561 Buy fake usd/aud/cad/JPY/CNY/GBP/euros/pounds

Sports Drink Maker Electrolit to Build $400 Million Facility in Texas

Top 2024 Advances in Alternative Protein

EXHIBIT: Voices for the Environment: A Century of Bay Area Activism, Nov. 14

EXHIBIT: Voices for the Environment: A Century of Bay Area Activism, Nov. 14

Cosplay Break: Bask in the Charm of Costumed Fans at Kumoricon 2024

Season’s Reelings: Your 2024 Holiday Movie Guide

THE TRASH REPORT: 2024—the Year in TRASH

Parents with kids under 18 swung to Trump in 2024 election: exit polling

Shark fisherman accused of embezzling over $194K from Kentucky church

Subscribe To Our Newsletter

Season’s Reelings: Your 2024 Holiday Movie Guide