Latest for news

for

Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting. (arXiv:2005.03633v1 [eess.AS])

By arxiv.org
Published On ::

In this paper, we focus on the task of small-footprint keyword spotting under the far-field scenario. Far-field environments are commonly encountered in real-life speech applications, and it causes serve degradation of performance due to room reverberation and various kinds of noises. Our baseline system is built on the convolutional neural network trained with pooled data of both far-field and close-talking speech. To cope with the distortions, we adopt the multi-task learning scheme with alignment loss to reduce the mismatch between the embedding features learned from different domains of data. Experimental results show that our proposed method maintains the performance on close-talking speech and achieves significant improvement on the far-field test set.

Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting. (arXiv:2005.03633v1 [eess.AS])

Learning Robust Models for e-Commerce Product Search. (arXiv:2005.03624v1 [cs.CL])

Technical Report of "Deductive Joint Support for Rational Unrestricted Rebuttal". (arXiv:2005.03620v1 [cs.AI])

A Local Spectral Exterior Calculus for the Sphere and Application to the Shallow Water Equations. (arXiv:2005.03598v1 [math.NA])

GeoLogic -- Graphical interactive theorem prover for Euclidean geometry. (arXiv:2005.03586v1 [cs.LO])

A Reduced Basis Method For Fractional Diffusion Operators II. (arXiv:2005.03574v1 [math.NA])

Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. (arXiv:2005.03572v1 [cs.CV])

MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. (arXiv:2005.03545v1 [cs.CL])

p for political: Participation Without Agency Is Not Enough. (arXiv:2005.03534v1 [cs.HC])

Faceted Search of Heterogeneous Geographic Information for Dynamic Map Projection. (arXiv:2005.03531v1 [cs.HC])

CounQER: A System for Discovering and Linking Count Information in Knowledge Bases. (arXiv:2005.03529v1 [cs.IR])

Practical Perspectives on Quality Estimation for Machine Translation. (arXiv:2005.03519v1 [cs.CL])

An asynchronous distributed and scalable generalized Nash equilibrium seeking algorithm for strongly monotone games. (arXiv:2005.03507v1 [cs.GT])

Sunny Pointer: Designing a mouse pointer for people with peripheral vision loss. (arXiv:2005.03504v1 [cs.HC])

Heidelberg Colorectal Data Set for Surgical Data Science in the Sensor Operating Room. (arXiv:2005.03501v1 [cs.CV])

Computing with bricks and mortar: Classification of waveforms with a doped concrete blocks. (arXiv:2005.03498v1 [cs.ET])

Subquadratic-Time Algorithms for Normal Bases. (arXiv:2005.03497v1 [cs.SC])

Algorithmic Averaging for Studying Periodic Orbits of Planar Differential Systems. (arXiv:2005.03487v1 [cs.SC])

Indexing Metric Spaces for Exact Similarity Search. (arXiv:2005.03468v1 [cs.DB])

Predictions and algorithmic statistics for infinite sequence. (arXiv:2005.03467v1 [cs.IT])

High Performance Interference Suppression in Multi-User Massive MIMO Detector. (arXiv:2005.03466v1 [cs.OH])

How Can CNNs Use Image Position for Segmentation?. (arXiv:2005.03463v1 [eess.IV])

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture. (arXiv:2005.03454v1 [cs.LG])

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG])

Parametrized Universality Problems for One-Counter Nets. (arXiv:2005.03435v1 [cs.FL])

Dirichlet spectral-Galerkin approximation method for the simply supported vibrating plate eigenvalues. (arXiv:2005.03433v1 [math.NA])

The Perceptimatic English Benchmark for Speech Perception Models. (arXiv:2005.03418v1 [cs.CL])

Detection and Feeder Identification of the High Impedance Fault at Distribution Networks Based on Synchronous Waveform Distortions. (arXiv:2005.03411v1 [eess.SY])

A LiDAR-based real-time capable 3D Perception System for Automated Driving in Urban Domains. (arXiv:2005.03404v1 [cs.RO])

Datom: A Deformable modular robot for building self-reconfigurable programmable matter. (arXiv:2005.03402v1 [cs.RO])

Semantic Signatures for Large-scale Visual Localization. (arXiv:2005.03388v1 [cs.CV])

2kenize: Tying Subword Sequences for Chinese Script Conversion. (arXiv:2005.03375v1 [cs.CL])

Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT])

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

Quantum correlation alignment for unsupervised domain adaptation. (arXiv:2005.03355v1 [quant-ph])

DMCP: Differentiable Markov Channel Pruning for Neural Networks. (arXiv:2005.03354v1 [cs.CV])

Error estimates for the Cahn--Hilliard equation with dynamic boundary conditions. (arXiv:2005.03349v1 [math.NA])

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV])

Crop Aggregating for short utterances speaker verification using raw waveforms. (arXiv:2005.03329v1 [eess.AS])

Bitvector-aware Query Optimization for Decision Support Queries (extended version). (arXiv:2005.03328v1 [cs.DB])

Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR])

Interval type-2 fuzzy logic system based similarity evaluation for image steganography. (arXiv:2005.03310v1 [cs.MM])

Knowledge Enhanced Neural Fashion Trend Forecasting. (arXiv:2005.03297v1 [cs.IR])

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

YANG2UML: Bijective Transformation and Simplification of YANG to UML. (arXiv:2005.03292v1 [cs.SE])

Data selection for multi-task learning under dynamic constraints. (arXiv:2005.03270v1 [eess.SY])

Online Proximal-ADMM For Time-varying Constrained Convex Optimization. (arXiv:2005.03267v1 [eess.SY])

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV])

Quda: Natural Language Queries for Visual Data Analytics. (arXiv:2005.03257v1 [cs.CL])

The Finish Line: A (Faux) Monument for the Ages

The Finish Line: Right Solutions for the Right Problems

Building Product Transparency— Be Careful What You Ask For

An Energy Label for Buildings

Green Advocacy vs. Informed Consent

Companies' 'Green' Efforts Include Products’ Material Content

Securitas Technology Partners with K9s United in Support of Law Enforcement Canines

Incomplete information can fuel misjudgment: study

FHWA rule updates protections for workers and drivers in work zones

New Standards for Real-Life Dough Testing

Tuftex, Anderson partner for Color Coordinates selling system

TISE 2025 Opens Entries for Best of Awards

EXHIBIT: Voices for the Environment: A Century of Bay Area Activism, Nov. 14

The Power of Black Excellence: HBCUs and the Fight for American Democracy, Nov. 19

EXHIBIT: Voices for the Environment: A Century of Bay Area Activism, Nov. 14

Subscribe To Our Newsletter