el A Note on Cores and Quasi Relative Interiors in Partially Finite Convex Programming. (arXiv:2005.03265v1 [math.FA]) By arxiv.org Published On :: The problem of minimizing an entropy functional subject to linear constraints is a useful example of partially finite convex programming. In the 1990s, Borwein and Lewis provided broad and easy-to-verify conditions that guarantee strong duality for such problems. Their approach is to construct a function in the quasi-relative interior of the relevant infinite-dimensional set, which assures the existence of a point in the core of the relevant finite-dimensional set. We revisit this problem, and provide an alternative proof by directly appealing to the definition of the core, rather than by relying on any properties of the quasi-relative interior. Our approach admits a minor relaxation of the linear independence requirements in Borwein and Lewis' framework, which allows us to work with certain piecewise-defined moment functions precluded by their conditions. We provide such a computed example that illustrates how this relaxation may be used to tame observed Gibbs phenomenon when the underlying data is discontinuous. The relaxation illustrates the understanding we may gain by tackling partially-finite problems from both the finite-dimensional and infinite-dimensional sides. The comparison of these two approaches is informative, as both proofs are constructive. Full Article
el The Congruence Subgroup Problem for finitely generated Nilpotent Groups. (arXiv:2005.03263v1 [math.GR]) By arxiv.org Published On :: The congruence subgroup problem for a finitely generated group $Gamma$ and $Gleq Aut(Gamma)$ asks whether the map $hat{G} o Aut(hat{Gamma})$ is injective, or more generally, what is its kernel $Cleft(G,Gamma ight)$? Here $hat{X}$ denotes the profinite completion of $X$. In the case $G=Aut(Gamma)$ we denote $Cleft(Gamma ight)=Cleft(Aut(Gamma),Gamma ight)$. Let $Gamma$ be a finitely generated group, $ar{Gamma}=Gamma/[Gamma,Gamma]$, and $Gamma^{*}=ar{Gamma}/tor(ar{Gamma})congmathbb{Z}^{(d)}$. Denote $Aut^{*}(Gamma)= extrm{Im}(Aut(Gamma) o Aut(Gamma^{*}))leq GL_{d}(mathbb{Z})$. In this paper we show that when $Gamma$ is nilpotent, there is a canonical isomorphism $Cleft(Gamma ight)simeq C(Aut^{*}(Gamma),Gamma^{*})$. In other words, $Cleft(Gamma ight)$ is completely determined by the solution to the classical congruence subgroup problem for the arithmetic group $Aut^{*}(Gamma)$. In particular, in the case where $Gamma=Psi_{n,c}$ is a finitely generated free nilpotent group of class $c$ on $n$ elements, we get that $C(Psi_{n,c})=C(mathbb{Z}^{(n)})={e}$ whenever $ngeq3$, and $C(Psi_{2,c})=C(mathbb{Z}^{(2)})=hat{F}_{omega}$ = the free profinite group on countable number of generators. Full Article
el An Issue Raised in 1978 by a Then-Future Editor-in-Chief of the Journal "Order": Does the Endomorphism Poset of a Finite Connected Poset Tell Us That the Poset Is Connected?. (arXiv:2005.03255v1 [math.CO]) By arxiv.org Published On :: In 1978, Dwight Duffus---editor-in-chief of the journal "Order" from 2010 to 2018 and chair of the Mathematics Department at Emory University from 1991 to 2005---wrote that "it is not obvious that $P$ is connected and $P^P$ isomorphic to $Q^Q$ implies that $Q$ is connected," where $P$ and $Q$ are finite non-empty posets. We show that, indeed, under these hypotheses $Q$ is connected and $Pcong Q$. Full Article
el A Chance Constraint Predictive Control and Estimation Framework for Spacecraft Descent with Field Of View Constraints. (arXiv:2005.03245v1 [math.OC]) By arxiv.org Published On :: Recent studies of optimization methods and GNC of spacecraft near small bodies focusing on descent, landing, rendezvous, etc., with key safety constraints such as line-of-sight conic zones and soft landings have shown promising results; this paper considers descent missions to an asteroid surface with a constraint that consists of an onboard camera and asteroid surface markers while using a stochastic convex MPC law. An undermodeled asteroid gravity and spacecraft technology inspired measurement model is established to develop the constraint. Then a computationally light stochastic Linear Quadratic MPC strategy is presented to keep the spacecraft in satisfactory field of view of the surface markers while trajectory tracking, employing chance based constraints and up-to-date estimation uncertainty from navigation. The estimation uncertainty giving rise to the tightened constraints is particularly addressed. Results suggest robust tracking performance across a variety of trajectories. Full Article
el Non-relativity of K"ahler manifold and complex space forms. (arXiv:2005.03208v1 [math.CV]) By arxiv.org Published On :: We study the non-relativity for two real analytic K"ahler manifolds and complex space forms of three types. The first one is a K"ahler manifold whose polarization of local K"ahler potential is a Nash function in a local coordinate. The second one is the Hartogs domain equpped with two canonical metrics whose polarizations of the K"ahler potentials are the diastatic functions. Full Article
el New constructions of strongly regular Cayley graphs on abelian groups. (arXiv:2005.03183v1 [math.CO]) By arxiv.org Published On :: In this paper, we give new constructions of strongly regular Cayley graphs on abelian groups as generalizations of a series of known constructions: the construction of covering extended building sets in finite fields by Xia (1992), the product construction of Menon-Hadamard difference sets by Turyn (1984), and the construction of Paley type partial difference sets by Polhill (2010). Then, we obtain new large families of strongly regular Cayley graphs of Latin square type or negative Latin square type. Full Article
el Exponential decay for negative feedback loop with distributed delay. (arXiv:2005.03136v1 [math.DS]) By arxiv.org Published On :: We derive sufficient conditions for exponential decay of solutions of the delay negative feedback equation with distributed delay. The conditions are written in terms of exponential moments of the distribution. Our method only uses elementary tools of calculus and is robust towards possible extensions to more complex settings, in particular, systems of delay differential equations. We illustrate the applicability of the method to particular distributions - Dirac delta, Gamma distribution, uniform and truncated normal distributions. Full Article
el Continuation of relative equilibria in the $n$--body problem to spaces of constant curvature. (arXiv:2005.03114v1 [math.DS]) By arxiv.org Published On :: We prove that all non-degenerate relative equilibria of the planar Newtonian $n$--body problem can be continued to spaces of constant curvature $kappa$, positive or negative, for small enough values of this parameter. We also compute the extension of some classical relative equilibria to curved spaces using numerical continuation. In particular, we extend Lagrange's triangle configuration with different masses to both positive and negative curvature spaces. Full Article
el A note on Tonelli Lagrangian systems on $mathbb{T}^2$ with positive topological entropy on high energy level. (arXiv:2005.03108v1 [math.DS]) By arxiv.org Published On :: In this work we study the dynamical behavior Tonelli Lagrangian systems defined on the tangent bundle of the torus $mathbb{T}^2=mathbb{R}^2 / mathbb{Z}^2$. We prove that the Lagrangian flow restricted to a high energy level $ E_L^{-1}(c)$ (i.e $ c> c_0(L)$) has positive topological entropy if the flow satisfies the Kupka-Smale propriety in $ E_L^{-1}(c)$ (i.e, all closed orbit with energy $c$ are hyperbolic or elliptic and all heteroclinic intersections are transverse on $E_L^{-1}(c)$). The proof requires the use of well-known results in Aubry-Mather's Theory. Full Article
el Quantization of Lax integrable systems and Conformal Field Theory. (arXiv:2005.03053v1 [math-ph]) By arxiv.org Published On :: We present the correspondence between Lax integrable systems with spectral parameter on a Riemann surface, and Conformal Field Theories, in quite general set-up suggested earlier by the author. This correspondence turns out to give a prequantization of the integrable systems in question. Full Article
el Modeling nanoconfinement effects using active learning. (arXiv:2005.02587v2 [physics.app-ph] UPDATED) By arxiv.org Published On :: Predicting the spatial configuration of gas molecules in nanopores of shale formations is crucial for fluid flow forecasting and hydrocarbon reserves estimation. The key challenge in these tight formations is that the majority of the pore sizes are less than 50 nm. At this scale, the fluid properties are affected by nanoconfinement effects due to the increased fluid-solid interactions. For instance, gas adsorption to the pore walls could account for up to 85% of the total hydrocarbon volume in a tight reservoir. Although there are analytical solutions that describe this phenomenon for simple geometries, they are not suitable for describing realistic pores, where surface roughness and geometric anisotropy play important roles. To describe these, molecular dynamics (MD) simulations are used since they consider fluid-solid and fluid-fluid interactions at the molecular level. However, MD simulations are computationally expensive, and are not able to simulate scales larger than a few connected nanopores. We present a method for building and training physics-based deep learning surrogate models to carry out fast and accurate predictions of molecular configurations of gas inside nanopores. Since training deep learning models requires extensive databases that are computationally expensive to create, we employ active learning (AL). AL reduces the overhead of creating comprehensive sets of high-fidelity data by determining where the model uncertainty is greatest, and running simulations on the fly to minimize it. The proposed workflow enables nanoconfinement effects to be rigorously considered at the mesoscale where complex connected sets of nanopores control key applications such as hydrocarbon recovery and CO2 sequestration. Full Article
el The Cascade Transformer: an Application for Efficient Answer Sentence Selection. (arXiv:2005.02534v2 [cs.CL] UPDATED) By arxiv.org Published On :: Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference. In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers. Each ranker is used to prune a subset of candidates in a batch, thus dramatically increasing throughput at inference time. Partial encodings from the transformer model are shared among rerankers, providing further speed-up. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy, as measured on two English Question Answering datasets. Full Article
el Temporal Event Segmentation using Attention-based Perceptual Prediction Model for Continual Learning. (arXiv:2005.02463v2 [cs.CV] UPDATED) By arxiv.org Published On :: Temporal event segmentation of a long video into coherent events requires a high level understanding of activities' temporal features. The event segmentation problem has been tackled by researchers in an offline training scheme, either by providing full, or weak, supervision through manually annotated labels or by self-supervised epoch based training. In this work, we present a continual learning perceptual prediction framework (influenced by cognitive psychology) capable of temporal event segmentation through understanding of the underlying representation of objects within individual frames. Our framework also outputs attention maps which effectively localize and track events-causing objects in each frame. The model is tested on a wildlife monitoring dataset in a continual training manner resulting in $80\%$ recall rate at $20\%$ false positive rate for frame level segmentation. Activity level testing has yielded $80\%$ activity recall rate for one false activity detection every 50 minutes. Full Article
el The Sensitivity of Language Models and Humans to Winograd Schema Perturbations. (arXiv:2005.01348v2 [cs.CL] UPDATED) By arxiv.org Published On :: Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues. Full Article
el Prediction of Event Related Potential Speller Performance Using Resting-State EEG. (arXiv:2005.01325v3 [cs.HC] UPDATED) By arxiv.org Published On :: Event-related potential (ERP) speller can be utilized in device control and communication for locked-in or severely injured patients. However, problems such as inter-subject performance instability and ERP-illiteracy are still unresolved. Therefore, it is necessary to predict classification performance before performing an ERP speller in order to use it efficiently. In this study, we investigated the correlations with ERP speller performance using a resting-state before an ERP speller. In specific, we used spectral power and functional connectivity according to four brain regions and five frequency bands. As a result, the delta power in the frontal region and functional connectivity in the delta, alpha, gamma bands are significantly correlated with the ERP speller performance. Also, we predicted the ERP speller performance using EEG features in the resting-state. These findings may contribute to investigating the ERP-illiteracy and considering the appropriate alternatives for each user. Full Article
el Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment. (arXiv:2005.00165v3 [cs.CL] UPDATED) By arxiv.org Published On :: A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all. Full Article
el Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images. (arXiv:2004.14487v2 [cs.CV] UPDATED) By arxiv.org Published On :: The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property. Full Article
el Self-Attention with Cross-Lingual Position Representation. (arXiv:2004.13310v2 [cs.CL] UPDATED) By arxiv.org Published On :: Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g. machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with emph{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT'14 English$Rightarrow$German, WAT'17 Japanese$Rightarrow$English, and WMT'17 Chinese$Leftrightarrow$English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information. Full Article
el SPECTER: Document-level Representation Learning using Citation-informed Transformers. (arXiv:2004.07180v3 [cs.CL] UPDATED) By arxiv.org Published On :: Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards token- and sentence-level training objectives and do not leverage information on inter-document relatedness, which limits their document-level representation power. For applications on scientific documents, such as classification and recommendation, the embeddings power strong performance on end tasks. We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning. Additionally, to encourage further research on document-level models, we introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction, to document classification and recommendation. We show that SPECTER outperforms a variety of competitive baselines on the benchmark. Full Article
el Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus. (arXiv:2004.06295v2 [cs.CL] UPDATED) By arxiv.org Published On :: Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly. Full Article
el Watching the World Go By: Representation Learning from Unlabeled Videos. (arXiv:2003.07990v2 [cs.CV] UPDATED) By arxiv.org Published On :: Recent single image unsupervised representation learning techniques show remarkable success on a variety of tasks. The basic principle in these works is instance discrimination: learning to differentiate between two augmented versions of the same image and a large batch of unrelated images. Networks learn to ignore the augmentation noise and extract semantically meaningful representations. Prior work uses artificial data augmentation techniques such as cropping, and color jitter which can only affect the image in superficial ways and are not aligned with how objects actually change e.g. occlusion, deformation, viewpoint change. In this paper, we argue that videos offer this natural augmentation for free. Videos can provide entirely new views of objects, show deformation, and even connect semantically similar but visually distinct concepts. We propose Video Noise Contrastive Estimation, a method for using unlabeled video to learn strong, transferable single image representations. We demonstrate improvements over recent unsupervised single image techniques, as well as over fully supervised ImageNet pretraining, across a variety of temporal and non-temporal tasks. Code and the Random Related Video Views dataset are available at https://www.github.com/danielgordon10/vince Full Article
el Eccentricity terrain of $delta$-hyperbolic graphs. (arXiv:2002.08495v2 [cs.DM] UPDATED) By arxiv.org Published On :: A graph $G=(V,E)$ is $delta$-hyperbolic if for any four vertices $u,v,w,x$, the two larger of the three distance sums $d(u,v)+d(w,x)$, $d(u,w)+d(v,x)$, and $d(u,x)+d(v,w)$ differ by at most $2delta geq 0$. Recent work shows that many real-world graphs have small hyperbolicity $delta$. This paper describes the eccentricity terrain of a $delta$-hyperbolic graph. The eccentricity function $e_G(v)=max{d(v,u) : u in V}$ partitions the vertex set of $G$ into eccentricity layers $C_{k}(G) = {v in V : e(v)=rad(G)+k}$, $k in mathbb{N}$, where $rad(G)=min{e_G(v): vin V}$ is the radius of $G$. The paper studies the eccentricity layers of vertices along shortest paths, identifying such terrain features as hills, plains, valleys, terraces, and plateaus. It introduces the notion of $eta$-pseudoconvexity, which implies Gromov's $epsilon$-quasiconvexity, and illustrates the abundance of pseudoconvex sets in $delta$-hyperbolic graphs. In particular, it shows that all sets $C_{leq k}(G)={vin V : e_G(v) leq rad(G) + k}$, $kin mathbb{N}$, are $(2delta-1)$-pseudoconvex. Additionally, several bounds on the eccentricity of a vertex are obtained which yield a few approaches to efficiently approximating all eccentricities. An $O(delta |E|)$ time eccentricity approximation $hat{e}(v)$, for all $vin V$, is presented that uses distances to two mutually distant vertices and satisfies $e_G(v)-2delta leq hat{e}(v) leq {e_G}(v)$. It also shows existence of two eccentricity approximating spanning trees $T$, one constructible in $O(delta |E|)$ time and the other in $O(|E|)$ time, which satisfy ${e}_G(v) leq e_T(v) leq {e}_G(v)+4delta+1$ and ${e}_G(v) leq e_T(v) leq {e}_G(v)+6delta$, respectively. Thus, the eccentricity terrain of a tree gives a good approximation (up-to an additive error $O(delta))$ of the eccentricity terrain of a $delta$-hyperbolic graph. Full Article
el Lake Ice Detection from Sentinel-1 SAR with Deep Learning. (arXiv:2002.07040v2 [eess.IV] UPDATED) By arxiv.org Published On :: Lake ice, as part of the Essential Climate Variable (ECV) lakes, is an important indicator to monitor climate change and global warming. The spatio-temporal extent of lake ice cover, along with the timings of key phenological events such as freeze-up and break-up, provide important cues about the local and global climate. We present a lake ice monitoring system based on the automatic analysis of Sentinel-1 Synthetic Aperture Radar (SAR) data with a deep neural network. In previous studies that used optical satellite imagery for lake ice monitoring, frequent cloud cover was a main limiting factor, which we overcome thanks to the ability of microwave sensors to penetrate clouds and observe the lakes regardless of the weather and illumination conditions. We cast ice detection as a two class (frozen, non-frozen) semantic segmentation problem and solve it using a state-of-the-art deep convolutional network (CNN). We report results on two winters ( 2016 - 17 and 2017 - 18 ) and three alpine lakes in Switzerland. The proposed model reaches mean Intersection-over-Union (mIoU) scores >90% on average, and >84% even for the most difficult lake. Additionally, we perform cross-validation tests and show that our algorithm generalises well across unseen lakes and winters. Full Article
el Toward Improving the Evaluation of Visual Attention Models: a Crowdsourcing Approach. (arXiv:2002.04407v2 [cs.CV] UPDATED) By arxiv.org Published On :: Human visual attention is a complex phenomenon. A computational modeling of this phenomenon must take into account where people look in order to evaluate which are the salient locations (spatial distribution of the fixations), when they look in those locations to understand the temporal development of the exploration (temporal order of the fixations), and how they move from one location to another with respect to the dynamics of the scene and the mechanics of the eyes (dynamics). State-of-the-art models focus on learning saliency maps from human data, a process that only takes into account the spatial component of the phenomenon and ignore its temporal and dynamical counterparts. In this work we focus on the evaluation methodology of models of human visual attention. We underline the limits of the current metrics for saliency prediction and scanpath similarity, and we introduce a statistical measure for the evaluation of the dynamics of the simulated eye movements. While deep learning models achieve astonishing performance in saliency prediction, our analysis shows their limitations in capturing the dynamics of the process. We find that unsupervised gravitational models, despite of their simplicity, outperform all competitors. Finally, exploiting a crowd-sourcing platform, we present a study aimed at evaluating how strongly the scanpaths generated with the unsupervised gravitational models appear plausible to naive and expert human observers. Full Article
el Provenance for the Description Logic ELHr. (arXiv:2001.07541v2 [cs.LO] UPDATED) By arxiv.org Published On :: We address the problem of handling provenance information in ELHr ontologies. We consider a setting recently introduced for ontology-based data access, based on semirings and extending classical data provenance, in which ontology axioms are annotated with provenance tokens. A consequence inherits the provenance of the axioms involved in deriving it, yielding a provenance polynomial as an annotation. We analyse the semantics for the ELHr case and show that the presence of conjunctions poses various difficulties for handling provenance, some of which are mitigated by assuming multiplicative idempotency of the semiring. Under this assumption, we study three problems: ontology completion with provenance, computing the set of relevant axioms for a consequence, and query answering. Full Article
el Hardware Implementation of Neural Self-Interference Cancellation. (arXiv:2001.04543v2 [eess.SP] UPDATED) By arxiv.org Published On :: In-band full-duplex systems can transmit and receive information simultaneously on the same frequency band. However, due to the strong self-interference caused by the transmitter to its own receiver, the use of non-linear digital self-interference cancellation is essential. In this work, we describe a hardware architecture for a neural network-based non-linear self-interference (SI) canceller and we compare it with our own hardware implementation of a conventional polynomial based SI canceller. In particular, we present implementation results for a shallow and a deep neural network SI canceller as well as for a polynomial SI canceller. Our results show that the deep neural network canceller achieves a hardware efficiency of up to $312.8$ Msamples/s/mm$^2$ and an energy efficiency of up to $0.9$ nJ/sample, which is $2.1 imes$ and $2 imes$ better than the polynomial SI canceller, respectively. These results show that NN-based methods applied to communications are not only useful from a performance perspective, but can also be a very effective means to reduce the implementation complexity. Full Article
el SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images. (arXiv:1912.09121v2 [cs.CV] UPDATED) By arxiv.org Published On :: High-resolution remote sensing images (HRRSIs) contain substantial ground object information, such as texture, shape, and spatial location. Semantic segmentation, which is an important task for element extraction, has been widely used in processing mass HRRSIs. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In this paper, we propose a new end-to-end semantic segmentation network, which integrates lightweight spatial and channel attention modules that can refine features adaptively. We compare our method with several classic methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can achieve better semantic segmentation results. The source codes are available at https://github.com/lehaifeng/SCAttNet. Full Article
el SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval. (arXiv:1912.05891v2 [cs.IR] UPDATED) By arxiv.org Published On :: In learning-to-rank for information retrieval, a ranking model is automatically learned from the data and then utilized to rank the sets of retrieved documents. Therefore, an ideal ranking model would be a mapping from a document set to a permutation on the set, and should satisfy two critical requirements: (1)~it should have the ability to model cross-document interactions so as to capture local context information in a query; (2)~it should be permutation-invariant, which means that any permutation of the inputted documents would not change the output ranking. Previous studies on learning-to-rank either design uni-variate scoring functions that score each document separately, and thus failed to model the cross-document interactions; or construct multivariate scoring functions that score documents sequentially, which inevitably sacrifice the permutation invariance requirement. In this paper, we propose a neural learning-to-rank model called SetRank which directly learns a permutation-invariant ranking model defined on document sets of any size. SetRank employs a stack of (induced) multi-head self attention blocks as its key component for learning the embeddings for all of the retrieved documents jointly. The self-attention mechanism not only helps SetRank to capture the local context information from cross-document interactions, but also to learn permutation-equivariant representations for the inputted documents, which therefore achieving a permutation-invariant ranking model. Experimental results on three large scale benchmarks showed that the SetRank significantly outperformed the baselines include the traditional learning-to-rank models and state-of-the-art Neural IR models. Full Article
el Novel Deep Learning Framework for Wideband Spectrum Characterization at Sub-Nyquist Rate. (arXiv:1912.05255v2 [eess.SP] UPDATED) By arxiv.org Published On :: Introduction of spectrum-sharing in 5G and subsequent generation networks demand base-station(s) with the capability to characterize the wideband spectrum spanned over licensed, shared and unlicensed non-contiguous frequency bands. Spectrum characterization involves the identification of vacant bands along with center frequency and parameters (energy, modulation, etc.) of occupied bands. Such characterization at Nyquist sampling is area and power-hungry due to the need for high-speed digitization. Though sub-Nyquist sampling (SNS) offers an excellent alternative when the spectrum is sparse, it suffers from poor performance at low signal to noise ratio (SNR) and demands careful design and integration of digital reconstruction, tunable channelizer and characterization algorithms. In this paper, we propose a novel deep-learning framework via a single unified pipeline to accomplish two tasks: 1)~Reconstruct the signal directly from sub-Nyquist samples, and 2)~Wideband spectrum characterization. The proposed approach eliminates the need for complex signal conditioning between reconstruction and characterization and does not need complex tunable channelizers. We extensively compare the performance of our framework for a wide range of modulation schemes, SNR and channel conditions. We show that the proposed framework outperforms existing SNS based approaches and characterization performance approaches to Nyquist sampling-based framework with an increase in SNR. Easy to design and integrate along with a single unified deep learning framework make the proposed architecture a good candidate for reconfigurable platforms. Full Article
el Global Locality in Biomedical Relation and Event Extraction. (arXiv:1909.04822v2 [cs.CL] UPDATED) By arxiv.org Published On :: Due to the exponential growth of biomedical literature, event and relation extraction are important tasks in biomedical text mining. Most work only focus on relation extraction, and detect a single entity pair mention on a short span of text, which is not ideal due to long sentences that appear in biomedical contexts. We propose an approach to both relation and event extraction, for simultaneously predicting relationships between all mention pairs in a text. We also perform an empirical study to discuss different network setups for this purpose. The best performing model includes a set of multi-head attentions and convolutions, an adaptation of the transformer architecture, which offers self-attention the ability to strengthen dependencies among related elements, and models the interaction between features extracted by multiple attention heads. Experiment results demonstrate that our approach outperforms the state of the art on a set of benchmark biomedical corpora including BioNLP 2009, 2011, 2013 and BioCreative 2017 shared tasks. Full Article
el A Shift Selection Strategy for Parallel Shift-Invert Spectrum Slicing in Symmetric Self-Consistent Eigenvalue Computation. (arXiv:1908.06043v2 [math.NA] UPDATED) By arxiv.org Published On :: The central importance of large scale eigenvalue problems in scientific computation necessitates the development of massively parallel algorithms for their solution. Recent advances in dense numerical linear algebra have enabled the routine treatment of eigenvalue problems with dimensions on the order of hundreds of thousands on the world's largest supercomputers. In cases where dense treatments are not feasible, Krylov subspace methods offer an attractive alternative due to the fact that they do not require storage of the problem matrices. However, demonstration of scalability of either of these classes of eigenvalue algorithms on computing architectures capable of expressing massive parallelism is non-trivial due to communication requirements and serial bottlenecks, respectively. In this work, we introduce the SISLICE method: a parallel shift-invert algorithm for the solution of the symmetric self-consistent field (SCF) eigenvalue problem. The SISLICE method drastically reduces the communication requirement of current parallel shift-invert eigenvalue algorithms through various shift selection and migration techniques based on density of states estimation and k-means clustering, respectively. This work demonstrates the robustness and parallel performance of the SISLICE method on a representative set of SCF eigenvalue problems and outlines research directions which will be explored in future work. Full Article
el Deterministic Sparse Fourier Transform with an ell_infty Guarantee. (arXiv:1903.00995v3 [cs.DS] UPDATED) By arxiv.org Published On :: In this paper we revisit the deterministic version of the Sparse Fourier Transform problem, which asks to read only a few entries of $x in mathbb{C}^n$ and design a recovery algorithm such that the output of the algorithm approximates $hat x$, the Discrete Fourier Transform (DFT) of $x$. The randomized case has been well-understood, while the main work in the deterministic case is that of Merhi et al.@ (J Fourier Anal Appl 2018), which obtains $O(k^2 log^{-1}k cdot log^{5.5}n)$ samples and a similar runtime with the $ell_2/ell_1$ guarantee. We focus on the stronger $ell_{infty}/ell_1$ guarantee and the closely related problem of incoherent matrices. We list our contributions as follows. 1. We find a deterministic collection of $O(k^2 log n)$ samples for the $ell_infty/ell_1$ recovery in time $O(nk log^2 n)$, and a deterministic collection of $O(k^2 log^2 n)$ samples for the $ell_infty/ell_1$ sparse recovery in time $O(k^2 log^3n)$. 2. We give new deterministic constructions of incoherent matrices that are row-sampled submatrices of the DFT matrix, via a derandomization of Bernstein's inequality and bounds on exponential sums considered in analytic number theory. Our first construction matches a previous randomized construction of Nelson, Nguyen and Woodruff (RANDOM'12), where there was no constraint on the form of the incoherent matrix. Our algorithms are nearly sample-optimal, since a lower bound of $Omega(k^2 + k log n)$ is known, even for the case where the sensing matrix can be arbitrarily designed. A similar lower bound of $Omega(k^2 log n/ log k)$ is known for incoherent matrices. Full Article
el Asymptotic expansions of eigenvalues by both the Crouzeix-Raviart and enriched Crouzeix-Raviart elements. (arXiv:1902.09524v2 [math.NA] UPDATED) By arxiv.org Published On :: Asymptotic expansions are derived for eigenvalues produced by both the Crouzeix-Raviart element and the enriched Crouzeix--Raviart element. The expansions are optimal in the sense that extrapolation eigenvalues based on them admit a fourth order convergence provided that exact eigenfunctions are smooth enough. The major challenge in establishing the expansions comes from the fact that the canonical interpolation of both nonconforming elements lacks a crucial superclose property, and the nonconformity of both elements. The main idea is to employ the relation between the lowest-order mixed Raviart--Thomas element and the two nonconforming elements, and consequently make use of the superclose property of the canonical interpolation of the lowest-order mixed Raviart--Thomas element. To overcome the difficulty caused by the nonconformity, the commuting property of the canonical interpolation operators of both nonconforming elements is further used, which turns the consistency error problem into an interpolation error problem. Then, a series of new results are obtained to show the final expansions. Full Article
el Using hierarchical matrices in the solution of the time-fractional heat equation by multigrid waveform relaxation. (arXiv:1706.07632v3 [math.NA] UPDATED) By arxiv.org Published On :: This work deals with the efficient numerical solution of the time-fractional heat equation discretized on non-uniform temporal meshes. Non-uniform grids are essential to capture the singularities of "typical" solutions of time-fractional problems. We propose an efficient space-time multigrid method based on the waveform relaxation technique, which accounts for the nonlocal character of the fractional differential operator. To maintain an optimal complexity, which can be obtained for the case of uniform grids, we approximate the coefficient matrix corresponding to the temporal discretization by its hierarchical matrix (${cal H}$-matrix) representation. In particular, the proposed method has a computational cost of ${cal O}(k N M log(M))$, where $M$ is the number of time steps, $N$ is the number of spatial grid points, and $k$ is a parameter which controls the accuracy of the ${cal H}$-matrix approximation. The efficiency and the good convergence of the algorithm, which can be theoretically justified by a semi-algebraic mode analysis, are demonstrated through numerical experiments in both one- and two-dimensional spaces. Full Article
el Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity. (arXiv:1706.02205v4 [math.NA] UPDATED) By arxiv.org Published On :: Dense kernel matrices $Theta in mathbb{R}^{N imes N}$ obtained from point evaluations of a covariance function $G$ at locations ${ x_{i} }_{1 leq i leq N} subset mathbb{R}^{d}$ arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to identify a subset $S subset { 1 , dots , N }^2$, with $# S = O ( N log (N) log^{d} ( N /epsilon ) )$, such that the zero fill-in incomplete Cholesky factorisation of the sparse matrix $Theta_{ij} 1_{( i, j ) in S}$ is an $epsilon$-approximation of $Theta$. This factorisation can provably be obtained in complexity $O ( N log( N ) log^{d}( N /epsilon) )$ in space and $O ( N log^{2}( N ) log^{2d}( N /epsilon) )$ in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that $d$ can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the $x_{i}$ and does not require an analytic representation of $G$. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity $O ( N log^{d}( N /epsilon) )$ in space and $O ( N log^{2d}( N /epsilon) )$ in time, improving upon the state of the art for general elliptic operators. Full Article
el Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting. (arXiv:2005.03633v1 [eess.AS]) By arxiv.org Published On :: In this paper, we focus on the task of small-footprint keyword spotting under the far-field scenario. Far-field environments are commonly encountered in real-life speech applications, and it causes serve degradation of performance due to room reverberation and various kinds of noises. Our baseline system is built on the convolutional neural network trained with pooled data of both far-field and close-talking speech. To cope with the distortions, we adopt the multi-task learning scheme with alignment loss to reduce the mismatch between the embedding features learned from different domains of data. Experimental results show that our proposed method maintains the performance on close-talking speech and achieves significant improvement on the far-field test set. Full Article
el The Zhou Ordinal of Labelled Markov Processes over Separable Spaces. (arXiv:2005.03630v1 [cs.LO]) By arxiv.org Published On :: There exist two notions of equivalence of behavior between states of a Labelled Markov Process (LMP): state bisimilarity and event bisimilarity. The first one can be considered as an appropriate generalization to continuous spaces of Larsen and Skou's probabilistic bisimilarity, while the second one is characterized by a natural logic. C. Zhou expressed state bisimilarity as the greatest fixed point of an operator $mathcal{O}$, and thus introduced an ordinal measure of the discrepancy between it and event bisimilarity. We call this ordinal the "Zhou ordinal" of $mathbb{S}$, $mathfrak{Z}(mathbb{S})$. When $mathfrak{Z}(mathbb{S})=0$, $mathbb{S}$ satisfies the Hennessy-Milner property. The second author proved the existence of an LMP $mathbb{S}$ with $mathfrak{Z}(mathbb{S}) geq 1$ and Zhou showed that there are LMPs having an infinite Zhou ordinal. In this paper we show that there are LMPs $mathbb{S}$ over separable metrizable spaces having arbitrary large countable $mathfrak{Z}(mathbb{S})$ and that it is consistent with the axioms of $mathit{ZFC}$ that there is such a process with an uncountable Zhou ordinal. Full Article
el Learning Robust Models for e-Commerce Product Search. (arXiv:2005.03624v1 [cs.CL]) By arxiv.org Published On :: Showing items that do not match search query intent degrades customer experience in e-commerce. These mismatches result from counterfactual biases of the ranking algorithms toward noisy behavioral signals such as clicks and purchases in the search logs. Mitigating the problem requires a large labeled dataset, which is expensive and time-consuming to obtain. In this paper, we develop a deep, end-to-end model that learns to effectively classify mismatches and to generate hard mismatched examples to improve the classifier. We train the model end-to-end by introducing a latent variable into the cross-entropy loss that alternates between using the real and generated samples. This not only makes the classifier more robust but also boosts the overall ranking performance. Our model achieves a relative gain compared to baselines by over 26% in F-score, and over 17% in Area Under PR curve. On live search traffic, our model gains significant improvement in multiple countries. Full Article
el Delayed approximate matrix assembly in multigrid with dynamic precisions. (arXiv:2005.03606v1 [cs.MS]) By arxiv.org Published On :: The accurate assembly of the system matrix is an important step in any code that solves partial differential equations on a mesh. We either explicitly set up a matrix, or we work in a matrix-free environment where we have to be able to quickly return matrix entries upon demand. Either way, the construction can become costly due to non-trivial material parameters entering the equations, multigrid codes requiring cascades of matrices that depend upon each other, or dynamic adaptive mesh refinement that necessitates the recomputation of matrix entries or the whole equation system throughout the solve. We propose that these constructions can be performed concurrently with the multigrid cycles. Initial geometric matrices and low accuracy integrations kickstart the multigrid, while improved assembly data is fed to the solver as and when it becomes available. The time to solution is improved as we eliminate an expensive preparation phase traditionally delaying the actual computation. We eliminate algorithmic latency. Furthermore, we desynchronise the assembly from the solution process. This anarchic increase of the concurrency level improves the scalability. Assembly routines are notoriously memory- and bandwidth-demanding. As we work with iteratively improving operator accuracies, we finally propose the use of a hierarchical, lossy compression scheme such that the memory footprint is brought down aggressively where the system matrix entries carry little information or are not yet available with high accuracy. Full Article
el A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer's Type. (arXiv:2005.03593v1 [cs.CL]) By arxiv.org Published On :: In recent years there has been a burgeoning interest in the use of computational methods to distinguish between elicited speech samples produced by patients with dementia, and those from healthy controls. The difference between perplexity estimates from two neural language models (LMs) - one trained on transcripts of speech produced by healthy participants and the other trained on transcripts from patients with dementia - as a single feature for diagnostic classification of unseen transcripts has been shown to produce state-of-the-art performance. However, little is known about why this approach is effective, and on account of the lack of case/control matching in the most widely-used evaluation set of transcripts (DementiaBank), it is unclear if these approaches are truly diagnostic, or are sensitive to other variables. In this paper, we interrogate neural LMs trained on participants with and without dementia using synthetic narratives previously developed to simulate progressive semantic dementia by manipulating lexical frequency. We find that perplexity of neural LMs is strongly and differentially associated with lexical frequency, and that a mixture model resulting from interpolating control and dementia LMs improves upon the current state-of-the-art for models trained on transcript text exclusively. Full Article
el Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. (arXiv:2005.03572v1 [cs.CV]) By arxiv.org Published On :: Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $ell_n$-norm loss and IoU-based loss. Furthermore, we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, , and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR$_{100}$ for object detection, and +0.9 AP and +3.5 AR$_{100}$ for instance segmentation, with 27.1 FPS on one NVIDIA GTX 1080Ti GPU. All the source code and trained models are available at https://github.com/Zzh-tju/CIoU Full Article
el Heidelberg Colorectal Data Set for Surgical Data Science in the Sensor Operating Room. (arXiv:2005.03501v1 [cs.CV]) By arxiv.org Published On :: Image-based tracking of medical instruments is an integral part of many surgical data science applications. Previous research has addressed the tasks of detecting, segmenting and tracking medical instruments based on laparoscopic video data. However, the methods proposed still tend to fail when applied to challenging images and do not generalize well to data they have not been trained on. This paper introduces the Heidelberg Colorectal (HeiCo) data set - the first publicly available data set enabling comprehensive benchmarking of medical instrument detection and segmentation algorithms with a specific emphasis on robustness and generalization capabilities of the methods. Our data set comprises 30 laparoscopic videos and corresponding sensor data from medical devices in the operating room for three different types of laparoscopic surgery. Annotations include surgical phase labels for all frames in the videos as well as instance-wise segmentation masks for surgical instruments in more than 10,000 individual frames. The data has successfully been used to organize international competitions in the scope of the Endoscopic Vision Challenges (EndoVis) 2017 and 2019. Full Article
el Anonymized GCN: A Novel Robust Graph Embedding Method via Hiding Node Position in Noise. (arXiv:2005.03482v1 [cs.LG]) By arxiv.org Published On :: Graph convolution network (GCN) have achieved state-of-the-art performance in the task of node prediction in the graph structure. However, with the gradual various of graph attack methods, there are lack of research on the robustness of GCN. At this paper, we will design a robust GCN method for node prediction tasks. Considering the graph structure contains two types of information: node information and connection information, and attackers usually modify the connection information to complete the interference with the prediction results of the node, we first proposed a method to hide the connection information in the generator, named Anonymized GCN (AN-GCN). By hiding the connection information in the graph structure in the generator through adversarial training, the accurate node prediction can be completed only by the node number rather than its specific position in the graph. Specifically, we first demonstrated the key to determine the embedding of a specific node: the row corresponding to the node of the eigenmatrix of the Laplace matrix, by target it as the output of the generator, we designed a method to hide the node number in the noise. Take the corresponding noise as input, we will obtain the connection structure of the node instead of directly obtaining. Then the encoder and decoder are spliced both in discriminator, so that after adversarial training, the generator and discriminator can cooperate to complete the encoding and decoding of the graph, then complete the node prediction. Finally, All node positions can generated by noise at the same time, that is to say, the generator will hides all the connection information of the graph structure. The evaluation shows that we only need to obtain the initial features and node numbers of the nodes to complete the node prediction, and the accuracy did not decrease, but increased by 0.0293. Full Article
el A combination of 'pooling' with a prediction model can reduce by 73% the number of COVID-19 (Corona-virus) tests. (arXiv:2005.03453v1 [cs.LG]) By arxiv.org Published On :: We show that combining a prediction model (based on neural networks), with a new method of test pooling (better than the original Dorfman method, and better than double-pooling) called 'Grid', we can reduce the number of Covid-19 tests by 73%. Full Article
el An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG]) By arxiv.org Published On :: We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%. Full Article
el The Perceptimatic English Benchmark for Speech Perception Models. (arXiv:2005.03418v1 [cs.CL]) By arxiv.org Published On :: We present the Perceptimatic English Benchmark, an open experimental benchmark for evaluating quantitative models of speech perception in English. The benchmark consists of ABX stimuli along with the responses of 91 American English-speaking listeners. The stimuli test discrimination of a large number of English and French phonemic contrasts. They are extracted directly from corpora of read speech, making them appropriate for evaluating statistical acoustic models (such as those used in automatic speech recognition) trained on typical speech data sets. We show that phone discrimination is correlated with several types of models, and give recommendations for researchers seeking easily calculated norms of acoustic distance on experimental stimuli. We show that DeepSpeech, a standard English speech recognizer, is more specialized on English phoneme discrimination than English listeners, and is poorly correlated with their behaviour, even though it yields a low error on the decision task given to humans. Full Article
el Joint Prediction and Time Estimation of COVID-19 Developing Severe Symptoms using Chest CT Scan. (arXiv:2005.03405v1 [eess.IV]) By arxiv.org Published On :: With the rapidly worldwide spread of Coronavirus disease (COVID-19), it is of great importance to conduct early diagnosis of COVID-19 and predict the time that patients might convert to the severe stage, for designing effective treatment plan and reducing the clinicians' workloads. In this study, we propose a joint classification and regression method to determine whether the patient would develop severe symptoms in the later time, and if yes, predict the possible conversion time that the patient would spend to convert to the severe stage. To do this, the proposed method takes into account 1) the weight for each sample to reduce the outliers' influence and explore the problem of imbalance classification, and 2) the weight for each feature via a sparsity regularization term to remove the redundant features of high-dimensional data and learn the shared information across the classification task and the regression task. To our knowledge, this study is the first work to predict the disease progression and the conversion time, which could help clinicians to deal with the potential severe cases in time or even save the patients' lives. Experimental analysis was conducted on a real data set from two hospitals with 422 chest computed tomography (CT) scans, where 52 cases were converted to severe on average 5.64 days and 34 cases were severe at admission. Results show that our method achieves the best classification (e.g., 85.91% of accuracy) and regression (e.g., 0.462 of the correlation coefficient) performance, compared to all comparison methods. Moreover, our proposed method yields 76.97% of accuracy for predicting the severe cases, 0.524 of the correlation coefficient, and 0.55 days difference for the converted time. Full Article
el Datom: A Deformable modular robot for building self-reconfigurable programmable matter. (arXiv:2005.03402v1 [cs.RO]) By arxiv.org Published On :: Moving a module in a modular robot is a very complex and error-prone process. Unlike in swarm, in the modular robots we are targeting, the moving module must keep the connection to, at least, one other module. In order to miniaturize each module to few millimeters, we have proposed a design which is using electrostatic actuator. However, this movement is composed of several attachment, detachment creating the movement and each small step can fail causing a module to break the connection. The idea developed in this paper consists in creating a new kind of deformable module allowing a movement which keeps the connection between the moving and the fixed modules. We detail the geometry and the practical constraints during the conception of this new module. We then validate the possibility of movement for a module in an existing configuration. This implies the cooperation of some of the modules placed along the path and we show in simulation that it exists a motion process to reach every free positions of the surface for a given configuration. Full Article
el Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation. (arXiv:2005.03393v1 [cs.CL]) By arxiv.org Published On :: In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in documentlevel neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods. Full Article
el Energy-efficient topology to enhance the wireless sensor network lifetime using connectivity control. (arXiv:2005.03370v1 [cs.NI]) By arxiv.org Published On :: Wireless sensor networks have attracted much attention because of many applications in the fields of industry, military, medicine, agriculture, and education. In addition, the vast majority of researches has been done to expand its applications and improve its efficiency. However, there are still many challenges for increasing the efficiency in different parts of this network. One of the most important parts is to improve the network lifetime in the wireless sensor network. Since the sensor nodes are generally powered by batteries, the most important issue to consider in these types of networks is to reduce the power consumption of the nodes in such a way as to increase the network lifetime to an acceptable level. The contribution of this paper is using topology control, the threshold for the remaining energy in nodes, and two of the meta-algorithms include SA (Simulated annealing) and VNS (Variable Neighbourhood Search) to increase the energy remaining in the sensors. Moreover, using a low-cost spanning tree, an appropriate connectivity control among nodes is created in the network in order to increase the network lifetime. The results of simulations show that the proposed method improves the sensor lifetime and reduces the energy consumed. Full Article