date Mirror Symmetry for Non-Abelian Landau-Ginzburg Models. (arXiv:1812.06200v3 [math.AG] UPDATED) By arxiv.org Published On :: We consider Landau-Ginzburg models stemming from groups comprised of non-diagonal symmetries, and we describe a rule for the mirror LG model. In particular, we present the non-abelian dual group, which serves as the appropriate choice of group for the mirror LG model. We also describe an explicit mirror map between the A-model and the B-model state spaces for two examples. Further, we prove that this mirror map is an isomorphism between the untwisted broad sectors and the narrow diagonal sectors for Fermat type polynomials. Full Article
date Bernoulli decomposition and arithmetical independence between sequences. (arXiv:1811.11545v2 [math.NT] UPDATED) By arxiv.org Published On :: In this paper we study the following set[A={p(n)+2^nd mod 1: ngeq 1}subset [0.1],] where $p$ is a polynomial with at least one irrational coefficient on non constant terms, $d$ is any real number and for $ain [0,infty)$, $a mod 1$ is the fractional part of $a$. By a Bernoulli decomposition method, we show that the closure of $A$ must have full Hausdorff dimension. Full Article
date Optimal construction of Koopman eigenfunctions for prediction and control. (arXiv:1810.08733v3 [math.OC] UPDATED) By arxiv.org Published On :: This work presents a novel data-driven framework for constructing eigenfunctions of the Koopman operator geared toward prediction and control. The method leverages the richness of the spectrum of the Koopman operator away from attractors to construct a rich set of eigenfunctions such that the state (or any other observable quantity of interest) is in the span of these eigenfunctions and hence predictable in a linear fashion. The eigenfunction construction is optimization-based with no dictionary selection required. Once a predictor for the uncontrolled part of the system is obtained in this way, the incorporation of control is done through a multi-step prediction error minimization, carried out by a simple linear least-squares regression. The predictor so obtained is in the form of a linear controlled dynamical system and can be readily applied within the Koopman model predictive control framework of [12] to control nonlinear dynamical systems using linear model predictive control tools. The method is entirely data-driven and based purely on convex optimization, with no reliance on neural networks or other non-convex machine learning tools. The novel eigenfunction construction method is also analyzed theoretically, proving rigorously that the family of eigenfunctions obtained is rich enough to span the space of all continuous functions. In addition, the method is extended to construct generalized eigenfunctions that also give rise Koopman invariant subspaces and hence can be used for linear prediction. Detailed numerical examples with code available online demonstrate the approach, both for prediction and feedback control. Full Article
date On $p$-groups with automorphism groups related to the exceptional Chevalley groups. (arXiv:1810.08365v3 [math.GR] UPDATED) By arxiv.org Published On :: Let $hat G$ be the finite simply connected version of an exceptional Chevalley group, and let $V$ be a nontrivial irreducible module, of minimal dimension, for $hat G$ over its field of definition. We explore the overgroup structure of $hat G$ in $mathrm{GL}(V)$, and the submodule structure of the exterior square (and sometimes the third Lie power) of $V$. When $hat G$ is defined over a field of odd prime order $p$, this allows us to construct the smallest (with respect to certain properties) $p$-groups $P$ such that the group induced by $mathrm{Aut}(P)$ on $P/Phi(P)$ is either $hat G$ or its normaliser in $mathrm{GL}(V)$. Full Article
date Exotic Springer fibers for orbits corresponding to one-row bipartitions. (arXiv:1810.03731v2 [math.RT] UPDATED) By arxiv.org Published On :: We study the geometry and topology of exotic Springer fibers for orbits corresponding to one-row bipartitions from an explicit, combinatorial point of view. This includes a detailed analysis of the structure of the irreducible components and their intersections as well as the construction of an explicit affine paving. Moreover, we compute the ring structure of cohomology by constructing a CW-complex homotopy equivalent to the exotic Springer fiber. This homotopy equivalent space admits an action of the type C Weyl group inducing Kato's original exotic Springer representation on cohomology. Our results are described in terms of the diagrammatics of the one-boundary Temperley-Lieb algebra (also known as the blob algebra). This provides a first step in generalizing the geometric versions of Khovanov's arc algebra to the exotic setting. Full Article
date On the rationality of cycle integrals of meromorphic modular forms. (arXiv:1810.00612v3 [math.NT] UPDATED) By arxiv.org Published On :: We derive finite rational formulas for the traces of cycle integrals of certain meromorphic modular forms. Moreover, we prove the modularity of a completion of the generating function of such traces. The theoretical framework for these results is an extension of the Shintani theta lift to meromorphic modular forms of positive even weight. Full Article
date Twisted Sequences of Extensions. (arXiv:1808.07936v3 [math.RT] UPDATED) By arxiv.org Published On :: Gabber and Joseph introduced a ladder diagram between two natural sequences of extensions. Their diagram is used to produce a 'twisted' sequence that is applied to old and new results on extension groups in category $mathcal{O}$. Full Article
date A Forward-Backward Splitting Method for Monotone Inclusions Without Cocoercivity. (arXiv:1808.04162v4 [math.OC] UPDATED) By arxiv.org Published On :: In this work, we propose a simple modification of the forward-backward splitting method for finding a zero in the sum of two monotone operators. Our method converges under the same assumptions as Tseng's forward-backward-forward method, namely, it does not require cocoercivity of the single-valued operator. Moreover, each iteration only requires one forward evaluation rather than two as is the case for Tseng's method. Variants of the method incorporating a linesearch, relaxation and inertia, or a structured three operator inclusion are also discussed. Full Article
date On the Total Curvature and Betti Numbers of Complex Projective Manifolds. (arXiv:1807.11625v2 [math.DG] UPDATED) By arxiv.org Published On :: We prove an inequality between the sum of the Betti numbers of a complex projective manifold and its total curvature, and we characterize the complex projective manifolds whose total curvature is minimal. These results extend the classical theorems of Chern and Lashof to complex projective space. Full Article
date The 2d-directed spanning forest converges to the Brownian web. (arXiv:1805.09399v3 [math.PR] UPDATED) By arxiv.org Published On :: The two-dimensional directed spanning forest (DSF) introduced by Baccelli and Bordenave is a planar directed forest whose vertex set is given by a homogeneous Poisson point process $mathcal{N}$ on $mathbb{R}^2$. If the DSF has direction $-e_y$, the ancestor $h(u)$ of a vertex $u in mathcal{N}$ is the nearest Poisson point (in the $L_2$ distance) having strictly larger $y$-coordinate. This construction induces complex geometrical dependencies. In this paper we show that the collection of DSF paths, properly scaled, converges in distribution to the Brownian web (BW). This verifies a conjecture made by Baccelli and Bordenave in 2007. Full Article
date Effective divisors on Hurwitz spaces. (arXiv:1804.01898v3 [math.AG] UPDATED) By arxiv.org Published On :: We prove the effectiveness of the canonical bundle of several Hurwitz spaces of degree k covers of the projective line from curves of genus 13<g<20. Full Article
date Conservative stochastic 2-dimensional Cahn-Hilliard equation. (arXiv:1802.04141v2 [math.PR] UPDATED) By arxiv.org Published On :: We consider the stochastic 2-dimensional Cahn-Hilliard equation which is driven by the derivative in space of a space-time white noise. We use two different approaches to study this equation. First we prove that there exists a unique solution $Y$ to the shifted equation (see (1.4) below), then $X:=Y+{Z}$ is the unique solution to stochastic Cahn-Hilliard equaiton, where ${Z}$ is the corresponding O-U process. Moreover, we use Dirichlet form approach in cite{Albeverio:1991hk} to construct the probabilistically weak solution the the original equation (1.1) below. By clarifying the precise relation between the solutions obtained by the Dirichlet forms aprroach and $X$, we can also get the restricted Markov uniquness of the generator and the uniqueness of martingale solutions to the equation (1.1). Full Article
date Extremal values of the Sackin balance index for rooted binary trees. (arXiv:1801.10418v5 [q-bio.PE] UPDATED) By arxiv.org Published On :: Tree balance plays an important role in different research areas like theoretical computer science and mathematical phylogenetics. For example, it has long been known that under the Yule model, a pure birth process, imbalanced trees are more likely than balanced ones. Therefore, different methods to measure the balance of trees were introduced. The Sackin index is one of the most frequently used measures for this purpose. In many contexts, statements about the minimal and maximal values of this index have been discussed, but formal proofs have never been provided. Moreover, while the number of trees with maximal Sackin index as well as the number of trees with minimal Sackin index when the number of leaves is a power of 2 are relatively easy to understand, the number of trees with minimal Sackin index for all other numbers of leaves was completely unknown. In this manuscript, we fully characterize trees with minimal and maximal Sackin index and also provide formulas to explicitly calculate the number of such trees. Full Article
date Expansion of Iterated Stratonovich Stochastic Integrals of Arbitrary Multiplicity Based on Generalized Iterated Fourier Series Converging Pointwise. (arXiv:1801.00784v9 [math.PR] UPDATED) By arxiv.org Published On :: The article is devoted to the expansion of iterated Stratonovich stochastic integrals of arbitrary multiplicity $k$ $(kinmathbb{N})$ based on the generalized iterated Fourier series. The case of Fourier-Legendre series as well as the case of trigonotemric Fourier series are considered in details. The obtained expansion provides a possibility to represent the iterated Stratonovich stochastic integral in the form of iterated series of products of standard Gaussian random variables. Convergence in the mean of degree $2n$ $(nin mathbb{N})$ of the expansion is proved. Some modifications of the mentioned expansion were derived for the case $k=2$. One of them is based of multiple trigonomentric Fourier series converging almost everywhere in the square $[t, T]^2$. The results of the article can be applied to the numerical solution of Ito stochastic differential equations. Full Article
date Local Moduli of Semisimple Frobenius Coalescent Structures. (arXiv:1712.08575v3 [math.DG] UPDATED) By arxiv.org Published On :: We extend the analytic theory of Frobenius manifolds to semisimple points with coalescing eigenvalues of the operator of multiplication by the Euler vector field. We clarify which freedoms, ambiguities and mutual constraints are allowed in the definition of monodromy data, in view of their importance for conjectural relationships between Frobenius manifolds and derived categories. Detailed examples and applications are taken from singularity and quantum cohomology theories. We explicitly compute the monodromy data at points of the Maxwell Stratum of the A3-Frobenius manifold, as well as at the small quantum cohomology of the Grassmannian G(2,4). In the latter case, we analyse in details the action of the braid group on the monodromy data. This proves that these data can be expressed in terms of characteristic classes of mutations of Kapranov's exceptional 5-block collection, as conjectured by one of the authors. Full Article
date High dimensional expanders and coset geometries. (arXiv:1710.05304v3 [math.CO] UPDATED) By arxiv.org Published On :: High dimensional expanders is a vibrant emerging field of study. Nevertheless, the only known construction of bounded degree high dimensional expanders is based on Ramanujan complexes, whereas one dimensional bounded degree expanders are abundant. In this work, we construct new families of bounded degree high dimensional expanders obeying the local spectral expansion property. This property has a number of important consequences, including geometric overlapping, fast mixing of high dimensional random walks, agreement testing and agreement expansion. Our construction also yields new families of expander graphs which are close to the Ramanujan bound, i.e., their spectral gap is close to optimal. The construction is quite elementary and it is presented in a self contained manner; This is in contrary to the highly involved previously known construction of the Ramanujan complexes. The construction is also very symmetric (such symmetry properties are not known for Ramanujan complexes) ; The symmetry of the construction could be used, for example, in order to obtain good symmetric LDPC codes that were previously based on Ramanujan graphs. The main tool that we use for is the theory of coset geometries. Coset geometries arose as a tool for studying finite simple groups. Here, we show that coset geometries arise in a very natural manner for groups of elementary matrices over any finitely generated algebra over a commutative unital ring. In other words, we show that such groups act simply transitively on the top dimensional face of a pure, partite, clique complex. Full Article
date Simulation of Integro-Differential Equation and Application in Estimation of Ruin Probability with Mixed Fractional Brownian Motion. (arXiv:1709.03418v6 [math.PR] UPDATED) By arxiv.org Published On :: In this paper, we are concerned with the numerical solution of one type integro-differential equation by a probability method based on the fundamental martingale of mixed Gaussian processes. As an application, we will try to simulate the estimation of ruin probability with an unknown parameter driven not by the classical L'evy process but by the mixed fractional Brownian motion. Full Article
date Local mollification of Riemannian metrics using Ricci flow, and Ricci limit spaces. (arXiv:1706.09490v2 [math.DG] UPDATED) By arxiv.org Published On :: We use Ricci flow to obtain a local bi-Holder correspondence between Ricci limit spaces in three dimensions and smooth manifolds. This is more than a complete resolution of the three-dimensional case of the conjecture of Anderson-Cheeger-Colding-Tian, describing how Ricci limit spaces in three dimensions must be homeomorphic to manifolds, and we obtain this in the most general, locally non-collapsed case. The proofs build on results and ideas from recent papers of Hochard and the current authors. Full Article
date The classification of Rokhlin flows on C*-algebras. (arXiv:1706.09276v6 [math.OA] UPDATED) By arxiv.org Published On :: We study flows on C*-algebras with the Rokhlin property. We show that every Kirchberg algebra carries a unique Rokhlin flow up to cocycle conjugacy, which confirms a long-standing conjecture of Kishimoto. We moreover present a classification theory for Rokhlin flows on C*-algebras satisfying certain technical properties, which hold for many C*-algebras covered by the Elliott program. As a consequence, we obtain the following further classification theorems for Rokhlin flows. Firstly, we extend the statement of Kishimoto's conjecture to the non-simple case: Up to cocycle conjugacy, a Rokhlin flow on a separable, nuclear, strongly purely infinite C*-algebra is uniquely determined by its induced action on the prime ideal space. Secondly, we give a complete classification of Rokhlin flows on simple classifiable $KK$-contractible C*-algebras: Two Rokhlin flows on such a C*-algebra are cocycle conjugate if and only if their induced actions on the cone of lower-semicontinuous traces are affinely conjugate. Full Article
date Categorification via blocks of modular representations for sl(n). (arXiv:1612.06941v3 [math.RT] UPDATED) By arxiv.org Published On :: Bernstein, Frenkel, and Khovanov have constructed a categorification of tensor products of the standard representation of $mathfrak{sl}_2$, where they use singular blocks of category $mathcal{O}$ for $mathfrak{sl}_n$ and translation functors. Here we construct a positive characteristic analogue using blocks of representations of $mathfrak{sl}_n$ over a field $ extbf{k}$ of characteristic $p$ with zero Frobenius character, and singular Harish-Chandra character. We show that the aforementioned categorification admits a Koszul graded lift, which is equivalent to a geometric categorification constructed by Cautis, Kamnitzer, and Licata using coherent sheaves on cotangent bundles to Grassmanians. In particular, the latter admits an abelian refinement. With respect to this abelian refinement, the stratified Mukai flop induces a perverse equivalence on the derived categories for complementary Grassmanians. This is part of a larger project to give a combinatorial approach to Lusztig's conjectures for representations of Lie algebras in positive characteristic. Full Article
date A Class of Functional Inequalities and their Applications to Fourth-Order Nonlinear Parabolic Equations. (arXiv:1612.03508v3 [math.AP] UPDATED) By arxiv.org Published On :: We study a class of fourth order nonlinear parabolic equations which include the thin-film equation and the quantum drift-diffusion model as special cases. We investigate these equations by first developing functional inequalities of the type $ int_Omega u^{2gamma-alpha-eta}Delta u^alphaDelta u^eta dx geq cint_Omega|Delta u^gamma |^2dx $, which seem to be of interest on their own right. Full Article
date On the zeros of the Riemann zeta function, twelve years later. (arXiv:0806.2361v7 [math.GM] UPDATED) By arxiv.org Published On :: The paper proves the Riemann Hypothesis. Full Article
date GraCIAS: Grassmannian of Corrupted Images for Adversarial Security. (arXiv:2005.02936v2 [cs.CV] UPDATED) By arxiv.org Published On :: Input transformation based defense strategies fall short in defending against strong adversarial attacks. Some successful defenses adopt approaches that either increase the randomness within the applied transformations, or make the defense computationally intensive, making it substantially more challenging for the attacker. However, it limits the applicability of such defenses as a pre-processing step, similar to computationally heavy approaches that use retraining and network modifications to achieve robustness to perturbations. In this work, we propose a defense strategy that applies random image corruptions to the input image alone, constructs a self-correlation based subspace followed by a projection operation to suppress the adversarial perturbation. Due to its simplicity, the proposed defense is computationally efficient as compared to the state-of-the-art, and yet can withstand huge perturbations. Further, we develop proximity relationships between the projection operator of a clean image and of its adversarially perturbed version, via bounds relating geodesic distance on the Grassmannian to matrix Frobenius norms. We empirically show that our strategy is complementary to other weak defenses like JPEG compression and can be seamlessly integrated with them to create a stronger defense. We present extensive experiments on the ImageNet dataset across four different models namely InceptionV3, ResNet50, VGG16 and MobileNet models with perturbation magnitude set to {epsilon} = 16. Unlike state-of-the-art approaches, even without any retraining, the proposed strategy achieves an absolute improvement of ~ 4.5% in defense accuracy on ImageNet. Full Article
date A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus. (arXiv:2005.02911v2 [quant-ph] UPDATED) By arxiv.org Published On :: Quantum computing has evolved quickly in recent years and is showing significant benefits in a variety of fields. Malware analysis is one of those fields that could also take advantage of quantum computing. The combination of software used to locate the most frequent hashes and $n$-grams between benign and malicious software (KiloGram) and a quantum search algorithm could be beneficial, by loading the table of hashes and $n$-grams into a quantum computer, and thereby speeding up the process of mapping $n$-grams to their hashes. The first phase will be to use KiloGram to find the top-$k$ hashes and $n$-grams for a large malware corpus. From here, the resulting hash table is then loaded into a quantum machine. A quantum search algorithm is then used search among every permutation of the entangled key and value pairs to find the desired hash value. This prevents one from having to re-compute hashes for a set of $n$-grams, which can take on average $O(MN)$ time, whereas the quantum algorithm could take $O(sqrt{N})$ in the number of table lookups to find the desired hash values. Full Article
date Multi-Resolution POMDP Planning for Multi-Object Search in 3D. (arXiv:2005.02878v2 [cs.RO] UPDATED) By arxiv.org Published On :: Robots operating in household environments must find objects on shelves, under tables, and in cupboards. Previous work often formulate the object search problem as a POMDP (Partially Observable Markov Decision Process), yet constrain the search space in 2D. We propose a new approach that enables the robot to efficiently search for objects in 3D, taking occlusions into account. We model the problem as an object-oriented POMDP, where the robot receives a volumetric observation from a viewing frustum and must produce a policy to efficiently search for objects. To address the challenge of large state and observation spaces, we first propose a per-voxel observation model which drastically reduces the observation size necessary for planning. Then, we present a novel octree-based belief representation which captures beliefs at different resolutions and supports efficient exact belief update. Finally, we design an online multi-resolution planning algorithm that leverages the resolution layers in the octree structure as levels of abstractions to the original POMDP problem. Our evaluation in a simulated 3D domain shows that, as the problem scales, our approach significantly outperforms baselines without resolution hierarchy by 25%-35% in cumulative reward. We demonstrate the practicality of our approach on a torso-actuated mobile robot searching for objects in areas of a cluttered lab environment where objects appear on surfaces at different heights. Full Article
date Modeling nanoconfinement effects using active learning. (arXiv:2005.02587v2 [physics.app-ph] UPDATED) By arxiv.org Published On :: Predicting the spatial configuration of gas molecules in nanopores of shale formations is crucial for fluid flow forecasting and hydrocarbon reserves estimation. The key challenge in these tight formations is that the majority of the pore sizes are less than 50 nm. At this scale, the fluid properties are affected by nanoconfinement effects due to the increased fluid-solid interactions. For instance, gas adsorption to the pore walls could account for up to 85% of the total hydrocarbon volume in a tight reservoir. Although there are analytical solutions that describe this phenomenon for simple geometries, they are not suitable for describing realistic pores, where surface roughness and geometric anisotropy play important roles. To describe these, molecular dynamics (MD) simulations are used since they consider fluid-solid and fluid-fluid interactions at the molecular level. However, MD simulations are computationally expensive, and are not able to simulate scales larger than a few connected nanopores. We present a method for building and training physics-based deep learning surrogate models to carry out fast and accurate predictions of molecular configurations of gas inside nanopores. Since training deep learning models requires extensive databases that are computationally expensive to create, we employ active learning (AL). AL reduces the overhead of creating comprehensive sets of high-fidelity data by determining where the model uncertainty is greatest, and running simulations on the fly to minimize it. The proposed workflow enables nanoconfinement effects to be rigorously considered at the mesoscale where complex connected sets of nanopores control key applications such as hydrocarbon recovery and CO2 sequestration. Full Article
date Multi-task pre-training of deep neural networks for digital pathology. (arXiv:2005.02561v2 [eess.IV] UPDATED) By arxiv.org Published On :: In this work, we investigate multi-task learning as a way of pre-training models for classification tasks in digital pathology. It is motivated by the fact that many small and medium-size datasets have been released by the community over the years whereas there is no large scale dataset similar to ImageNet in the domain. We first assemble and transform many digital pathology datasets into a pool of 22 classification tasks and almost 900k images. Then, we propose a simple architecture and training scheme for creating a transferable model and a robust evaluation and selection protocol in order to evaluate our method. Depending on the target task, we show that our models used as feature extractors either improve significantly over ImageNet pre-trained models or provide comparable performance. Fine-tuning improves performance over feature extraction and is able to recover the lack of specificity of ImageNet features, as both pre-training sources yield comparable performance. Full Article
date The Cascade Transformer: an Application for Efficient Answer Sentence Selection. (arXiv:2005.02534v2 [cs.CL] UPDATED) By arxiv.org Published On :: Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference. In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers. Each ranker is used to prune a subset of candidates in a batch, thus dramatically increasing throughput at inference time. Partial encodings from the transformer model are shared among rerankers, providing further speed-up. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy, as measured on two English Question Answering datasets. Full Article
date On the list recoverability of randomly punctured codes. (arXiv:2005.02478v2 [math.CO] UPDATED) By arxiv.org Published On :: We show that a random puncturing of a code with good distance is list recoverable beyond the Johnson bound. In particular, this implies that there are Reed-Solomon codes that are list recoverable beyond the Johnson bound. It was previously known that there are Reed-Solomon codes that do not have this property. As an immediate corollary to our main theorem, we obtain better degree bounds on unbalanced expanders that come from Reed-Solomon codes. Full Article
date Temporal Event Segmentation using Attention-based Perceptual Prediction Model for Continual Learning. (arXiv:2005.02463v2 [cs.CV] UPDATED) By arxiv.org Published On :: Temporal event segmentation of a long video into coherent events requires a high level understanding of activities' temporal features. The event segmentation problem has been tackled by researchers in an offline training scheme, either by providing full, or weak, supervision through manually annotated labels or by self-supervised epoch based training. In this work, we present a continual learning perceptual prediction framework (influenced by cognitive psychology) capable of temporal event segmentation through understanding of the underlying representation of objects within individual frames. Our framework also outputs attention maps which effectively localize and track events-causing objects in each frame. The model is tested on a wildlife monitoring dataset in a continual training manner resulting in $80\%$ recall rate at $20\%$ false positive rate for frame level segmentation. Activity level testing has yielded $80\%$ activity recall rate for one false activity detection every 50 minutes. Full Article
date Differential Machine Learning. (arXiv:2005.02347v2 [q-fin.CP] UPDATED) By arxiv.org Published On :: Differential machine learning (ML) extends supervised learning, with models trained on examples of not only inputs and labels, but also differentials of labels to inputs. Differential ML is applicable in all situations where high quality first order derivatives wrt training inputs are available. In the context of financial Derivatives risk management, pathwise differentials are efficiently computed with automatic adjoint differentiation (AAD). Differential ML, combined with AAD, provides extremely effective pricing and risk approximations. We can produce fast pricing analytics in models too complex for closed form solutions, extract the risk factors of complex transactions and trading books, and effectively compute risk management metrics like reports across a large number of scenarios, backtesting and simulation of hedge strategies, or capital regulations. The article focuses on differential deep learning (DL), arguably the strongest application. Standard DL trains neural networks (NN) on punctual examples, whereas differential DL teaches them the shape of the target function, resulting in vastly improved performance, illustrated with a number of numerical examples, both idealized and real world. In the online appendices, we apply differential learning to other ML models, like classic regression or principal component analysis (PCA), with equally remarkable results. This paper is meant to be read in conjunction with its companion GitHub repo https://github.com/differential-machine-learning, where we posted a TensorFlow implementation, tested on Google Colab, along with examples from the article and additional ones. We also posted appendices covering many practical implementation details not covered in the paper, mathematical proofs, application to ML models besides neural networks and extensions necessary for a reliable implementation in production. Full Article
date Automata Tutor v3. (arXiv:2005.01419v2 [cs.FL] UPDATED) By arxiv.org Published On :: Computer science class enrollments have rapidly risen in the past decade. With current class sizes, standard approaches to grading and providing personalized feedback are no longer possible and new techniques become both feasible and necessary. In this paper, we present the third version of Automata Tutor, a tool for helping teachers and students in large courses on automata and formal languages. The second version of Automata Tutor supported automatic grading and feedback for finite-automata constructions and has already been used by thousands of users in dozens of countries. This new version of Automata Tutor supports automated grading and feedback generation for a greatly extended variety of new problems, including problems that ask students to create regular expressions, context-free grammars, pushdown automata and Turing machines corresponding to a given description, and problems about converting between equivalent models - e.g., from regular expressions to nondeterministic finite automata. Moreover, for several problems, this new version also enables teachers and students to automatically generate new problem instances. We also present the results of a survey run on a class of 950 students, which shows very positive results about the usability and usefulness of the tool. Full Article
date The Sensitivity of Language Models and Humans to Winograd Schema Perturbations. (arXiv:2005.01348v2 [cs.CL] UPDATED) By arxiv.org Published On :: Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues. Full Article
date Prediction of Event Related Potential Speller Performance Using Resting-State EEG. (arXiv:2005.01325v3 [cs.HC] UPDATED) By arxiv.org Published On :: Event-related potential (ERP) speller can be utilized in device control and communication for locked-in or severely injured patients. However, problems such as inter-subject performance instability and ERP-illiteracy are still unresolved. Therefore, it is necessary to predict classification performance before performing an ERP speller in order to use it efficiently. In this study, we investigated the correlations with ERP speller performance using a resting-state before an ERP speller. In specific, we used spectral power and functional connectivity according to four brain regions and five frequency bands. As a result, the delta power in the frontal region and functional connectivity in the delta, alpha, gamma bands are significantly correlated with the ERP speller performance. Also, we predicted the ERP speller performance using EEG features in the resting-state. These findings may contribute to investigating the ERP-illiteracy and considering the appropriate alternatives for each user. Full Article
date Quantum arithmetic operations based on quantum Fourier transform on signed integers. (arXiv:2005.00443v2 [cs.IT] UPDATED) By arxiv.org Published On :: The quantum Fourier transform brings efficiency in many respects, especially usage of resource, for most operations on quantum computers. In this study, the existing QFT-based and non-QFT-based quantum arithmetic operations are examined. The capabilities of QFT-based addition and multiplication are improved with some modifications. The proposed operations are compared with the nearest quantum arithmetic operations. Furthermore, novel QFT-based subtraction and division operations are presented. The proposed arithmetic operations can perform non-modular operations on all signed numbers without any limitation by using less resources. In addition, novel quantum circuits of two's complement, absolute value and comparison operations are also presented by using the proposed QFT based addition and subtraction operations. Full Article
date On-board Deep-learning-based Unmanned Aerial Vehicle Fault Cause Detection and Identification. (arXiv:2005.00336v2 [eess.SP] UPDATED) By arxiv.org Published On :: With the increase in use of Unmanned Aerial Vehicles (UAVs)/drones, it is important to detect and identify causes of failure in real time for proper recovery from a potential crash-like scenario or post incident forensics analysis. The cause of crash could be either a fault in the sensor/actuator system, a physical damage/attack, or a cyber attack on the drone's software. In this paper, we propose novel architectures based on deep Convolutional and Long Short-Term Memory Neural Networks (CNNs and LSTMs) to detect (via Autoencoder) and classify drone mis-operations based on sensor data. The proposed architectures are able to learn high-level features automatically from the raw sensor data and learn the spatial and temporal dynamics in the sensor data. We validate the proposed deep-learning architectures via simulations and experiments on a real drone. Empirical results show that our solution is able to detect with over 90% accuracy and classify various types of drone mis-operations (with about 99% accuracy (simulation data) and upto 88% accuracy (experimental data)). Full Article
date Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment. (arXiv:2005.00165v3 [cs.CL] UPDATED) By arxiv.org Published On :: A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all. Full Article
date Generative Adversarial Networks in Digital Pathology: A Survey on Trends and Future Potential. (arXiv:2004.14936v2 [eess.IV] UPDATED) By arxiv.org Published On :: Image analysis in the field of digital pathology has recently gained increased popularity. The use of high-quality whole slide scanners enables the fast acquisition of large amounts of image data, showing extensive context and microscopic detail at the same time. Simultaneously, novel machine learning algorithms have boosted the performance of image analysis approaches. In this paper, we focus on a particularly powerful class of architectures, called Generative Adversarial Networks (GANs), applied to histological image data. Besides improving performance, GANs also enable application scenarios in this field, which were previously intractable. However, GANs could exhibit a potential for introducing bias. Hereby, we summarize the recent state-of-the-art developments in a generalizing notation, present the main applications of GANs and give an outlook of some chosen promising approaches and their possible future applications. In addition, we identify currently unavailable methods with potential for future applications. Full Article
date Towards Embodied Scene Description. (arXiv:2004.14638v2 [cs.RO] UPDATED) By arxiv.org Published On :: Embodiment is an important characteristic for all intelligent agents (creatures and robots), while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from the interaction between the agent and the environment. In this work, we propose the Embodied Scene Description, which exploits the embodiment ability of the agent to find an optimal viewpoint in its environment for scene description tasks. A learning framework with the paradigms of imitation learning and reinforcement learning is established to teach the intelligent agent to generate corresponding sensorimotor activities. The proposed framework is tested on both the AI2Thor dataset and a real world robotic platform demonstrating the effectiveness and extendability of the developed method. Full Article
date Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images. (arXiv:2004.14487v2 [cs.CV] UPDATED) By arxiv.org Published On :: The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property. Full Article
date When Hearing Defers to Touch. (arXiv:2004.13462v2 [q-bio.NC] UPDATED) By arxiv.org Published On :: Hearing is often believed to be more sensitive than touch. This assertion is based on a comparison of sensitivities to weak stimuli. The respective stimuli, however, are not easily comparable since hearing is gauged using acoustic pressure and touch using skin displacement. We show that under reasonable assumptions the auditory and tactile detection thresholds can be reconciled on a level playing field. The results indicate that the capacity of touch and hearing to detect weak stimuli varies according to the size of a sensed object as well as to the frequency of its oscillations. In particular, touch is found to be more effective than hearing at detecting small and slow objects. Full Article
date Self-Attention with Cross-Lingual Position Representation. (arXiv:2004.13310v2 [cs.CL] UPDATED) By arxiv.org Published On :: Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g. machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with emph{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT'14 English$Rightarrow$German, WAT'17 Japanese$Rightarrow$English, and WMT'17 Chinese$Leftrightarrow$English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information. Full Article
date Optimal Adjacent Vertex-Distinguishing Edge-Colorings of Circulant Graphs. (arXiv:2004.12822v2 [cs.DM] UPDATED) By arxiv.org Published On :: A k-proper edge-coloring of a graph G is called adjacent vertex-distinguishing if any two adjacent vertices are distinguished by the set of colors appearing in the edges incident to each vertex. The smallest value k for which G admits such coloring is denoted by $chi$'a (G). We prove that $chi$'a (G) = 2R + 1 for most circulant graphs Cn([[1, R]]). Full Article
date Jealousy-freeness and other common properties in Fair Division of Mixed Manna. (arXiv:2004.11469v2 [cs.GT] UPDATED) By arxiv.org Published On :: We consider a fair division setting where indivisible items are allocated to agents. Each agent in the setting has strictly negative, zero or strictly positive utility for each item. We, thus, make a distinction between items that are good for some agents and bad for other agents (i.e. mixed), good for everyone (i.e. goods) or bad for everyone (i.e. bads). For this model, we study axiomatic concepts of allocations such as jealousy-freeness up to one item, envy-freeness up to one item and Pareto-optimality. We obtain many new possibility and impossibility results in regard to combinations of these properties. We also investigate new computational tasks related to such combinations. Thus, we advance the state-of-the-art in fair division of mixed manna. Full Article
date Warwick Image Forensics Dataset for Device Fingerprinting In Multimedia Forensics. (arXiv:2004.10469v2 [cs.CV] UPDATED) By arxiv.org Published On :: Device fingerprints like sensor pattern noise (SPN) are widely used for provenance analysis and image authentication. Over the past few years, the rapid advancement in digital photography has greatly reshaped the pipeline of image capturing process on consumer-level mobile devices. The flexibility of camera parameter settings and the emergence of multi-frame photography algorithms, especially high dynamic range (HDR) imaging, bring new challenges to device fingerprinting. The subsequent study on these topics requires a new purposefully built image dataset. In this paper, we present the Warwick Image Forensics Dataset, an image dataset of more than 58,600 images captured using 14 digital cameras with various exposure settings. Special attention to the exposure settings allows the images to be adopted by different multi-frame computational photography algorithms and for subsequent device fingerprinting. The dataset is released as an open-source, free for use for the digital forensic community. Full Article
date On the regularity of De Bruijn multigrids. (arXiv:2004.10128v2 [cs.DM] UPDATED) By arxiv.org Published On :: In this paper we prove that any odd multigrid with non-zero rational offsets is regular, which means that its dual is a rhombic tiling. To prove this result we use a result on trigonometric diophantine equations. Full Article
date SPECTER: Document-level Representation Learning using Citation-informed Transformers. (arXiv:2004.07180v3 [cs.CL] UPDATED) By arxiv.org Published On :: Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards token- and sentence-level training objectives and do not leverage information on inter-document relatedness, which limits their document-level representation power. For applications on scientific documents, such as classification and recommendation, the embeddings power strong performance on end tasks. We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning. Additionally, to encourage further research on document-level models, we introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction, to document classification and recommendation. We show that SPECTER outperforms a variety of competitive baselines on the benchmark. Full Article
date The growth rate over trees of any family of set defined by a monadic second order formula is semi-computable. (arXiv:2004.06508v3 [cs.DM] UPDATED) By arxiv.org Published On :: Monadic second order logic can be used to express many classical notions of sets of vertices of a graph as for instance: dominating sets, induced matchings, perfect codes, independent sets or irredundant sets. Bounds on the number of sets of any such family of sets are interesting from a combinatorial point of view and have algorithmic applications. Many such bounds on different families of sets over different classes of graphs are already provided in the literature. In particular, Rote recently showed that the number of minimal dominating sets in trees of order $n$ is at most $95^{frac{n}{13}}$ and that this bound is asymptotically sharp up to a multiplicative constant. We build on his work to show that what he did for minimal dominating sets can be done for any family of sets definable by a monadic second order formula. We first show that, for any monadic second order formula over graphs that characterizes a given kind of subset of its vertices, the maximal number of such sets in a tree can be expressed as the extit{growth rate of a bilinear system}. This mostly relies on well known links between monadic second order logic over trees and tree automata and basic tree automata manipulations. Then we show that this "growth rate" of a bilinear system can be approximated from above.We then use our implementation of this result to provide bounds on the number of independent dominating sets, total perfect dominating sets, induced matchings, maximal induced matchings, minimal perfect dominating sets, perfect codes and maximal irredundant sets on trees. We also solve a question from D. Y. Kang et al. regarding $r$-matchings and improve a bound from G'orska and Skupie'n on the number of maximal matchings on trees. Remark that this approach is easily generalizable to graphs of bounded tree width or clique width (or any similar class of graphs where tree automata are meaningful). Full Article
date Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus. (arXiv:2004.06295v2 [cs.CL] UPDATED) By arxiv.org Published On :: Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly. Full Article
date Transfer Learning for EEG-Based Brain-Computer Interfaces: A Review of Progress Made Since 2016. (arXiv:2004.06286v3 [cs.HC] UPDATED) By arxiv.org Published On :: A brain-computer interface (BCI) enables a user to communicate with a computer directly using brain signals. Electroencephalograms (EEGs) used in BCIs are weak, easily contaminated by interference and noise, non-stationary for the same subject, and varying across different subjects and sessions. Therefore, it is difficult to build a generic pattern recognition model in an EEG-based BCI system that is optimal for different subjects, during different sessions, for different devices and tasks. Usually, a calibration session is needed to collect some training data for a new subject, which is time consuming and user unfriendly. Transfer learning (TL), which utilizes data or knowledge from similar or relevant subjects/sessions/devices/tasks to facilitate learning for a new subject/session/device/task, is frequently used to reduce the amount of calibration effort. This paper reviews journal publications on TL approaches in EEG-based BCIs in the last few years, i.e., since 2016. Six paradigms and applications -- motor imagery, event-related potentials, steady-state visual evoked potentials, affective BCIs, regression problems, and adversarial attacks -- are considered. For each paradigm/application, we group the TL approaches into cross-subject/session, cross-device, and cross-task settings and review them separately. Observations and conclusions are made at the end of the paper, which may point to future research directions. Full Article