up

Optimal construction of Koopman eigenfunctions for prediction and control. (arXiv:1810.08733v3 [math.OC] UPDATED)

This work presents a novel data-driven framework for constructing eigenfunctions of the Koopman operator geared toward prediction and control. The method leverages the richness of the spectrum of the Koopman operator away from attractors to construct a rich set of eigenfunctions such that the state (or any other observable quantity of interest) is in the span of these eigenfunctions and hence predictable in a linear fashion. The eigenfunction construction is optimization-based with no dictionary selection required. Once a predictor for the uncontrolled part of the system is obtained in this way, the incorporation of control is done through a multi-step prediction error minimization, carried out by a simple linear least-squares regression. The predictor so obtained is in the form of a linear controlled dynamical system and can be readily applied within the Koopman model predictive control framework of [12] to control nonlinear dynamical systems using linear model predictive control tools. The method is entirely data-driven and based purely on convex optimization, with no reliance on neural networks or other non-convex machine learning tools. The novel eigenfunction construction method is also analyzed theoretically, proving rigorously that the family of eigenfunctions obtained is rich enough to span the space of all continuous functions. In addition, the method is extended to construct generalized eigenfunctions that also give rise Koopman invariant subspaces and hence can be used for linear prediction. Detailed numerical examples with code available online demonstrate the approach, both for prediction and feedback control.




up

On $p$-groups with automorphism groups related to the exceptional Chevalley groups. (arXiv:1810.08365v3 [math.GR] UPDATED)

Let $hat G$ be the finite simply connected version of an exceptional Chevalley group, and let $V$ be a nontrivial irreducible module, of minimal dimension, for $hat G$ over its field of definition. We explore the overgroup structure of $hat G$ in $mathrm{GL}(V)$, and the submodule structure of the exterior square (and sometimes the third Lie power) of $V$. When $hat G$ is defined over a field of odd prime order $p$, this allows us to construct the smallest (with respect to certain properties) $p$-groups $P$ such that the group induced by $mathrm{Aut}(P)$ on $P/Phi(P)$ is either $hat G$ or its normaliser in $mathrm{GL}(V)$.




up

Exotic Springer fibers for orbits corresponding to one-row bipartitions. (arXiv:1810.03731v2 [math.RT] UPDATED)

We study the geometry and topology of exotic Springer fibers for orbits corresponding to one-row bipartitions from an explicit, combinatorial point of view. This includes a detailed analysis of the structure of the irreducible components and their intersections as well as the construction of an explicit affine paving. Moreover, we compute the ring structure of cohomology by constructing a CW-complex homotopy equivalent to the exotic Springer fiber. This homotopy equivalent space admits an action of the type C Weyl group inducing Kato's original exotic Springer representation on cohomology. Our results are described in terms of the diagrammatics of the one-boundary Temperley-Lieb algebra (also known as the blob algebra). This provides a first step in generalizing the geometric versions of Khovanov's arc algebra to the exotic setting.




up

On the rationality of cycle integrals of meromorphic modular forms. (arXiv:1810.00612v3 [math.NT] UPDATED)

We derive finite rational formulas for the traces of cycle integrals of certain meromorphic modular forms. Moreover, we prove the modularity of a completion of the generating function of such traces. The theoretical framework for these results is an extension of the Shintani theta lift to meromorphic modular forms of positive even weight.




up

Twisted Sequences of Extensions. (arXiv:1808.07936v3 [math.RT] UPDATED)

Gabber and Joseph introduced a ladder diagram between two natural sequences of extensions. Their diagram is used to produce a 'twisted' sequence that is applied to old and new results on extension groups in category $mathcal{O}$.




up

A Forward-Backward Splitting Method for Monotone Inclusions Without Cocoercivity. (arXiv:1808.04162v4 [math.OC] UPDATED)

In this work, we propose a simple modification of the forward-backward splitting method for finding a zero in the sum of two monotone operators. Our method converges under the same assumptions as Tseng's forward-backward-forward method, namely, it does not require cocoercivity of the single-valued operator. Moreover, each iteration only requires one forward evaluation rather than two as is the case for Tseng's method. Variants of the method incorporating a linesearch, relaxation and inertia, or a structured three operator inclusion are also discussed.




up

On the Total Curvature and Betti Numbers of Complex Projective Manifolds. (arXiv:1807.11625v2 [math.DG] UPDATED)

We prove an inequality between the sum of the Betti numbers of a complex projective manifold and its total curvature, and we characterize the complex projective manifolds whose total curvature is minimal. These results extend the classical theorems of Chern and Lashof to complex projective space.




up

The 2d-directed spanning forest converges to the Brownian web. (arXiv:1805.09399v3 [math.PR] UPDATED)

The two-dimensional directed spanning forest (DSF) introduced by Baccelli and Bordenave is a planar directed forest whose vertex set is given by a homogeneous Poisson point process $mathcal{N}$ on $mathbb{R}^2$. If the DSF has direction $-e_y$, the ancestor $h(u)$ of a vertex $u in mathcal{N}$ is the nearest Poisson point (in the $L_2$ distance) having strictly larger $y$-coordinate. This construction induces complex geometrical dependencies. In this paper we show that the collection of DSF paths, properly scaled, converges in distribution to the Brownian web (BW). This verifies a conjecture made by Baccelli and Bordenave in 2007.




up

Effective divisors on Hurwitz spaces. (arXiv:1804.01898v3 [math.AG] UPDATED)

We prove the effectiveness of the canonical bundle of several Hurwitz spaces of degree k covers of the projective line from curves of genus 13<g<20.




up

Conservative stochastic 2-dimensional Cahn-Hilliard equation. (arXiv:1802.04141v2 [math.PR] UPDATED)

We consider the stochastic 2-dimensional Cahn-Hilliard equation which is driven by the derivative in space of a space-time white noise. We use two different approaches to study this equation. First we prove that there exists a unique solution $Y$ to the shifted equation (see (1.4) below), then $X:=Y+{Z}$ is the unique solution to stochastic Cahn-Hilliard equaiton, where ${Z}$ is the corresponding O-U process. Moreover, we use Dirichlet form approach in cite{Albeverio:1991hk} to construct the probabilistically weak solution the the original equation (1.1) below. By clarifying the precise relation between the solutions obtained by the Dirichlet forms aprroach and $X$, we can also get the restricted Markov uniquness of the generator and the uniqueness of martingale solutions to the equation (1.1).




up

Extremal values of the Sackin balance index for rooted binary trees. (arXiv:1801.10418v5 [q-bio.PE] UPDATED)

Tree balance plays an important role in different research areas like theoretical computer science and mathematical phylogenetics. For example, it has long been known that under the Yule model, a pure birth process, imbalanced trees are more likely than balanced ones. Therefore, different methods to measure the balance of trees were introduced. The Sackin index is one of the most frequently used measures for this purpose. In many contexts, statements about the minimal and maximal values of this index have been discussed, but formal proofs have never been provided. Moreover, while the number of trees with maximal Sackin index as well as the number of trees with minimal Sackin index when the number of leaves is a power of 2 are relatively easy to understand, the number of trees with minimal Sackin index for all other numbers of leaves was completely unknown. In this manuscript, we fully characterize trees with minimal and maximal Sackin index and also provide formulas to explicitly calculate the number of such trees.




up

Expansion of Iterated Stratonovich Stochastic Integrals of Arbitrary Multiplicity Based on Generalized Iterated Fourier Series Converging Pointwise. (arXiv:1801.00784v9 [math.PR] UPDATED)

The article is devoted to the expansion of iterated Stratonovich stochastic integrals of arbitrary multiplicity $k$ $(kinmathbb{N})$ based on the generalized iterated Fourier series. The case of Fourier-Legendre series as well as the case of trigonotemric Fourier series are considered in details. The obtained expansion provides a possibility to represent the iterated Stratonovich stochastic integral in the form of iterated series of products of standard Gaussian random variables. Convergence in the mean of degree $2n$ $(nin mathbb{N})$ of the expansion is proved. Some modifications of the mentioned expansion were derived for the case $k=2$. One of them is based of multiple trigonomentric Fourier series converging almost everywhere in the square $[t, T]^2$. The results of the article can be applied to the numerical solution of Ito stochastic differential equations.




up

Local Moduli of Semisimple Frobenius Coalescent Structures. (arXiv:1712.08575v3 [math.DG] UPDATED)

We extend the analytic theory of Frobenius manifolds to semisimple points with coalescing eigenvalues of the operator of multiplication by the Euler vector field. We clarify which freedoms, ambiguities and mutual constraints are allowed in the definition of monodromy data, in view of their importance for conjectural relationships between Frobenius manifolds and derived categories. Detailed examples and applications are taken from singularity and quantum cohomology theories. We explicitly compute the monodromy data at points of the Maxwell Stratum of the A3-Frobenius manifold, as well as at the small quantum cohomology of the Grassmannian G(2,4). In the latter case, we analyse in details the action of the braid group on the monodromy data. This proves that these data can be expressed in terms of characteristic classes of mutations of Kapranov's exceptional 5-block collection, as conjectured by one of the authors.




up

High dimensional expanders and coset geometries. (arXiv:1710.05304v3 [math.CO] UPDATED)

High dimensional expanders is a vibrant emerging field of study. Nevertheless, the only known construction of bounded degree high dimensional expanders is based on Ramanujan complexes, whereas one dimensional bounded degree expanders are abundant.

In this work, we construct new families of bounded degree high dimensional expanders obeying the local spectral expansion property. This property has a number of important consequences, including geometric overlapping, fast mixing of high dimensional random walks, agreement testing and agreement expansion. Our construction also yields new families of expander graphs which are close to the Ramanujan bound, i.e., their spectral gap is close to optimal.

The construction is quite elementary and it is presented in a self contained manner; This is in contrary to the highly involved previously known construction of the Ramanujan complexes. The construction is also very symmetric (such symmetry properties are not known for Ramanujan complexes) ; The symmetry of the construction could be used, for example, in order to obtain good symmetric LDPC codes that were previously based on Ramanujan graphs.

The main tool that we use for is the theory of coset geometries. Coset geometries arose as a tool for studying finite simple groups. Here, we show that coset geometries arise in a very natural manner for groups of elementary matrices over any finitely generated algebra over a commutative unital ring. In other words, we show that such groups act simply transitively on the top dimensional face of a pure, partite, clique complex.




up

Simulation of Integro-Differential Equation and Application in Estimation of Ruin Probability with Mixed Fractional Brownian Motion. (arXiv:1709.03418v6 [math.PR] UPDATED)

In this paper, we are concerned with the numerical solution of one type integro-differential equation by a probability method based on the fundamental martingale of mixed Gaussian processes. As an application, we will try to simulate the estimation of ruin probability with an unknown parameter driven not by the classical L'evy process but by the mixed fractional Brownian motion.




up

Local mollification of Riemannian metrics using Ricci flow, and Ricci limit spaces. (arXiv:1706.09490v2 [math.DG] UPDATED)

We use Ricci flow to obtain a local bi-Holder correspondence between Ricci limit spaces in three dimensions and smooth manifolds. This is more than a complete resolution of the three-dimensional case of the conjecture of Anderson-Cheeger-Colding-Tian, describing how Ricci limit spaces in three dimensions must be homeomorphic to manifolds, and we obtain this in the most general, locally non-collapsed case. The proofs build on results and ideas from recent papers of Hochard and the current authors.




up

The classification of Rokhlin flows on C*-algebras. (arXiv:1706.09276v6 [math.OA] UPDATED)

We study flows on C*-algebras with the Rokhlin property. We show that every Kirchberg algebra carries a unique Rokhlin flow up to cocycle conjugacy, which confirms a long-standing conjecture of Kishimoto. We moreover present a classification theory for Rokhlin flows on C*-algebras satisfying certain technical properties, which hold for many C*-algebras covered by the Elliott program. As a consequence, we obtain the following further classification theorems for Rokhlin flows. Firstly, we extend the statement of Kishimoto's conjecture to the non-simple case: Up to cocycle conjugacy, a Rokhlin flow on a separable, nuclear, strongly purely infinite C*-algebra is uniquely determined by its induced action on the prime ideal space. Secondly, we give a complete classification of Rokhlin flows on simple classifiable $KK$-contractible C*-algebras: Two Rokhlin flows on such a C*-algebra are cocycle conjugate if and only if their induced actions on the cone of lower-semicontinuous traces are affinely conjugate.




up

Categorification via blocks of modular representations for sl(n). (arXiv:1612.06941v3 [math.RT] UPDATED)

Bernstein, Frenkel, and Khovanov have constructed a categorification of tensor products of the standard representation of $mathfrak{sl}_2$, where they use singular blocks of category $mathcal{O}$ for $mathfrak{sl}_n$ and translation functors. Here we construct a positive characteristic analogue using blocks of representations of $mathfrak{sl}_n$ over a field $ extbf{k}$ of characteristic $p$ with zero Frobenius character, and singular Harish-Chandra character. We show that the aforementioned categorification admits a Koszul graded lift, which is equivalent to a geometric categorification constructed by Cautis, Kamnitzer, and Licata using coherent sheaves on cotangent bundles to Grassmanians. In particular, the latter admits an abelian refinement. With respect to this abelian refinement, the stratified Mukai flop induces a perverse equivalence on the derived categories for complementary Grassmanians. This is part of a larger project to give a combinatorial approach to Lusztig's conjectures for representations of Lie algebras in positive characteristic.




up

A Class of Functional Inequalities and their Applications to Fourth-Order Nonlinear Parabolic Equations. (arXiv:1612.03508v3 [math.AP] UPDATED)

We study a class of fourth order nonlinear parabolic equations which include the thin-film equation and the quantum drift-diffusion model as special cases. We investigate these equations by first developing functional inequalities of the type $ int_Omega u^{2gamma-alpha-eta}Delta u^alphaDelta u^eta dx geq cint_Omega|Delta u^gamma |^2dx $, which seem to be of interest on their own right.




up

On the zeros of the Riemann zeta function, twelve years later. (arXiv:0806.2361v7 [math.GM] UPDATED)

The paper proves the Riemann Hypothesis.




up

Word problems for finite nilpotent groups. (arXiv:2005.03634v1 [math.GR])

Let $w$ be a word in $k$ variables. For a finite nilpotent group $G$, a conjecture of Amit states that $N_w(1) ge |G|^{k-1}$, where $N_w(1)$ is the number of $k$-tuples $(g_1,...,g_k)in G^{(k)}$ such that $w(g_1,...,g_k)=1$. Currently, this conjecture is known to be true for groups of nilpotency class 2. Here we consider a generalized version of Amit's conjecture, and prove that $N_w(g) ge |G|^{k-2}$, where $g$ is a $w$-value in $G$, for finite groups $G$ of odd order and nilpotency class 2. If $w$ is a word in two variables, we further show that $N_w(g) ge |G|$, where $g$ is a $w$-value in $G$ for finite groups $G$ of nilpotency class 2. In addition, for $p$ a prime, we show that finite $p$-groups $G$, with two distinct irreducible complex character degrees, satisfy the generalized Amit conjecture for words $w_k =[x_1,y_1]...[x_k,y_k]$ with $k$ a natural number; that is, for $g$ a $w_k$-value in $G$ we have $N_{w_k}(g) ge |G|^{2k-1}$.

Finally, we discuss the related group properties of being rational and chiral, and show that every finite group of nilpotency class 2 is rational.




up

A survey of Hardy type inequalities on homogeneous groups. (arXiv:2005.03614v1 [math.FA])

In this review paper, we survey Hardy type inequalities from the point of view of Folland and Stein's homogeneous groups. Particular attention is paid to Hardy type inequalities on stratified groups which give a special class of homogeneous groups. In this environment, the theory of Hardy type inequalities becomes intricately intertwined with the properties of sub-Laplacians and more general subelliptic partial differential equations. Particularly, we discuss the Badiale-Tarantello conjecture and a conjecture on the geometric Hardy inequality in a half-space of the Heisenberg group with a sharp constant.




up

On products of groups and indices not divisible by a given prime. (arXiv:2005.03608v1 [math.GR])

Let the group $G = AB$ be the product of subgroups $A$ and $B$, and let $p$ be a prime. We prove that $p$ does not divide the conjugacy class size (index) of each $p$-regular element of prime power order $xin Acup B$ if and only if $G$ is $p$-decomposable, i.e. $G=O_p(G) imes O_{p'}(G)$.




up

Groups up to congruence relation and from categorical groups to c-crossed modules. (arXiv:2005.03601v1 [math.CT])

We introduce a notion of c-group, which is a group up to congruence relation and consider the corresponding category. Extensions, actions and crossed modules (c-crossed modules) are defined in this category and the semi-direct product is constructed. We prove that each categorical group gives rise to c-groups and to a c-crossed module, which is a connected, special and strict c-crossed module in the sense defined by us. The results obtained here will be applied in the proof of an equivalence of the categories of categorical groups and connected, special and strict c-crossed modules.




up

A reducibility problem for even Unitary groups: The depth zero case. (arXiv:2005.03386v1 [math.RT])

We study a problem concerning parabolic induction in certain p-adic unitary groups. More precisely, for $E/F$ a quadratic extension of p-adic fields the associated unitary group $G=mathrm{U}(n,n)$ contains a parabolic subgroup $P$ with Levi component $L$ isomorphic to $mathrm{GL}_n(E)$. Let $pi$ be an irreducible supercuspidal representation of $L$ of depth zero. We use Hecke algebra methods to determine when the parabolically induced representation $iota_P^G pi$ is reducible.




up

Evaluating the phase dynamics of coupled oscillators via time-variant topological features. (arXiv:2005.03343v1 [physics.data-an])

The characterization of phase dynamics in coupled oscillators offers insights into fundamental phenomena in complex systems. To describe the collective dynamics in the oscillatory system, order parameters are often used but are insufficient for identifying more specific behaviors. We therefore propose a topological approach that constructs quantitative features describing the phase evolution of oscillators. Here, the phase data are mapped into a high-dimensional space at each time point, and topological features describing the shape of the data are subsequently extracted from the mapped points. We extend these features to time-variant topological features by considering the evolution time, which serves as an additional dimension in the topological-feature space. The resulting time-variant features provide crucial insights into the time evolution of phase dynamics. We combine these features with the machine learning kernel method to characterize the multicluster synchronized dynamics at a very early stage of the evolution. Furthermore, we demonstrate the usefulness of our method for qualitatively explaining chimera states, which are states of stably coexisting coherent and incoherent groups in systems of identical phase oscillators. The experimental results show that our method is generally better than those using order parameters, especially if only data on the early-stage dynamics are available.




up

The Congruence Subgroup Problem for finitely generated Nilpotent Groups. (arXiv:2005.03263v1 [math.GR])

The congruence subgroup problem for a finitely generated group $Gamma$ and $Gleq Aut(Gamma)$ asks whether the map $hat{G} o Aut(hat{Gamma})$ is injective, or more generally, what is its kernel $Cleft(G,Gamma ight)$? Here $hat{X}$ denotes the profinite completion of $X$. In the case $G=Aut(Gamma)$ we denote $Cleft(Gamma ight)=Cleft(Aut(Gamma),Gamma ight)$.

Let $Gamma$ be a finitely generated group, $ar{Gamma}=Gamma/[Gamma,Gamma]$, and $Gamma^{*}=ar{Gamma}/tor(ar{Gamma})congmathbb{Z}^{(d)}$. Denote $Aut^{*}(Gamma)= extrm{Im}(Aut(Gamma) o Aut(Gamma^{*}))leq GL_{d}(mathbb{Z})$. In this paper we show that when $Gamma$ is nilpotent, there is a canonical isomorphism $Cleft(Gamma ight)simeq C(Aut^{*}(Gamma),Gamma^{*})$. In other words, $Cleft(Gamma ight)$ is completely determined by the solution to the classical congruence subgroup problem for the arithmetic group $Aut^{*}(Gamma)$.

In particular, in the case where $Gamma=Psi_{n,c}$ is a finitely generated free nilpotent group of class $c$ on $n$ elements, we get that $C(Psi_{n,c})=C(mathbb{Z}^{(n)})={e}$ whenever $ngeq3$, and $C(Psi_{2,c})=C(mathbb{Z}^{(2)})=hat{F}_{omega}$ = the free profinite group on countable number of generators.




up

New constructions of strongly regular Cayley graphs on abelian groups. (arXiv:2005.03183v1 [math.CO])

In this paper, we give new constructions of strongly regular Cayley graphs on abelian groups as generalizations of a series of known constructions: the construction of covering extended building sets in finite fields by Xia (1992), the product construction of Menon-Hadamard difference sets by Turyn (1984), and the construction of Paley type partial difference sets by Polhill (2010). Then, we obtain new large families of strongly regular Cayley graphs of Latin square type or negative Latin square type.




up

Generalized Cauchy-Kovalevskaya extension and plane wave decompositions in superspace. (arXiv:2005.03160v1 [math-ph])

The aim of this paper is to obtain a generalized CK-extension theorem in superspace for the bi-axial Dirac operator. In the classical commuting case, this result can be written as a power series of Bessel type of certain differential operators acting on a single initial function. In the superspace setting, novel structures appear in the cases of negative even superdimensions. In these cases, the CK-extension depends on two initial functions on which two power series of differential operators act. These series are not only of Bessel type but they give rise to an additional structure in terms of Appell polynomials. This pattern also is present in the structure of the Pizzetti formula, which describes integration over the supersphere in terms of differential operators. We make this relation explicit by studying the decomposition of the generalized CK-extension into plane waves integrated over the supersphere. Moreover, these results are applied to obtain a decomposition of the Cauchy kernel in superspace into monogenic plane waves, which shall be useful for inverting the super Radon transform.




up

Irreducible representations of Braid Group $B_n$ of dimension $n+1$. (arXiv:2005.03105v1 [math.GR])

We prove that there are no irreducible representations of $B_n$ of dimension $n+1$ for $ngeq 10.$




up

GraCIAS: Grassmannian of Corrupted Images for Adversarial Security. (arXiv:2005.02936v2 [cs.CV] UPDATED)

Input transformation based defense strategies fall short in defending against strong adversarial attacks. Some successful defenses adopt approaches that either increase the randomness within the applied transformations, or make the defense computationally intensive, making it substantially more challenging for the attacker. However, it limits the applicability of such defenses as a pre-processing step, similar to computationally heavy approaches that use retraining and network modifications to achieve robustness to perturbations. In this work, we propose a defense strategy that applies random image corruptions to the input image alone, constructs a self-correlation based subspace followed by a projection operation to suppress the adversarial perturbation. Due to its simplicity, the proposed defense is computationally efficient as compared to the state-of-the-art, and yet can withstand huge perturbations. Further, we develop proximity relationships between the projection operator of a clean image and of its adversarially perturbed version, via bounds relating geodesic distance on the Grassmannian to matrix Frobenius norms. We empirically show that our strategy is complementary to other weak defenses like JPEG compression and can be seamlessly integrated with them to create a stronger defense. We present extensive experiments on the ImageNet dataset across four different models namely InceptionV3, ResNet50, VGG16 and MobileNet models with perturbation magnitude set to {epsilon} = 16. Unlike state-of-the-art approaches, even without any retraining, the proposed strategy achieves an absolute improvement of ~ 4.5% in defense accuracy on ImageNet.




up

A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus. (arXiv:2005.02911v2 [quant-ph] UPDATED)

Quantum computing has evolved quickly in recent years and is showing significant benefits in a variety of fields. Malware analysis is one of those fields that could also take advantage of quantum computing. The combination of software used to locate the most frequent hashes and $n$-grams between benign and malicious software (KiloGram) and a quantum search algorithm could be beneficial, by loading the table of hashes and $n$-grams into a quantum computer, and thereby speeding up the process of mapping $n$-grams to their hashes. The first phase will be to use KiloGram to find the top-$k$ hashes and $n$-grams for a large malware corpus. From here, the resulting hash table is then loaded into a quantum machine. A quantum search algorithm is then used search among every permutation of the entangled key and value pairs to find the desired hash value. This prevents one from having to re-compute hashes for a set of $n$-grams, which can take on average $O(MN)$ time, whereas the quantum algorithm could take $O(sqrt{N})$ in the number of table lookups to find the desired hash values.




up

Multi-Resolution POMDP Planning for Multi-Object Search in 3D. (arXiv:2005.02878v2 [cs.RO] UPDATED)

Robots operating in household environments must find objects on shelves, under tables, and in cupboards. Previous work often formulate the object search problem as a POMDP (Partially Observable Markov Decision Process), yet constrain the search space in 2D. We propose a new approach that enables the robot to efficiently search for objects in 3D, taking occlusions into account. We model the problem as an object-oriented POMDP, where the robot receives a volumetric observation from a viewing frustum and must produce a policy to efficiently search for objects. To address the challenge of large state and observation spaces, we first propose a per-voxel observation model which drastically reduces the observation size necessary for planning. Then, we present a novel octree-based belief representation which captures beliefs at different resolutions and supports efficient exact belief update. Finally, we design an online multi-resolution planning algorithm that leverages the resolution layers in the octree structure as levels of abstractions to the original POMDP problem. Our evaluation in a simulated 3D domain shows that, as the problem scales, our approach significantly outperforms baselines without resolution hierarchy by 25%-35% in cumulative reward. We demonstrate the practicality of our approach on a torso-actuated mobile robot searching for objects in areas of a cluttered lab environment where objects appear on surfaces at different heights.




up

Modeling nanoconfinement effects using active learning. (arXiv:2005.02587v2 [physics.app-ph] UPDATED)

Predicting the spatial configuration of gas molecules in nanopores of shale formations is crucial for fluid flow forecasting and hydrocarbon reserves estimation. The key challenge in these tight formations is that the majority of the pore sizes are less than 50 nm. At this scale, the fluid properties are affected by nanoconfinement effects due to the increased fluid-solid interactions. For instance, gas adsorption to the pore walls could account for up to 85% of the total hydrocarbon volume in a tight reservoir. Although there are analytical solutions that describe this phenomenon for simple geometries, they are not suitable for describing realistic pores, where surface roughness and geometric anisotropy play important roles. To describe these, molecular dynamics (MD) simulations are used since they consider fluid-solid and fluid-fluid interactions at the molecular level. However, MD simulations are computationally expensive, and are not able to simulate scales larger than a few connected nanopores. We present a method for building and training physics-based deep learning surrogate models to carry out fast and accurate predictions of molecular configurations of gas inside nanopores. Since training deep learning models requires extensive databases that are computationally expensive to create, we employ active learning (AL). AL reduces the overhead of creating comprehensive sets of high-fidelity data by determining where the model uncertainty is greatest, and running simulations on the fly to minimize it. The proposed workflow enables nanoconfinement effects to be rigorously considered at the mesoscale where complex connected sets of nanopores control key applications such as hydrocarbon recovery and CO2 sequestration.




up

Multi-task pre-training of deep neural networks for digital pathology. (arXiv:2005.02561v2 [eess.IV] UPDATED)

In this work, we investigate multi-task learning as a way of pre-training models for classification tasks in digital pathology. It is motivated by the fact that many small and medium-size datasets have been released by the community over the years whereas there is no large scale dataset similar to ImageNet in the domain. We first assemble and transform many digital pathology datasets into a pool of 22 classification tasks and almost 900k images. Then, we propose a simple architecture and training scheme for creating a transferable model and a robust evaluation and selection protocol in order to evaluate our method. Depending on the target task, we show that our models used as feature extractors either improve significantly over ImageNet pre-trained models or provide comparable performance. Fine-tuning improves performance over feature extraction and is able to recover the lack of specificity of ImageNet features, as both pre-training sources yield comparable performance.




up

The Cascade Transformer: an Application for Efficient Answer Sentence Selection. (arXiv:2005.02534v2 [cs.CL] UPDATED)

Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference. In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers. Each ranker is used to prune a subset of candidates in a batch, thus dramatically increasing throughput at inference time. Partial encodings from the transformer model are shared among rerankers, providing further speed-up. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy, as measured on two English Question Answering datasets.




up

On the list recoverability of randomly punctured codes. (arXiv:2005.02478v2 [math.CO] UPDATED)

We show that a random puncturing of a code with good distance is list recoverable beyond the Johnson bound. In particular, this implies that there are Reed-Solomon codes that are list recoverable beyond the Johnson bound. It was previously known that there are Reed-Solomon codes that do not have this property. As an immediate corollary to our main theorem, we obtain better degree bounds on unbalanced expanders that come from Reed-Solomon codes.




up

Temporal Event Segmentation using Attention-based Perceptual Prediction Model for Continual Learning. (arXiv:2005.02463v2 [cs.CV] UPDATED)

Temporal event segmentation of a long video into coherent events requires a high level understanding of activities' temporal features. The event segmentation problem has been tackled by researchers in an offline training scheme, either by providing full, or weak, supervision through manually annotated labels or by self-supervised epoch based training. In this work, we present a continual learning perceptual prediction framework (influenced by cognitive psychology) capable of temporal event segmentation through understanding of the underlying representation of objects within individual frames. Our framework also outputs attention maps which effectively localize and track events-causing objects in each frame. The model is tested on a wildlife monitoring dataset in a continual training manner resulting in $80\%$ recall rate at $20\%$ false positive rate for frame level segmentation. Activity level testing has yielded $80\%$ activity recall rate for one false activity detection every 50 minutes.




up

Differential Machine Learning. (arXiv:2005.02347v2 [q-fin.CP] UPDATED)

Differential machine learning (ML) extends supervised learning, with models trained on examples of not only inputs and labels, but also differentials of labels to inputs.

Differential ML is applicable in all situations where high quality first order derivatives wrt training inputs are available. In the context of financial Derivatives risk management, pathwise differentials are efficiently computed with automatic adjoint differentiation (AAD). Differential ML, combined with AAD, provides extremely effective pricing and risk approximations. We can produce fast pricing analytics in models too complex for closed form solutions, extract the risk factors of complex transactions and trading books, and effectively compute risk management metrics like reports across a large number of scenarios, backtesting and simulation of hedge strategies, or capital regulations.

The article focuses on differential deep learning (DL), arguably the strongest application. Standard DL trains neural networks (NN) on punctual examples, whereas differential DL teaches them the shape of the target function, resulting in vastly improved performance, illustrated with a number of numerical examples, both idealized and real world. In the online appendices, we apply differential learning to other ML models, like classic regression or principal component analysis (PCA), with equally remarkable results.

This paper is meant to be read in conjunction with its companion GitHub repo https://github.com/differential-machine-learning, where we posted a TensorFlow implementation, tested on Google Colab, along with examples from the article and additional ones. We also posted appendices covering many practical implementation details not covered in the paper, mathematical proofs, application to ML models besides neural networks and extensions necessary for a reliable implementation in production.




up

Automata Tutor v3. (arXiv:2005.01419v2 [cs.FL] UPDATED)

Computer science class enrollments have rapidly risen in the past decade. With current class sizes, standard approaches to grading and providing personalized feedback are no longer possible and new techniques become both feasible and necessary. In this paper, we present the third version of Automata Tutor, a tool for helping teachers and students in large courses on automata and formal languages. The second version of Automata Tutor supported automatic grading and feedback for finite-automata constructions and has already been used by thousands of users in dozens of countries. This new version of Automata Tutor supports automated grading and feedback generation for a greatly extended variety of new problems, including problems that ask students to create regular expressions, context-free grammars, pushdown automata and Turing machines corresponding to a given description, and problems about converting between equivalent models - e.g., from regular expressions to nondeterministic finite automata. Moreover, for several problems, this new version also enables teachers and students to automatically generate new problem instances. We also present the results of a survey run on a class of 950 students, which shows very positive results about the usability and usefulness of the tool.




up

The Sensitivity of Language Models and Humans to Winograd Schema Perturbations. (arXiv:2005.01348v2 [cs.CL] UPDATED)

Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues.




up

Prediction of Event Related Potential Speller Performance Using Resting-State EEG. (arXiv:2005.01325v3 [cs.HC] UPDATED)

Event-related potential (ERP) speller can be utilized in device control and communication for locked-in or severely injured patients. However, problems such as inter-subject performance instability and ERP-illiteracy are still unresolved. Therefore, it is necessary to predict classification performance before performing an ERP speller in order to use it efficiently. In this study, we investigated the correlations with ERP speller performance using a resting-state before an ERP speller. In specific, we used spectral power and functional connectivity according to four brain regions and five frequency bands. As a result, the delta power in the frontal region and functional connectivity in the delta, alpha, gamma bands are significantly correlated with the ERP speller performance. Also, we predicted the ERP speller performance using EEG features in the resting-state. These findings may contribute to investigating the ERP-illiteracy and considering the appropriate alternatives for each user.




up

Quantum arithmetic operations based on quantum Fourier transform on signed integers. (arXiv:2005.00443v2 [cs.IT] UPDATED)

The quantum Fourier transform brings efficiency in many respects, especially usage of resource, for most operations on quantum computers. In this study, the existing QFT-based and non-QFT-based quantum arithmetic operations are examined. The capabilities of QFT-based addition and multiplication are improved with some modifications. The proposed operations are compared with the nearest quantum arithmetic operations. Furthermore, novel QFT-based subtraction and division operations are presented. The proposed arithmetic operations can perform non-modular operations on all signed numbers without any limitation by using less resources. In addition, novel quantum circuits of two's complement, absolute value and comparison operations are also presented by using the proposed QFT based addition and subtraction operations.




up

On-board Deep-learning-based Unmanned Aerial Vehicle Fault Cause Detection and Identification. (arXiv:2005.00336v2 [eess.SP] UPDATED)

With the increase in use of Unmanned Aerial Vehicles (UAVs)/drones, it is important to detect and identify causes of failure in real time for proper recovery from a potential crash-like scenario or post incident forensics analysis. The cause of crash could be either a fault in the sensor/actuator system, a physical damage/attack, or a cyber attack on the drone's software. In this paper, we propose novel architectures based on deep Convolutional and Long Short-Term Memory Neural Networks (CNNs and LSTMs) to detect (via Autoencoder) and classify drone mis-operations based on sensor data. The proposed architectures are able to learn high-level features automatically from the raw sensor data and learn the spatial and temporal dynamics in the sensor data. We validate the proposed deep-learning architectures via simulations and experiments on a real drone. Empirical results show that our solution is able to detect with over 90% accuracy and classify various types of drone mis-operations (with about 99% accuracy (simulation data) and upto 88% accuracy (experimental data)).




up

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment. (arXiv:2005.00165v3 [cs.CL] UPDATED)

A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all.




up

Generative Adversarial Networks in Digital Pathology: A Survey on Trends and Future Potential. (arXiv:2004.14936v2 [eess.IV] UPDATED)

Image analysis in the field of digital pathology has recently gained increased popularity. The use of high-quality whole slide scanners enables the fast acquisition of large amounts of image data, showing extensive context and microscopic detail at the same time. Simultaneously, novel machine learning algorithms have boosted the performance of image analysis approaches. In this paper, we focus on a particularly powerful class of architectures, called Generative Adversarial Networks (GANs), applied to histological image data. Besides improving performance, GANs also enable application scenarios in this field, which were previously intractable. However, GANs could exhibit a potential for introducing bias. Hereby, we summarize the recent state-of-the-art developments in a generalizing notation, present the main applications of GANs and give an outlook of some chosen promising approaches and their possible future applications. In addition, we identify currently unavailable methods with potential for future applications.




up

Towards Embodied Scene Description. (arXiv:2004.14638v2 [cs.RO] UPDATED)

Embodiment is an important characteristic for all intelligent agents (creatures and robots), while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from the interaction between the agent and the environment. In this work, we propose the Embodied Scene Description, which exploits the embodiment ability of the agent to find an optimal viewpoint in its environment for scene description tasks. A learning framework with the paradigms of imitation learning and reinforcement learning is established to teach the intelligent agent to generate corresponding sensorimotor activities. The proposed framework is tested on both the AI2Thor dataset and a real world robotic platform demonstrating the effectiveness and extendability of the developed method.




up

Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images. (arXiv:2004.14487v2 [cs.CV] UPDATED)

The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property.




up

When Hearing Defers to Touch. (arXiv:2004.13462v2 [q-bio.NC] UPDATED)

Hearing is often believed to be more sensitive than touch. This assertion is based on a comparison of sensitivities to weak stimuli. The respective stimuli, however, are not easily comparable since hearing is gauged using acoustic pressure and touch using skin displacement. We show that under reasonable assumptions the auditory and tactile detection thresholds can be reconciled on a level playing field. The results indicate that the capacity of touch and hearing to detect weak stimuli varies according to the size of a sensed object as well as to the frequency of its oscillations. In particular, touch is found to be more effective than hearing at detecting small and slow objects.




up

Self-Attention with Cross-Lingual Position Representation. (arXiv:2004.13310v2 [cs.CL] UPDATED)

Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g. machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with emph{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT'14 English$Rightarrow$German, WAT'17 Japanese$Rightarrow$English, and WMT'17 Chinese$Leftrightarrow$English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information.