cl MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. (arXiv:2005.03545v1 [cs.CL]) By arxiv.org Published On :: Multimodal Sentiment Analysis is an active area of research that leverages multimodal signals for affective understanding of user-generated videos. The predominant approach, addressing this task, has been to develop sophisticated fusion techniques. However, the heterogeneous nature of the signals creates distributional modality gaps that pose significant challenges. In this paper, we aim to learn effective modality representations to aid the process of fusion. We propose a novel framework, MISA, which projects each modality to two distinct subspaces. The first subspace is modality invariant, where the representations across modalities learn their commonalities and reduce the modality gap. The second subspace is modality-specific, which is private to each modality and captures their characteristic features. These representations provide a holistic view of the multimodal data, which is used for fusion that leads to task predictions. Our experiments on popular sentiment analysis benchmarks, MOSI and MOSEI, demonstrate significant gains over state-of-the-art models. We also consider the task of Multimodal Humor Detection and experiment on the recently proposed UR_FUNNY dataset. Here too, our model fares better than strong baselines, establishing MISA as a useful multimodal framework. Full Article
cl The Danish Gigaword Project. (arXiv:2005.03521v1 [cs.CL]) By arxiv.org Published On :: Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language. Full Article
cl Practical Perspectives on Quality Estimation for Machine Translation. (arXiv:2005.03519v1 [cs.CL]) By arxiv.org Published On :: Sentence level quality estimation (QE) for machine translation (MT) attempts to predict the translation edit rate (TER) cost of post-editing work required to correct MT output. We describe our view on sentence-level QE as dictated by several practical setups encountered in the industry. We find consumers of MT output---whether human or algorithmic ones---to be primarily interested in a binary quality metric: is the translated sentence adequate as-is or does it need post-editing? Motivated by this we propose a quality classification (QC) view on sentence-level QE whereby we focus on maximizing recall at precision above a given threshold. We demonstrate that, while classical QE regression models fare poorly on this task, they can be re-purposed by replacing the output regression layer with a binary classification one, achieving 50-60\% recall at 90\% precision. For a high-quality MT system producing 75-80\% correct translations, this promises a significant reduction in post-editing work indeed. Full Article
cl Computing with bricks and mortar: Classification of waveforms with a doped concrete blocks. (arXiv:2005.03498v1 [cs.ET]) By arxiv.org Published On :: We present results showing the capability of concrete-based information processing substrate in the signal classification task in accordance with in materio computing paradigm. As the Reservoir Computing is a suitable model for describing embedded in materio computation, we propose that this type of presented basic construction unit can be used as a source for "reservoir of states" necessary for simple tuning of the readout layer. In that perspective, buildings constructed from computing concrete could function as a highly parallel information processor for smart architecture. We present an electrical characterization of the set of samples with different additive concentrations followed by a dynamical analysis of selected specimens showing fingerprints of memfractive properties. Moreover, on the basis of obtained parameters, classification of the signal waveform shapes can be performed in scenarios explicitly tuned for a given device terminal. Full Article
cl Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences. (arXiv:2005.03436v1 [cs.CL]) By arxiv.org Published On :: The patterns in which the syntax of different languages converges and diverges are often used to inform work on cross-lingual transfer. Nevertheless, little empirical work has been done on quantifying the prevalence of different syntactic divergences across language pairs. We propose a framework for extracting divergence patterns for any language pair from a parallel corpus, building on Universal Dependencies. We show that our framework provides a detailed picture of cross-language divergences, generalizes previous approaches, and lends itself to full automation. We further present a novel dataset, a manually word-aligned subset of the Parallel UD corpus in five languages, and use it to perform a detailed corpus study. We demonstrate the usefulness of the resulting analysis by showing that it can help account for performance patterns of a cross-lingual parser. Full Article
cl The Perceptimatic English Benchmark for Speech Perception Models. (arXiv:2005.03418v1 [cs.CL]) By arxiv.org Published On :: We present the Perceptimatic English Benchmark, an open experimental benchmark for evaluating quantitative models of speech perception in English. The benchmark consists of ABX stimuli along with the responses of 91 American English-speaking listeners. The stimuli test discrimination of a large number of English and French phonemic contrasts. They are extracted directly from corpora of read speech, making them appropriate for evaluating statistical acoustic models (such as those used in automatic speech recognition) trained on typical speech data sets. We show that phone discrimination is correlated with several types of models, and give recommendations for researchers seeking easily calculated norms of acoustic distance on experimental stimuli. We show that DeepSpeech, a standard English speech recognizer, is more specialized on English phoneme discrimination than English listeners, and is poorly correlated with their behaviour, even though it yields a low error on the decision task given to humans. Full Article
cl Scheduling with a processing time oracle. (arXiv:2005.03394v1 [cs.DS]) By arxiv.org Published On :: In this paper we study a single machine scheduling problem on a set of independent jobs whose execution time is not known, but guaranteed to be either short or long, for two given processing times. At every time step, the scheduler has the possibility either to test a job, by querying a processing time oracle, which reveals its processing time, and occupies one time unit on the schedule. Or the scheduler can execute a job, might it be previously tested or not. The objective value is the total completion time over all jobs, and is compared with the objective value of an optimal schedule, which does not need to test. The resulting competitive ratio measures the price of hidden processing time. Two models are studied in this paper. In the non-adaptive model, the algorithm needs to decide before hand which jobs to test, and which jobs to execute untested. However in the adaptive model, the algorithm can make these decisions adaptively to the outcomes of the job tests. In both models we provide optimal polynomial time two-phase algorithms, which consist of a first phase where jobs are tested, and a second phase where jobs are executed untested. Experiments give strong evidence that optimal algorithms have this structure. Proving this property is left as an open problem. Full Article
cl Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation. (arXiv:2005.03393v1 [cs.CL]) By arxiv.org Published On :: In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in documentlevel neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods. Full Article
cl 2kenize: Tying Subword Sequences for Chinese Script Conversion. (arXiv:2005.03375v1 [cs.CL]) By arxiv.org Published On :: Simplified Chinese to Traditional Chinese character conversion is a common preprocessing step in Chinese NLP. Despite this, current approaches have poor performance because they do not take into account that a simplified Chinese character can correspond to multiple traditional characters. Here, we propose a model that can disambiguate between mappings and convert between the two scripts. The model is based on subword segmentation, two language models, as well as a method for mapping between subword sequences. We further construct benchmark datasets for topic classification and script conversion. Our proposed method outperforms previous Chinese Character conversion approaches by 6 points in accuracy. These results are further confirmed in a downstream application, where 2kenize is used to convert pretraining dataset for topic classification. An error analysis reveals that our method's particular strengths are in dealing with code-mixing and named entities. Full Article
cl Playing Minecraft with Behavioural Cloning. (arXiv:2005.03374v1 [cs.AI]) By arxiv.org Published On :: MineRL 2019 competition challenged participants to train sample-efficient agents to play Minecraft, by using a dataset of human gameplay and a limit number of steps the environment. We approached this task with behavioural cloning by predicting what actions human players would take, and reached fifth place in the final ranking. Despite being a simple algorithm, we observed the performance of such an approach can vary significantly, based on when the training is stopped. In this paper, we detail our submission to the competition, run further experiments to study how performance varied over training and study how different engineering decisions affected these results. Full Article
cl JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL]) By arxiv.org Published On :: Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese--English and News Commentary Japanese--Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks. Full Article
cl DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL]) By arxiv.org Published On :: Despite recent progress on computer vision and natural language processing, developing video understanding intelligence is still hard to achieve due to the intrinsic difficulty of story in video. Moreover, there is not a theoretical metric for evaluating the degree of video understanding. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focused on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217,308 annotated images with rich character-centered annotations, including visual bounding boxes, behaviors, and emotions of main characters, and coreference resolved scripts. Additionally, we provide analyses of the dataset as well as Dual Matching Multistream model which effectively learns character-centered representations of video to answer questions about the video. We are planning to release our dataset and model publicly for research purposes and expect that our work will provide a new perspective on video story understanding research. Full Article
cl Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV]) By arxiv.org Published On :: Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions. Full Article
cl Boosting Cloud Data Analytics using Multi-Objective Optimization. (arXiv:2005.03314v1 [cs.DB]) By arxiv.org Published On :: Data analytics in the cloud has become an integral part of enterprise businesses. Big data analytics systems, however, still lack the ability to take user performance goals and budgetary constraints for a task, collectively referred to as task objectives, and automatically configure an analytic job to achieve these objectives. This paper presents a data analytics optimizer that can automatically determine a cluster configuration with a suitable number of cores as well as other system parameters that best meet the task objectives. At a core of our work is a principled multi-objective optimization (MOO) approach that computes a Pareto optimal set of job configurations to reveal tradeoffs between different user objectives, recommends a new job configuration that best explores such tradeoffs, and employs novel optimizations to enable such recommendations within a few seconds. We present efficient incremental algorithms based on the notion of a Progressive Frontier for realizing our MOO approach and implement them into a Spark-based prototype. Detailed experiments using benchmark workloads show that our MOO techniques provide a 2-50x speedup over existing MOO methods, while offering good coverage of the Pareto frontier. When compared to Ottertune, a state-of-the-art performance tuning system, our approach recommends configurations that yield 26\%-49\% reduction of running time of the TPCx-BB benchmark while adapting to different application preferences on multiple objectives. Full Article
cl Nakdan: Professional Hebrew Diacritizer. (arXiv:2005.03312v1 [cs.CL]) By arxiv.org Published On :: We present a system for automatic diacritization of Hebrew text. The system combines modern neural models with carefully curated declarative linguistic knowledge and comprehensive manually constructed tables and dictionaries. Besides providing state of the art diacritization accuracy, the system also supports an interface for manual editing and correction of the automatic output, and has several features which make it particularly useful for preparation of scientific editions of Hebrew texts. The system supports Modern Hebrew, Rabbinic Hebrew and Poetic Hebrew. The system is freely accessible for all use at this http URL Full Article
cl Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV]) By arxiv.org Published On :: Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an Adaptive Feature Selection guided Deep Forest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE) and AUC achieved by our method are 91.79%, 93.05%, 89.95% and 96.35%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods. Full Article
cl Quda: Natural Language Queries for Visual Data Analytics. (arXiv:2005.03257v1 [cs.CL]) By arxiv.org Published On :: Visualization-oriented natural language interfaces (V-NLIs) have been explored and developed in recent years. One challenge faced by V-NLIs is in the formation of effective design decisions that usually requires a deep understanding of user queries. Learning-based approaches have shown potential in V-NLIs and reached state-of-the-art performance in various NLP tasks. However, because of the lack of sufficient training samples that cater to visual data analytics, cutting-edge techniques have rarely been employed to facilitate the development of V-NLIs. We present a new dataset, called Quda, to help V-NLIs understand free-form natural language. Our dataset contains 14;035 diverse user queries annotated with 10 low-level analytic tasks that assist in the deployment of state-of-the-art techniques for parsing complex human language. We achieve this goal by first gathering seed queries with data analysts who are target users of V-NLIs. Then we employ extensive crowd force for paraphrase generation and validation. We demonstrate the usefulness of Quda in building V-NLIs by creating a prototype that makes effective design decisions for free-form user queries. We also show that Quda can be beneficial for a wide range of applications in the visualization community by analyzing the design tasks described in academic publications. Full Article
cl Multi-Target Deep Learning for Algal Detection and Classification. (arXiv:2005.03232v1 [cs.CV]) By arxiv.org Published On :: Water quality has a direct impact on industry, agriculture, and public health. Algae species are common indicators of water quality. It is because algal communities are sensitive to changes in their habitats, giving valuable knowledge on variations in water quality. However, water quality analysis requires professional inspection of algal detection and classification under microscopes, which is very time-consuming and tedious. In this paper, we propose a novel multi-target deep learning framework for algal detection and classification. Extensive experiments were carried out on a large-scale colored microscopic algal dataset. Experimental results demonstrate that the proposed method leads to the promising performance on algal detection, class identification and genus identification. Full Article
cl Conley's fundamental theorem for a class of hybrid systems. (arXiv:2005.03217v1 [math.DS]) By arxiv.org Published On :: We establish versions of Conley's (i) fundamental theorem and (ii) decomposition theorem for a broad class of hybrid dynamical systems. The hybrid version of (i) asserts that a globally-defined "hybrid complete Lyapunov function" exists for every hybrid system in this class. Motivated by mechanics and control settings where physical or engineered events cause abrupt changes in a system's governing dynamics, our results apply to a large class of Lagrangian hybrid systems (with impacts) studied extensively in the robotics literature. Viewed formally, these results generalize those of Conley and Franks for continuous-time and discrete-time dynamical systems, respectively, on metric spaces. However, we furnish specific examples illustrating how our statement of sufficient conditions represents merely an early step in the longer project of establishing what formal assumptions can and cannot endow hybrid systems models with the topologically well characterized partitions of limit behavior that make Conley's theory so valuable in those classical settings. Full Article
cl A Dynamical Perspective on Point Cloud Registration. (arXiv:2005.03190v1 [cs.CV]) By arxiv.org Published On :: We provide a dynamical perspective on the classical problem of 3D point cloud registration with correspondences. A point cloud is considered as a rigid body consisting of particles. The problem of registering two point clouds is formulated as a dynamical system, where the dynamic model point cloud translates and rotates in a viscous environment towards the static scene point cloud, under forces and torques induced by virtual springs placed between each pair of corresponding points. We first show that the potential energy of the system recovers the objective function of the maximum likelihood estimation. We then adopt Lyapunov analysis, particularly the invariant set theorem, to analyze the rigid body dynamics and show that the system globally asymptotically tends towards the set of equilibrium points, where the globally optimal registration solution lies in. We conjecture that, besides the globally optimal equilibrium point, the system has either three or infinite "spurious" equilibrium points, and these spurious equilibria are all locally unstable. The case of three spurious equilibria corresponds to generic shape of the point cloud, while the case of infinite spurious equilibria happens when the point cloud exhibits symmetry. Therefore, simulating the dynamics with random perturbations guarantees to obtain the globally optimal registration solution. Numerical experiments support our analysis and conjecture. Full Article
cl Fact-based Dialogue Generation with Convergent and Divergent Decoding. (arXiv:2005.03174v1 [cs.CL]) By arxiv.org Published On :: Fact-based dialogue generation is a task of generating a human-like response based on both dialogue context and factual texts. Various methods were proposed to focus on generating informative words that contain facts effectively. However, previous works implicitly assume a topic to be kept on a dialogue and usually converse passively, therefore the systems have a difficulty to generate diverse responses that provide meaningful information proactively. This paper proposes an end-to-end Fact-based dialogue system augmented with the ability of convergent and divergent thinking over both context and facts, which can converse about the current topic or introduce a new topic. Specifically, our model incorporates a novel convergent and divergent decoding that can generate informative and diverse responses considering not only given inputs (context and facts) but also inputs-related topics. Both automatic and human evaluation results on DSTC7 dataset show that our model significantly outperforms state-of-the-art baselines, indicating that our model can generate more appropriate, informative, and diverse responses. Full Article
cl Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting. (arXiv:2005.03119v1 [cs.CL]) By arxiv.org Published On :: Unsupervised machine translation (MT) has recently achieved impressive results with monolingual corpora only. However, it is still challenging to associate source-target sentences in the latent space. As people speak different languages biologically share similar visual systems, the potential of achieving better alignment through visual content is promising yet under-explored in unsupervised multimodal MT (MMT). In this paper, we investigate how to utilize visual content for disambiguation and promoting latent space alignment in unsupervised MMT. Our model employs multimodal back-translation and features pseudo visual pivoting in which we learn a shared multilingual visual-semantic embedding space and incorporate visually-pivoted captioning as additional weak supervision. The experimental results on the widely used Multi30K dataset show that the proposed model significantly improves over the state-of-the-art methods and generalizes well when the images are not available at the testing time. Full Article
cl AIOps for a Cloud Object Storage Service. (arXiv:2005.03094v1 [cs.DC]) By arxiv.org Published On :: With the growing reliance on the ubiquitous availability of IT systems and services, these systems become more global, scaled, and complex to operate. To maintain business viability, IT service providers must put in place reliable and cost efficient operations support. Artificial Intelligence for IT Operations (AIOps) is a promising technology for alleviating operational complexity of IT systems and services. AIOps platforms utilize big data, machine learning and other advanced analytics technologies to enhance IT operations with proactive actionable dynamic insight. In this paper we share our experience applying the AIOps approach to a production cloud object storage service to get actionable insights into system's behavior and health. We describe a real-life production cloud scale service and its operational data, present the AIOps platform we have created, and show how it has helped us resolving operational pain points. Full Article
cl Diagnosing the Environment Bias in Vision-and-Language Navigation. (arXiv:2005.03086v1 [cs.CL]) By arxiv.org Published On :: Vision-and-Language Navigation (VLN) requires an agent to follow natural-language instructions, explore the given environments, and reach the desired target locations. These step-by-step navigational instructions are crucial when the agent is navigating new environments about which it has no prior knowledge. Most recent works that study VLN observe a significant performance drop when tested on unseen environments (i.e., environments not used in training), indicating that the neural agent models are highly biased towards training environments. Although this issue is considered as one of the major challenges in VLN research, it is still under-studied and needs a clearer explanation. In this work, we design novel diagnosis experiments via environment re-splitting and feature replacement, looking into possible reasons for this environment bias. We observe that neither the language nor the underlying navigational graph, but the low-level visual appearance conveyed by ResNet features directly affects the agent model and contributes to this environment bias in results. According to this observation, we explore several kinds of semantic representations that contain less low-level visual information, hence the agent learned with these features could be better generalized to unseen testing environments. Without modifying the baseline agent model and its training method, our explored semantic features significantly decrease the performance gaps between seen and unseen on multiple datasets (i.e. R2R, R4R, and CVDN) and achieve competitive unseen results to previous state-of-the-art models. Our code and features are available at: https://github.com/zhangybzbo/EnvBiasVLN Full Article
cl Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality. (arXiv:2005.03074v1 [cs.CL]) By arxiv.org Published On :: We develop a categorical compositional distributional semantics for Lambek Calculus with a Relevant Modality !L*, which has a limited edition of the contraction and permutation rules. The categorical part of the semantics is a monoidal biclosed category with a coalgebra modality, very similar to the structure of a Differential Category. We instantiate this category to finite dimensional vector spaces and linear maps via "quantisation" functors and work with three concrete interpretations of the coalgebra modality. We apply the model to construct categorical and concrete semantic interpretations for the motivating example of !L*: the derivation of a phrase with a parasitic gap. The effectiveness of the concrete interpretations are evaluated via a disambiguation task, on an extension of a sentence disambiguation dataset to parasitic gap phrase one, using BERT, Word2Vec, and FastText vectors and Relational tensors. Full Article
cl I Always Feel Like Somebody's Sensing Me! A Framework to Detect, Identify, and Localize Clandestine Wireless Sensors. (arXiv:2005.03068v1 [cs.CR]) By arxiv.org Published On :: The increasing ubiquity of low-cost wireless sensors in smart homes and buildings has enabled users to easily deploy systems to remotely monitor and control their environments. However, this raises privacy concerns for third-party occupants, such as a hotel room guest who may be unaware of deployed clandestine sensors. Previous methods focused on specific modalities such as detecting cameras but do not provide a generalizable and comprehensive method to capture arbitrary sensors which may be "spying" on a user. In this work, we seek to determine whether one can walk in a room and detect any wireless sensor monitoring an individual. As such, we propose SnoopDog, a framework to not only detect wireless sensors that are actively monitoring a user, but also classify and localize each device. SnoopDog works by establishing causality between patterns in observable wireless traffic and a trusted sensor in the same space, e.g., an inertial measurement unit (IMU) that captures a user's movement. Once causality is established, SnoopDog performs packet inspection to inform the user about the monitoring device. Finally, SnoopDog localizes the clandestine device in a 2D plane using a novel trial-based localization technique. We evaluated SnoopDog across several devices and various modalities and were able to detect causality 96.6% percent of the time, classify suspicious devices with 100% accuracy, and localize devices to a sufficiently reduced sub-space. Full Article
cl Weakly-Supervised Neural Response Selection from an Ensemble of Task-Specialised Dialogue Agents. (arXiv:2005.03066v1 [cs.CL]) By arxiv.org Published On :: Dialogue engines that incorporate different types of agents to converse with humans are popular. However, conversations are dynamic in the sense that a selected response will change the conversation on-the-fly, influencing the subsequent utterances in the conversation, which makes the response selection a challenging problem. We model the problem of selecting the best response from a set of responses generated by a heterogeneous set of dialogue agents by taking into account the conversational history, and propose a emph{Neural Response Selection} method. The proposed method is trained to predict a coherent set of responses within a single conversation, considering its own predictions via a curriculum training mechanism. Our experimental results show that the proposed method can accurately select the most appropriate responses, thereby significantly improving the user experience in dialogue systems. Full Article
cl Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches. (arXiv:2005.03035v1 [cs.CL]) By arxiv.org Published On :: An interesting and frequent type of multi-word expression (MWE) is the headless MWE, for which there are no true internal syntactic dominance relations; examples include many named entities ("Wells Fargo") and dates ("July 5, 2020") as well as certain productive constructions ("blow for blow", "day after day"). Despite their special status and prevalence, current dependency-annotation schemes require treating such flat structures as if they had internal syntactic heads, and most current parsers handle them in the same fashion as headed constructions. Meanwhile, outside the context of parsing, taggers are typically used for identifying MWEs, but taggers might benefit from structural information. We empirically compare these two common strategies--parsing and tagging--for predicting flat MWEs. Additionally, we propose an efficient joint decoding algorithm that combines scores from both strategies. Experimental results on the MWE-Aware English Dependency Corpus and on six non-English dependency treebanks with frequent flat structures show that: (1) tagging is more accurate than parsing for identifying flat-structure MWEs, (2) our joint decoder reconciles the two different views and, for non-BERT features, leads to higher accuracies, and (3) most of the gains result from feature sharing between the parsers and taggers. Full Article
cl Evaluating text coherence based on the graph of the consistency of phrases to identify symptoms of schizophrenia. (arXiv:2005.03008v1 [cs.CL]) By arxiv.org Published On :: Different state-of-the-art methods of the detection of schizophrenia symptoms based on the estimation of text coherence have been analyzed. The analysis of a text at the level of phrases has been suggested. The method based on the graph of the consistency of phrases has been proposed to evaluate the semantic coherence and the cohesion of a text. The semantic coherence, cohesion, and other linguistic features (lexical diversity, lexical density) have been taken into account to form feature vectors for the training of a model-classifier. The training of the classifier has been performed on the set of English-language interviews. According to the retrieved results, the impact of each feature on the output of the model has been analyzed. The results obtained can indicate that the proposed method based on the graph of the consistency of phrases may be used in the different tasks of the detection of mental illness. Full Article
cl How Biofuels Can Cool Our Climate and Strengthen Our Ecosystems By feedproxy.google.com Published On :: Wed, 24 Feb 2016 18:37:59 +0000 By Evan H. DeLucia Courtesy of EOS Critics of biofuels like ethanol argue they are an unsustainable use of land. But with careful management, next-generation grass-based biofuels can net climate savings and improve their ecosystems. As the world seeks strategies … Continue reading → Full Article Biomass biofuels carbon sinks Climate Change ecosystems greenhouse gases
cl Closure of Diablo Canyon Nuclear Plant By feedproxy.google.com Published On :: Wed, 22 Jun 2016 10:19:38 +0000 By Lauren McCauley Common Dreams In landmark agreement, California’s last remaining nuclear plant will be replaced by greenhouse-gas-free energy sources A plan to shutter the last remaining nuclear power plant in California and replace it with renewable energy is being … Continue reading → Full Article ET News Nuclear
cl Docker Image for ASK and AWS CLI By dzone.com Published On :: Wed, 29 Apr 2020 18:06:12 GMT The purpose of this container is to be able to use the Amazon ASK CLI and Amazon AWS CLI in a Docker container in DevOps pipelines. Note: This is a fork from the martindsouza image with these changes: Full Article tutorial web dev aws node alexa aws cli ask cli
cl Day Clock ( vue.js ) By codepen.io Published On :: 2020-05-01T07:34:56-07:00 See the Code - See it Full Page - See Details Do you sometimes get lost in what day it is today? This Day Clock shows the progress of the current day and for the week. Click the center clock knot to toggle display. Double click the letter D to activate the debug mode. Also available at github.com/kunukn/dayclock This Pen uses: HTML, CSS, Vue, and Vue Full Article
cl Clint Eastwood's true-life drama Richard Jewell takes aims at big targets, and misses By www.inlander.com Published On :: Thu, 12 Dec 2019 01:30:00 -0800 Once upon a time, Clint Eastwood, a notoriously outspoken conservative in supposedly liberal Hollywood, had no problem at all with cops who employed their own unconventional extra-legal brand of law enforcement (see: Dirty Harry). Today, in Richard Jewell, he really doesn't like the FBI.… Full Article Film/Film News
cl Spokane musician Eliza Johnson brought her quirky style — and tinned fish — to American Idol Sunday night. Watch the clip By www.inlander.com Published On :: Mon, 24 Feb 2020 11:27:00 -0800 Back in November, we wrote about local singer-songwriter Eliza Johnson's musical project Eliza Catastrophe and her new album You, which she released on pre-loaded MP3 players. One thing we weren't able to mention in our interview — for contractual reasons — is that she had only a couple months prior auditioned for American Idol, and her performance finally aired on the ABC reality competition show Sunday night.… Full Article Music/Music News
cl It's no Pixar classic, but Onward continues the studio's penchant for intelligent, original animated entertainment By www.inlander.com Published On :: Thu, 05 Mar 2020 04:01:00 -0800 What am I supposed to say here?… Full Article Film/Film News
cl The Fox Theater cancels all events, including Spokane Symphony concerts, through April 10 By www.inlander.com Published On :: Thu, 12 Mar 2020 16:22:00 -0700 As the threat of the Coronavirus spreads throughout the country, public events everywhere are being canceled and postponed for public safety concerns. The Fox Theater is the latest venue to follow suit, closing its doors and canceling all events through April 10.… Full Article Music News
cl Spokane Symphony launches Musicians' Relief Fund to help local classical stars survive the pandemic By www.inlander.com Published On :: Thu, 23 Apr 2020 16:09:00 -0700 You might not know it from the fancy attire they wear on stage at the Fox Theater, but for the musicians in the Spokane Symphony, it's a part-time gig. It's a prestigious gig, to be sure, but like most artists, for the musicians, it's just one piece of a puzzle full of hustle they have to solve to make a living.… Full Article Arts & Culture
cl Spokane Comedy Club bringing the laughs from Dan Cummins, Spokane's Kelsey Cook and more right to your computer this weekend By www.inlander.com Published On :: Fri, 24 Apr 2020 15:07:18 -0700 The Spokane Comedy Club might be quiet right now, but there are still laughs to be had on Zoom, and not just from watching your co-workers try to navigate the online meeting platform. Saturday night, and again next Saturday, the comedy club is hosting Comedians Doing Comedy: A Virtual Comedy Show.… Full Article Arts & Culture
cl 5 ways to entertain yourself online, from concerts and art shows to painting classes and story times By www.inlander.com Published On :: Wed, 29 Apr 2020 15:48:56 -0700 Here are a few ways to keep yourself entertained, and maybe even educate yourself a bit, while you're stuck at home:… Full Article Arts & Culture
cl How climate change is contributing to skyrocketing rates of infectious disease By www.inlander.com Published On :: Fri, 08 May 2020 17:27:54 -0700 A catastrophic loss in biodiversity, reckless destruction of wildland and warming temperatures have allowed disease to explode. Ignoring the connection between climate change and pandemics would be “dangerous delusion,” one scientist said. The scientists who study how diseases emerge in a changing environment knew this moment was coming.… Full Article News/Nation & World
cl Regain control of your closet with some simple steps By www.inlander.com Published On :: Wed, 08 Apr 2020 18:30:00 -0700 As this issue goes to press we are all staying home to battle the coronavirus.… Full Article Home
cl North Idaho's Best Golf Course: Circling Raven By www.inlander.com Published On :: Thu, 19 Mar 2020 01:30:00 -0700 For people who love playing, a day on the worst possible golf course is better than any day not swinging the clubs.… Full Article Recreation
cl With ridership declining, we hop on the bus with one big question in mind: Where is the STA headed? By www.inlander.com Published On :: Thu, 05 Mar 2020 04:05:00 -0800 Before my car broke down, I didn't ride the bus.… Full Article News/Local News
cl Combinatorial synthesis of libraries of macrocyclic compounds useful in drug discovery By www.freepatentsonline.com Published On :: Tue, 28 Apr 2015 08:00:00 EDT A library of macrocyclic compounds of the formula (I) where part (A) is a bivalent radical, a —(CH2)y— bivalent radical or a covalent bond;where part (B) is a bivalent radical, a —(CH2)z— bivalent radical, or a covalent bond;where part (C) is a bivalent radical, a —(CH2)t— bivalent radical, or a covalent bond; andwhere part (T) is a —Y-L-Z— radical wherein Y is CH2 or CO, Z is NH or O and L is a bivalent radical. These compounds are useful for carrying out screening assays or as intermediates for the synthesis of other compounds of pharmaceutical interest. A process for the preparation of these compounds in a combinatorial manner, is also disclosed. Full Article
cl Compound and organic light-emitting device including the same By www.freepatentsonline.com Published On :: Tue, 12 May 2015 08:00:00 EDT A compound represented by Formula 1 below and an organic light-emitting device including an organic layer containing the compound of Formula 1: wherein R1 to R4, X and Y, a and b, and m and n are defined as in the specification. Full Article
cl Process for the conversion of aliphatic cyclic amines to aliphatic diamines By www.freepatentsonline.com Published On :: Tue, 19 May 2015 08:00:00 EDT A process for conversion of aliphatic bicyclic amines to aliphatic diamines including contacting one or more bicyclic amines selected from the group consisting of 3-azabicyclo[3.3.1]nonane and azabicyclo[3.3.1]non-2-ene with ammonia and hydrogen, and alcohols in the presence of heterogeneous metal based catalyst systems, a metal selected from the group consisting of Co, Ni, Ru, Fe, Cu, Re, Pd, and their oxides at a temperature from 140° C. to 200° C. and a pressure from 1540 to 1735 psig for at least one hour reactor systems; forming a product mixture comprising aliphatic diamine(s), bicyclic amine(s), ammonia, hydrogen, and alcohol(s); removing said product mixture from the reactor system; removing at least some of the ammonia, hydrogen, water, alcohols, bicyclic amines from said product mixture; thereby separating the aliphatic diamines from said product mixture. Full Article
cl Techniques for evaluation, building and/or retraining of a classification model By www.freepatentsonline.com Published On :: Tue, 12 May 2015 08:00:00 EDT Techniques for evaluation and/or retraining of a classification model built using labeled training data. In some aspects, a classification model having a first set of weights is retrained by using unlabeled input to reweight the labeled training data to have a second set of weights, and by retraining the classification model using the labeled training data weighted according to the second set of weights. In some aspects, a classification model is evaluated by building a similarity model that represents similarities between unlabeled input and the labeled training data and using the similarity model to evaluate the labeled training data to identify a subset of the plurality of items of labeled training data that is more similar to the unlabeled input than a remainder of the labeled training data. Full Article
cl Classifying unclassified samples By www.freepatentsonline.com Published On :: Tue, 19 May 2015 08:00:00 EDT A system and method for classifying unclassified samples. The method includes detecting a number of classes including training samples in training data sets. The method includes, for each class, determining a vector for each training sample based on a specified number of nearest neighbor distances between the training sample and neighbor training samples, and determining a class distribution based on the vectors. The method also includes detecting an unclassified sample in a data set and, for each class, determining a vector for the unclassified sample based on the specified number of nearest neighbor distances between the unclassified sample and nearest neighbor training samples within the class, and determining a probability that the unclassified sample is a member of the class based on the vector and the class distribution. The method further includes classifying the unclassified sample based on the probabilities. Full Article
cl Multiple two-state classifier output fusion system and method By www.freepatentsonline.com Published On :: Tue, 19 May 2015 08:00:00 EDT A system and method for providing more than two levels of classification distinction of a user state are provided. The first and second general states of a user are sensed. The first general state is classified as either a first state or a second state, and the second general state is classified as either a third state or a fourth state. The user state of the user is then classified as one of at least three different classification states. Full Article