Conference Papers

sMapReduce: A programming pattern forwireless sensor networks

Gupta V., Tovar E., Pinho L.M., Kim J., Lakshmanan K., Rajkumar R.

Proceedings - International Conference on Software Engineering

2011

Abstract:

Wireless Sensor Networks (WSNs) are increasingly used in various application domains like home-automation, agriculture, industries and infrastructure monitoring. As applications tend to leverage larger geographical deployments of sensor networks, the availability of an intuitive and user friendly programming abstraction becomes a crucial factor in enabling faster and more efficient development, and reprogramming of applications. We propose a programming pattern named sMapReduce, inspired by the Google MapReduce framework, for mapping application behaviors on to a sensor network and enabling complex data aggregation. The proposed pattern requires a user to create a network-level application in two functions: sMap and Reduce, in order to abstract away from the low-level details without sacrificing the control to develop complex logic. Such a two-fold division of programming logic is a natural-fit to typical sensor networking operation which makes sensing and topological modalities accessible to the user.

SMARTcycling: Assessing cyclists’ driving experience

Vieira P., Costeira J.P., Brandao S., Marques M.

IEEE Intelligent Vehicles Symposium, Proceedings

2016

Abstract:

Due to economic and environmental issues, bicycles have been regaining their significance as a transportation vehicle in urban scenarios. To further drive this desirable trend, policy makers must have the tools to access current bicycle infrastructures and road safety concerns. Fundamental for this assessment is a deeper understanding of how cyclists use current infrastructures, if the cycling experience results in stressful events, and the conditions of the current infrastructure. We here introduce a new platform, SMARTcycling, that, by taking advantage of the mobile power available to a smartphone, captures and stores data from several sensors, namely an action camera, a cardio signal acquisition belt, and smartphone’s Global Positioning System (GPS) coordinates. The data is further processed and, through visual cues, we access the cyclist driving events and road condition cues. SMARTcycling also detects the cyclist stress using the electrocardiograms (ECG) from the belt. We further contribute by making available a dataset containing the sensors data from 10 paths over two cities in Portugal. On this dataset, we show our initial promising results on event detection, road condition identification and stress assessment.

SMT-based model checking for recursive programs

Komuravelli A., Gurfinkel A., Chaki S

Formal Methods in System Design volume 48

2016

Abstract:

We present an SMT-based symbolic model checking algorithm for safety verification of recursive programs. The algorithm is modular and analyzes procedures individually. Unlike other SMT-based approaches, it maintains both over- and under-approximations of procedure summaries. Under-approximations are used to analyze procedure calls without inlining. Over-approximations are used to block infeasible counterexamples and detect convergence to a proof. We show that for programs and properties over a decidable theory, the algorithm is guaranteed to find a counterexample, if one exists. However, efficiency depends on an oracle for quantifier elimination (QE). For Boolean programs, the algorithm is a polynomial decision procedure, matching the worst-case bounds of the best BDD-based algorithms. For Linear Arithmetic (integers and rationals), we give an efficient instantiation of the algorithm by applying QE lazily. We use existing interpolation techniques to over-approximate QE and introduce Model Based Projection to under-approximate QE. Empirical evaluation on SV-COMP benchmarks shows that our algorithm improves significantly on the state-of-the-art.

SOAR: A Synthesis Approach for Data Science API Refactoring

Ni A., Ramos D., Yang A., Lynce I., Manquinho V., Martins R., Goues C

ICSE

2021

Abstract:

With the growth of the open-source data science community, both the number of data science libraries and the number of versions for the same library are increasing rapidly. To match the evolving APIs from those libraries, open-source organizations often have to exert manual effort to refactor the APIs used in the code base. Moreover, due to the abundance of similar open-source libraries, data scientists working on a certain application may have an abundance of libraries to choose, maintain and migrate between. The manual refactoring between APIs is a tedious and error-prone task. Although recent research efforts were made on performing automatic API refactoring between different languages, previous work relies on statistical learning with collected pairwise training data for the API matching and migration. Using large statistical data for refactoring is not ideal because such training data will not be available for a new library or a new version of the same library. We introduce Synthesis for Open-Source API Refactoring (SOAR), a novel technique that requires no training data to achieve API migration and refactoring. SOAR relies only on the documentation that is readily available at the release of the library to learn API representations and mapping between libraries. Using program synthesis, SOAR automatically computes the correct configuration of arguments to the APIs and any glue code that is required to invoke those APIs. SOAR also uses the error messages from the interpreter when running refactored code to generate logical constraints that can be used to prune the search space. Our empirical evaluation shows that SOAR can successfully refactor 80% of our benchmarks, corresponding to deep learning models with up to 44 layers, with an average run time of 97.23 seconds, and 90% of the benchmark set for data wrangling tasks with an average run time of 17.31 seconds.

Sparse and Continuous Attention Mechanisms

Martins A., Farinhas A., Treviso M., Niculae V., Aguiar P., Figueiredo M.

NeurIPS 2020

2020

Abstract:

Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e.g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation). Distributions in each of these families have fixed support. In contrast, for finite domains, there has been recent work on sparse alternatives to softmax (e.g. sparsemax and alpha-entmax), which have varying support, being able to assign zero probability to irrelevant categories. This paper expands that work in two directions: first, we extend alpha-entmax to continuous domains, revealing a link with Tsallis statistics and deformed exponential families. Second, we introduce continuous-domain attention mechanisms, deriving efficient gradient backpropagation algorithms for alpha in {1,2}. Experiments on attention-based text classification, machine translation, and visual question answering illustrate the use of continuous attention in 1D and 2D, showing that it allows attending to time intervals and compact regions.

Sparse and Structured Visual Attention

Martins, P.H.; Niculae, V.; Marinho, Z.; Martins, A.

arXiv.org

2020

Abstract:

Visual attention mechanisms are widely used in multimodal tasks, such as image captioning and visual question answering (VQA). One drawback of softmax-based attention mechanisms is that they assign probability mass to all image regions, regardless of their adjacency structure and of their relevance to the text. In this paper, to better link the image structure with the text, we replace the traditional softmax attention mechanism with two alternative sparsity-promoting transformations: sparsemax, which is able to select the relevant regions only (assigning zero weight to the rest), and a newly proposed Total-Variation Sparse Attention (TVmax), which further encourages the joint selection of adjacent spatial locations. Experiments in image captioning and VQA, using both LSTM and Transformer architectures, show gains in terms of human-rated caption quality, attention relevance, and VQA accuracy, with improved interpretability.

Sparse Text Generation

Martins P.; Marinho Z.; Martins A.

Proc. EMNLP 2020

2020

Abstract:

Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, as in top-k or nucleus sampling. This creates a mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch. The result is a text generator with favorable performance in terms of fluency and consistency, fewer repetitions, and n-gram diversity closer to human text. In order to evaluate our model, we propose three new metrics for comparing sparse or truncated distributions: 𝜖-perplexity, sparsemax score, and Jensen-Shannon divergence. Human-evaluated experiments in story completion and dialogue generation show that entmax sampling leads to more engaging and coherent stories and conversations.

Sparse Text Generation

Martins, P.H.; Marinho, Z.; Martins, A.

arXiv.org

2020

Abstract:

Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, as in top-k or nucleus sampling. This creates a mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch. The result is a text generator with favorable performance in terms of fluency and consistency, fewer repetitions, and n-gram diversity closer to human text. In order to evaluate our model, we propose three new metrics for comparing sparse or truncated distributions: ϵ-perplexity, sparsemax score, and Jensen-Shannon divergence. Human-evaluated experiments in story completion and dialogue generation show that entmax sampling leads to more engaging and coherent stories and conversations.

Spectral analysis in a thin domain with periodically oscillating characteristics

Ferreira R., Mascarenhas L.M., Piatnitski A.

ESAIM - Control, Optimisation and Calculus of Variations

2012

Abstract:

The paper deals with a Dirichlet spectral problem for an elliptic operator with ε-periodic coefficients in a 3D bounded domain of small thickness δ. We study the asymptotic behavior of the spectrum as ε and δ tend to zero. This asymptotic behavior depends crucially on whether ε and δ are of the same order (δ ≈ ε), or ε is much less than δ(δ = ετ, τ 1). We consider all three cases.