DSpace Angular :: Browsing LIAAD - Other Publications by Title

Item

Accelerating Recommender Systems using GPUs

( 2015) André Valente Rodrigues ; Alípio Jorge ; Inês Dutra

Item

Active Mining of Parallel Video Streams

( 2014) Samaneh Khoshrou ; Jaime Cardoso ; Luís Filipe Teixeira

Item

Adaptive model rules from data streams

( 2013) Ezilda Duarte Almeida ; Carlos Ferreira ; João Gama

Decision rules are one of the most expressive languages for machine learning. In this paper we present Adaptive Model Rules (AMRules), the first streaming rule learning algorithm for regression problems. In AMRules the antecedent of a rule is a conjunction of conditions on the attribute values, and the consequent is a linear combination of attribute values. Each rule uses a Page-Hinkley test to detect changes in the process generating data and react to changes by pruning the rule set. In the experimental section we report the results of AMRules on benchmark regression problems, and compare the performance of our system with other streaming regression algorithms. © 2013 Springer-Verlag.

Item

Avoiding Anomalies in Data Stream Learning

( 2013) João Gama ; Kosina,P ; Ezilda Duarte Almeida

The presence of anomalies in data compromises data quality and can reduce the effectiveness of learning algorithms. Standard data mining methodologies refer to data cleaning as a pre-processing before the learning task. The problem of data cleaning is exacerbated when learning in the computational model of data streams. In this paper we present a streaming algorithm for learning classification rules able to detect contextual anomalies in the data. Contextual anomalies are surprising attribute values in the context defined by the conditional part of the rule. For each example we compute the degree of anomaliness based on the probability of the attribute-values given the conditional part of the rule covering the example. The examples with high degree of anomaliness are signaled to the user and not used to train the classifier. The experimental evaluation in real-world data sets shows the ability to discover anomalous examples in the data. The main advantage of the proposed method is the ability to inform the context and explain why the anomaly occurs.

Item

A biased random key genetic algorithm for 2D and 3D bin packing problems

( 2013) José Fernando Gonçalves ; Resende,MGC

In this paper we present a novel biased random-key genetic algorithm (BRKGA) for 2D and 3D bin packing problems. The approach uses a maximal-space representation to manage the free spaces in the bins. The proposed algorithm hybridizes a novel placement procedure with a genetic algorithm based on random keys. The BRKGA is used to evolve the order in which the boxes are packed into the bins and the parameters used by the placement procedure. Two new placement heuristics are used to determine the bin and the free maximal space where each box is placed. A novel fitness function that improves significantly the solution quality is also developed. The new approach is extensively tested on 858 problem instances and compared with other approaches published in the literature. The computational experiment results demonstrate that the new approach consistently equals or outperforms the other approaches and the statistical analysis confirms that the approach is significantly better than all the other approaches.

Item

Binary recommender systems: Introduction, an application and outlook

( 2013) Alípio Jorge

Recommender Systems are a hot application area these days, made popular by well known web sites. The problem of predicting user preferences is very demanding from the data mining algorithm design point of view, but it also poses challenges to evaluation and monitoring. Moreover, there is a lot of information that can be exploited, from clickstreams and background information to musical content and social interaction. As data grows and recommendation requests must be answered in a split second, online and agile solutions must be implemented. In this talk we will give a brief introduction to binary recommender systems, describe a particular hybrid application to music recommendation - from algorithm to online evaluation, and refer to context aware and online recommender algorithms. © 2013 ACM.

Item

Computational Models for Social and Technical Interactions

( 2017) João Gama ; Oliveira,E ; Cardoso,HL

Item

Concave minimum cost network flow problems solved with a colony of ants

( 2013) Monteiro,MSR ; Dalila Fontes ; Fontes,FACC

In this work we address the Single-Source Uncapacitated Minimum Cost Network Flow Problem with concave cost functions. This problem is NP-Hard, therefore we propose a hybrid heuristic to solve it. Our goal is not only to apply an ant colony optimization (ACO) algorithm to such a problem, but also to provide an insight on the behaviour of the parameters in the performance of the algorithm. The performance of the ACO algorithm is improved with the hybridization of a local search (LS) procedure. The core ACO procedure is used to mainly deal with the exploration of the search space, while the LS is incorporated to further cope with the exploitation of the best solutions found. The method we have developed has proven to be very efficient while solving both small and large size problem instances. The problems we have used to test the algorithm were previously solved by other authors using other population based heuristics. Our algorithm was able to improve upon some of their results in terms of solution quality, proving that the HACO algorithm is a very good alternative approach to solve these problems. In addition, our algorithm is substantially faster at achieving these improved solutions. Furthermore, the magnitude of the reduction of the computational requirements grows with problem size.

Item

Contextual Anomalies in Medical Data

( 2013) Vasco,D ; Pedro Pereira Rodrigues ; João Gama

Anomalies in data can cause a lot of problems in the data analysis processes. Thus, it is necessary to improve data quality by detecting and eliminating errors and inconsistencies in the data, known as the data cleaning process [1]. Since detection and correction of anomalies requires detailed domain knowledge, the involvement of experts in the field is essential to the success of the process of cleaning the data. However, considering the size of data to be processed, this process should be as automatic as possible so as to minimize the time spent [1]. © 2013 IEEE.

Item

Data Stream Clustering: A Survey

( 2013) Silva,JA ; Faria,ER ; Barros,RC ; Hruschka,ER ; de Carvalho,ACPLF ; João Gama

Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of performing fast and incremental processing of data objects, suitably addressing time and memory limitations. In this article, we present a survey of data stream clustering algorithms, providing a thorough discussion of the main design components of state-of-the-art algorithms. In addition, this work addresses the temporal aspects involved in data stream clustering, and presents an overview of the usually employed experimental methodologies. A number of references are provided that describe applications of data stream clustering in different domains, such as network intrusion detection, sensor networks, and stock market analysis. Information regarding software packages and data repositories are also available for helping researchers and practitioners. Finally, some important issues and open questions that can be subject of future research are discussed.

Item

Data stream mining: The bounded rationality

( 2013) João Gama

The developments of information and communication technologies dramatically change the data collection and processing methods. Data mining is now moving to the era of bounded rationality. In this work we discuss the implications of the resource constraints impose by the data stream computational model in the design of learning algorithms. We analyze the behavior of stream mining algorithms and present future research directions including ubiquitous stream mining and self-adaption models.

Item

Difference Equations, Discrete Dynamical Systems and Applications

( 2016) Alsedà i Soler,L ; Cushing,JM ; Elaydi,S ; Alberto Pinto

Item

Dimensions as Virtual Items: Improving the predictive ability of top-N recommender systems

( 2013) Domingues,MA ; Alípio Jorge ; Carlos Manuel Soares

Traditionally, recommender systems for the web deal with applications that have two dimensions, users and items. Based on access data that relate these dimensions, a recommendation model can be built and used to identify a set of N items that will be of interest to a certain user. In this paper we propose a multidimensional approach, called DaVI (Dimensions as Virtual Items), that consists in inserting contextual and background information as new user-item pairs. The main advantage of this approach is that it can be applied in combination with several existing two-dimensional recommendation algorithms. To evaluate its effectiveness, we used the DaVI approach with two different top-N recommender algorithms, Item-based Collaborative Filtering and Association Rules based, and ran an extensive set of experiments in three different real world data sets. In addition, we have also compared our approach to the previously introduced combined reduction and weight post-filtering approaches. The empirical results strongly indicate that our approach enables the application of existing two-dimensional recommendation algorithms in multidimensional data, exploiting the useful information of these data to improve the predictive ability of top-N recommender systems.

Item

Disambiguating implicit temporal queries for temporal information retrieval applications

( 2013) Ricardo Campos

Item

A Distributed Computing Solution for Privacy-Preserving Genome-Wide Association Studies

( 2024) Pedro Gabriel Ferreira ; Cláudia Vanessa Brito ; 7497 ; 7516

AbstractBreakthroughs in sequencing technologies led to an exponential growth of genomic data, providing unprecedented biological in-sights and new therapeutic applications. However, analyzing such large amounts of sensitive data raises key concerns regarding data privacy, specifically when the information is outsourced to third-party infrastructures for data storage and processing (e.g., cloud computing). Current solutions for data privacy protection resort to centralized designs or cryptographic primitives that impose considerable computational overheads, limiting their applicability to large-scale genomic analysis.We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. Unlike in previous work, Gyosafollows a distributed processing design that enables handling larger amounts of genomic data in a scalable and efficient fashion. Further, by leveraging trusted execution environments (TEEs), namely Intel SGX, Gyosaallows users to confidentially delegate their GWAS analysis to untrusted third-party infrastructures. To overcome the memory limitations of SGX, we implement a computation partitioning scheme within Gyosa. This scheme reduces the number of operations done inside the TEEs while safeguarding the users’ genomic data privacy. By integrating this security scheme inGlow, Gyosaprovides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees. Further, the results show that, by distributing GWASes computations, one can achieve a practical and usable privacy-preserving solution.

Item

Dos Projectos às Regiões Digitais. Principais desafios.