DSpace Angular :: Browsing LIAAD - Other Publications by Issue Date

Item

Dos Projectos às Regiões Digitais. Principais desafios.

( 2008) Oliveira Manuel ; Maria Simões ; Domingos Santos ; Jan Wolf ; Ricardo Campos

Item

Ubiquitous Knowledge Discovery

( 2011) Michael May ; João Gama

Item

A biased random key genetic algorithm for 2D and 3D bin packing problems

( 2013) José Fernando Gonçalves ; Resende,MGC

In this paper we present a novel biased random-key genetic algorithm (BRKGA) for 2D and 3D bin packing problems. The approach uses a maximal-space representation to manage the free spaces in the bins. The proposed algorithm hybridizes a novel placement procedure with a genetic algorithm based on random keys. The BRKGA is used to evolve the order in which the boxes are packed into the bins and the parameters used by the placement procedure. Two new placement heuristics are used to determine the bin and the free maximal space where each box is placed. A novel fitness function that improves significantly the solution quality is also developed. The new approach is extensively tested on 858 problem instances and compared with other approaches published in the literature. The computational experiment results demonstrate that the new approach consistently equals or outperforms the other approaches and the statistical analysis confirms that the approach is significantly better than all the other approaches.

Item

Real-time Augmented Reality shopping platform for studying consumer cognitive experiences

( 2013) Stoyanova,J ; Goncalves,R ; António Coelho ; Pedro Brito

Augmented Reality (AR) is a technology which produces a synthesis between a computer-generated data and the physical world of a viewer while establishing 3D registration and real time interaction. Among the wide range of application of AR, its use in advertising shopping experiences has recently been embraced by advertisers due to its novelty and engaging potential. Part of a wider research aiming at understanding the impact of AR on consumer psychology, this paper presents a demo platform application developed for a real-time shopping experience for shoes and attempts to define a ground base for posterior marketing research in the field. In order to fully evaluate consumer experiences and compare with the main AR platform two other shopping applications were designed: a marker-based and a static one. The platform will assist in exploring the antecedents of consumer purchase intention and in defining metrics for measuring shopping experiences with AR.

Item

Data Stream Clustering: A Survey

( 2013) Silva,JA ; Faria,ER ; Barros,RC ; Hruschka,ER ; de Carvalho,ACPLF ; João Gama

Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of performing fast and incremental processing of data objects, suitably addressing time and memory limitations. In this article, we present a survey of data stream clustering algorithms, providing a thorough discussion of the main design components of state-of-the-art algorithms. In addition, this work addresses the temporal aspects involved in data stream clustering, and presents an overview of the usually employed experimental methodologies. A number of references are provided that describe applications of data stream clustering in different domains, such as network intrusion detection, sensor networks, and stock market analysis. Information regarding software packages and data repositories are also available for helping researchers and practitioners. Finally, some important issues and open questions that can be subject of future research are discussed.

Item

Concave minimum cost network flow problems solved with a colony of ants

( 2013) Monteiro,MSR ; Dalila Fontes ; Fontes,FACC

In this work we address the Single-Source Uncapacitated Minimum Cost Network Flow Problem with concave cost functions. This problem is NP-Hard, therefore we propose a hybrid heuristic to solve it. Our goal is not only to apply an ant colony optimization (ACO) algorithm to such a problem, but also to provide an insight on the behaviour of the parameters in the performance of the algorithm. The performance of the ACO algorithm is improved with the hybridization of a local search (LS) procedure. The core ACO procedure is used to mainly deal with the exploration of the search space, while the LS is incorporated to further cope with the exploitation of the best solutions found. The method we have developed has proven to be very efficient while solving both small and large size problem instances. The problems we have used to test the algorithm were previously solved by other authors using other population based heuristics. Our algorithm was able to improve upon some of their results in terms of solution quality, proving that the HACO algorithm is a very good alternative approach to solve these problems. In addition, our algorithm is substantially faster at achieving these improved solutions. Furthermore, the magnitude of the reduction of the computational requirements grows with problem size.

Item

Evaluation methodology for multiclass novelty detection algorithms

( 2013) Faria,ER ; Goncalves,IJCR ; João Gama ; Carvalho,ACPLF

Novelty detection is a useful ability for learning systems, especially in data stream scenarios, where new concepts can appear, known concepts can disappear and concepts can evolve over time. There are several studies in the literature investigating the use of machine learning classification techniques for novelty detection in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques, particular for multiclass problems. In this study, we propose a new evaluation approach for multiclass data streams novelty detection problems. This approach is able to deal with: i) multiclass problems, ii) confusion matrix with a column representing the unknown examples, iii) confusion matrix that increases over time, iv) unsupervised learning, that generates novelties without an association with the problem classes and v) representation of the evaluation measures over time. We evaluate the performance of the proposed approach by known novelty detection algorithms with artificial and real data sets. © 2013 IEEE.

Item

Predicting Taxi-Passenger Demand Using Streaming Data

( 2013) Luís Moreira Matias ; João Gama ; Michel Ferreira ; João Mendes Moreira ; Damas,L

Informed driving is increasingly becoming a key feature for increasing the sustainability of taxi companies. The sensors that are installed in each vehicle are providing new opportunities for automatically discovering knowledge, which, in return, delivers information for real-time decision making. Intelligent transportation systems for taxi dispatching and for finding time-saving routes are already exploring these sensing data. This paper introduces a novel methodology for predicting the spatial distribution of taxi-passengers for a short-term time horizon using streaming data. First, the information was aggregated into a histogram time series. Then, three time-series forecasting techniques were combined to originate a prediction. Experimental tests were conducted using the online data that are transmitted by 441 vehicles of a fleet running in the city of Porto, Portugal. The results demonstrated that the proposed framework can provide effective insight into the spatiotemporal distribution of taxi-passenger demand for a 30-min horizon.

Item

Rule Induction for Sentence Reduction

( 2013) João Cordeiro ; Dias,G ; Pavel Brazdil

Sentence Reduction has recently received a great attention from the research community of Automatic Text Summarization. Sentence Reduction consists in the elimination of sentence components such as words, part-of-speech tags sequences or chunks without highly deteriorating the information contained in the sentence and its grammatical correctness. In this paper, we present an unsupervised scalable methodology for learning sentence reduction rules. Paraphrases are first discovered within a collection of automatically crawled Web News Stories and then textually aligned in order to extract interchangeable text fragment candidates, in particular reduction cases. As only positive examples exist, Inductive Logic Programming (ILP) provides an interesting learning paradigm for the extraction of sentence reduction rules. As a consequence, reduction cases are transformed into first order logic clauses to supply a massive set of suitable learning instances and an ILP learning environment is defined within the context of the Aleph framework. Experiments evidence good results in terms of irrelevancy elimination, syntactical correctness and reduction rate in a real-world environment as opposed to other methodologies proposed so far.

Item

On evaluating stream learning algorithms

( 2013) João Gama ; Raquel Sebastião ; Pedro Pereira Rodrigues

Most streaming decision models evolve continuously over time, run in resource-aware environments, and detect and react to changes in the environment generating data. One important issue, not yet convincingly addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of prequential error with forgetting mechanisms to provide reliable error estimators. We prove that, in stationary data and for consistent learning algorithms, the holdout estimator, the prequential error and the prequential error estimated over a sliding window or using fading factors, all converge to the Bayes error. The use of prequential error with forgetting mechanisms reveals to be advantageous in assessing performance and in comparing stream learning algorithms. It is also worthwhile to use the proposed methods for hypothesis testing and for change detection. In a set of experiments in drift scenarios, we evaluate the ability of a standard change detection algorithm to detect change using three prequential error estimators. These experiments point out that the use of forgetting mechanisms (sliding windows or fading factors) are required for fast and efficient change detection. In comparison to sliding windows, fading factors are faster and memoryless, both important requirements for streaming applications. Overall, this paper is a contribution to a discussion on best practice for performance assessment when learning is a continuous process, and the decision models are dynamic and evolve over time.

Item

Adaptive model rules from data streams

( 2013) Ezilda Duarte Almeida ; Carlos Ferreira ; João Gama

Decision rules are one of the most expressive languages for machine learning. In this paper we present Adaptive Model Rules (AMRules), the first streaming rule learning algorithm for regression problems. In AMRules the antecedent of a rule is a conjunction of conditions on the attribute values, and the consequent is a linear combination of attribute values. Each rule uses a Page-Hinkley test to detect changes in the process generating data and react to changes by pruning the rule set. In the experimental section we report the results of AMRules on benchmark regression problems, and compare the performance of our system with other streaming regression algorithms. © 2013 Springer-Verlag.

Item

SMOTE for regression

( 2013) Luís Torgo ; Rita Paula Ribeiro ; Pfahringer,B ; Paula Oliveira Branco

Several real world prediction problems involve forecasting rare values of a target variable. When this variable is nominal we have a problem of class imbalance that was already studied thoroughly within machine learning. For regression tasks, where the target variable is continuous, few works exist addressing this type of problem. Still, important application areas involve forecasting rare extreme values of a continuous target variable. This paper describes a contribution to this type of tasks. Namely, we propose to address such tasks by sampling approaches. These approaches change the distribution of the given training data set to decrease the problem of imbalance between the rare target cases and the most frequent ones. We present a modification of the well-known Smote algorithm that allows its use on these regression tasks. In an extensive set of experiments we provide empirical evidence for the superiority of our proposals for these particular regression tasks. The proposed SmoteR method can be used with any existing regression algorithm turning it into a general tool for addressing problems of forecasting rare extreme values of a continuous target variable. © 2013 Springer-Verlag.

Item

Random rules from data streams

( 2013) Ezilda Duarte Almeida ; Kosina,P ; João Gama

Existing works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream mining prediction tasks is the Very Fast Decision Rules learner (VFDR). In this work we extend the VFDR algorithm using random rules from data streams. The proposed algorithm generates several sets of rules. Each rule set is associated with a set of Natt attributes. The proposed algorithm maintains all properties required when learning from stationary data streams: online and any-time classification, processing each example once. Copyright 2013 ACM.

Item

On recommending urban hotspots to find our next passenger

( 2013) Luís Moreira Matias ; Fernandes,R ; João Gama ; Michel Ferreira ; João Mendes Moreira ; Damas,L

The rising fuel costs is disallowing random cruising strategies for passenger finding. Hereby, a recommendation model to suggest the most passengerprofitable urban area/stand is presented. This framework is able to combine the 1) underlying historical patterns on passenger demand and the 2) current network status to decide which is the best zone to head to in each moment. The major contribution of this work is on how to combine well-known methods for learning from data streams (such as the historical GPS traces) as an approach to solve this particular problem. The results were promising: 395.361/506.873 of the services dispatched were correctly predicted. The experiments also highlighted that a fleet equipped with such framework surpassed a fleet that is not: they experienced an average waiting time to pick-up a passenger 5% lower than its competitor. © 2013 IJCAI.

Item

WIPS: The WiSARD indoor positioning system

( 2013) Cardoso,DO ; João Gama ; De Gregorio,M ; Franca,FMG ; Giordano,M ; Lima,PMV

In this paper, we present a WiSARD-based system facing the problem of Indoor Positioning (IP) by taking advantage of pervasively available infrastructures (WiFi Access Points -AP). The goal is to develop a system to be used to position users in indoor environments, such as: museums, malls, factories, offshore platforms etc. Based on the fingerprint approach, we show how the proposed weightless neural system provides very good results in terms of performance and positioning resolution. Both the approach to the problem and the system will be presented through two correlated experiments.

Item

Data stream mining: The bounded rationality

( 2013) João Gama

The developments of information and communication technologies dramatically change the data collection and processing methods. Data mining is now moving to the era of bounded rationality. In this work we discuss the implications of the resource constraints impose by the data stream computational model in the design of learning algorithms. We analyze the behavior of stream mining algorithms and present future research directions including ubiquitous stream mining and self-adaption models.

Item

Dynamics of human decisions

( 2013) Renato Araújo Soeiro ; Mousa,A ; Oliveira,TR ; Alberto Pinto

Item

Learning model rules from high-speed data streams

( 2013) Ezilda Duarte Almeida ; Carlos Ferreira ; João Gama

Decision rules are one of the most expressive languages for machine learning. In this paper we present Adaptive Model Rules (AMRules), the first streaming rule learning algorithm for regression problems. In AMRules the antecedent of a rule is a conjunction of conditions on the attribute values, and the consequent is a linear combination of attribute values. Each rule in AMRules uses a Page-Hinkley test to detect changes in the process generating data and react to changes by pruning the rule set. In the experimental section we report the results of AMRules on benchmark regression problems, and compare the performance of our algorithm with other streaming regression algorithms. © 2013 IJCAI.

Item

Binary recommender systems: Introduction, an application and outlook

( 2013) Alípio Jorge

Recommender Systems are a hot application area these days, made popular by well known web sites. The problem of predicting user preferences is very demanding from the data mining algorithm design point of view, but it also poses challenges to evaluation and monitoring. Moreover, there is a lot of information that can be exploited, from clickstreams and background information to musical content and social interaction. As data grows and recommendation requests must be answered in a split second, online and agile solutions must be implemented. In this talk we will give a brief introduction to binary recommender systems, describe a particular hybrid application to music recommendation - from algorithm to online evaluation, and refer to context aware and online recommender algorithms. © 2013 ACM.

Item

Novelty detection algorithm for data streams multi-class problems

( 2013) Faria,ER ; João Gama ; Carvalho,APLF

Novelty detection has been presented in the literature as one-class problem. In this case, new examples are classified as either belonging to the target class or not. The examples not explained by the model are detected as belonging to a class named novelty. However, novelty detection is much more general, especially in data streams scenarios, where the number of classes might be unknown before learning and new classes can appear any time. In this case, the novelty concept is composed by different classes. This work presents a new algorithm to address novelty detection in data streams multi-class problems, the MINAS algorithm. Moreover, we also present a new experimental methodology to evaluate novelty detection methods in multi-class problems. The data used in the experiments include artificial and real data sets. Experimental results show that MINAS is able to discover novelties in multi-class problems. Copyright 2013 ACM.

LIAAD - Other Publications

Permanent URI for this collection

Browse

Browse

Browsing LIAAD - Other Publications by Issue Date

Results Per Page

Sort Options