DSpace Angular :: Browsing CTM by Author "1107"

Item

Audiovisual annotation in the study of physics

( 2018) Marçal,J ; Borges,MM ; Carvalho,P ; Paula Viana ; 1107

The support of video in the learning environment is nowadays used to many ends, either for demonstration, research or share. It is intended to reinforce the space before and after class and introduce a new dynamic and interaction in the classroom itself. Pedagogical innovation may be achieved by different approaches to motivate students and obtain better results. This paper presents a revision of the literature about the potential of using video annotation in the education context, specifically in the domain of Physics, using an open source annotation tool. The creation of audiovisual references, either for quick access to parts of organized video annotated content by the teacher, knowledge building or revision by and for other students is analyzed. This study is complemented with a testbed, showing the potential of using audiovisual annotated content, within a k-12 context. Students were invited to select video content, annotate, organize and publish the annotations, which could support the learning process in the domain of Physics. Results show that most of the aspects under analysis received a positive evaluation. The only exception relates to the capacity of the approach to motivated students to the study of Physics, as most of the students did not see this methodology as a motivating means. © 2018 ACM

Item

Automatic TV Logo Identification for Advertisement Detection without Prior Data

( 2021) Pedro Miguel Carvalho ; Américo José Pereira ; Paula Viana ; 1107 ; 4358 ; 6078

Advertisements are often inserted in multimedia content, and this is particularly relevant in TV broadcasting as they have a key financial role. In this context, the flexible and efficient processing of TV content to identify advertisement segments is highly desirable as it can benefit different actors, including the broadcaster, the contracting company, and the end user. In this context, detecting the presence of the channel logo has been seen in the state-of-the-art as a good indicator. However, the difficulty of this challenging process increases as less prior data is available to help reduce uncertainty. As a result, the literature proposals that achieve the best results typically rely on prior knowledge or pre-existent databases. This paper proposes a flexible method for processing TV broadcasting content aiming at detecting channel logos, and consequently advertising segments, without using prior data about the channel or content. The final goal is to enable stream segmentation identifying advertisement slices. The proposed method was assessed over available state-of-the-art datasets as well as additional and more challenging stream captures. Results show that the proposed method surpasses the state-of-the-art.

Item

Enhancing Photography Management Through Automatically Extracted Metadata

( 2022) Pedro Miguel Carvalho ; Freitas,D ; Machado,T ; Paula Viana ; 1107 ; 4358

The tremendous increase in photographs that are captured each day by common users has been favoured by the availability of high quality devices at accessible costs, such as smartphones and digital cameras. However, the quantity of captured photos raises new challenges regarding the access and management of image repositories. This paper describes a lightweight distributed framework intended to help overcome these problems. It uses image metadata in EXIF format, already widely added to images by digital acquisition devices, and automatic facial recognition to provide management and search functionalities. Moreover, a visualization functionality using a graph-based strategy was integrated, enabling an enhanced and more interactive navigation through search results and the corresponding relations.

Item

From a Visual Scene to a Virtual Representation: A Cross-Domain Review

( 2023) Pedro Miguel Carvalho ; Paula Viana ; Nuno Alexandre Pereira ; Américo José Pereira ; Luís Corte Real ; 4358 ; 1107 ; 7023 ; 6078 ; 243

The widespread use of smartphones and other low-cost equipment as recording devices, the massive growth in bandwidth, and the ever-growing demand for new applications with enhanced capabilities, made visual data a must in several scenarios, including surveillance, sports, retail, entertainment, and intelligent vehicles. Despite significant advances in analyzing and extracting data from images and video, there is a lack of solutions able to analyze and semantically describe the information in the visual scene so that it can be efficiently used and repurposed. Scientific contributions have focused on individual aspects or addressing specific problems and application areas, and no cross-domain solution is available to implement a complete system that enables information passing between cross-cutting algorithms. This paper analyses the problem from an end-to-end perspective, i.e., from the visual scene analysis to the representation of information in a virtual environment, including how the extracted data can be described and stored. A simple processing pipeline is introduced to set up a structure for discussing challenges and opportunities in different steps of the entire process, allowing to identify current gaps in the literature. The work reviews various technologies specifically from the perspective of their applicability to an end-to-end pipeline for scene analysis and synthesis, along with an extensive analysis of datasets for relevant tasks.

Item

Improving Audiovisual Content Annotation Through a Semi-automated Process Based on Deep Learning

( 2018) Paula Viana ; Maria Teresa Andrade ; Pedro Miguel Carvalho ; Vilaça,L ; 1107 ; 4358 ; 400

Over the last years, Deep Learning has become one of the most popular research fields of Artificial Intelligence. Several approaches have been developed to address conventional challenges of AI. In computer vision, these methods provide the means to solve tasks like image classification, object identification and extraction of features. In this paper, some approaches to face detection and recognition are presented and analyzed, in order to identify the one with the best performance. The main objective is to automate the annotation of a large dataset and to avoid the costy and time-consuming process of content annotation. The approach follows the concept of incremental learning and a R-CNN model was implemented. Tests were conducted with the objective of detecting and recognizing one personality within image and video content. Results coming from this initial automatic process are then made available to an auxiliary tool that enables further validation of the annotations prior to uploading them to the archive. Tests show that, even with a small size dataset, the results obtained are satisfactory. © 2020, Springer Nature Switzerland AG.

Item

Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content

( 2022) Paula Viana ; Maria Teresa Andrade ; Pedro Miguel Carvalho ; Luís Miguel Salgado ; Inês Filipa Teixeira ; Tiago André Costa ; Jonker,P ; 400 ; 1107 ; 4358 ; 5363 ; 7420 ; 7514

Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.

Item

Semantic Storytelling Automation: A Context-Aware and Metadata-Driven Approach

( 2020) Paula Viana ; Pedro Miguel Carvalho ; Maria Teresa Andrade ; Jonker,PP ; Papanikolaou,V ; Teixeira,IN ; Vilaça,L ; Pinto,JP ; Tiago André Costa ; 4358 ; 5363 ; 400 ; 1107

Item

YouTube timed metadata enrichment using a collaborative approach

( 2019) Paula Viana ; José Pedro Pinto ; 5865 ; 1107

Although the growth of video content in online platforms has been happening for some time, searching and browsing these assets is still very inefficient as rich contextual data that describes the content is still not available. Furthermore, any available descriptions are, usually, not linked to timed moments of content. In this paper, we present an approach for making social web videos available on YouTube more accessible, searchable and navigable. By using the concept of crowdsourcing to collect the metadata, our proposal can contribute to easily enhance content uploaded in the YouTube platform. Metadata, collected as a collaborative annotation game, is added to the content as time-based information in the form of descriptions and captions using the YouTube API. This contributes for enriching video content and enabling navigation through temporal links. © Springer Nature Switzerland AG 2019.

CTM

Permanent URI for this community

Browse

Browse

Browsing CTM by Author "1107"

Results Per Page

Sort Options