CTM - Indexed Articles in Journals

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 219
  • Item
    Cognition inspired format for the expression of computer vision metadata
    ( 2016) Pedro Miguel Carvalho ; Hélder Fernandes Castro ; João Pedro Monteiro ; Américo José Pereira ; 4358 ; 4487 ; 5568 ; 6078
    Over the last decade noticeable progress has occurred in automated computer interpretation of visual information. Computers running artificial intelligence algorithms are growingly capable of extracting perceptual and semantic information from images, and registering it as metadata. There is also a growing body of manually produced image annotation data. All of this data is of great importance for scientific purposes as well as for commercial applications. Optimizing the usefulness of this, manually or automatically produced, information implies its precise and adequate expression at its different logical levels, making it easily accessible, manipulable and shareable. It also implies the development of associated manipulating tools. However, the expression and manipulation of computer vision results has received less attention than the actual extraction of such results. Hence, it has experienced a smaller advance. Existing metadata tools are poorly structured, in logical terms, as they intermix the declaration of visual detections with that of the observed entities, events and comprising context. This poor structuring renders such tools rigid, limited and cumbersome to use. Moreover, they are unprepared to deal with more advanced situations, such as the coherent expression of the information extracted from, or annotated onto, multi-view video resources. The work here presented comprises the specification of an advanced XML based syntax for the expression and processing of Computer Vision relevant metadata. This proposal takes inspiration from the natural cognition process for the adequate expression of the information, with a particular focus on scenarios of varying numbers of sensory devices, notably, multi-view video.
  • Item
    Efficient CIEDE2000-Based Color Similarity Decision for Computer Vision
    ( 2020) Américo José Pereira ; Pedro Miguel Carvalho ; Luís Corte Real ; 6078 ; 4358 ; 243
    Color and color differences are critical aspects in many image processing and computer vision applications. A paradigmatic example is object segmentation, where color distances can greatly influence the performance of the algorithms. Metrics for color difference have been proposed in the literature, including the definition of standards such as CIEDE2000, which quantifies the change in visual perception of two given colors. This standard has been recommended for industrial computer vision applications, but the benefits of its application have been impaired by the complexity of the formula. This paper proposes a new strategy that improves the usability of the CIEDE2000 metric when a maximum acceptable distance can be imposed. We argue that, for applications where a maximum value, above which colors are considered to be different, can be established, then it is possible to reduce the amount of calculations of the metric, by preemptively analyzing the color features. This methodology encompasses the benefits of the metric while overcoming its computational limitations, thus broadening the range of applications of CIEDE2000 in both the computer vision algorithms and computational resource requirements.
  • Item
    Misalignment-Resilient Propagation Model for Underwater Optical Wireless Links
    ( 2023) João Henrique Araújo ; Joana Santos Tavares ; Marques,VM ; Henrique Salgado ; Luís Manuel Pessoa ; 296 ; 4760 ; 5686 ; 7658
    This paper proposes a multiple-lens receiver scheme to increase the misalignment tolerance of an underwater optical wireless communications link between an autonomous underwater vehicle (AUV) and a sensor plane. An accurate model of photon propagation based on the Monte Carlo simulation is presented which accounts for the lens(es) photon refraction at the sensor interface and angular misalignment between the emitter and receiver. The results show that the ideal divergence of the beam of the emitter is around 15° for a 1 m transmission length, increasing to 22° for a shorter distance of 0.5 m but being independent of the water turbidity. In addition, it is concluded that a seven-lense scheme is approximately three times more tolerant to offset than a single lens. A random forest machine learning algorithm is also assessed for its suitability to estimate the offset and angle of the AUV in relation to the fixed sensor, based on the power distribution of each lens, in real time. The algorithm is able to estimate the offset and angular misalignment with a mean square error of 5 mm (6 mm) and 0.157 rad (0.174 rad) for a distance between the transmitter and receiver of 1 m and 0.5 m, respectively.
  • Item
    Texture collinearity foreground segmentation for night videos
    ( 2020) Martins,I ; Pedro Miguel Carvalho ; Luís Corte Real ; Luis Alba Castro,JL ; 243 ; 4358
    One of the most difficult scenarios for unsupervised segmentation of moving objects is found in nighttime videos where the main challenges are the poor illumination conditions resulting in low-visibility of objects, very strong lights, surface-reflected light, a great variance of light intensity, sudden illumination changes, hard shadows, camouflaged objects, and noise. This paper proposes a novel method, coined COLBMOG (COLlinearity Boosted MOG), devised specifically for the foreground segmentation in nighttime videos, that shows the ability to overcome some of the limitations of state-of-the-art methods and still perform well in daytime scenarios. It is a texture-based classification method, using local texture modeling, complemented by a color-based classification method. The local texture at the pixel neighborhood is modeled as an N-dimensional vector. For a given pixel, the classification is based on the collinearity between this feature in the input frame and the reference background frame. For this purpose, a multimodal temporal model of the collinearity between texture vectors of background pixels is maintained. COLBMOG was objectively evaluated using the ChangeDetection.net (CDnet) 2014, Night Videos category, benchmark. COLBMOG ranks first among all the unsupervised methods. A detailed analysis of the results revealed the superior performance of the proposed method compared to the best performing state-of-the-art methods in this category, particularly evident in the presence of the most complex situations where all the algorithms tend to fail. © 2020 Elsevier Inc.
  • Item
    Automatic TV Logo Identification for Advertisement Detection without Prior Data
    ( 2021) Pedro Miguel Carvalho ; Américo José Pereira ; Paula Viana ; 1107 ; 4358 ; 6078
    Advertisements are often inserted in multimedia content, and this is particularly relevant in TV broadcasting as they have a key financial role. In this context, the flexible and efficient processing of TV content to identify advertisement segments is highly desirable as it can benefit different actors, including the broadcaster, the contracting company, and the end user. In this context, detecting the presence of the channel logo has been seen in the state-of-the-art as a good indicator. However, the difficulty of this challenging process increases as less prior data is available to help reduce uncertainty. As a result, the literature proposals that achieve the best results typically rely on prior knowledge or pre-existent databases. This paper proposes a flexible method for processing TV broadcasting content aiming at detecting channel logos, and consequently advertising segments, without using prior data about the channel or content. The final goal is to enable stream segmentation identifying advertisement slices. The proposed method was assessed over available state-of-the-art datasets as well as additional and more challenging stream captures. Results show that the proposed method surpasses the state-of-the-art.