<?xml version="1.0"?>
<rss version="2.0"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:dcterms="http://purl.org/dc/terms/"
     xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>IPOL</title>
<link>http://www.ipol.im/feed/</link>
<atom:link href="http://www.ipol.im/feed/articles.rss" rel="self" type="application/atom+xml"/>
<description>IPOL Articles — Latest articles published in IPOL.</description>
<item>
	<title>An Implementation of Two-Phase Image Segmentation Using the Split Bregman Method</title>
	
	<dc:creator>Olakunle Abawonse,
G&#xFC;nay Do&#x11F;an</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2026/578/</guid>
	
	<link>http://www.ipol.im/pub/art/2026/578/</link>
	<pubDate>Fri, 10 Apr 2026 00:00:00 +0200</pubDate>
	<dcterms:modified>2026-04-10T08:16:14Z</dcterms:modified>
	<description>In this paper, we describe an implementation of the two-phase image segmentation algorithm proposed by Goldstein, Bresson and Osher in [Geometric Applications of the Split Bregman Method:
Segmentation and Surface Reconstruction, Journal of Scientific Computing, 2010]. This algorithm partitions the domain of a given 2D image into foreground and background regions, and each pixel of the image is assigned membership to one of these two regions. The underlying assumption for the segmentation model is that the pixel values of the input image can be summarized by two distinct average values, and that the region boundaries are smooth. Accordingly, the model is formulated as an energy functional whose variable is a region membership function that assigns pixels to either region, as originally proposed by Chan and Vese in [Active Contours Without Edges, IEEE Transactions on Image Processing, 2001]. This energy is the sum of image data terms in the regions and a length penalty for region boundaries. Goldstein, Bresson and Osher modify the energy of Chan-Vese so that their new energy can be minimized efficiently using the split Bregman method to produce an equivalent two-phase segmentation. We provide a detailed implementation of this method, and document its performance with several images over a range of algorithm parameters.</description>
</item>
<item>
	<title>Image Segmentation using Backward Stochastic Differential Equations</title>
	
	<dc:creator>Dariusz Borkowski</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2026/636/</guid>
	
	<link>http://www.ipol.im/pub/art/2026/636/</link>
	<pubDate>Wed, 08 Apr 2026 00:00:00 +0200</pubDate>
	<dcterms:modified>2026-04-08T09:53:28Z</dcterms:modified>
	<description>We introduce a novel image segmentation algorithm based on the methodology of approximating solutions to backward stochastic differential equations (BSDEs). The segmentation method repeats the BSDE reconstruction process, with the parameters of these equations changing in subsequent steps. We are interested in a sequence of images driven by BSDE solutions. As the segmentation result, we define the limit of these images. By their nature, stochastic tools, particularly the Monte Carlo method, have high computational complexity. There are concerns about the running time of the proposed method, especially if we are considering a sequence of stochastic solutions. Experimental segmentation results show that it is possible to obtain results quickly and that the algorithm yields excellent results for images with intense noise.</description>
</item>
<item>
	<title>Voronoi Diagrams for Page Segmentation</title>
	
	<dc:creator>Marina Gardella,
Ignacio Ramirez</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2026/591/</guid>
	
	<link>http://www.ipol.im/pub/art/2026/591/</link>
	<pubDate>Wed, 18 Mar 2026 00:00:00 +0100</pubDate>
	<dcterms:modified>2026-03-18T07:59:32Z</dcterms:modified>
	<description>Page segmentation is a key task in document processing, enabling effective extraction of structured information from diverse document types. This paper presents an in-depth analysis of the method proposed by Kise et al., a bottom-up approach using area Voronoi diagrams to identify spatial relationships between document parts. Our work provides a detailed description of the method, emphasizing clarity, reproducibility, and transparency, particularly regarding aspects not fully specified in the original paper. We highlight the impact of the parameter settings and preprocessing steps on the method&amp;#x27;s performance. Through extensive testing, we demonstrate that the method can handle a wide range of layouts but exhibits notable sensitivity to specific document characteristics, especially in handling complex elements like handwritten text, lists, drop-caps, and tables.</description>
</item>
<item>
	<title>Thin-plate Splines on the Sphere for Interpolation, Computing Spherical Averages, and Solving Inverse Problems</title>
	
	<dc:creator>Max Dunitz</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2026/451/</guid>
	
	<link>http://www.ipol.im/pub/art/2026/451/</link>
	<pubDate>Fri, 13 Feb 2026 00:00:00 +0100</pubDate>
	<dcterms:modified>2026-02-13T19:47:01Z</dcterms:modified>
	<description>In many applications, 
planar spline interpolations of scattered data on the sphere are unsatisfactory; spherical splines are desired. Wahba (1981) defined the thin-plate splines on the sphere by analogy with the polynomial splines on the circle and the thin-plate splines in R^d. The thin-plate spline fit to a scattered data set on the sphere is the solution to an empirical risk minimization problem that penalizes the infidelity of the fit to the data as well as its &amp;#x27;wiggliness&amp;#x27;. This latter term is the square of a seminorm penalty based on the Laplace-Beltrami operator. The minimization problem is posed in a reproducing kernel Hilbert space (RKHS) of functions of finite wiggliness, whose reproducing kernel is isotropic and, due to a result by Schoenberg (1942), given by a Legendre series. A closed-form expression (in terms of the polylogarithm) for the kernel was found by Wendelberger (1982) and re-discovered by Beatson and zu Castell (2018). 
These closed-form expressions make not just spline interpolation but also downstream signal-processing tasks, such as cubature or resolution of inverse problems, more tractable in fields where scattered data and spherical models are common, such as remote sensing, geostatistics, motion planning, graphics, and medical imaging. In this paper, we present a tutorial on spline methods in RKHSs and show how they can be used to interpolate, smooth, and numerically integrate scattered data on the sphere and solve related inverse problems. The accompanying demo compares thin-plate spline interpolation over the sphere with thin-plate splines on an equirectangular projection and natural cubic splines on a one-dimensional latitudinal projection used in greenhouse gas monitoring. Global mean values of the interpolation surfaces are presented as well, to illustrate how this isotropic spherical kernel - which penalizes interpolant wiggliness without concern for application-specific factors like atmospheric winds - affects the computation of global averages.</description>
</item>
<item>
	<title>A Brief Analysis of the Change Detector by Kervrann et al.</title>
	
	<dc:creator>Tristan Dagobert,
Jean-Michel Morel,
Gabriele Facciolo</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/602/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/602/</link>
	<pubDate>Sun, 28 Dec 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-12-28T11:05:21Z</dcterms:modified>
	<description>This work describes the symmetric method by Kervrann et al. for change detection.
The algorithm processes a pair of images using a hypothesis testing technique with an a contrario approach.
We perform a brief analysis of the results produced by the method and evaluate its quality and limitations
on the Sentinel-2 OSCD dataset.</description>
</item>
<item>
	<title>CS-TRD: a Cross-Section Tree Ring Detection Method</title>
	
	<dc:creator>Henry Marichal,
Diego Passarella,
Gregory Randall</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/485/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/485/</link>
	<pubDate>Sat, 27 Dec 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-12-27T15:19:23Z</dcterms:modified>
	<description>This work describes a Tree Ring Detection method for complete Cross-Sections of Trees (CS-TRD) that detects, processes, and connects edges corresponding to the tree&amp;#x27;s growth rings. 
The method relies on edge detection, and its parameters are set to default values and can be adjusted as needed. The only required input is the location of the
biological center of the tree, the pith, which can be marked manually or using an automatic detection algorithm.
CS-TRD achieves an F-Score of 91% in the UruDendro dataset (of Pinus taeda) and 97% in the Kennel dataset (of Abies alba) without specialized hardware requirements. </description>
</item>
<item>
	<title>L1-Norm Redundant Delaunay Phase Unwrapping and Gradient Correction</title>
	
	<dc:creator>Alexandre Achard-de Lustrac,
Roland Akiki,
Axel Davy,
Jean-Michel Morel</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/583/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/583/</link>
	<pubDate>Sat, 27 Dec 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-12-27T15:19:23Z</dcterms:modified>
	<description>This article deals with arrays of real numbers which have been reduced modulo 2h into the interval [-h,h] where h&amp;#x3E;0 is a positive real number. Such an array is said to be wrapped modulo 2h. Often, the elements of these arrays correspond to values observed at points in an image-like 2D space which are connected by a graph structure. The process of retrieving the original array from which the wrapped image originates is called unwrapping. Of course, the wrapping process is not one-to-one, and the quality of the recovered unwrapped version depends on the smoothness of the original array. The goal of unwrapping is to define a most plausible left inverse (as will be defined in a precise way) to the non-injective modulation operator mod 2h using heuristic arguments and regularity assumptions on the original signal. Following the guidelines described in [M. Constantini, A Novel Phase Unwrapping Method Based on Network Programming, IEEE Transactions on Geoscience and Remote Sensing, 1998] and [M. Constantini et al., A general formulation for redundant integration of finite differences and phase unwrapping on a sparse multidimensional domain, IEEE Transactions on Geoscience and Remote Sensing, 2012], this is made possible by correcting an approximate gradient into a global gradient using either linear programming or, in some cases, minimum-cost flow techniques to solve an L1-norm optimization problem. Such a gradient-correcting technique can also be used in general for finding a most plausible gradient and reconstructing a signal. The online demo associated with this paper implements the aforementioned methods.</description>
</item>
<item>
	<title>A Brief Review and Analysis of Two Methods for Automatic Sign Language Segmentation</title>
	
	<dc:creator>Ariel E. Stassi,
J. Mat&#xED;as Di Martino,
Gregory Randall</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/560/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/560/</link>
	<pubDate>Fri, 22 Aug 2025 00:00:00 +0200</pubDate>
	<dcterms:modified>2025-08-22T08:10:27Z</dcterms:modified>
	<description>Sign language segmentation is a fundamental task in sign language processing to implement automatic translation systems. In this work, we study and compare the performance of two state-of-the-art methods for automatic sign language segmentation:
&amp;#x27;Automatic Segmentation of Sign Language into Subtitle-Units&amp;#x27; [Bull et al., European Conference on Computer Vision Workshops, 2020] and &amp;#x27;Linguistically Motivated Sign Language Segmentation&amp;#x27; [Moryossef et al., Findings of the Association for Computational Linguistics, 2023]. Each method has an online demo available that can be used to run the approaches here presented on example videos, varying parameters such as considered pose models and probability thresholds. Both methods use pauses and movements of the derived skeletons to detect the limits of a phrase. We consider two datasets, one of American Sign Language (the test set of How2Sign) and one of Uruguayan Sign Language (LSU-DS). For the evaluation, we consider two metrics used in the paper of Moryossef et al. In the case of LSU-DS, as we have triplets of simultaneous videos taken from different points of view, we propose to use the IoU dispersion among points of view to estimate the coherence of the temporal segmentation of a unique signer simultaneously observed by different cameras. The performances of the different variants of each method are evaluated, showing the limits of the methods, the datasets, and the metrics to capture the quality of the automatic solutions.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source codes are available [[here (LMSLS)|https://github.com/sign-language-processing/segmentation]] and [[here (ASSLiSU)|https://github.com/hannahbull/sign_language_segmentation]] (last checked 2025/07/26).**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>Gaussian Splatting: An Introduction</title>
	
	<dc:creator>Akash Malhotra,
Nac&#xE9;ra Seghouani</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/566/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/566/</link>
	<pubDate>Tue, 17 Jun 2025 00:00:00 +0200</pubDate>
	<dcterms:modified>2025-06-17T18:47:21Z</dcterms:modified>
	<description>Gaussian Splatting has emerged as a powerful technique for signal representation, especially in 3D. This paper introduces Gaussian Splatting and demonstrates its application across 1D, 2D, and 3D cases. We also discuss Gaussian Splatting in relation to Neural Radiance Fields (NeRF), highlighting the computational trade-offs and performance benefits. Through this work, we aim to bridge the gap between foundational concepts in view synthesis and advanced research, making Gaussian Splatting a more approachable and widely understood technique in the field of signal processing and computer vision. We provide code examples and detailed explanations to make the topic accessible to a broader audience, enabling readers to dive into more advanced technical papers with ease.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source codes are available here: [[2D Gaussian Splatting|https://github.com/OutofAi/2D-Gaussian-Splatting]]
and [[Gaussian Splatting|https://github.com/nerfstudio-project/gsplat]] (last checked 2025/06/17).**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>Specularity in NeRFs: A Comparative Study of Ref-NeRF and NRFF</title>
	
	<dc:creator>Albert Barreiro,
Roger Mar&#xED;,
Rafael Redondo,
Gloria Haro,
Carles Bosch,
David Berga</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/562/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/562/</link>
	<pubDate>Wed, 12 Mar 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-03-12T14:15:47Z</dcterms:modified>
	<description>Neural Radiance Fields (NeRF) have emerged as a leading technology for 3D digitization, especially for their high accuracy and intricate detailing. Despite their advancements, early NeRF models struggle to handle reflections on specular surfaces effectively. To address this, alternative approaches such as Ref-NeRF and NRFF were proposed to improve fidelity in representing this physical phenomenon. This study compares these two models, providing an analysis of their effectiveness and limitations in dealing with complex specularities. We demonstrate that
both methods struggle with inter-reflections and tend to model anisotropic specularities by altering the predicted surface normals.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source codes are available here: [[Ref-NeRF|https://github.com/kakaobrain/NeRF-Factory]]
and [[NRFF|https://github.com/imkanghan/nrff]] (last checked 2025/02/06).**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>Latent Diffusion Approaches for Conditional Generation of Aerial Imagery: A Study</title>
	
	<dc:creator>Roger Mar&#xED;,
Rafael Redondo</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/580/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/580/</link>
	<pubDate>Tue, 11 Mar 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-03-11T08:34:52Z</dcterms:modified>
	<description>Generative artificial intelligence is increasingly being applied in diverse areas such as architecture design, music composition, or character animation. Among the generative methods, diffusion models are today the state of the art in the synthesis of high quality images with inherent diversity and realism.
This paper aims to evaluate the fidelity and realism of the synthesis achieved by different architectural variations of a latent diffusion model, which is used to generate aerial images conditioned to semantic maps. As shown in the results, the diffusion model tends to correctly capture the overall semantic structure and generates realistic textures, often with a lack of fine-grained detail. Among the conditioning variations, cross-attention layers were crucial to outline the semantic segments more accurately and exploit conditional data more effectively.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>Semiogram: a Visual Tool for Gait Quantification in Routine Neurological Follow-Up</title>
	
	<dc:creator>Cyril Voisard,
Nicolas de l&#x27;Escalopier,
Damien Ricard,
Laurent Oudre</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2025/535/</guid>
	
	<link>http://www.ipol.im/pub/art/2025/535/</link>
	<pubDate>Sat, 01 Feb 2025 00:00:00 +0100</pubDate>
	<dcterms:modified>2025-02-01T16:53:54Z</dcterms:modified>
	<description>In this work, we present an innovative multidimensional tool developed for gait evaluation and monitoring in patients with neurological disorders in routine clinical practice using Inertial Sensors, named semiogram. It has previously been published and validated by Voisard et al. [C. Voisard, N. de l&amp;#x27;Escalopier, A. Vienne-Jumeau, A. Moreau, F. Quijoux, F. Bompaire, M. Sallansonnet, M-L. Brechemier, I. Taifas, C. Tafani, E. Drouard, N. Vayatis, D. Ricard and L. Oudre, Innovative Multidimensional Gait Evaluation using IMU in Multiple Sclerosis: introducing the Semiogram, Frontiers in Neurology, 2023]. This tool offers a quantitative semiological analysis based on average speed and 16 other gait parameters, grouped into 7 criteria recognized in the literature:  sturdiness, springiness, steadiness, stability, smoothness, synchronization, and symmetry. The provided visualization aims to facilitate easy interpretation by the clinician.</description>
</item>
<item>
	<title>A Review of t-SNE</title>
	
	<dc:creator>Sangwon Jung,
Tristan Dagobert,
Jean-Michel Morel,
Gabriele Facciolo</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/528/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/528/</link>
	<pubDate>Thu, 31 Oct 2024 00:00:00 +0100</pubDate>
	<dcterms:modified>2024-10-31T14:35:08Z</dcterms:modified>
	<description>High dimensional data is difficult to visualize. T-Distributed Stochastic Neighbor Embedding (t-SNE) is a popular technique for dimensionality reduction enabling a planar visualization of a dataset preserving as much as possible its metric.  This paper explores the theoretical background of t-SNE and its accelerated version. A comparison of the performance of t-SNE on various datasets with different dimensions is also performed. </description>
</item>
<item>
	<title>Non-local Matching of Superpixel-based Deep Features for Color Transfer and Colorization</title>
	
	<dc:creator>Roxane Leduc,
Hernan Carrillo,
Nicolas  Papadakis</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/522/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/522/</link>
	<pubDate>Tue, 29 Oct 2024 00:00:00 +0100</pubDate>
	<dcterms:modified>2024-10-29T18:30:48Z</dcterms:modified>
	<description>In this article, we give a thorough description of the algorithm proposed in [H. Carrillo, M. Cl&amp;#xE9;ment and A. Bugeau, Non-local matching of superpixel-based deep features for color transfer, VISAPP, 2022] for color transfer by relying on a robust non-local correspondence between low-level features at high resolution. An adaptation of this method for colorization is also described. We highlight the overall relevant results obtained with this technique for both applications and also show its limitations.</description>
</item>
<item>
	<title>Accelerating NeRF with the Visual Hull</title>
	
	<dc:creator>Roger Mar&#xED;</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/553/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/553/</link>
	<pubDate>Fri, 26 Jul 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-07-26T07:33:30Z</dcterms:modified>
	<description>Neural rendering methods for learning the appearance and geometry of 3D scenes have gained tremendous popularity since 2020. In this field, NeRF or Neural Radiance Fields is the best-known methodology. Given a collection of multi-view images and their camera models, NeRF optimizes a neural network to learn the color and scene geometry that render the input images according to classical volumetric rendering techniques. NeRF operates in a self-supervised manner and provides a remarkable level of detail, but the time-consuming optimization process remains a major limitation. This paper reviews the Voxel-Accelerated NeRF (VaxNeRF), a simple acceleration strategy for NeRF proposed in 2021. VaxNeRF reduces the number of point queries required in training and inference time by considering only the region of space corresponding to the visual hull, i.e., the maximum volume compatible with the object silhouettes given by the multi-view collection. VaxNeRF requires only coarse foreground-background segmentation masks and minimal changes to the original NeRF code to improve speed by a factor of 2-8, without any performance degradation.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source code is [[available here|https://github.com/naruya/VaxNeRF]] (last checked 2024/07/12).**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>A Brief Evaluation of InSAR Phase Denoising and Coherence Estimation with Phi-Net</title>
	
	<dc:creator>Roland Akiki,
J&#xE9;r&#xE9;my Anger,
Carlo de Franchis,
Gabriele Facciolo,
Rapha&#xEB;l Grandin,
Jean-Michel Morel</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/549/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/549/</link>
	<pubDate>Fri, 26 Jul 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-07-26T07:07:22Z</dcterms:modified>
	<description>In this article, we examine the joint InSAR phase denoising and coherence estimation performance of the network known as Phi-Net [Sica et al., IEEE Transactions on Geoscience and Remote Sensing, 2021]. We briefly examine the method, network architecture, training data and strategy. Then, in the experimental section, we compare the network&amp;#x27;s performance against the simple boxcar uniform filter. We verify the observations made by the authors, in particular concerning the superior denoising performance and preservation of fine details in the coherence estimation. Our experiments also indicate that an end-to-end deep learning method might bring a small improvement to the patch-based approach adopted in Phi-Net. 

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source code is [[available here|https://github.com/DLRRadarScienceGroup/Phi-Net]] (last checked 2024/07/10).**&amp;#x3C;br&amp;#x3E;
 </description>
</item>
<item>
	<title>Survival Forest for Left-Truncated Right-Censored Data</title>
	
	<dc:creator>Vincent Laurent,
Olivier Vo Van</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/466/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/466/</link>
	<pubDate>Tue, 16 Jul 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-07-16T14:52:37Z</dcterms:modified>
	<description>The estimation of the lifetime of an industrial equipment or a patient is often based on censored data, because the event of interest is observed only for a subsample of observations. The use of the Random Forest algorithm applied to industrial data is relevant because the algorithm presents robust performances in many applications. Coupled with survival approaches, it can produce time trajectories for each subset of the feature space and thus differentiate observed objects with respect to their lifetimes. Our work aims to generalize the existing tree-based approach CART applied to left-truncated right-censored data to obtain a Random Forest algorithm. We provide a simple API to use such algorithm as well as tools to validate a temporal score against censored data.</description>
</item>
<item>
	<title>Dehazing with Dark Channel Prior: Analysis and Implementation</title>
	
	<dc:creator>Jose-Luis Lisani,
Charles Hessel</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/530/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/530/</link>
	<pubDate>Mon, 24 Jun 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-06-23T11:50:12Z</dcterms:modified>
	<description>In outdoor scenes, atmospheric absortion and scattering attenuate the radiance received by the 
camera and may produce haze. In 2009 He et al. proposed a simple but effective dehazing 
algorithm based on a hypothesis called the &amp;#x27;dark channel prior&amp;#x27; (DCP). 
Based on this prior several other dehazing methods have been published in recent years.
In this paper we review the original algorithm by He et al, together with some
posterior improvements proposed by the same and other authors. We also analyze the effect of the
parameters on the results and we study a variant of the method proposed by Drews et al.
for the analysis of haze in underwater images.</description>
</item>
<item>
	<title>A Brief Analysis of SLAVC method for Sound Source Localization</title>
	
	<dc:creator>Xavier Juanola,
Gloria Haro</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/525/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/525/</link>
	<pubDate>Wed, 29 May 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-05-29T17:10:19Z</dcterms:modified>
	<description>Mo and Morgado introduced in 2022 a novel self-supervised learning approach for Visual Sound Source Localization, denoted as SLAVC [Mo, S. and Mordado, P., A Closer Look at Weakly-Supervised Audio-Visual Source Localization, Advances in Neural Information Processing Systems, 2022]. The proposed method is based on multiple-instance contrastive learning. In addition to improving the results of previous methods, it also solves two critical problems that former methods faced: 1) excessive overfitting  despite training on extensive datasets, 2) tendency to hallucinate sound sources even without visual evidence to  support it in the video. In this paper, we  briefly present the method, offer an online executable version allowing the users to test it on their own  image-audio pairs and propose some improvements that could benefit the model as future work.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source code is [[available here|https://github.com/stoneMo/SLAVC]] (last checked 2024/05/26).**&amp;#x3C;br&amp;#x3E;</description>
</item>
<item>
	<title>A Short Analysis of BigColor for Image Colorization</title>
	
	<dc:creator>Rosana Garc&#xED;a,
Gregory Randall,
Lara Raad</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/542/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/542/</link>
	<pubDate>Fri, 24 May 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-05-24T15:56:17Z</dcterms:modified>
	<description>This work analyzes the BigColor method, a fully automatic colorization approach that aims to meet the challenge of providing realistic and vivid colorization for complex and diverse images in real-world scenarios.  The method is a BigGAN-inspired encoder-generator network, using a spatial feature map, enabling single forward-pass colorization, supporting arbitrary input resolutions, and producing multimodal colorization results. We provide a short analysis of the method&amp;#x27;s results and highlight some limitations alongside its achievements.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source code is [[available here|https://github.com/KIMGEONUNG/BigColor]] (last checked 2024/05/23).**&amp;#x3C;br&amp;#x3E;</description>
</item>
<item>
	<title>A Brief Analysis of iColoriT for Interactive Image Colorization</title>
	
	<dc:creator>Rosana Garc&#xED;a,
Gregory Randall,
Lara Raad</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/539/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/539/</link>
	<pubDate>Wed, 22 May 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-05-22T08:55:02Z</dcterms:modified>
	<description>This paper briefly describes and analyzes iColoriT, a hybrid colorization method based on a Vision Transformer that propagates user hints to relevant regions of a grayscale image while using color priors learned from a large image dataset. This approach gives users more control over color inference and shows a quick way to achieve results.

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
**The original source code is [[available here|https://github.com/pmh9960/iColoriT]] (last checked 2024/05/16).**&amp;#x3C;br&amp;#x3E;</description>
</item>
<item>
	<title>A Brief Analysis of the Generic Framework for the Structured Abstraction of Images</title>
	
	<dc:creator>Noura Faraj,
Luc&#xED;a Bouza,
Julie  Delon</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/495/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/495/</link>
	<pubDate>Tue, 14 May 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-05-14T09:44:03Z</dcterms:modified>
	<description>In this study, we present a simplified implementation of a versatile framework for structured image abstraction, as outlined in [Faraj et al., A Generic Framework for the Structured Abstraction of Images, International Symposium on Non-Photorealistic Animation and Rendering, 2017]. The framework relies on the topographic map, a hierarchical and geometric representation composed of all shapes of an image, organized in a tree structure. Within this framework, abstract renderings of digital photographs can be generated by iteratively applying simple local operations like replacement, removal, or rotation of shapes. These operations give rise to a diverse spectrum of renderings, spanning from geometrical abstraction and painting-like effects to style transfer. We perform a brief analysis of the results produced by the method, highlighting its quality and limitations.</description>
</item>
<item>
	<title>Image Forgery Detection Based on Noise Inspection: Analysis and Refinement of the Noisesniffer Method</title>
	
	<dc:creator>Marina Gardella,
Pablo Mus&#xE9;,
Miguel Colom,
Jean-Michel Morel</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/462/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/462/</link>
	<pubDate>Thu, 04 Apr 2024 00:00:00 +0200</pubDate>
	<dcterms:modified>2024-04-04T08:27:59Z</dcterms:modified>
	<description>Images undergo a complex processing chain from the moment light reaches the camera&amp;#x27;s sensor until the final digital image is delivered. Each of its operations leaves traces on the noise model which enable forgery detection through noise analysis. In this article, we describe the Noisesniffer method [Gardella et al., Noisesniffer: a Fully Automatic Image Forgery Detector Based on Noise Analysis, IEEE International Workshop on Biometrics and Forensics, 2021]. This method estimates for each image a background stochastic model which makes it possible to detect local noise anomalies characterized by their number of false alarms. We improve on the original formulation of the method by introducing a region-growing algorithm to detect local deviations from the background model. Results show that the proposed method outperforms the previous version as well as the state of the art.</description>
</item>
<item>
	<title>Localization and Image Reconstruction in a STORM Based Super-resolution Microscope</title>
	
	<dc:creator>Pranjal Choudhury,
Bosanta Ranjan Boruah</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/496/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/496/</link>
	<pubDate>Wed, 28 Feb 2024 00:00:00 +0100</pubDate>
	<dcterms:modified>2024-02-28T11:26:14Z</dcterms:modified>
	<description>In this paper, we present a comprehensive Python program for localizing the point spread functions (PSFs) present in a stack of images and thereby rendering a super-resolved image in a Stochastic Optical Reconstruction Microscopy (STORM). A microscope that provides super-resolved images is known as a super-resolution microscope. Optical super-resolution microscopy is playing a pivotal role in advancing the field of optical imaging and has found applications in a number of areas such as cellular biology, biotechnology, medical research, and nanotechnology. The proposed Python program utilizes image processing techniques to accurately identify the PSFs present in highly noisy images with densely packed fluorescent objects. Our program not only provides all the necessary tools for image reconstruction in a STORM microscope under open source license but also offers certain advantages over the existing reconstruction software packages. Some such advantages are an option to start the reconstruction process and the visualization of the rendered super-resolved image in parallel with image acquisition and disposal of the images immediately after acquisition for minimum use of disk space. Parallel visualization of the reconstructed image allows aborting the image acquisition in the case the images are not suitable for super-resolution, thereby saving valuable time. Our Python program is demonstrated using a number of different image stacks. The proposed software code can be applied not only to STORM but also to any other super-resolution technique using single-molecule localization.</description>
</item>
<item>
	<title>Line Segment Detection: a Review of the 2022 State of the Art</title>
	
	<dc:creator>Thibaud Ehret,
Jean-Michel Morel</dc:creator>
	
	
	  <guid>http://www.ipol.im/pub/art/2024/481/</guid>
	
	<link>http://www.ipol.im/pub/art/2024/481/</link>
	<pubDate>Wed, 28 Feb 2024 00:00:00 +0100</pubDate>
	<dcterms:modified>2024-02-28T11:02:56Z</dcterms:modified>
	<description>We compare nine line segment detectors. The two more ancient ones are based on classical edge growing followed by a statistically founded validation. The next six are very recent and based on supervised deep learning.
These six deep learning methods train and validate their neural network on two datasets (&amp;#x27;YorkUrban&amp;#x27;, &amp;#x27;Wireframe&amp;#x27;); most of them compared their results with the now classic LSD (Line Segment Detector) and EDlines, and  get a better performance than them on these datasets. The ninth paper combines deep learning and classical edge growing to achieve a purely non-supervised method. 
The seven machine learning based detectors  and EDlines are described here.  LSD and EDlines are parameter-free, fixed to allow for one false alarm on average. Our experiments show that the six purely ML based line segment detectors show  a significant variability to their end-parameters, leading to apparent missed or irrelevant detection. We also compared all nine detectors on two images: one clearly &amp;#x22;in domain&amp;#x22; for the &amp;#x27;Wireframe&amp;#x27; dataset, and the other one slightly out of domain. A quantitative comparison would be fallacious. Indeed, while differing in their search strategy, the statistical detectors share a very similar definition and decision threshold for line segments.  The purely ML-based detectors have learned from human annotators that were directed at reconstructing architectures as wireframes. Hence, these algorithms aim at a different goal, the architectural interpretation of the scene.   Yet, several of them have more complete goals than just line segment detection.  Indeed, several of them also associate to each segment a descriptor, and aim at making the pair segment+descriptor fit for image matching.  The readers are invited to judge by themselves  about the advantages and drawbacks of all methods by submitting their own images to the online demos associated with the present paper. 

**This is an MLBriefs article, the source code has not been reviewed!**&amp;#x3C;br&amp;#x3E;
The original implementations of the methods are available at the following 
links:
[[LETR|https://github.com/mlpc-ucsd/LETR]],
[[TP-LSD|https://github.com/Siyuada7/TP-LSD]],
[[M-LSD|https://github.com/navervision/mlsd]],
[[SOLD2|https://github.com/cvg/SOLD2]],
[[ULSD|https://github.com/lh9171338/ULSD-ISPRS]],
[[AFM|https://github.com/cherubicXN/afm_cvpr2019]],
[[EDlines|https://github.com/CihanTopal/ED_Lib]]
[[DeepLSD|https://github.com/cvg/DeepLSD]].</description>
</item>

</channel>
</rss>
