IPOL

IPOL IPOL http://www.ipol.im/feed/ IPOL Articles — Latest articles published in IPOL. ikiwiki 2025-03-12T14:15:47Z Specularity in NeRFs: A Comparative Study of Ref-NeRF and NRFF http://www.ipol.im/pub/art/2025/562/ Albert Barreiro, Roger Marí, Rafael Redondo, Gloria Haro, Carles Bosch, David Berga 2025-03-12T14:15:47Z 2025-03-11T23:00:00Z Neural Radiance Fields (NeRF) have emerged as a leading technology for 3D digitization, especially for their high accuracy and intricate detailing. Despite their advancements, early NeRF models struggle to handle reflections on specular surfaces effectively. To address this, alternative approaches such as Ref-NeRF and NRFF were proposed to improve fidelity in representing this physical phenomenon. This study compares these two models, providing an analysis of their effectiveness and limitations in dealing with complex specularities. We demonstrate that both methods struggle with inter-reflections and tend to model anisotropic specularities by altering the predicted surface normals. **This is an MLBriefs article, the source code has not been reviewed!** **The original source codes are available here: [[Ref-NeRF|https://github.com/kakaobrain/NeRF-Factory]] and [[NRFF|https://github.com/imkanghan/nrff]] (last checked 2025/02/06).** Latent Diffusion Approaches for Conditional Generation of Aerial Imagery: A Study http://www.ipol.im/pub/art/2025/580/ Roger Marí, Rafael Redondo 2025-03-11T08:34:52Z 2025-03-10T23:00:00Z Generative artificial intelligence is increasingly being applied in diverse areas such as architecture design, music composition, or character animation. Among the generative methods, diffusion models are today the state of the art in the synthesis of high quality images with inherent diversity and realism. This paper aims to evaluate the fidelity and realism of the synthesis achieved by different architectural variations of a latent diffusion model, which is used to generate aerial images conditioned to semantic maps. As shown in the results, the diffusion model tends to correctly capture the overall semantic structure and generates realistic textures, often with a lack of fine-grained detail. Among the conditioning variations, cross-attention layers were crucial to outline the semantic segments more accurately and exploit conditional data more effectively. **This is an MLBriefs article, the source code has not been reviewed!** Semiogram: a Visual Tool for Gait Quantification in Routine Neurological Follow-Up http://www.ipol.im/pub/art/2025/535/ Cyril Voisard, Nicolas de l'Escalopier, Damien Ricard, Laurent Oudre 2025-02-01T16:53:54Z 2025-01-31T23:00:00Z In this work, we present an innovative multidimensional tool developed for gait evaluation and monitoring in patients with neurological disorders in routine clinical practice using Inertial Sensors, named semiogram. It has previously been published and validated by Voisard et al. [C. Voisard, N. de l'Escalopier, A. Vienne-Jumeau, A. Moreau, F. Quijoux, F. Bompaire, M. Sallansonnet, M-L. Brechemier, I. Taifas, C. Tafani, E. Drouard, N. Vayatis, D. Ricard and L. Oudre, Innovative Multidimensional Gait Evaluation using IMU in Multiple Sclerosis: introducing the Semiogram, Frontiers in Neurology, 2023]. This tool offers a quantitative semiological analysis based on average speed and 16 other gait parameters, grouped into 7 criteria recognized in the literature: sturdiness, springiness, steadiness, stability, smoothness, synchronization, and symmetry. The provided visualization aims to facilitate easy interpretation by the clinician. A Review of t-SNE http://www.ipol.im/pub/art/2024/528/ Sangwon Jung, Tristan Dagobert, Jean-Michel Morel, Gabriele Facciolo 2024-10-31T14:35:08Z 2024-10-30T23:00:00Z High dimensional data is difficult to visualize. T-Distributed Stochastic Neighbor Embedding (t-SNE) is a popular technique for dimensionality reduction enabling a planar visualization of a dataset preserving as much as possible its metric. This paper explores the theoretical background of t-SNE and its accelerated version. A comparison of the performance of t-SNE on various datasets with different dimensions is also performed. Non-local Matching of Superpixel-based Deep Features for Color Transfer and Colorization http://www.ipol.im/pub/art/2024/522/ Roxane Leduc, Hernan Carrillo, Nicolas Papadakis 2024-10-29T18:30:48Z 2024-10-28T23:00:00Z In this article, we give a thorough description of the algorithm proposed in [H. Carrillo, M. Clément and A. Bugeau, Non-local matching of superpixel-based deep features for color transfer, VISAPP, 2022] for color transfer by relying on a robust non-local correspondence between low-level features at high resolution. An adaptation of this method for colorization is also described. We highlight the overall relevant results obtained with this technique for both applications and also show its limitations. Accelerating NeRF with the Visual Hull http://www.ipol.im/pub/art/2024/553/ Roger Marí 2024-07-26T07:33:30Z 2024-07-25T22:00:00Z Neural rendering methods for learning the appearance and geometry of 3D scenes have gained tremendous popularity since 2020. In this field, NeRF or Neural Radiance Fields is the best-known methodology. Given a collection of multi-view images and their camera models, NeRF optimizes a neural network to learn the color and scene geometry that render the input images according to classical volumetric rendering techniques. NeRF operates in a self-supervised manner and provides a remarkable level of detail, but the time-consuming optimization process remains a major limitation. This paper reviews the Voxel-Accelerated NeRF (VaxNeRF), a simple acceleration strategy for NeRF proposed in 2021. VaxNeRF reduces the number of point queries required in training and inference time by considering only the region of space corresponding to the visual hull, i.e., the maximum volume compatible with the object silhouettes given by the multi-view collection. VaxNeRF requires only coarse foreground-background segmentation masks and minimal changes to the original NeRF code to improve speed by a factor of 2-8, without any performance degradation. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/naruya/VaxNeRF]] (last checked 2024/07/12).** A Brief Evaluation of InSAR Phase Denoising and Coherence Estimation with Phi-Net http://www.ipol.im/pub/art/2024/549/ Roland Akiki, Jérémy Anger, Carlo de Franchis, Gabriele Facciolo, Raphaël Grandin, Jean-Michel Morel 2024-07-26T07:07:22Z 2024-07-25T22:00:00Z In this article, we examine the joint InSAR phase denoising and coherence estimation performance of the network known as Phi-Net [Sica et al., IEEE Transactions on Geoscience and Remote Sensing, 2021]. We briefly examine the method, network architecture, training data and strategy. Then, in the experimental section, we compare the network's performance against the simple boxcar uniform filter. We verify the observations made by the authors, in particular concerning the superior denoising performance and preservation of fine details in the coherence estimation. Our experiments also indicate that an end-to-end deep learning method might bring a small improvement to the patch-based approach adopted in Phi-Net. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/DLRRadarScienceGroup/Phi-Net]] (last checked 2024/07/10).** Survival Forest for Left-Truncated Right-Censored Data http://www.ipol.im/pub/art/2024/466/ Vincent Laurent, Olivier Vo Van 2024-07-16T14:52:37Z 2024-07-15T22:00:00Z The estimation of the lifetime of an industrial equipment or a patient is often based on censored data, because the event of interest is observed only for a subsample of observations. The use of the Random Forest algorithm applied to industrial data is relevant because the algorithm presents robust performances in many applications. Coupled with survival approaches, it can produce time trajectories for each subset of the feature space and thus differentiate observed objects with respect to their lifetimes. Our work aims to generalize the existing tree-based approach CART applied to left-truncated right-censored data to obtain a Random Forest algorithm. We provide a simple API to use such algorithm as well as tools to validate a temporal score against censored data. Dehazing with Dark Channel Prior: Analysis and Implementation http://www.ipol.im/pub/art/2024/530/ Jose-Luis Lisani, Charles Hessel 2024-06-23T11:50:12Z 2024-06-23T22:00:00Z In outdoor scenes, atmospheric absortion and scattering attenuate the radiance received by the camera and may produce haze. In 2009 He et al. proposed a simple but effective dehazing algorithm based on a hypothesis called the 'dark channel prior' (DCP). Based on this prior several other dehazing methods have been published in recent years. In this paper we review the original algorithm by He et al, together with some posterior improvements proposed by the same and other authors. We also analyze the effect of the parameters on the results and we study a variant of the method proposed by Drews et al. for the analysis of haze in underwater images. A Brief Analysis of SLAVC method for Sound Source Localization http://www.ipol.im/pub/art/2024/525/ Xavier Juanola, Gloria Haro 2024-05-29T17:10:19Z 2024-05-28T22:00:00Z Mo and Morgado introduced in 2022 a novel self-supervised learning approach for Visual Sound Source Localization, denoted as SLAVC [Mo, S. and Mordado, P., A Closer Look at Weakly-Supervised Audio-Visual Source Localization, Advances in Neural Information Processing Systems, 2022]. The proposed method is based on multiple-instance contrastive learning. In addition to improving the results of previous methods, it also solves two critical problems that former methods faced: 1) excessive overfitting despite training on extensive datasets, 2) tendency to hallucinate sound sources even without visual evidence to support it in the video. In this paper, we briefly present the method, offer an online executable version allowing the users to test it on their own image-audio pairs and propose some improvements that could benefit the model as future work. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/stoneMo/SLAVC]] (last checked 2024/05/26).** A Short Analysis of BigColor for Image Colorization http://www.ipol.im/pub/art/2024/542/ Rosana García, Gregory Randall, Lara Raad 2024-05-24T15:56:17Z 2024-05-23T22:00:00Z This work analyzes the BigColor method, a fully automatic colorization approach that aims to meet the challenge of providing realistic and vivid colorization for complex and diverse images in real-world scenarios. The method is a BigGAN-inspired encoder-generator network, using a spatial feature map, enabling single forward-pass colorization, supporting arbitrary input resolutions, and producing multimodal colorization results. We provide a short analysis of the method's results and highlight some limitations alongside its achievements. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/KIMGEONUNG/BigColor]] (last checked 2024/05/23).** A Brief Analysis of iColoriT for Interactive Image Colorization http://www.ipol.im/pub/art/2024/539/ Rosana García, Gregory Randall, Lara Raad 2024-05-22T08:55:02Z 2024-05-21T22:00:00Z This paper briefly describes and analyzes iColoriT, a hybrid colorization method based on a Vision Transformer that propagates user hints to relevant regions of a grayscale image while using color priors learned from a large image dataset. This approach gives users more control over color inference and shows a quick way to achieve results. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/pmh9960/iColoriT]] (last checked 2024/05/16).** A Brief Analysis of the Generic Framework for the Structured Abstraction of Images http://www.ipol.im/pub/art/2024/495/ Noura Faraj, Lucía Bouza, Julie Delon 2024-05-14T09:44:03Z 2024-05-13T22:00:00Z In this study, we present a simplified implementation of a versatile framework for structured image abstraction, as outlined in [Faraj et al., A Generic Framework for the Structured Abstraction of Images, International Symposium on Non-Photorealistic Animation and Rendering, 2017]. The framework relies on the topographic map, a hierarchical and geometric representation composed of all shapes of an image, organized in a tree structure. Within this framework, abstract renderings of digital photographs can be generated by iteratively applying simple local operations like replacement, removal, or rotation of shapes. These operations give rise to a diverse spectrum of renderings, spanning from geometrical abstraction and painting-like effects to style transfer. We perform a brief analysis of the results produced by the method, highlighting its quality and limitations. Image Forgery Detection Based on Noise Inspection: Analysis and Refinement of the Noisesniffer Method http://www.ipol.im/pub/art/2024/462/ Marina Gardella, Pablo Musé, Miguel Colom, Jean-Michel Morel 2024-04-04T08:27:59Z 2024-04-03T22:00:00Z Images undergo a complex processing chain from the moment light reaches the camera's sensor until the final digital image is delivered. Each of its operations leaves traces on the noise model which enable forgery detection through noise analysis. In this article, we describe the Noisesniffer method [Gardella et al., Noisesniffer: a Fully Automatic Image Forgery Detector Based on Noise Analysis, IEEE International Workshop on Biometrics and Forensics, 2021]. This method estimates for each image a background stochastic model which makes it possible to detect local noise anomalies characterized by their number of false alarms. We improve on the original formulation of the method by introducing a region-growing algorithm to detect local deviations from the background model. Results show that the proposed method outperforms the previous version as well as the state of the art. Localization and Image Reconstruction in a STORM Based Super-resolution Microscope http://www.ipol.im/pub/art/2024/496/ Pranjal Choudhury, Bosanta Ranjan Boruah 2024-02-28T11:26:14Z 2024-02-27T23:00:00Z In this paper, we present a comprehensive Python program for localizing the point spread functions (PSFs) present in a stack of images and thereby rendering a super-resolved image in a Stochastic Optical Reconstruction Microscopy (STORM). A microscope that provides super-resolved images is known as a super-resolution microscope. Optical super-resolution microscopy is playing a pivotal role in advancing the field of optical imaging and has found applications in a number of areas such as cellular biology, biotechnology, medical research, and nanotechnology. The proposed Python program utilizes image processing techniques to accurately identify the PSFs present in highly noisy images with densely packed fluorescent objects. Our program not only provides all the necessary tools for image reconstruction in a STORM microscope under open source license but also offers certain advantages over the existing reconstruction software packages. Some such advantages are an option to start the reconstruction process and the visualization of the rendered super-resolved image in parallel with image acquisition and disposal of the images immediately after acquisition for minimum use of disk space. Parallel visualization of the reconstructed image allows aborting the image acquisition in the case the images are not suitable for super-resolution, thereby saving valuable time. Our Python program is demonstrated using a number of different image stacks. The proposed software code can be applied not only to STORM but also to any other super-resolution technique using single-molecule localization. Line Segment Detection: a Review of the 2022 State of the Art http://www.ipol.im/pub/art/2024/481/ Thibaud Ehret, Jean-Michel Morel 2024-02-28T11:02:56Z 2024-02-27T23:00:00Z We compare nine line segment detectors. The two more ancient ones are based on classical edge growing followed by a statistically founded validation. The next six are very recent and based on supervised deep learning. These six deep learning methods train and validate their neural network on two datasets ('YorkUrban', 'Wireframe'); most of them compared their results with the now classic LSD (Line Segment Detector) and EDlines, and get a better performance than them on these datasets. The ninth paper combines deep learning and classical edge growing to achieve a purely non-supervised method. The seven machine learning based detectors and EDlines are described here. LSD and EDlines are parameter-free, fixed to allow for one false alarm on average. Our experiments show that the six purely ML based line segment detectors show a significant variability to their end-parameters, leading to apparent missed or irrelevant detection. We also compared all nine detectors on two images: one clearly "in domain" for the 'Wireframe' dataset, and the other one slightly out of domain. A quantitative comparison would be fallacious. Indeed, while differing in their search strategy, the statistical detectors share a very similar definition and decision threshold for line segments. The purely ML-based detectors have learned from human annotators that were directed at reconstructing architectures as wireframes. Hence, these algorithms aim at a different goal, the architectural interpretation of the scene. Yet, several of them have more complete goals than just line segment detection. Indeed, several of them also associate to each segment a descriptor, and aim at making the pair segment+descriptor fit for image matching. The readers are invited to judge by themselves about the advantages and drawbacks of all methods by submitting their own images to the online demos associated with the present paper. **This is an MLBriefs article, the source code has not been reviewed!** The original implementations of the methods are available at the following links: [[LETR|https://github.com/mlpc-ucsd/LETR]], [[TP-LSD|https://github.com/Siyuada7/TP-LSD]], [[M-LSD|https://github.com/navervision/mlsd]], [[SOLD2|https://github.com/cvg/SOLD2]], [[ULSD|https://github.com/lh9171338/ULSD-ISPRS]], [[AFM|https://github.com/cherubicXN/afm_cvpr2019]], [[EDlines|https://github.com/CihanTopal/ED_Lib]] [[DeepLSD|https://github.com/cvg/DeepLSD]]. Comparing Interactive Image Segmentation Models under Different Clicking Procedures http://www.ipol.im/pub/art/2024/498/ Franco Marchesoni-Acland 2024-01-19T08:46:51Z 2024-01-18T23:00:00Z Interactive image segmentation (IIS) methods are usually evaluated in terms of segmentation performance vs.\ number of clicks (NoC). However, the automatic evaluation depends on a clicking procedure and its relation to the procedure used for training. In this work we compare qualitatively and quantitatively two state-of-the-art IIS methods that report the best performances but have not been compared against each other. We show i) what method is better, ii) that the performance is sensitive to clicking procedures, iii) what method is more robust to clicking procedures, and iv) that training with a specific clicking procedure does not guarantee the best performance using it. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/SamsungLabs/ritm_interactive_segmentation]] (last checked 2023/09/12).** On the Domain Generalization Capabilities of Interactive Segmentation Methods http://www.ipol.im/pub/art/2024/499/ Franco Marchesoni-Acland, Tanguy Magne, Fayçal Rekbi, Gabriele Facciolo 2024-01-19T09:04:35Z 2024-01-18T23:00:00Z Interactive image segmentation (IIS) methods are usually trained over segmentation datasets containing natural images. They are also usually evaluated over natural images. However, the most common use case is the annotation of new images from a different domain. Yet, the performance of IIS methods on a different domain is seldom reported. In this work, we evaluate a state-of-the-art IIS method trained with natural images over an aerial image dataset. Its performance is compared to the performances the method achieves when being trained/finetuned with aerial images. The comparison reveals that there is a big domain generalization gap. **This is an MLBriefs article, the source code has not been reviewed!** **The original source code is [[available here|https://github.com/SamsungLabs/ritm_interactive_segmentation]] (last checked 2023/09/12).** Arm-CODA: A Data Set of Upper-limb Human Movement During Routine Examination http://www.ipol.im/pub/art/2024/494/ Sylvain W. Combettes, Paul Boniol, Antoine Mazarguil, Danping Wang, Diego Vaquero-Ramos, Marion Chauveau, Laurent Oudre, Nicolas Vayatis, Pierre-Paul Vidal, Alexandra Roren, Marie-Martine Lefèvre-Colau 2024-01-18T19:16:34Z 2024-01-17T23:00:00Z This article thoroughly describes a data set of 240 multivariate time series collected using 34 Cartesian Optoelectronic Dynamic Anthropometer (CODA) markers placed on the upper limb of 16 healthy subjects each undergoing 15 predefined movements such as raising their arms or combing their hair. Each sensor records its position in the 3D space. In total, 2.5 hours of time series are collected. A remarkable aspect of this data set is the extensive availability of metadata: subjects' characteristics (age, height, etc.) as well as movements' annotations. Indeed, for each subject and each movement, the start and end time stamps of at least two iterations of the same movement are provided. In addition to the study of human motion, this data set can be used to evaluate generic time series analytical tasks such as multivariate time series segmentation, clustering or classification. Implementation of Image Denoising based on Backward Stochastic Differential Equations http://www.ipol.im/pub/art/2023/467/ Dariusz Borkowski 2023-12-09T19:22:41Z 2023-12-08T23:00:00Z In this paper, we give the implementation of an image denoising algorithm based on backward stochastic differential equations. In our algorithm, we consider two stochastic processes. One of them has values in the image domain and determines pixels that will be involved in the reconstruction, the second one has values in the image codomain and gives weights to values of pixels. The reconstructed image is characterized by smoothing noisy pixels and at the same time enhancing edges. Our experiments show that the new approach gives very good results and can be successfully used to reconstruct images. A Reference Data Set for the Study of Healthy Subject Gait with Inertial Measurements Units http://www.ipol.im/pub/art/2023/497/ Cyril Voisard, Nicolas de l’Escalopier, Albane Moreau, Alienor Vienne-Jumeau, Damien Ricard, Laurent Oudre 2023-12-08T19:23:42Z 2023-12-07T23:00:00Z This article provides a comprehensive description of a dataset consisting of 110 multivariate gait signals collected using three inertial measurement units. The data was obtained from a sample of 19 healthy subjects who followed a predefined protocol: standing still, walking 10 meters, turning around, walking back, and stopping. One notable aspect of this dataset is the inclusion of extensive signal metadata, including the start and end timestamps of each footstep, along with contextual information for each trial. Part of this dataset was previously used to develop and assess a gait event detection algorithm [Voisard et al., Automatic Gait Events Detection with Inertial Measurement Units: Healthy Subjects and Moderate to Severe Impaired Patients], and as a reference for a multidimensional tool in gait quantification [Voisard et al., Innovative Multidimensional Gait Evaluation using IMU in Multiple Sclerosis: introducing the Semiogram]. A Signal-dependent Video Noise Estimator Via Inter-frame Signal Suppression http://www.ipol.im/pub/art/2023/420/ Yanhao Li, Marina Gardella, Quentin Bammey, Tina Nikoukhah, Rafael Grompone von Gioi, Miguel Colom, Jean-Michel Morel 2023-11-09T15:57:59Z 2023-11-08T23:00:00Z We propose a block-based signal-dependent noise estimation method on videos, that leverages inter-frame redundancy to separate noise from signal. Block matching is applied to find block pairs between two consecutive frames with similar signal. Then the Ponomarenko et al. method is extended to video by sorting pairs by their low-frequency energy and estimating noise in the high frequencies. Experiments on a real dataset of drone videos show its performance for different parameter settings and different noise levels. Two extensions of the proposed method using subpixel matching and for multiscale noise estimation are respectively analyzed. OpenCCO: An Implementation of Constrained Constructive Optimization for Generating 2D and 3D Vascular Trees http://www.ipol.im/pub/art/2023/477/ Bertrand Kerautret, Phuc Ngo, Nicolas Passat, Hugues Talbot, Clara Jaquet 2023-11-01T11:34:36Z 2023-10-31T23:00:00Z In this article, we focus on the algorithm called CCO (Constrained Constructive Optimization), initially proposed by Schreiner and Buxbaum [Computer-Optimization of Vascular Trees, IEEE Transactions on Biomedical Engineering, 40, 1993] and further extended by Karch et al. [A Three-Dimensional Model for Arterial Tree Representation, Generated by Constrained Constructive Optimization, Computers in Biology and Medicine, 29, 1999]. This algorithm can be considered as one of the gold standards for vascular tree structure generation. Modeling and/or simulating the morphology of vascular networks is a challenging but crucial task that can have a strong impact on different applications such as fluid simulation or learning processes related to image segmentation. Various implementations of CCO were proposed over the last years. However, to the best of our knowledge, there does not exist any open-source version that faithfully follows the native CCO algorithm. Our purpose is to propose such an implementation both in 2D and 3D. Implementing Handheld Burst Super-Resolution http://www.ipol.im/pub/art/2023/460/ Jamy Lafenetre, Gabriele Facciolo, Thomas Eboli 2024-05-28T09:07:17Z 2023-07-15T22:00:00Z Nowadays, smartphone cameras capture bursts of raw photographs whenever the trigger is pressed. These photos are then fused to produce a single picture with higher quality. This paper details the implementation of the method 'Handheld Multi-Frame Super-Resolution algorithm' by Wronski et al. (used in the Google Pixel 3 camera), which performs simultaneously multi-image super-resolution demosaicking and denoising from a burst of images. Hand tremors during exposure cause subpixel motions, which combined with the Bayer color filter array of the sensor results in a collection of aliased and shifted raw photographs of the same scene. The algorithm efficiently aligns and fuses these signals into a single high-resolution one by leveraging the aliasing to reconstruct the high-frequencies of the signal up to the Nyquist rate of the sensor. This approach yields digitally zoomed images up to a factor of 2, which is the limit naturally set by the sensor pixel integration. We present an in-depth description of this algorithm, along with numerous implementation details we have found to reproduce the results of the original paper, whose code is not publicly available. An Overview of GANet - Guided Aggregation Net for End-to-end Stereo Matching http://www.ipol.im/pub/art/2023/441/ Alvaro Gómez 2023-07-16T09:52:35Z 2023-07-15T22:00:00Z Guided Aggregation Net for End-to-end Stereo Matching (GANet) is a stereo matching method that uses Deep Neural Networks (DNN) to compute a disparity map from a pair of images of a scene. As other classic and DNN stereo methods, it follows the traditional stereo steps: dense features are extracted from both images, the cost of matching the features at different disparities is organized in a Cost Volume (CV) which is regularized by aggregation and local filtering and finally a map with minimal cost is derived from the CV. In GANet, the aggregation of the CV is done by a Semi-Global Guided Aggregation layer (SGA) which implements a differentiable approximation of the well known Semi-Global Matching (SGM) algorithm. SGA is followed by a Local Guided Aggregation layer (LGA) that performs a local filtering. SGA and LGA weights are generated by an auxiliary guidance subnet fed with the original reference image and its extracted features. This article presents an overview of GANet. An online demo, running on CPU, is made available. **This is an MLBriefs article, the source code has not been reviewed!** The original source code is available [[here|https://github.com/feihuzhang/GANet]] (last checked 2023/07/16).