LSD: a Line Segment Detector
edit This workshop may eventually be submitted and published.

DRAFT Version - 2 sept 2010

Authors

Summary

LSD is a linear-time Line Segment Detector that gives accurate results, a controlled number of false detections, and requires no parameter tuning. The method is based in Burns, Hanson, and Riseman method, and uses an a contrario validation approach according to Desolneux, Moisan, and Morel theory.

References

  1. Rafael Grompone von Gioi, Jérémie Jakubowicz, Jean-Michel Morel, Gregory Randall, LSD: A Fast Line Segment Detector with a False Detection Control, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 4, pp. 722-732, April 2010. preprint

  2. J. Brian Burns, Allen R. Hanson, Edward M. Riseman, Extracting Straight Lines, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 4, pp. 425-455, 1986.

  3. Agnès Desolneux, Lionel Moisan, Jean-Michel Morel, From Gestalt Theory to Image Analysis, a Probabilistic Approach, Springer 2008.

  4. Rafael Grompone von Gioi, Jérémie Jakubowicz, Jean-Michel Morel, Gregory Randall, On Straight Line Segment Detection, Journal of Mathematical Imaging and Vision, vol. 32, no. 3, pp. 313-347, November 2008. preprint

Source Code

Ansi C implementation of LSD version 1.4 here.

An older version in Megawave2 framework is available here (this version corresponds better to the algorithm described in [1], but does not include the further improvements described here and included in the current version).

Overview

The algorithm is aimed at detecting locally straight edges on images, and that is what we call line segments. Edges are fast transitions from dark to light on an image. On that parts of the image, the gradient of the image is roughly orthogonal to the edge and the level-lines roughly follows the direction of edges, as is shown in the following figure:

The algorithm starts by computing the level-line angle at each pixel to produce a Level-Line Field. Then, this field is segmented into connected regions of pixels that share the same level-line angle, up to a certain tolerance τ. These connected regions are called Line Support Regions:

Each Line Support Region (a set of pixels) is a candidate for a Line Segment. But the corresponding geometrical object (a rectangle in this case) must be associated with it. The principal inertial axis of the Line Support Region is used as main rectangle direction; the size of the rectangle is chosen to cover the full region:

Each rectangle is subject to a validation procedure. The pixels in the rectangle whose level-line angle corresponds to the angle of the rectangle up to a tolerance τ are called aligned points. The total number of pixels in the rectangle, n, and its number of aligned points, k, are counted and used to validate or not the rectangle as a detected Line Segment.

The validation step is based on the a contrario approach and the Helmholtz principle proposed by Morel, Moisan, and Desolneux [3]. The so-called Helmholtz principle states that no perception should be produced on noise images. Accordingly, the a contrario approach propose to define a noise or non-structured model where the desired structure is not present. Then, an event is validated if the expected number of events as good as the observed one is small on the non-structured model.

In the case of Line Segments, we are interested in the number of aligned points and we consider the events that has at least the same number of aligned points as the observed one. Thus, if we observe a rectangle with k aligned points among n points, the expected number of events as good as the observed one is

N_{test} \cdot P_{H_0}[K\geq k]

where N_{test} is the total number of possible rectangles considered, P_{H_0} is the probability on the non-structured model H_0, and K is a random variable on H_0.

The non-structured model for line segment detection is a level-line field where:

On this setting, the probability that a pixel on the non-structured model is an aligned point is

p=\frac{\tau}{\pi}

and, as a consequence of the independence of the random variables, the probability term P_{H_0}[K\geq k] is given by the binomial tail b(n,k,p).

The number of tests N_{test} corresponds to the number of possible rectangles considered, and that is all the rectangles starting and ending in pixels of the image, and the width is an integer number of pixels. In an N\times M image size, the number of tests is

(NM)^{5/2}.

Finally, we define the Number of False Alarms (NFA) associated with a rectangle r as

\textrm{NFA}(r) = (NM)^{5/2}\cdot b(n,k,p).

When the NFA value associated with a rectangle is large, that means that an event as good as the observed one is expected on the non-structured model, i.e., a common event and thus a non relevant one. On the other hand, when the NFA value is small, the event corresponds to a rare event and probably a meaningful one. A threshold ε is selected and rectangles with NFA(r)<ε are called ε-meaningful rectangles and are the detections of the algorithm.

Following Desolneux, Moisan, and Morel [3], we set ε=1 once for all. This corresponds to accepting on average one false detection per image in the non-structured model, which is reasonable. Also, the detection result is not sensible to the value of ε so setting it to any reasonable value would produce very similar results.

Algorithm

The LSD algorithm takes a gray-level image as input and returns a list of detected line segments. The algorithm can be described by the following 11 steps that will be described in detail below:

  1. Scale image to scale S using Gaussian sub-sampling (σ=Σ/S).
  2. Compute the gradient magnitude and level-line orientation at each pixel.
  3. Remove pixels whose gradient magnitude is less than \rho.
  4. Build a list of the remaining pixels, pseudo-ordered according to image gradient magnitude.
  5. For each pixel in the list, starting with the ones with the higher gradient magnitude:
    1. Grow a region of connected pixels belonging to the list, that share the same level-line angle up to tolerance τ.
    2. Compute the rectangular approximation for the connected region of pixels found.
    3. Compute the NFA value for the rectangle found.
    4. If the density of aligned points in the rectangle is less than D, cut the region, until the density restriction is satisfied.
    5. Try to modify the rectangle to improve the NFA value.
    6. If the NFA value < ε, add the rectangle to the output list, and remove its aligned points from the list of pixels.

The algorithm depends on six parameters: S, Σ, \rho, τ, D, and ε. As we will see, all of these parameters have reasonable default values that produce good results in most images. Thus, there is no need for parameter tuning and the algorithm can be used parameterless.

Image scaling

The first step of LSD is to scale the input image. This step can be useful to analyze structures present at different scales of an image. This step can also be skipped (by selecting S=1) to process the actual image. The scale S is, a priori, a choice of the user and in a sense a true parameter or tool to control the analysis performed by LSD. However, the default value sets a scaling to 80% of the image size (S=0.8); the reason is to avoid aliasing and quantization artifacts present in many images while producing almost the same result as a full scale analysis.

The scaling is performed by a Gaussian sub-sampling of the image with σ=Σ/S. The values of Σ is set to 0.6 that gives a good balance between aliasing and bluring of the image.

Gradient computation

The image's gradient is computed at each pixel using a 2x2 mask. Given

\begin{array}{c|c|c|c}
                  \ddots & \vdots    & \vdots      & \cdots \\
                  \hline
                  \cdots & i_{x,y}   & i_{x+1,y}   & \cdots \\
                  \hline
                  \cdots & i_{x,y+1} & i_{x+1,y+1} & \cdots \\
                  \hline
                  \cdots & \vdots    & \vdots      & \ddots \\
                \end{array}

the image gradient is compute as

g_x(x,y)=\frac{i_{x+1,y}+i_{x+1,y+1}-i_{x,y}-i_{x,y+1}}{2},\quad
                g_y(x,y)=\frac{i_{x,y+1}+i_{x+1,y+1}-i_{x,y}-i_{x+1,y}}{2}.

The level-line angle is computed as

\arctan{\left(\frac{g_x(x,y)}{-g_y(x,y)}\right)}

and the gradient norm as

\sqrt{ g_x^2(x,y) + g_y^2(x,y) }.

The reason for using this simple schema is to use the smallest possible mask size in its computation, thus reducing to the minimum possible the dependence of the computed gradient values.

In the non-structured model, the level-line field is composed of independent random variables at each pixel. But the computed level-line field is never strictly independent because adjacent pixel values are used to compute it. This does not prevent to use an a contrario approach, but requires a more involved analysis to compute the detection threshold. Actually, numerical simulations have shown that the same threshold deduced for the case of independent level-line field also controls the number of false detections when computed by the 2x2 mask, see [4].

Gradient threshold

Pixels with small gradient norm corresponds to flat zones or slow gradients. Also, they naturally present a higher error in the gradient computation due to the quantization of its values. In LSD the pixels with gradient magnitude smaller then ρ are rejected and are not used in the constructions of line-support regions or rectangles. The threshold used is defined by

\rho=\frac{q}{\sin\tau}

where q is a bound to the possible error in the gradient value due to quantization effects [1]. In the usual case, the pixel values are quantized to integer values in \{0,1,\dots,255\}. Thus, the maximum possible error in the gradient is 2. The default value is then q=2.

Gradient Pseudo-ordering

Pixels with higher gradient magnitude corresponds better to edges, usually corresponds to more contrasted edges and, in them, to pixels near the center of the edge. Thus, it makes sense to start looking for line segments at pixels with higher gradient magnitude. Ordering the pixels in strict gradient magnitude order is, however, expensive in computational time. It is not possible to be done in linear time.

But a pseudo-ordering of the pixels is possible in linear-time, and that is what is done in LSD. Pixels are classified into bins according the their gradient value, and then a list is done beginning with the pixels in the bin with highest gradient values, then the pixels on the second bin and so on, ending the list with the pixels in the bin with smallest gradient values.

The default values imply the use of 1024 bins, with gradient norms between zero and 255, which cover the full range of possible gradient values when the image gray level values are in the range [0,255].

Region Growing

Starting from each pixel in the list of unused pixels, a region growing algorithm is applied trying to form a line-support region. Recursively, the unused neighbors pixels of the pixels in the region are tested, and the ones that share the level-line angle with the so-called region angle up to a tolerance τ are added to the region. The region angle \theta_{region} starts being the level-line angle of the first point and it is updated each time a pixel is added to the following value

\arctan\left(\frac{\sum_i \sin(\textrm{level-line-angle}_i)}
                       {\sum_i \cos(\textrm{level-line-angle}_i)}\right)

where the index i runs over the pixels in the region. This is a pseudo-mean of the level-line angles of all pixels in the region. The process is repeated until no more pixels can be added to the region. The following pseudo-code gives a precise definition:

  1. The initial point P is added to the Region
  2. \theta_{region} is set to the level-line angle of pixel P
  3. S_x\leftarrow\cos(\theta_{region})
  4. S_y\leftarrow\sin(\theta_{region})
  5. For each pixel P in the Region do:
    1. For each \bar{P} neighbor of P and Status(\bar{P})\neq\textrm{Used} do:
      1. If \textrm{AngleDifference}\left(\theta_{region},
           \textrm{level-line-angle}(\bar{P})\right)<\tau do:
        1. Add \bar{P} to the Region
        2. \textrm{Status}(\bar{P})\leftarrow\textrm{Used}
        3. S_x \leftarrow S_x+\cos(\textrm{LevelLineAngle}
            (\bar{P}))
        4. S_y \leftarrow S_y+\sin(\textrm{LevelLineAngle}
            (\bar{P}))
        5. \theta_{region} \leftarrow \arctan(S_y/S_x)

The neighborhood used is 8-connected, so the neighbors of pixel I(x,y) are I(x-1,y-1), I(x,y-1), I(x+1,y-1), I(x-1,y), I(x+1,y), I(x-1,y+1), I(x,y+1), and I(x+1,y+1).

Rectangular Approximation

A line segment corresponds to a geometrical event, a rectangle. Before evaluating a line-support region, the rectangle associated with it must be found. The region of pixels is interpreted as a solid object and the gradient magnitude of each pixel is used as the mass of that point. Then, the center of mass of the region is used as the center for the rectangle and the main direction of the rectangle is set to the first inertia axis of the region. Finally, the width and length are chosen as the smallest ones that cover the full line-support region.

The center of the rectangle (C_x,C_y) is set to:

C_x = \frac{\sum_{i\in \textrm{Region}} G(i) \cdot x(i)}
                           {\sum_{i\in \textrm{Region}} G(i)}

C_y = \frac{\sum_{i\in \textrm{Region}} G(i) \cdot y(i)}
                           {\sum_{i\in \textrm{Region}} G(i)}

where G(i) is the gradient magnitude of pixel i. The main rectangle's angle is set to the eigenvector's angle of the smaller eigenvalue of the matrix

\left(\begin{array}{cc}
                        I_{xx} & I_{xy} \\
                        I_{xy} & I_{yy} \\
                      \end{array}\right)

with

I_{xx} = \frac{\sum_{i\in\textrm{Region}}G(i)\cdot(x(i)-C_x)^2}
                              {\sum_{i\in\textrm{Region}}G(i)}

I_{yy} = \frac{\sum_{i\in\textrm{Region}}G(i)\cdot(y(i)-C_y)^2}
                              {\sum_{i\in\textrm{Region}}G(i)}

I_{xy} = \frac{\sum_{i\in\textrm{Region}}G(i)\cdot
                (x(i)-C_x)(y(i)-C_y)}{\sum_{i\in\textrm{Region}}G(i)}

NFA Computation

A key concept in the validation of a rectangle is that of p-aligned points (or pixels), that are pixels in the rectangle whose level-line angle is equal to the rectangle's main orientation, up to a tolerance of p\pi. The precision p is initially set to the value \tau/\pi but other values are also tested as we will see later. The total number of pixels in the rectangle is denoted by n and the number of p-aligned points is denoted by k. Then, the Number of False Alarms (NFA) value associated with the rectangle r is

\textrm{NFA}(r) = (NM)^{5/2}\cdot b(n,k,p)

where N and M are the size of the image, and b(n,k,p) is the binomial tail defined by

b(n,k,p)=\sum_{i=k}^{n}\binom{n}{i}p^i(1-p)^{n-i}.

All in all, for each rectangle being evaluated and given a precision p, the number k of p-aligned pixels is counted along with the total number n of pixels in it. Then the NFA value is computed by

\textrm{NFA}(r) = (NM)^{5/2} \cdot
                                  \sum_{i=k}^{n}\binom{n}{i}p^i(1-p)^{n-i}.

The rectangles with \textrm{NFA}(r) < \varepsilon are validated as detections.

Aligned Points Density

In some cases, the angle tolerance τ method produces the wrong interpretation. This problem arises when two straight edges are present in the image forming an angle between them smaller than the tolerance τ. The following image shows an example of a line-support region found (in gray) and the rectangle corresponding to it.

This line-support region would be better interpreted as two thinner rectangles, one longer than the other, forming an obtuse angle.

The only solution proposed in LSD to this problem is to detect it and cut the line-support region into two smaller regions, then the same algorithm is applied to each of the two new regions. The detection is based on the density of aligned points in the rectangle. In a well detected straight edge, the rectangle is well adapted to the line-support region and the density of aligned points is high. When this "angle problem" is present, as can be seen in the previous figure, the density of aligned points is low. Also, when a slightly curved edge is being locally approximated by straight edges, the degree of the approximation (how many line segments are used to cover part of curve) is related to the density of aligned points; this parameter, then, controls how curves are approximated by line segments too.

The density of aligned points of a rectangle is computed as the ratio of the number of aligned points (k in the previous notation) to the area of the rectangle:

d = \frac{k}{\textrm{length}(r) \cdot \textrm{width}(r)}.

A threshold D is defined and rectangles should have a density d larger or equal to D to be accepted. The default value of this parameter is 0.7.

When the density criterion is not satisfied, two methods of cutting the region are tried: reduce angle tolerance and reduce region radius. In both methods, part of the pixels in the region are kept, while the others are marked again as NOT USED, so they can be used again in future line-support regions.

Reduce angle tolerance

The first method is used only once and tries to guess from the region what should be the tolerance τ' that adapts to the region. When two or more straight regions are present, this method is expected to get the tolerance that would get only one of these regions, the one containing the seed pixel. Then, the same region growing algorithm is used again, starting from the same seed pixel but using the newly estimated tolerance value.

If all the pixels in the region were used in the estimation of the tolerance, the new value would be such that all the pixels would still accepted. Instead, only the pixels near the seed pixel are used. Actually, only the pixels whose distance to the seed point is less than the width of the computed rectangle. In that way, the size neighborhood used in the estimation of τ' is related to the size of the computed rectangle.

All the pixels in that neighborhood of the seed point are evaluated, and the new tolerance τ' is set to twice the standard deviation of the level-line angles of these pixels.

Reduce region radius

If the previous method was not successful in producing a line-support region that satisfy the density criterion, a second method is repetitively tried: removing the pixels that are farther from the seed pixel. This method work best when the line-support region corresponds to a curve and the region needs to be reduced until the density criterion is satisfied, usually meaning that a certain degree of approximation to the curve is obtained.

The pixel in the region that is farther from the seed point is used as radius of the region, and at each iteration the radius is reduced to 75 % of its value.

This process is repeated until the density criterion is satisfied or there are not enough pixels in the region.

Rectangle Improvement

Before rejecting a line-support region for being not meaningful (NFA > ε) LSD tries some variations to the initially found rectangle's configuration in search for a valid configuration.

The relevant factors tested are the precision p used (that determines the tolerance that defines p-aligned points) and the width of the rectangle.

The initial precision used, corresponding to the region growing tolerance τ is large enough and only testing smaller values make sense. The same is true for width of the rectangle because the initial width was chosen to cover all the line-support region. But in some cases, for example, reducing in one pixel the width may reduce the number of aligned points only by one, while reducing the total number of pixels by a number equal to the length of the rectangle; the balance could improve the NFA value greatly.

The Improvement routine of LSD consists of the following steps:

  1. try finer precisions
  2. try to reduce width
  3. try to reduce one side of the rectangle
  4. try to reduce the other side of the rectangle
  5. try even finer precisions

If a meaningful rectangle is found (NFA > ε) the Improvement routine will stop after the step that found it.

Step 1 tries five consecutive times to halve the precision value; the value that produces the best NFA value is kept. So this step tests the precision values p/2, p/4, p/8, p/16, and p/32, where p is the initial precision value.

Step 2 tries up to five times to reduce the rectangle width in 0.5 pixels. That means that the width values tested are W, W-0.5, W-1, W-1.5, W-2, and W-2.5, where W is the initial width value.

Step 3 tries five times to reduce only one side of the rectangle in 0.5 pixel. This implies reducing the width of the rectangle in 0.5 pixels but also moving the center of the rectangle in 0.25 pixels in order that the other side of the rectangle remains on its place. So the side displacements tested are 0.5, 1, 1.5, 2, and 2.5 pixels.

Step 4 do the same thing as step 3 but to the other side.

Step 5 tries again to reduce the precision. Again it tests five consecutive times to halve the precision value. This step, then test the precision values \hat{p}/2, \hat{p}/4, \hat{p}/8, \hat{p}/16, and \hat{p}/32, where \hat{p} is the precision at the beginning of this step.

Examples

The following set of examples try to give an idea of the kind of results obtained with LSD, both good and and bad results:

office: A good result. The line segments detected corresponds to straight structures in the image. The detection corresponds roughly with the expected result.

marilyn: Another good result. Most detected line segments corresponds to structures whose projection in the image is locally straight, even if in reality they are not straight or flat objects.

lsd: Note that LSD detect locally straight edges, so each black strokes produce two detections, one for each white to black transition.

pipi: Note that when curves are present, LSD produce short line segments corresponding to curve sections that are locally straight. The result is a polygonal approximation for curves.

noise: LSD was designed to provide a good false detection control. Its false detection control is based in automatically providing detections thresholds that prevent detections that could happen by chance in a non-structured image.

square: The square is masked by noise and is not detected by LSD. But the square can be detected if the image is analyzed at a different scale by Gaussian sub-sampling.

square-subsampling: When a Gaussian sub-sampling is applied to the previous image, the noise is partially removed and the expected line segments are detected.

wall: Some regions are partially anisotropic and partially straight. This regions can produce unexpected detections.

map: In some images we get detections that seems strange at first sight, even if they are correct.

gibbs2: Image compression Gibbs effect is responsible of many unexpected detections.

Video

The video here (mp4 file, 56 MB) shows the result of applying LSD, frame per frame, to the original video here.

π