Project #4: Intelligent Scissors

Overview

In this project, you will implement the on-the-fly, single goal variant of Dijkstra’s algorithm for finding the best path between each pair of given selection points along the boundary of an image.

Objectives

Apply a variant of Dijkstra’s algorithm to a real-world problem.
Solve a non-trivial image processing problem.
Mature as problem-solvers.
Understand the importance of choosing data structures wisely in order to achieve good performance.
Compare a naive greedy “Simple Scissors” algorithm with the Dijkstra’s-based “Intelligent Scissors” algorithm.

Code Distribution

http://faculty.cs.byu.edu/~ringger/cs312/projects/Project4-IntelligentScissors-VS2008.zip

Background

As originally defined by Mortensen and Barrett of BYU (1), “Intelligent Scissors” is an interactive image segmentation technique that uses the on-the-fly, single-goal variant Dijkstra’s algorithm to find minimal cost segmentation (or boundary) paths in an image. An interactive implementation of the intelligent scissors algorithm would work as follows:

The user clicks on a “selection point” (also referred to as “segmentation point”) somewhere along the desired image boundary.
On-the-fly, single-goal Dijkstra’s algorithm is used to find a minimal cost path from the selection point to every other pixel in the image. The paths are cached so that:
As the user moves the cursor to a location further along the boundary, the minimum cost path back to the selection point is displayed.
The user clicks on the image boundary again to place another selection point, and the minimum cost path back to the selection point is added to the segmentation boundary.
Steps 2 through 4 are repeated using the new selection point until the path reaches the original selection point, making a closed boundary.

Intelligent Scissors works well in practice and is now the basis of a number of segmentation algorithms in commercial image editing software.

For this assignment, you will be programming both a “Simple Scissors” algorithm and an “Intelligent Scissors” using the on-the-fly, single-goal variant of Dijkstra’s algorithm.

“Simple Scissors”

The Simple Scissors algorithm does not use Dijkstra’s algorithm. Instead, at each vertex (pixel, point, node) along the path, this algorithm simply chooses the edge to the neighboring pixel in a greedy fashion until the goal is reached or the path cannot continue. The next pixel is chosen by selecting the immediate neighbor with the smallest weight that has not been previously visited. When two or more directions have equal weight, ties should be broken as follows: Prefer pixels in clockwise order, starting with the pixel to the north (upwards in the image; lower y-coordinate). If there is no unvisited neighbor (i.e., all four neighbors have been previously visited), then the path necessarily ends in a dead end. For example, the Simple Scissors algorithm in Figure 1 has become stuck in a dead end on the cheek-bone. When the path ends in this manner, the algorithm should draw up to the dead end, reset its state, and continue searching at the next segmentation point. Simple Scissors is a very greedy algorithm, and it's not very intelligent! As you can see, Simple Scissors will not always work properly due to its unintelligent “greedy” nature.

400px-intelligentscissors1.png

“Intelligent Scissors” Using On-the-Fly, Single-Goal Dijkstra’s Algorithm

The Intelligent Scissors algorithm uses the on-the-fly, single-goal version of Dijkstra’s algorithm to find the optimal (least-cost) paths between selection points in a given image. After completing a path between a pair of adjacent selection points, the algorithm should reset the dist() values on the image graph and continue searching from the next segmentation point.

You will not be expected to implement the interactive version of Intelligent Scissors sketched at the beginning of this instructions document (you could implement that for fun). Rather, given a set of selection points, your algorithm will find the segmentation path in a non-interactive “batch” mode.

Pseudocode for the on-the-fly, single goal version of Dijkstra's algorithm can be found in Lecture #20. Do not use the version of Dijkstra's algorithm found in the textbook, since it is not designed to work on the fly or with a single goal.

The definition of the cost of edges between adjacent pixels is given below in the “Images as Graphs” section.

Figure 2 was segmented by repeated application of the on-the-fly, single-goal Dijkstra’s algorithm with just three selection points. Contrast this with Figure 1 produced by a Simple Scissors algorithm. The Simple Scissors algorithm adheres to the edge in this case, but passes right by the point it is supposed to meet up with (near the eye socket of the skeleton) and then gets stuck on the cheek bone of the skeleton.

400px-intelligentscissors2.png

Simple scissors algorithm will often fail, and it may fail differently than the image shown in this instructions document. The primary purpose of the simple scissors part of the project is to serve as a point of contrast with intelligent scissors. It is far more important that your intelligent scissors have replicable results.

Gradient Image as a Directed Weighted Graph

The provided codebase pre-computes the gradient of a given image (explained below), so that each position in the original image (except for the border pixels) has a corresponding value called the gradient magnitude. Furthermore, the codebase provides methods to access the pixel-to-pixel edge weights needed for your intelligent scissors algorithm. The following description is provided as background and to help you understand what the provided code is actually doing.

Intuitively, a high gradient magnitude occurs wherever there is a boundary between regions in the image, and a low gradient magnitude occurs throughout uninteresting / uniform regions in the image. Plotting the gradient magnitude of each pixel defines an image that we call the “gradient image”, as shown in Figure 5 (below in the next section). As illustrated in the figure, we have provided code for pre-computing the “gradient image”. Thus, you don't need to implement the gradient magnitude calculation.

Our primary focus here is to explain conceptually how you should interpret the gradient image as a directed weighted graph. The gradient image can be viewed as (or converted on the fly to) a directed weighted graph as follows:

Each pixel $p$ in the gradient image is a vertex $v$ in the graph.
Each directed edge in the graph connects adjacent pixels. We will assume that each pixel $p$ is adjacent only to its vertical neighbors $p_N$ and $p_S$ (to the north and south) and its horizontal neighbors $p_E$ and $p_W$ (to the east and west).
The weight of an edge from one pixel $p_{xy}$ to its neighbor $p_{x^'y^'}$ is the maximum gradient minus the gradient magnitude for the destination pixel as follows:
- $Weight(p_{xy},p_{x^'y^'}) = MG - g_{x^'y^'}$
- where $MG$ = the maximum gradient magnitude across all positions in the gradient image
- and $g_{x^'y^'}$ = the gradient magnitude for the destination pixel $p_{x^'y^'}$.

Note that a higher gradient yields lower edge weight.

This correspondence between gradient image and its directed graph is shown in Figure 3 here:

500px-gradientimagetodirectedgraph.png

The following mathematical explanation for computing gradient magnitudes is intended to give you additional insight into how the approach works. To pre-compute the gradient magnitude of pixels in the image, we use an operator called the “Sobel filter” that calculates a discrete approximation of the $x$ and $y$ partial derivatives of the image. Using the Sobel filter, the gradient magnitude $g_{xy}$ for the pixel $p_{xy}$, is defined as follows:

$g_{xy} = \sqrt{gx^2_{xy} + gy^2_{xy}}$

where: $gx_{xy} = {{\left(p_{x+1,y+1}-p_{x-1,y+1}\right) + 2\left(p_{x+1,y}-p_{x-1,y}\right) + \left(p_{x+1,y-1}-p_{x-1,y-1}\right)}\over{4}}$ $gy_{xy} = {{\left(p_{x+1,y+1}-p_{x+1,y-1}\right) + 2\left(p_{x,y+1}-p_{x,y-1}\right) + \left(p_{x-1,y+1}-p_{x-1,y-1}\right)}\over{4}}$

Provided Framework

We provide a VisualStudio project that implements a graphical user interface that allows you to load and display images (with or without pre-selected points) and to create some points of your own by clicking on the image. It gives you hooks into three different classes of Scissors that use the selected points to segment the picture.

The project includes a class for reading and writing ASCII pgm format images (we have also included some pgm test images). PGM files allow comments. You will be given images that include the selection points in a comment in the second line of the image file. The segmentation points will be given in the order in which they were selected by a user in a semi-colon separated list of tuples. The tuples consist of parentheses around an ordered pair of integers separated by commas. For example, your image file might look something like this:

	P2
	# (30,40); (50,60); (10,20); (50,80)
	# Created by Ifranview
	384 256 
	255 
	...

The project distribution is already quite functional. As shown in Figure 4 below, the framework will load and display images.

400px-intelligentscissors3.png

As shown in Figure 5, it will also display the pre-computed gradient image, computed automatically at image load time, in a separate tab.

400px-intelligentscissors4.png

The project distribution includes everything except implementations of the Dijkstra’s-based Intelligent Scissors and the Simple Scissors algorithms. DijkstraScissors.cs and SimpleScissors.cs are the files that contain the classes you want to implement. Note that we have provided a very rudimentary scissors algorithm example in the file StraightScissors.cs as an example. It simply connects the points you have selected with straight lines. The screenshot in Figure 6 shows this fast but imperfect algorithm in action. You are encouraged to use Straight Scissors as a template and starting point for your work.

400px-intelligentscissors5.png

You are provided with a method called Scissors.GetPixelWeight(Point) to calculate the weight of a point using the method explained above. Scissors.GetPixelWeight() is accessible via inheritance to your scissors algorithms. It consults the gradient image (Image.Gradient) and subtracts a given pixel's gradient from the maximum gradient as described above; it does everything you need to provide the inbound edge weight for the vertex corresponding to a given pixel. Remember that a low number is better (less cost).

Required: Whiteboard Experience (TM)

You are developing as a problem solver. You are also learning to solve a problem before you start writing code. In this project, we require you to step up to a whiteboard after reading these instructions before writing any code. Use the whiteboard to walk through the solution with a friend. Your friend could be another CS major, a CS 312 classmate, or another technical person with an eye for detail.

Make sure you sketch out and understand what it means for the gradient image to be represented as a graph. Simulate the on-the-fly, single goal version of Dijkstra's with your pen on that graph. Think out loud and with sketches about the contents of your priority queue for simple examples. Settle on which methods you will actually need on your priority queue.

In short, make sure you understand what you are going to do before you write a stitch of code. As you will note below, including a photo of your “whiteboard experience” is a required element of your project report.

Have a “whiteboard experience” and document it with a photograph of you next to the whiteboard sketches, etc.

Prepare a whiteboard experience report as a .pdf file. It should consist of one page with the following elements:

your name
your section number
[5 points] your photo
[5 points] a caption (2-3 sentences) explaining why you are ready to proceed to write code
the names of the people in your audience

Submit the whiteboard experience report online by the whiteboard experience deadline for this project.

To Do

Have a “whiteboard experience” as directed above, and submit it online by the deadline for the whiteboard experience.
Implement the Simple Scissors image segmentation algorithm in the following method: SimpleScissors.FindSegmentation(IList points, Pen pen)
- The FindSegmentation() method in each class is the entry point for the class that will be invoked when the corresponding button is clicked (when the user selects your scissors implementation on the toolbar). The parameters are explained in the files. (The project contains several other files that you will not need to modify.)
Prove that in the kind of graph we are working with in this project (in which for each vertex, all inbound edge weights are the same), on-the-fly, single-goal Dijkstra's will never try to update a vertex's dist() key or call the decreasekey() method on the priority queue.
- In other words, show that once a vertex has been visited the first time, you will never find a better way to get there. (That's a different claim than the claim that Dijkstra's algorithm is correct. That claim was “once a vertex has been settled, you will never find a better way to get there”. Please note the important distinction.)
- In other words, show that the condition in on-the-fly, single-goal Dijkstra's pseudo-code $dist(v) > dist(u) + l(u, v)$ for some new vertex $u$ will never be true.
- Hint: try a proof by contradiction!
Implement the on-the-fly, single-goal Dijkstra's algorithm as the Intelligent Scissors image segmentation algorithm in the following method: DijkstraScissors.FindSegmentation(IList points, Pen pen)
- Each algorithm computes its notion of the shortest path between adjacent pairs of selection points – this is called the “segmentation path”. Remember that the segmentation path will form a cycle that ends at the starting selection point.
- Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.
Use the DrawEllipse() method to place a small (preferably) yellow circle on each selection point in the Overlay. See the StraightScissors class for example.
Set the color of each pixel on the segmentation path to white or red or some other very visible (and printable) color in the Overlay. (Again, see the StraightScissors class for an example.)
Solve three different segmentation problems:
- Choose three images.
- At least two must come from the “with-segmentation-points” sub-directory of the “TestImages” directory in the project distribution.
- The other one can come from anywhere you like, including the “with-segmentation-points” subdirectory.
- For all three images / problems you select, your algorithm must complete the segmentation using on-the-fly, single-goal Dijkstra’s algorithm in 5 seconds or less. Note that the code uses the StopWatch class to measure how long it takes to perform steps 1-3 and places that time in the status bar alongside the progress bar at the bottom of the form (see the figures).
- Run both Simple Scissors and Intelligent Scissors on all three images, and capture the results in screenshots.

Implementation Notes

One element for success in this project is to think carefully about how the on-the-fly, single-goal version of Dijkstra’s algorithm is restricted to computing the shortest path to just the destination segmentation point (and the other points settled along the way) and not to every point in the entire graph.

There is no special sense of direction in Simple Scissors, but that is also true of Intelligent Scissors. They both explore in the hope of finding a cheap path to the next point – Intelligent Scissors just explores paths in a way that is guaranteed to eventually find the right answer. Since the right answer (cheapest path) is pointed in the right direction, it appears to know which direction to travel after the fact. Simple Scissors does not have that guarantee, so you never know which direction you'll wander off in.

To handle the out-of-bounds case, in your scissors classes, you have access to an Overlay object. You can use it to find out what the image boundaries are using the following methods: Overlay.Height and Overlay.Width. Edge pixels in the original image have coordinates $x \in [0, $ Overlay.Width$-1]$ and $y \in [0,$ Overlay.Height$-1]$. If you query the weight of an out-of-bounds Point, you will get an Exception.

Furthermore, based on the definition of the gradient, pixels along all edges of the image have no corresponding gradient value. If you request the gradient of one of these edge pixels, an IndexOutOfBounds exception is raised. Consequently, you must restrict the coordinates of pixels in the gradient image such that $x \in [1,$ Overlay.Width$-2]$ and $y \in [1,$ Overlay.Height$-2]$. Be sure to make your scissors algorithms robust to these border conditions and disallow them from wandering off the edges and triggering exceptions.

Overlay.SetPixel() is the method you’ll want to use to set the color of a pixel. Then you'll be able to clear the overlay between runs.
After you set a pixel, you need to tell the image to update. You can use Program.MainForm.RefreshImage() to do that. You can change as many pixels as you want between updates and they’ll all get updated with a single RefreshImage().
Feel free to use any built-in (.NET) data-structure you like. To make sure you're using the latest version of the .NET assembly, do the following:
- Right click on the project and select properties.
- Verify (or select) .NET version 4 or later.
You will need a priority queue for on-the-fly, single-goal Dijkstra’s algorithm. You are welcome to implement your own priority queue. Alternatively, feel free to download and reuse an existing priority queue class, if you wish. Be sure to understand which type of priority queue you are using. Be sure to comply with the course third-party code policy, as found in the syllabus.
Remember that the on-the-fly, single-goal variant of Dijkstra’s algorithm avoids re-enqueueing points already “settled” (removed from the priority queue).
Furthermore, on our graphs if you ever try to update the key on a graph vertex, even if it is only visited and still on the agenda, then you have a bug (see your proof to that effect).
In order to make your timings more accurate on recent versions of Windows (Vista, 7), be sure to turn off Aero first.

Report

Write a type-written report containing the following distinct sections as a single PDF document, and submit it by following the submission directions in the course syllabus.

Place your name, net ID, and section number at the top of the first page.
Also include an estimate of the amount of time you spent on this project at the top of the first page.
[24 points] Include screen shots of the images, including the running times, produced by Intelligent Scissors on all three of your selected images.
- You can capture the image of the window (using the ctrl-alt- shift-PrtSc facility for capturing an image of the window in focus). Looking at the resulting segmentation gives one a good idea of how good the segmentation is. We want to be able to see what you end up with and compare the results.
[11 points] Include screen shots of the images, including the running times, produced by Simple Scissors on all three of your selected images.
- Thus, you should have six window shots.
[10 points] Say whether or not you met the 5-second performance requirement for your intelligent scissors algorithm.
[5 points] In just a few sentences, compare and contrast the Simple Scissors and the Intelligent Scissors algorithms to explore the trade-offs between running time and precision that you observe in your experiments.
[10 points] Explain your choice of priority queue, and explain the efficiency of each method implemented by your priority queue class.
[10 points] Include your proof that no vertex's key is ever updated and therefore decreasekey() is never called on the priority queue.
[30 points] Include a copy of the source code for your classes that implement simple scissors and intelligent scissors (including the on-the-fly, single-goal version of Dijkstra's algorithm). We will verify that you used a correct implementation of the on-the-fly, single-goal Dijkstra’s algorithm.
- Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.

Further Exploration

Here are a few suggestions for possible further exploration:

Make the intelligent scissors interactive. That is, allow the user to click on selection points one at a time and incrementally find the segmentation path between those points.

Support color images. You’ll need to pick a new image format (and probably want to use Microsoft’s built-in tools for loading images) and need to come up with a gradient equation that accounts for color.

Improve the running time of your Intelligent Scissors algorithm by improving the implementation (would a different priority queue help?). This might include using an algorithm other than Dijkstra’s algorithm.
Add a button to zoom in on the image so you can get a better idea of whether or not your algorithm is following the gradient like you think it does.
Come up with an automatic way to evaluate the quality of a segmentation path.
Render the segmented view of the image as a height-field so that it looks like a 3-dimensional terrain. Then, draw your segmented path on the terrain and see if the algorithm is following the gradient like you think it does. DirectX9 would probably be useful for this.
Reduce the amount of memory that your Intelligent Scissors algorithm consumes.

References

[1] E. N. Mortensen and W. A. Barrett, "Intelligent Scissors for Image Composition," in Computer Graphics (SIGGRAPH `95), pp. 191-198, Los Angeles, CA, Aug. 1995.

cs-312/project-4.txt · Last modified: 2015/02/24 14:59 by ryancha

Back to top

Table of Contents