In this project, you will implement the on-the-fly, single goal variant of Dijkstra’s algorithm for finding the best path between each pair of given selection points along the boundary of an image.
As originally defined by Mortensen and Barrett of BYU (1), “Intelligent Scissors” is an interactive image segmentation technique that uses the on-the-fly, single-goal variant Dijkstra’s algorithm to find minimal cost segmentation (or boundary) paths in an image. An interactive implementation of the intelligent scissors algorithm would work as follows:
Intelligent Scissors works well in practice and is now the basis of a number of segmentation algorithms in commercial image editing software.
For this assignment, you will be programming both a “Simple Scissors” algorithm and an “Intelligent Scissors” using the on-the-fly, single-goal variant of Dijkstra’s algorithm.
The Simple Scissors algorithm does not use Dijkstra’s algorithm. Instead, at each vertex (pixel, point, node) along the path, this algorithm simply chooses the edge to the neighboring pixel in a greedy fashion until the goal is reached or the path cannot continue. The next pixel is chosen by selecting the immediate neighbor with the smallest weight that has not been previously visited. When two or more directions have equal weight, ties should be broken as follows: Prefer pixels in clockwise order, starting with the pixel to the north (upwards in the image; lower y-coordinate). If there is no unvisited neighbor (i.e., all four neighbors have been previously visited), then the path necessarily ends in a dead end. For example, the Simple Scissors algorithm in Figure 1 has become stuck in a dead end on the cheek-bone. When the path ends in this manner, the algorithm should draw up to the dead end, reset its state, and continue searching at the next segmentation point. Simple Scissors is a very greedy algorithm, and it's not very intelligent! As you can see, Simple Scissors will not always work properly due to its unintelligent “greedy” nature.
The Intelligent Scissors algorithm uses the on-the-fly, single-goal version of Dijkstra’s algorithm to find the optimal (least-cost) paths between selection points in a given image. After completing a path between a pair of adjacent selection points, the algorithm should reset the dist() values on the image graph and continue searching from the next segmentation point.
You will not be expected to implement the interactive version of Intelligent Scissors sketched at the beginning of this instructions document (you could implement that for fun). Rather, given a set of selection points, your algorithm will find the segmentation path in a non-interactive “batch” mode.
Pseudocode for the on-the-fly, single goal version of Dijkstra's algorithm can be found in Lecture #20. Do not use the version of Dijkstra's algorithm found in the textbook, since it is not designed to work on the fly or with a single goal.
The definition of the cost of edges between adjacent pixels is given below in the “Images as Graphs” section.
Figure 2 was segmented by repeated application of the on-the-fly, single-goal Dijkstra’s algorithm with just three selection points. Contrast this with Figure 1 produced by a Simple Scissors algorithm. The Simple Scissors algorithm adheres to the edge in this case, but passes right by the point it is supposed to meet up with (near the eye socket of the skeleton) and then gets stuck on the cheek bone of the skeleton.
400px-intelligentscissors2.png
Simple scissors algorithm will often fail, and it may fail differently than the image shown in this instructions document. The primary purpose of the simple scissors part of the project is to serve as a point of contrast with intelligent scissors. It is far more important that your intelligent scissors have replicable results.
The provided codebase pre-computes the gradient of a given image (explained below), so that each position in the original image (except for the border pixels) has a corresponding value called the gradient magnitude. Furthermore, the codebase provides methods to access the pixel-to-pixel edge weights needed for your intelligent scissors algorithm. The following description is provided as background and to help you understand what the provided code is actually doing.
Intuitively, a high gradient magnitude occurs wherever there is a boundary between regions in the image, and a low gradient magnitude occurs throughout uninteresting / uniform regions in the image. Plotting the gradient magnitude of each pixel defines an image that we call the “gradient image”, as shown in Figure 5 (below in the next section). As illustrated in the figure, we have provided code for pre-computing the “gradient image”. Thus, you don't need to implement the gradient magnitude calculation.
Our primary focus here is to explain conceptually how you should interpret the gradient image as a directed weighted graph. The gradient image can be viewed as (or converted on the fly to) a directed weighted graph as follows:
Note that a higher gradient yields lower edge weight.
This correspondence between gradient image and its directed graph is shown in Figure 3 here:
500px-gradientimagetodirectedgraph.png
The following mathematical explanation for computing gradient magnitudes is intended to give you additional insight into how the approach works. To pre-compute the gradient magnitude of pixels in the image, we use an operator called the “Sobel filter” that calculates a discrete approximation of the $x$ and $y$ partial derivatives of the image. Using the Sobel filter, the gradient magnitude $g_{xy}$ for the pixel $p_{xy}$, is defined as follows:
$g_{xy} = \sqrt{gx^2_{xy} + gy^2_{xy}}$
where: $gx_{xy} = {{\left(p_{x+1,y+1}-p_{x-1,y+1}\right) + 2\left(p_{x+1,y}-p_{x-1,y}\right) + \left(p_{x+1,y-1}-p_{x-1,y-1}\right)}\over{4}}$ $gy_{xy} = {{\left(p_{x+1,y+1}-p_{x+1,y-1}\right) + 2\left(p_{x,y+1}-p_{x,y-1}\right) + \left(p_{x-1,y+1}-p_{x-1,y-1}\right)}\over{4}}$
We provide a VisualStudio project that implements a graphical user interface that allows you to load and display images (with or without pre-selected points) and to create some points of your own by clicking on the image. It gives you hooks into three different classes of Scissors that use the selected points to segment the picture.
The project includes a class for reading and writing ASCII pgm format images (we have also included some pgm test images). PGM files allow comments. You will be given images that include the selection points in a comment in the second line of the image file. The segmentation points will be given in the order in which they were selected by a user in a semi-colon separated list of tuples. The tuples consist of parentheses around an ordered pair of integers separated by commas. For example, your image file might look something like this:
P2 # (30,40); (50,60); (10,20); (50,80) # Created by Ifranview 384 256 255 ...
The project distribution is already quite functional. As shown in Figure 4 below, the framework will load and display images.
400px-intelligentscissors3.png
As shown in Figure 5, it will also display the pre-computed gradient image, computed automatically at image load time, in a separate tab.
400px-intelligentscissors4.png
The project distribution includes everything except implementations of the Dijkstra’s-based Intelligent Scissors and the Simple Scissors algorithms. DijkstraScissors.cs and SimpleScissors.cs are the files that contain the classes you want to implement. Note that we have provided a very rudimentary scissors algorithm example in the file StraightScissors.cs as an example. It simply connects the points you have selected with straight lines. The screenshot in Figure 6 shows this fast but imperfect algorithm in action. You are encouraged to use Straight Scissors as a template and starting point for your work.
400px-intelligentscissors5.png
You are provided with a method called Scissors.GetPixelWeight(Point) to calculate the weight of a point using the method explained above. Scissors.GetPixelWeight() is accessible via inheritance to your scissors algorithms. It consults the gradient image (Image.Gradient) and subtracts a given pixel's gradient from the maximum gradient as described above; it does everything you need to provide the inbound edge weight for the vertex corresponding to a given pixel. Remember that a low number is better (less cost).
You are developing as a problem solver. You are also learning to solve a problem before you start writing code. In this project, we require you to step up to a whiteboard after reading these instructions before writing any code. Use the whiteboard to walk through the solution with a friend. Your friend could be another CS major, a CS 312 classmate, or another technical person with an eye for detail.
Make sure you sketch out and understand what it means for the gradient image to be represented as a graph. Simulate the on-the-fly, single goal version of Dijkstra's with your pen on that graph. Think out loud and with sketches about the contents of your priority queue for simple examples. Settle on which methods you will actually need on your priority queue.
In short, make sure you understand what you are going to do before you write a stitch of code. As you will note below, including a photo of your “whiteboard experience” is a required element of your project report.
Have a “whiteboard experience” and document it with a photograph of you next to the whiteboard sketches, etc.
Prepare a whiteboard experience report as a .pdf file. It should consist of one page with the following elements:
Submit the whiteboard experience report online by the whiteboard experience deadline for this project.
One element for success in this project is to think carefully about how the on-the-fly, single-goal version of Dijkstra’s algorithm is restricted to computing the shortest path to just the destination segmentation point (and the other points settled along the way) and not to every point in the entire graph.
There is no special sense of direction in Simple Scissors, but that is also true of Intelligent Scissors. They both explore in the hope of finding a cheap path to the next point – Intelligent Scissors just explores paths in a way that is guaranteed to eventually find the right answer. Since the right answer (cheapest path) is pointed in the right direction, it appears to know which direction to travel after the fact. Simple Scissors does not have that guarantee, so you never know which direction you'll wander off in.
To handle the out-of-bounds case, in your scissors classes, you have access to an Overlay object. You can use it to find out what the image boundaries are using the following methods: Overlay.Height and Overlay.Width. Edge pixels in the original image have coordinates $x \in [0, $ Overlay.Width$-1]$ and $y \in [0,$ Overlay.Height$-1]$. If you query the weight of an out-of-bounds Point, you will get an Exception.
Furthermore, based on the definition of the gradient, pixels along all edges of the image have no corresponding gradient value. If you request the gradient of one of these edge pixels, an IndexOutOfBounds exception is raised. Consequently, you must restrict the coordinates of pixels in the gradient image such that $x \in [1,$ Overlay.Width$-2]$ and $y \in [1,$ Overlay.Height$-2]$. Be sure to make your scissors algorithms robust to these border conditions and disallow them from wandering off the edges and triggering exceptions.
Write a type-written report containing the following distinct sections as a single PDF document, and submit it by following the submission directions in the course syllabus.
Here are a few suggestions for possible further exploration:
Support color images. You’ll need to pick a new image format (and probably want to use Microsoft’s built-in tools for loading images) and need to come up with a gradient equation that accounts for color.