GSoC 2014 – Signing off

This years GSoC coding period has nearly come to an end. This post aims to briefly summarize everything that happened during the last three months. My task was to implement Region Adjacency Graph based segmentation algorithms for scikit-image. This post provides a good explanation about them. Below I will list out my major contributions.

Contributions

Region Adjacency Graphs

Fixing the API for RAGs was very important, since it was directly going to affect everything else that followed. After a long discussion and some benchmarks we finally decided to have NetworkX as a dependency. This helped a lot, since I had a lot of graph algorithms already implemented for me. The file rag.py implements the RAG class and the RAG construction methods. I also implemented threshold_cut, a function which segments images by simply thresholding edge weights. To know more, you can visit, RAG Introduction.

Normalized Cut

The function cut_normazlied, implements the Normalized Cut algorithm for RAGs. You can visit Normalized Cut on RAGs to know more. See the videos at the end to get a quick idea of how NCut works. Also see, A closer look at NCut, where I have benchmarked the function and indicated bottlenecks.

Drawing Regions Adjacency Graphs

In my posts, I had been using a small piece of code I had written to display RAGs. This Pull Request implements the same functionality for scikit-image. This would be immensely useful for anyone who is experimenting with RAGs. For a more detailed explanation, check out Drawing RAGs.

Hierarchical Merging of Region Adjacency Graphs

This Pull Request implements a simple form of Hierarchical Merging. For more details, see Hierarchical Merging of Region Adjacency Graphs. This post also contains videos at the end, do check them out. This can also be easily extended to a boundary map based approach, which I plan to do post-GSoC

 

Final Comments

The most important thing for me is that I am a better Python programmer as compared to what I was before GSoC began this year. I was able to see how some graph based segmentation methods work at their most basic level. Although GSoC has come to an end, I don’t think my contributions to scikit-image have. Contributing to it has been a tremendous learning experience and plan to continue doing so. I have been been fascinated with Image Processing since me and my friends wrote an unholy piece of Matlab code about 3 years ago to achieve this. And as far as I can see its a fascination I will have for the rest of my life.

Finally, I would like to thank my mentors Juan, Johannes Schönberger and Guillaume Gay. I would also like to thank Stefan for reviewing my Pull Requests.

 

 

Advertisements

Hierarchical Merging of Region Adjacency Graphs

Region Adjacency Graphs model regions in an image as nodes of a graph with edges between adjacent regions. Superpixel methods tend to over segment images, ie, divide into more regions than necessary. Performing a Normalized Cut and Thresholding Edge Weights are two ways of extracting a better segmentation out of this. What if we could combine two small regions into a bigger one ? If we keep combining small similar regions into bigger ones, we will end up with bigger regions which are significantly different from its adjacent ones. Hierarchical Merging explores this possibility. The current working code can be found at this Pull Request

Code Example

The merge_hierarchical function performs hierarchical merging on a RAG. It picks up the smallest weighing edge and combines the regions connected by it. The new region is adjacent to all previous neighbors of the two combined regions. The weights are updated accordingly. It continues doing so till the minimum edge weight in the graph in more than the supplied thresh value. The function takes a RAG as input where smaller edge weight imply similar regions. Therefore, we use the rag_mean_color function with the default "distance" mode for RAG construction. Here is a minimal code snippet.

from skimage import graph, data, io, segmentation, color


img = data.coffee()
labels = segmentation.slic(img, compactness=30, n_segments=400)
g = graph.rag_mean_color(img, labels)
labels2 = graph.merge_hierarchical(labels, g, 40)
g2 = graph.rag_mean_color(img, labels2)

out = color.label2rgb(labels2, img, kind='avg')
out = segmentation.mark_boundaries(out, labels2, (0, 0, 0))
io.imsave('out.png',out)

I arrived at the threshold 40 after some trial and error. Here is the output.

out

The drawback here is that the thresh argument can vary significantly depending on image to image.

Comparison with Normalized Cut

Loosely speaking the normalized cut follows a top-down approach where as the hierarchical merging follow a bottom-up approach. Normalized Cut starts with the graph as a whole and breaks it down into smaller parts. On the other hand hierarchical merging, starts with individual regions and merges them into bigger ones till a criteria is reached. The Normalized Cut however, is much more robust and requires little tuning of its parameters as images change. Hierarchical merging is a lot faster, even though most of its computation logic is written in Python.

Effect of change in threshold

Setting a very low threshold, will not merge any regions and will give us back the original image. A very large threshold on the other hand would merge all regions and give return the image as just one big blob. The effect is illustrated below.

threshold=10

10

threshold=20

20

threshold=40

40

threshold=70

70

threshold=100

70

Hierarchical Merging in Action

With this modification the following code can output the effect of all the intermediate segmentation during each iteration.

from skimage import graph, data, io, segmentation, color
import time
from matplotlib import pyplot as plt


img = data.coffee()
labels = segmentation.slic(img, compactness=30, n_segments=400)
g = graph.rag_mean_color(img, labels)
labels2 = graph.merge_hierarchical(labels, g, 60)

c = 0

out = color.label2rgb(graph.graph_merge.seg_list[-10], img, kind='avg')
for label in graph.graph_merge.seg_list:
    out = color.label2rgb(label, img, kind='avg')
    out = segmentation.mark_boundaries(out, label, (0, 0, 0))
    io.imsave('/home/vighnesh/Desktop/agg/' + str(c) + '.png', out)
    c += 1

I then used avconv -f image2 -r 3 -i %d.png -r 20 car.mp4 to output a video. Below are a few examples.

In each of these videos, at every frame, a boundary dissapears. This means that the two regions separated by that boundary are merged. The frame rate is 5 FPS, so more than one region might be merged at a time.

Coffee Image

coffee

Car Image

car

Baseball Image

baseball