**Region Adjacency Graphs **

Fixing the API for RAGs was very important, since it was directly going to affect everything else that followed. After a long discussion and some benchmarks we finally decided to have NetworkX as a dependency. This helped a lot, since I had a lot of graph algorithms already implemented for me. The file rag.py implements the RAG class and the RAG construction methods. I also implemented threshold_cut, a function which segments images by simply thresholding edge weights. To know more, you can visit, RAG Introduction.

**Normalized Cut **

The function cut_normazlied, implements the Normalized Cut algorithm for RAGs. You can visit Normalized Cut on RAGs to know more. See the videos at the end to get a quick idea of how NCut works. Also see, A closer look at NCut, where I have benchmarked the function and indicated bottlenecks.

**Drawing Regions Adjacency Graphs **

In my posts, I had been using a small piece of code I had written to display RAGs. This Pull Request implements the same functionality for scikit-image. This would be immensely useful for anyone who is experimenting with RAGs. For a more detailed explanation, check out Drawing RAGs.

**Hierarchical Merging of Region Adjacency Graphs **

This Pull Request implements a simple form of Hierarchical Merging. For more details, see Hierarchical Merging of Region Adjacency Graphs. This post also contains videos at the end, do check them out. This can also be easily extended to a boundary map based approach, which I plan to do post-GSoC

The most important thing for me is that I am a better Python programmer as compared to what I was before GSoC began this year. I was able to see how some graph based segmentation methods work at their most basic level. Although GSoC has come to an end, I don’t think my contributions to **scikit-image** have. Contributing to it has been a tremendous learning experience and plan to continue doing so. I have been been fascinated with Image Processing since me and my friends wrote an unholy piece of Matlab code about 3 years ago to achieve this. And as far as I can see its a fascination I will have for the rest of my life.

Finally, I would like to thank my mentors Juan, Johannes Schönberger and Guillaume Gay. I would also like to thank Stefan for reviewing my Pull Requests.

]]>

The `merge_hierarchical`

function performs hierarchical merging on a RAG. It picks up the smallest weighing edge and combines the regions connected by it. The new region is adjacent to all previous neighbors of the two combined regions. The weights are updated accordingly. It continues doing so till the minimum edge weight in the graph in more than the supplied `thresh`

value. The function takes a RAG as input where smaller edge weight imply similar regions. Therefore, we use the `rag_mean_color`

function with the default `"distance"`

mode for RAG construction. Here is a minimal code snippet.

from skimage import graph, data, io, segmentation, color img = data.coffee() labels = segmentation.slic(img, compactness=30, n_segments=400) g = graph.rag_mean_color(img, labels) labels2 = graph.merge_hierarchical(labels, g, 40) g2 = graph.rag_mean_color(img, labels2) out = color.label2rgb(labels2, img, kind='avg') out = segmentation.mark_boundaries(out, labels2, (0, 0, 0)) io.imsave('out.png',out)

I arrived at the threshold `40`

after some trial and error. Here is the output.

The drawback here is that the `thresh`

argument can vary significantly depending on image to image.

Loosely speaking the normalized cut follows a top-down approach where as the hierarchical merging follow a bottom-up approach. Normalized Cut starts with the graph as a whole and breaks it down into smaller parts. On the other hand hierarchical merging, starts with individual regions and merges them into bigger ones till a criteria is reached. The Normalized Cut however, is much more robust and requires little tuning of its parameters as images change. Hierarchical merging is a lot faster, even though most of its computation logic is written in Python.

Setting a very low threshold, will not merge any regions and will give us back the original image. A very large threshold on the other hand would merge all regions and give return the image as just one big blob. The effect is illustrated below.

With this modification the following code can output the effect of all the intermediate segmentation during each iteration.

from skimage import graph, data, io, segmentation, color import time from matplotlib import pyplot as plt img = data.coffee() labels = segmentation.slic(img, compactness=30, n_segments=400) g = graph.rag_mean_color(img, labels) labels2 = graph.merge_hierarchical(labels, g, 60) c = 0 out = color.label2rgb(graph.graph_merge.seg_list[-10], img, kind='avg') for label in graph.graph_merge.seg_list: out = color.label2rgb(label, img, kind='avg') out = segmentation.mark_boundaries(out, label, (0, 0, 0)) io.imsave('/home/vighnesh/Desktop/agg/' + str(c) + '.png', out) c += 1

I then used `avconv -f image2 -r 3 -i %d.png -r 20 car.mp4`

to output a video. Below are a few examples.

In each of these videos, at every frame, a boundary dissapears. This means that the two regions separated by that boundary are merged. The frame rate is 5 FPS, so more than one region might be merged at a time.

In this post I explained how the Normalized Cut works and demonstrated some examples of it. This post aims to take a closer look at the code. I ran the following code to monitor the time taken by NCut with respect to initial number of regions.

from __future__ import print_function from skimage import graph, data, io, segmentation, color import time from matplotlib import pyplot as plt image = data.coffee() segment_list = range(50, 801, 50) for nseg in segment_list: labels = segmentation.slic(image, compactness=30, n_segments=nseg) rag = graph.rag_mean_color(image, labels, mode='similarity') T = time.time() new_labels = graph.ncut(labels, rag) time_taken = time.time() - T out = color.label2rgb(new_labels, image, kind='avg') io.imsave('/home/vighnesh/Desktop/ncut/' + str(nseg) + '.png', out) print(nseg, time_taken)

Here is the output sequentially.

By a little guess-work, I figured that the curve approximately varies as `x^2.2`

. For 800 nodes, the time taken is around 35 seconds.

I used line profiler to examine the time taken by each line of code in `threshold_normalized`

. Here are the results.

218 @profile 219 def _ncut_relabel(rag, thresh, num_cuts): 220 """Perform Normalized Graph cut on the Region Adjacency Graph. 221 222 Recursively partition the graph into 2, until further subdivision 223 yields a cut greather than `thresh` or such a cut cannot be computed. 224 For such a subgraph, indices to labels of all its nodes map to a single 225 unique value. 226 227 Parameters 228 ---------- 229 labels : ndarray 230 The array of labels. 231 rag : RAG 232 The region adjacency graph. 233 thresh : float 234 The threshold. A subgraph won't be further subdivided if the 235 value of the N-cut exceeds `thresh`. 236 num_cuts : int 237 The number or N-cuts to perform before determining the optimal one. 238 map_array : array 239 The array which maps old labels to new ones. This is modified inside 240 the function. 241 """ 242 59 218937 3710.8 3.2 d, w = _ncut.DW_matrices(rag) 243 59 151 2.6 0.0 m = w.shape[0] 244 245 59 61 1.0 0.0 if m > 2: 246 44 3905 88.8 0.1 d2 = d.copy() 247 # Since d is diagonal, we can directly operate on its data 248 # the inverse of the square root 249 44 471 10.7 0.0 d2.data = np.reciprocal(np.sqrt(d2.data, out=d2.data), out=d2.data) 250 251 # Refer Shi & Malik 2001, Equation 7, Page 891 252 44 26997 613.6 0.4 vals, vectors = linalg.eigsh(d2 * (d - w) * d2, which='SM', 253 44 6577542 149489.6 94.9 k=min(100, m - 2)) 254 255 # Pick second smallest eigenvector. 256 # Refer Shi & Malik 2001, Section 3.2.3, Page 893 257 44 618 14.0 0.0 vals, vectors = np.real(vals), np.real(vectors) 258 44 833 18.9 0.0 index2 = _ncut_cy.argmin2(vals) 259 44 2408 54.7 0.0 ev = _ncut.normalize(vectors[:, index2]) 260 261 44 22737 516.8 0.3 cut_mask, mcut = get_min_ncut(ev, d, w, num_cuts) 262 44 78 1.8 0.0 if (mcut < thresh): 263 # Sub divide and perform N-cut again 264 # Refer Shi & Malik 2001, Section 3.2.5, Page 893 265 29 78228 2697.5 1.1 sub1, sub2 = partition_by_cut(cut_mask, rag) 266 267 29 175 6.0 0.0 _ncut_relabel(sub1, thresh, num_cuts) 268 29 92 3.2 0.0 _ncut_relabel(sub2, thresh, num_cuts) 269 29 32 1.1 0.0 return 270 271 # The N-cut wasn't small enough, or could not be computed. 272 # The remaining graph is a region. 273 # Assign `ncut label` by picking any label from the existing nodes, since 274 # `labels` are unique, `new_label` is also unique. 275 30 685 22.8 0.0 _label_all(rag, 'ncut label')

As you can see above 95% of the time is taken by the call to eigsh.

To take a closer look at it, I plotted time while ensuring only one iteration. This commit here takes care of it. Also, I changed the `eigsh`

call to look for the largest eigenvectors instead of the smallest ones, with this commit here. Here are the results.

A single eignenvalue computation is bounded by `O(n^1.5)`

as mentioned in the original paper. The recursive NCuts are pushing the time required towards more than `O(n^2)`

.`eigsh`

solves the eigenvalue problem for a symmetric hermitian matrix. It in turn relies on a library called ARPack. As documented here ARPack isn’t very good at finding the smallest eigenvectors. If the value supplied as the argument `k`

is too small, we get the `ArpackNoConvergence`

Exception. As seen from the above plot, finding the largest eigenvectors is much more efficient using the `eigsh`

function.

Since the problem is specific to ARPack, using other libraries might lead to faster computation. slepc4py is one such BSD licensed library. The possibility of optionally importing `slec4py`

should be certainly explored in the near future.

Also, we can optionally ask the user for a function to solve the eigenvalue problem, so that he can use a matrix library of his choice if he/she so desires.

Although the current Normalized Cut implementation takes more than quadratic time, the preceding over segmentation method does most of the heavy lifting. With something like SLIC, we can be sure of the number of nodes irrespective of the input image size. Although, a better eigenvalue finding technique for smallest eigenvectors would immensely improve its performance.

]]>You will need to pull for this Pull Request to be able to execute the code below. I’ll start by defining a custom `show_image`

function to aid displaying in IPython notebooks.

from skimage import graph, data, io, segmentation, color from matplotlib import pyplot as plt from skimage.measure import regionprops import numpy as np from matplotlib import colors def show_image(img): width = img.shape[1] / 50.0 height = img.shape[0] * width/img.shape[1] f = plt.figure(figsize=(width, height)) plt.imshow(img)

We will start by loading a demo image just containing 3 bold colors to help us see how the `draw_rag`

function works.

image = io.imread('/home/vighnesh/Desktop/images/colors.png') show_image(image)

We will now use the SLIC algorithm to give us an over-segmentation, on which we will build our RAG.

labels = segmentation.slic(image, compactness=30, n_segments=400)

Here’s what the over-segmentation looks like.

border_image = segmentation.mark_boundaries(image, labels, (0, 0, 0)) show_image(border_image)

We can now form out RAG and see how it looks.

rag = graph.rag_mean_color(image, labels) out = graph.draw_rag(labels, rag, border_image) show_image(out)

In the above image, nodes are shown in yellow whereas edges are shown in green. Each region is represented by its centroid. As Juan pointed out, many edges will be difficult to see because of low contrast between them and the image, as seen above. To counter this we support the `desaturate`

option. When set to `True`

the image is converted to grayscale before displaying. Hence all the image pixels are a shade of gray, while the edges and nodes stand out.

out = graph.draw_rag(labels, rag, border_image, desaturate=True) show_image(out)

Although the above image does very well to show us individual regions and their adjacency relationships, it does nothing to show us the magnitude of edges. To give us more information about the magnitude of edges, we have the `colormap`

option. It colors edges between the first and the second color depending on their weight.

blue_red = colors.ListedColormap(['blue', 'red']) out = graph.draw_rag(labels, rag, border_image, desaturate=True, colormap=blue_red) show_image(out)

As you can see, the edges between similar regions are blue, whereas edges between dissimilar regions are red. `draw_rag`

also accepts a `thresh`

option. All edges above `thresh`

are not considered for drawing.

out = graph.draw_rag(labels, rag, border_image, desaturate=True, colormap=blue_red, thresh=10) show_image(out)

Another clever trick is to supply a blank image, this way, we can see the RAG unobstructed.

cyan_red = colors.ListedColormap(['cyan', 'red']) out = graph.draw_rag(labels, rag, np.zeros_like(image), desaturate=True, colormap=cyan_red) show_image(out)

**Ahhh, magnificent.**

Here is a small piece of code which produces a typical desaturated color-distance RAG.

image = data.coffee() labels = segmentation.slic(image, compactness=30, n_segments=400) rag = graph.rag_mean_color(image, labels) cmap = colors.ListedColormap(['blue', 'red']) out = graph.draw_rag(labels, rag, image, border_color=(0,0,0), desaturate=True, colormap=cmap) show_image(out)

If you notice the above image, you will find some edges crossing over each other. This is because, some regions are convex. Hence their centroid lies outside their boundary and edges emanating from it can cross other edges.

I will go over some examples of RAG drawings, since most of it is similar, I won’t repeat the code here. The Ncut technique, wherever used, was with its default parameters.

Notice how the centroid of the white rim of the cup is placed at its centre. It is the one adjacent to the centroid of the gray region of the upper part of the spoon, connected to it via a blue edge. Notice how this edge crosses others.

- A point that was brought up in the PR as well is that thick lines would immensely enhance the visual

appeal of the output. As and when they are implemented,`rag_draw`

should be modified to support drawing

thick edges. - As centroids don’t always lie in within an objects boundary, we can represent regions by a point other than their centroid, something which always lies within the boundary. This would allow for better visualization of the actual RAG from its drawing.

The first thing that I can think of which does something useful in the above mention situation is the Minimum Cut Algorithm. It divides a graph into two parts, `A`

and `B`

such that the weight of the edges going from nodes in Set `A`

to the nodes in Set `B`

is minimum.

For the **Minimum Cut** algorithm to work, we need to define the weights of our Region Adjacency Graph (RAG) in such a way that similar regions have more weight. This way, removing lesser edges would leave us with the similar regions.

For all the examples below to work, you will need to pull from this Pull Request. The tests fail due to outdated NumPy and SciPy versions on Travis. I have also submitted a Pull Request to fix that. Just like the last post, I have a `show_img`

function.

from skimage import graph, data, io, segmentation, color from matplotlib import pyplot as plt from skimage.measure import regionprops from skimage import draw import numpy as np def show_img(img): width = img.shape[1]/75.0 height = img.shape[0]*width/img.shape[1] f = plt.figure(figsize=(width, height)) plt.imshow(img)

I have modified the `display_edges`

function for this demo. It draws nodes in yellow. Edges with low edge weights are greener and edges with high edge weight are more red.

def display_edges(image, g): """Draw edges of a RAG on its image Returns a modified image with the edges drawn. Edges with high weight are drawn in red and edges with a low weight are drawn in green. Nodes are drawn in yellow. Parameters ---------- image : ndarray The image to be drawn on. g : RAG The Region Adjacency Graph. threshold : float Only edges in `g` below `threshold` are drawn. Returns: out: ndarray Image with the edges drawn. """ image = image.copy() max_weight = max([d['weight'] for x, y, d in g.edges_iter(data=True)]) min_weight = min([d['weight'] for x, y, d in g.edges_iter(data=True)]) for edge in g.edges_iter(): n1, n2 = edge r1, c1 = map(int, rag.node[n1]['centroid']) r2, c2 = map(int, rag.node[n2]['centroid']) green = 0,1,0 red = 1,0,0 line = draw.line(r1, c1, r2, c2) circle = draw.circle(r1,c1,2) norm_weight = ( g[n1][n2]['weight'] - min_weight ) / ( max_weight - min_weight ) image[line] = norm_weight*red + (1 - norm_weight)*green image[circle] = 1,1,0 return image

To see demonstrate the `display_edges`

function, I will load an image, which just has two regions of black and white.

demo_image = io.imread('bw.png') show_img(demo_image)

Let’s compute the pre-segmenetation using the SLIC method. In addition to that, we will also use `regionprops`

to give us the centroid of each region to aid the `display_edges`

function.

labels = segmentation.slic(demo_image, compactness=30, n_segments=100) labels = labels + 1 # So that no labelled region is 0 and ignored by regionprops regions = regionprops(labels)

We will use `label2rgb`

to replace each region with its average color. Since the image is so monotonous, the difference is hardly noticeable.

label_rgb = color.label2rgb(labels, demo_image, kind='avg') show_img(label_rgb)

We can use `mark_boundaries`

to display region boundaries.

label_rgb = segmentation.mark_boundaries(label_rgb, labels, (0, 1, 1)) show_img(label_rgb)

As mentioned earlier we need to construct a graph with similar regions having more weights between them. For this we supply the `"similarity"`

option to `rag_mean_color`

.

rag = graph.rag_mean_color(demo_image, labels, mode="similarity") for region in regions: rag.node[region['label']]['centroid'] = region['centroid'] label_rgb = display_edges(label_rgb, rag) show_img(label_rgb)

If you notice above the black and white regions have red edges between them, i.e. they are very similar. However the edges between the black and white regions are green, indicating they are less similar.

Consider the following graph

The minimum cut approach would partition the graph as `{A, B, C, D}`

and `{E}`

. It has a tendency to separate out small isolated regions of the graph. This is undesirable for image segmentation as this would separate out small, relatively disconnected regions of the image. A more reasonable partition would be `{A, C}`

and `{B, D, E}`

. To counter this aspect of the minimum cut, we used the **Normalized Cut**.

It is defined as follows

Let be the set of all nodes and for be the edge weight between and

where

With the above equation, **NCut** won’t be low is any of **A** or **B** is not well-connected with the rest of the graph. Consider the same graph as the last one.

We can see that minimizing the **NCut** gives us the expected partition, that is, `{A, C}`

and `{B, D, E}`

.

The idea of using Normalized Cut for segmenting images was first suggested by Jianbo Shi and Jitendra Malik in their paper Normalized Cuts and Image Segmentation. Instead of pixels, we are considering RAGs as nodes.

The problem of finding NCut is NP-Complete. Appendix A of the paper has a proof for it. It is made tractable by an approximation explained in Section 2.1 of the paper. The function _ncut_relabel is responsible for actually carrying out the NCut. It divides the graph into two parts, such that the NCut is minimized. Then for each of the two parts, it recursively carries out the same procedure until the NCut is unstable, i.e. it evaluates to a value greater than the specified threshold. Here is a small snippet to illustrate.

img = data.coffee() labels1 = segmentation.slic(img, compactness=30, n_segments=400) out1 = color.label2rgb(labels1, img, kind='avg') g = graph.rag_mean_color(img, labels1, mode='similarity') labels2 = graph.cut_normalized(labels1, g) out2 = color.label2rgb(labels2, img, kind='avg') show_img(out2)

To observe how the NCut works, I wrote a small hack. This shows us the regions as divides by the method at every stage of recursion. The code relies on a modification in the original code, which can be seen here.

from skimage import graph, data, io, segmentation, color from matplotlib import pyplot as plt import os #img = data.coffee() os.system('rm *.png') img = data.coffee() #img = color.gray2rgb(img) labels1 = segmentation.slic(img, compactness=30, n_segments=400) out1 = color.label2rgb(labels1, img, kind='avg') g = graph.rag_mean_color(img, labels1, mode='similarity') labels2 = graph.cut_normalized(labels1, g) offset = 1000 count = 1 tmp_labels = labels1.copy() for g1,g2 in graph.graph_cut.sub_graph_list: for n,d in g1.nodes_iter(data=True): for l in d['labels']: tmp_labels[labels1 == l] = offset offset += 1 for n,d in g2.nodes_iter(data=True): for l in d['labels']: tmp_labels[labels1 == l] = offset offset += 1 tmp_img = color.label2rgb(tmp_labels, img, kind='avg') io.imsave(str(count) + '.png',tmp_img) count += 1

The two components at each stage are stored in the form of tuples in `sub_graph_list`

. Let’s say, the Graph was divided into `A`

and `B`

initially, and later `A`

was divided into `A1`

and `A2`

. The first iteration of the loop will label `A`

and `B`

. The second iteration will label `A1`

, `A2`

and `B`

, and so on. I used the PNGs saved and converted them into a video with `avconv`

using the command `avconv -f image2 -r 1 -i %d.png -r 20 demo.webm`

. GIFs would result in a loss of color, so I made webm videos. Below are a few images and their respective successive NCuts. Use **Full Screen** for better viewing.

Note that although there is a user supplied **threshold**, it does not have to vary significantly. For all the demos below, the default value is used.

During each iteration, one region (area of the image with the same color) is split into two. A region is represented by its average color. Here’s what happens in the video

- The image is divided into red, and the rest of the regions (gray at this point)
- The grey is divided into a dark pink region (pink, maroon and yellow) and a

dark green ( cyan, green and blue region ). - The dark green region is divided into light blue ( cyan and blue ) and the

green region. - The light blue region is divided into cyan and blue
- The dark pink region is divided into yellow and a darker pink (pink and marron

region. - The darker pink region is divided into pink and maroon regions.

The current segmentation functions in scikit-image are too fine grained and fall closer to superpixel methods, providing a starting point for segmentation. Region Adjacency Graphs (RAGs) are a common data structure for many segmentation algorithms. As part of GSoC this year I am implementing RAGs for scikit-image. The current HEAD of scikit-image’s master branch contains my RAG implementation based on Networkx from my recent Pull Request. In the example below, we will see how Region Adjacency Graphs (RAGs) attempt to solve the segmentation problem.Please note that you need the latest master branch of scikit-image to run the following code.

We define the function `show_img`

in preference to the standard call to `imshow`

to set nice default size parameters.

We start with `coffee`

, a nice fresh image of a coffee cup.

from skimage import graph, data, io, segmentation, color from matplotlib import pyplot as plt from skimage.measure import regionprops from skimage import draw import numpy as np def show_img(img): width = 10.0 height = img.shape[0]*width/img.shape[1] f = plt.figure(figsize=(width, height)) plt.imshow(img) img = data.coffee() show_img(img)

We segment the image using SLIC algorithm. The SLIC algorithm will

assign a unique label to each **region**. This is a

localized cluster of pixels sharing some similar property, in this case their

color. The label of each pixel is stored in the `labels`

array.

`regionprops`

helps us compute various features of these regions. We will be

sing the centroid, purely for visualization.

labels = segmentation.slic(img, compactness=30, n_segments=400) labels = labels + 1 # So that no labelled region is 0 and ignored by regionprops regions = regionprops(labels)

The `label2rgb`

function assigns a specific color to all pixels belonging to one

region (having the same label). In this case, in `label_rgb`

each pixel is

replaces with the average `RGB`

color of its region.

label_rgb = color.label2rgb(labels, img, kind='avg') show_img(label_rgb)

Just for clarity, we use `mark_boundaries`

to highlight the region boundaries.

You will notice the the image is divided into more regions than required. This

phenomenon is called **over-segmentation**.

label_rgb = segmentation.mark_boundaries(label_rgb, labels, (0, 0, 0)) show_img(label_rgb)

Region Adjacency Graphs, as the name suggests represent adjacency of regions

with a graph. Each region in the image is a node in a graph. There is an edge

between every pair of adjacent regions (regions whose pixels are adjacent). The

weight of between every two nodes can be defined in a variety of ways. For this

example, we will use the difference of average color between two regions as

their edge weight. The more similar the regions, the lesser the weight between

them. Because we are using difference in mean color to compute the edge weight,

the method has been named `rag_mean_color`

.

rag = graph.rag_mean_color(img, labels)

For our visualization, we are also adding an additional property to a node, the

coordinated of its centroid.

for region in regions: rag.node[region['label']]['centroid'] = region['centroid']

`display_edges`

is a function to draw the edges of a RAG on its corresponding

image. It draws edges as green lines and centroids as yellow dots.

It also accepts an argument, `thresh`

. We only draw edges with weight below this threshold.

def display_edges(image, g, threshold): """Draw edges of a RAG on its image Returns a modified image with the edges drawn.Edges are drawn in green and nodes are drawn in yellow. Parameters ---------- image : ndarray The image to be drawn on. g : RAG The Region Adjacency Graph. threshold : float Only edges in `g` below `threshold` are drawn. Returns: out: ndarray Image with the edges drawn. """ image = image.copy() for edge in g.edges_iter(): n1, n2 = edge r1, c1 = map(int, rag.node[n1]['centroid']) r2, c2 = map(int, rag.node[n2]['centroid']) line = draw.line(r1, c1, r2, c2) circle = draw.circle(r1,c1,2) if g[n1][n2]['weight'] < threshold : image[line] = 0,1,0 image[circle] = 1,1,0 return image

We call the function with `thresh = infinity`

so that all edges are drawn. I

myself was surprised with the beauty of the following output.

edges_drawn_all = display_edges(label_rgb, rag, np.inf ) show_img(edges_drawn_all)

Let’s see what happens by setting `thresh`

to `29`

, a value I arrived at with

some trial and error.

edges_drawn_29 = display_edges(label_rgb, rag, 29 ) show_img(edges_drawn_29)

As you can see above, the RAG is now divided into disconnected regions. If you

notice, the table above and to the right of the dish is one big connected

component.

The function `cut_threshold`

removes edges below a specified threshold and then

labels a connected component as one region. Once the RAG is constructed, many similar

and more sophisticated strategies can improve the initial segmentation.

final_labels = graph.cut_threshold(labels, rag, 29) final_label_rgb = color.label2rgb(final_labels, img, kind='avg') show_img(final_label_rgb)

Not perfect, but not that bad I’d say. My next steps will be to implement better algorithms to process the RAG after the initial segmentation.These include the merging predicates mention here and N-cut.

For the story so far see – the Introduction, 2nd week update, Graph Data Structures Comparison .

After much debate on the mailing list with many scikit-image developers we finally decided to use the NetworkX Graph Class for our Region Adjacency Graph ( RAG ). It comes with a lot of well-tested functionality and would speed up the GSoC progress. It is also pure Python, and shares a lot of its dependencies with scikit-image.

To construct a RAG, we need to iterate over the entire labeled image, looking for adjacent pixels with distinct labels. Initially I wrote a special case Cython loop for 2D and 3D, much like this one. But to scikit-image developers suggested, and rightly so, a more genaral n-dimensional approach. I looked for a function, which would iterate over an entire array and call a given function. As it turns out generic_filter does exactly that. I have used it here with the callable defined here.

The footprint is created such that only the elements which are 2nd and 3rd along all axes are set according to the connectivity. Rest all are forced to zero. In the 2D case with connectivity 2 it would be the bottom right elements ( a[1,1], a[1,2] , a[2,1], a[2,2] ) which are set . This ensures that one pair of adjacent pixels is not processed again when the filter window moves ahead.

The footprint ensures that the first element in the array passed to the callable is the central element in the footprint. All other elements are adjacent to the central element and an edge is created between them and the central element.

I implemented a skeletal RAG algorithm following Juan’s suggestion. It takes the labeled image as input. This is typically an over-segmented image obtained by algorithms like SLIC or watershed. Each region in the labeled image is represented by a node in the graph. Nodes of adjacent regions are joined by an edge. The weight of this edge is the magnitude of difference in mean color. Once the graph is constructed, nodes joined by edges with weights lesser than a given threshold are combined into one region.

You can view the Pull Request here.

Below are the test results of executing the example code on two scikit-image sample images. The threshold value for both the results is different and is found by trial and error. Typically, a higher threshold value, gives fewer regions in the final image.

The mechanism for RAGs is in place now. The segmentation is pretty satisfactory, considering the simple logic. The major drawback however is that it’s not fully automatic. Over the next few weeks I will implement more sophisticated merging predicates, including N-cut.

]]>Although I am yet to handle them, Juan has pointed out that typical images he has worked on have had 8000 x 8000 x 8000 pixel resolution with close to 100,000 nodes. If we use a typical adjacency matrix with 8 bit unsigned entries, that would consume 9.5 GB. However the edges in a RAG is comparatively low, as compared to the maximum number of possible edges. Thus, we need a data structure whose storage requirements are a linear function of the number of edges. My initial intuition was to design a class by wrapping a thin layer of code around one of scipy’s sparse matrix classes. But collapsing edges ( joining two nodes ) is something not easily possible in any of these formats. Hence we decided on implementing our own data structure from scratch.

To test the 4 possible that we came up with, I am constructing the RAG for the image, and randomly merging nodes till there are only 10 left. The source code and the benchmarks can be found here.

See graph_lil.py and rag_lil.pyx. This is the typical adjacency list representation, except the lists aren’t lists, they are numpy arrays. The list of neighbors is also sorted in each case to optimize the lookup ( see add_edge function ). This also makes the code a little more complex than the other approaches. Out of all the approaches, this is the most memory efficient, but also the slowest, since adding an edge and merging nodes requires a lot of memory movement.

Apart from the image, the only thing consuming significant memory is the construction of the RAG.

Line # Mem usage Increment Line Contents ================================================ 12 18.223 MiB 0.000 MiB @profile 13 def test(): 14 496.133 MiB 477.910 MiB arr = np.load("../data/watershed.npy") 15 496.133 MiB 0.000 MiB t = time.time() 16 504.375 MiB 8.242 MiB g = graph.construct_rag(arr) 17 18 19 504.383 MiB 0.008 MiB print "RAG construction took %f secs " % (time.time() - t) 20 21 504.383 MiB 0.000 MiB t = time.time() 22 504.910 MiB 0.527 MiB g.random_merge(10) 23 504.965 MiB 0.055 MiB g.display() 24 504.965 MiB 0.000 MiB print "Merging took %f secs " % (time.time() - t)

Since **line_profiler** wouldn’t work with Cython, the output comes from Cython’s in build profiling.From the below output, it’s clear that **insert** and **searchsorted ** functions consume the most time.

Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 72104155 275.539 0.000 640.501 0.000 rag_lil.pyx:18(add_edge) 144208310 274.654 0.000 274.654 0.000 {method 'searchsorted' of 'numpy.ndarray' objects} 1 164.755 164.755 999.104 999.104 {rag_lil.construct_rag_3d_lil} 144208312 90.060 0.000 90.060 0.000 stringsource:317(__cinit__) 144208310 84.936 0.000 359.589 0.000 fromnumeric.py:952(searchsorted) 144208311 66.758 0.000 156.817 0.000 stringsource:613(memoryview_cwrapper) 144208312 21.694 0.000 21.694 0.000 stringsource:339(__dealloc__) 144208311 15.142 0.000 15.142 0.000 stringsource:619(memoryview_check) 158956 3.893 0.000 5.373 0.000 function_base.py:3305(insert) 476868 0.726 0.000 0.726 0.000 {numpy.core.multiarray.array} 199734 0.272 0.000 0.272 0.000 {numpy.core.multiarray.empty} 158956 0.179 0.000 0.424 0.000 numeric.py:392(asarray) 158956 0.163 0.000 0.163 0.000 numeric.py:1299(rollaxis) 1 0.137 0.137 0.137 0.137 {method 'reduce' of 'numpy.ufunc' objects} 158956 0.127 0.000 0.127 0.000 {isinstance} 158956 0.051 0.000 0.051 0.000 {method 'item' of 'numpy.ndarray' objects} 1 0.019 0.019 0.057 0.057 graph_lil.py:8(__init__) 1 0.000 0.000 0.000 0.000 {range} 1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros} 1 0.000 0.000 0.137 0.137 fromnumeric.py:2048(amax) 1 0.000 0.000 0.137 0.137 _methods.py:15(_amax) 1 0.000 0.000 999.104 999.104 graph_lil.py:59(construct_rag) 1 0.000 0.000 0.000 0.000 stringsource:957(memoryview_fromslice) 1 0.000 0.000 999.104 999.104 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 stringsource:933(__dealloc__) 1 0.000 0.000 0.000 0.000 stringsource:508(__get__) 1 0.000 0.000 0.000 0.000 stringsource:468(__getbuffer__) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

**search**, **insert**, and **delete** are consuming the most time in this case.

Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 20378 117.102 0.006 252.734 0.012 rag_lil.pyx:47(merge_node) 56129898 98.659 0.000 98.659 0.000 {method 'searchsorted' of 'numpy.ndarray' objects} 56129898 32.130 0.000 130.789 0.000 fromnumeric.py:952(searchsorted) 107196 1.997 0.000 2.734 0.000 function_base.py:3112(delete) 68494 1.339 0.000 1.837 0.000 function_base.py:3305(insert) 1 0.605 0.605 254.349 254.349 graph_lil.py:48(random_merge) 419874 0.573 0.000 0.573 0.000 {numpy.core.multiarray.array} 161011 0.427 0.000 0.453 0.000 random.py:173(randrange) 161011 0.263 0.000 0.716 0.000 random.py:236(randint) 282886 0.214 0.000 0.600 0.000 numeric.py:392(asarray) 175690 0.205 0.000 0.205 0.000 {numpy.core.multiarray.empty} 163024 0.162 0.000 0.162 0.000 stringsource:317(__cinit__) 282886 0.134 0.000 0.134 0.000 {isinstance} 20378 0.101 0.000 252.942 0.012 {rag_lil.merge_node_py} 40756 0.085 0.000 0.108 0.000 stringsource:957(memoryview_fromslice) 122268 0.068 0.000 0.213 0.000 stringsource:613(memoryview_cwrapper) 68494 0.065 0.000 0.065 0.000 numeric.py:1299(rollaxis) 20378 0.045 0.000 0.052 0.000 random.py:271(choice) 175690 0.044 0.000 0.044 0.000 {method 'item' of 'numpy.ndarray' objects} 20378 0.033 0.000 252.975 0.012 graph_lil.py:31(merge) 163024 0.033 0.000 0.033 0.000 stringsource:339(__dealloc__) 181389 0.029 0.000 0.029 0.000 {method 'random' of '_random.Random' objects} 1 0.023 0.023 254.371 254.371 <string>:1(<module>) 122268 0.015 0.000 0.015 0.000 stringsource:619(memoryview_check) 40756 0.006 0.000 0.006 0.000 stringsource:508(__get__) 40756 0.006 0.000 0.006 0.000 stringsource:468(__getbuffer__) 40756 0.005 0.000 0.005 0.000 stringsource:933(__dealloc__) 20378 0.005 0.000 0.005 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects

See graph_nx.py and rag_nx.pyx. This is a subclass of networkx.Graph. The only additional feature I added was the merge function. NetworX mainains a dictionary per node of its adjacent nodes. Each edge has it’s own dictionary for maintaining it’s properties. This leads to increased memory usage. Because O(1) lookup in dictionaries, edge addition and contraction is very fast.As Juan pointed out to me because of the small load factor of Python’s dictionaries, most buckets remain empty. Dictionaries also have to store the key, as well as the value. Therefore, this required the 2nd highest memory amongst the tested approaches. Juan used a similar approach for his project here and has reported ~100 GB of RAM usage.

Graph construction consumes memory, and merging frees up space because we delete one of the merged nodes.

Line # Mem usage Increment Line Contents ================================================ 12 23.023 MiB 0.000 MiB @profile 13 def test(): 14 500.918 MiB 477.895 MiB arr = np.load("../data/watershed.npy") 15 500.918 MiB 0.000 MiB t = time.time() 16 530.703 MiB 29.785 MiB g = graph.construct_rag(arr) 17 18 19 530.719 MiB 0.016 MiB print "RAG construction took %f secs " % (time.time() - t) 20 21 530.719 MiB 0.000 MiB t = time.time() 22 517.906 MiB -12.812 MiB g.random_merge(10) 23 #g.display() 24 517.906 MiB 0.000 MiB print "Merging took %f secs " % (time.time() - t)

Since the function to add edges comes directly from the netwrox class, I haven’t profiled it.

Here we have a line by line time profiling bu using the **line_profiler** module. The most significant contribution is by looping through the neighbors to get edge weights.

Line # Hits Time Per Hit % Time Line Contents ============================================================== 26 @profile 27 def merge(self, i, j): 28 29 20378 45880 2.3 4.2 if not self.has_edge(i, j): 30 raise ValueError('Cant merge non adjacent nodes') 31 32 # print "before ",self.order() 33 94147 115543 1.2 10.5 for x in self.neighbors(i): 34 73769 58640 0.8 5.3 if x == j: 35 20378 14190 0.7 1.3 continue 36 53391 139198 2.6 12.6 w1 = self.get_edge_data(x, i)['weight'] 37 53391 39027 0.7 3.5 w2 = -1 38 53391 103894 1.9 9.4 if self.has_edge(x, j): 39 19352 42703 2.2 3.9 w2 = self.get_edge_data(x, j)['weight'] 40 41 53391 57866 1.1 5.2 w = max(w1, w2) 42 43 53391 313574 5.9 28.4 self.add_edge(x, j, weight=w) 44 45 20378 172383 8.5 15.6 self.remove_node(i)

See graph_custom.py and rag_custom.pyx. This is something me and Stefan came up with during a chat. This is similar to NetworkX’s approach, but the nodes are stored in a list instead of a dictionary. Also, instead of maintaining a dictionary for a property ( like weight ) per edge, we maintain one dictionary per node, where the keys are the adjacent nodes and the values are the weights. Both the **Custom **as well as the **NetworkX** representation have comparatively more memory requirements than **LIL** because dictionaries have to store the key and value, and also due to the small loading factor of Python’s dictionaries. Although not the most efficient in terms of memory, but it’s the fasted for construction of graph and merging of nodes.

Again, we see that the RAG itself adds to the memory other than the image. Merging nodes does not result in a significant decrease in memory because, one of the merged nodes still holds an empty dictionary.

Line # Mem usage Increment Line Contents ================================================ 13 18.184 MiB 0.000 MiB @profile 14 def test(): 15 496.137 MiB 477.953 MiB arr = np.load("../data/watershed.npy") 16 496.137 MiB 0.000 MiB t = time.time() 17 507.371 MiB 11.234 MiB g = graph.construct_rag(arr) 18 19 507.383 MiB 0.012 MiB print "RAG construction took %f secs " % (time.time() - t) 20 21 # print g.max_size 22 507.383 MiB 0.000 MiB t = time.time() 23 506.543 MiB -0.840 MiB g.random_merge(10) 24 506.543 MiB 0.000 MiB g.display() 25 # print g.max_size 26 506.543 MiB 0.000 MiB print "Merging took %f secs " % (time.time() - t)

The dictionary update and the look up, take the most time. Maintaining edge count was done only for test purposes.

Line # Hits Time Per Hit % Time Line Contents ============================================================== 24 @profile 25 def make_edge(self, i, j, wt): 26 72157546 40339563 0.6 19.9 try: 27 72157546 56802957 0.8 28.0 self.rows[i][j] 28 73778 92866 1.3 0.0 except KeyError: 29 73778 72716 1.0 0.0 self.edge_count += 1 30 31 72157546 52556299 0.7 25.9 self.rows[i][j] = wt 32 72157546 53126454 0.7 26.2 self.rows[j][i] = wt

Making new edges takes the most time.

Line # Hits Time Per Hit % Time Line Contents ============================================================== 41 @profile 42 def merge(self, i, j): 43 44 20378 48653 2.4 3.5 if not self.has_edge(i, j): 45 raise ValueError('Cant merge non adjacent nodes') 46 47 48 # print "before ",self.order() 49 94147 100716 1.1 7.3 for x in self.neighbors(i): 50 73769 56567 0.8 4.1 if x == j: 51 20378 14158 0.7 1.0 continue 52 53391 94329 1.8 6.8 w1 = self.get_weight(x, i) 53 53391 38553 0.7 2.8 w2 = -1 54 53391 181510 3.4 13.1 if self.has_edge(x, j): 55 19352 29334 1.5 2.1 w2 = self.get_weight(x,j) 56 57 53391 59662 1.1 4.3 w = max(w1, w2) 58 59 53391 648168 12.1 46.8 self.make_edge(x, j, w) 60 61 20378 113651 5.6 8.2 self.remove_node(i)

See graph_csr.py and rag_csr.pyx. This is inspired for the scipy’s csr_matrix. To account for merging of nodes, we append the data of the merged node towards the end of indptr, indices and data array ( merge function ). We invalidate the merged nodes by setting a valid flag to False. These arrays keep doubling in size whenever required ( double function), which leads to O(1) amortized memory movement cost. The merging code is also almost pure cython with very less Python calls, which makes it faster than the LIL approach. As we later discovered, the doubling happens 7 times for our test case, which leads to a 256 times increase in the array size. This results in the highest memory usage among all approaches.

Apart from the graph, merging increases memory usage. This is because the arrays double their size, which can happen any number of times. For this particular case, it occurred 9 times.

Line # Mem usage Increment Line Contents ================================================ 13 20.031 MiB 0.000 MiB @profile 14 def test(): 15 498.074 MiB 478.043 MiB arr = np.load("../data/watershed.npy") 16 498.074 MiB 0.000 MiB t = time.time() 17 500.281 MiB 2.207 MiB g = graph.construct_rag(arr) 18 19 500.293 MiB 0.012 MiB print "RAG construction took %f secs " % (time.time() - t) 20 21 #print g.max_size 22 500.293 MiB 0.000 MiB t = time.time() 23 825.211 MiB 324.918 MiB g.random_merge(10) 24 #print g.max_size 25 825.211 MiB 0.000 MiB print "Merging took %f secs " % (time.time() - t)

For test purposes , this was done using scipy’s dok_matrix.

We have to profile Cython code again. As expected, doubling the array and copying elements, take the most time.

Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 20378 9.164 0.000 10.663 0.001 rag_csr.pyx:17(merge) 20378 0.515 0.000 11.658 0.001 graph_csr.py:69(merge) 8 0.451 0.056 0.452 0.057 graph_csr.py:51(double) 288383 0.387 0.000 0.415 0.000 random.py:173(randrange) 20378 0.353 0.000 0.353 0.000 {method 'sort' of 'numpy.ndarray' objects} 20378 0.319 0.000 0.823 0.000 arraysetops.py:93(unique) 1 0.318 0.318 12.573 12.573 graph_csr.py:79(random_merge) 40756 0.265 0.000 0.270 0.000 {numpy.core.multiarray.concatenate} 288383 0.141 0.000 0.556 0.000 random.py:236(randint) 142646 0.134 0.000 0.134 0.000 stringsource:317(__cinit__) 20378 0.069 0.000 0.069 0.000 {numpy.core.multiarray.copyto} 101890 0.067 0.000 0.193 0.000 stringsource:613(memoryview_cwrapper) 20378 0.055 0.000 0.055 0.000 {numpy.core.multiarray.empty_like} 20378 0.051 0.000 0.175 0.000 numeric.py:78(zeros_like) 20378 0.049 0.000 1.028 0.000 arraysetops.py:379(union1d) 40756 0.041 0.000 0.054 0.000 stringsource:957(memoryview_fromslice) 20378 0.037 0.000 0.037 0.000 {method 'flatten' of 'numpy.ndarray' objects} 20378 0.035 0.000 0.041 0.000 random.py:271(choice) 1 0.030 0.030 12.603 12.603 <string>:1(<module>) 308761 0.030 0.000 0.030 0.000 {method 'random' of '_random.Random' objects} 142646 0.029 0.000 0.029 0.000 stringsource:339(__dealloc__) 20378 0.028 0.000 10.691 0.001 {rag_csr.merge} 101890 0.014 0.000 0.014 0.000 stringsource:619(memoryview_check) 40756 0.005 0.000 0.005 0.000 stringsource:933(__dealloc__) 40756 0.005 0.000 0.005 0.000 stringsource:468(__getbuffer__) 40756 0.004 0.000 0.004 0.000 stringsource:508(__get__) 20378 0.004 0.000 0.004 0.000 {len} 16 0.001 0.000 0.001 0.000 {numpy.core.multiarray.zeros} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

The time taken for each approach for all 3 volumes used.

This plot shows the maximum memory usage for each approach. Notice That for the CSR approach, for case 1, the memory usage during merging is significantly higher. This is due to the fact that, doubling of the array, can happen any number of times.

]]>

The graph data structure should be able to hold close to 10^5 nodes which is not uncommon for 3D images. This rules out the adjacency matrix representation which would require 10^10 slots. A plausible solution is to use Scipy’s sparse matrices. The degree of each node is the number of nodes adjacent to it, for 2D images it will be close to 4, where as for 3D it will be close to 8. So we can safely assume that the number of edges is comparable to the number of nodes. I am testing 4 different approaches to store a graph, 2 of which are inspired from scipy’s sparse matrix classes. Another thing that had to keep in mind was to allow quick contraction of edges. This is useful for approaches like Hierarchical Clustering.In the testing code I am constructing a RAG for the image and randomly collapsing edges till a fixed number of nodes are left, while monitoring memory usage and speed.

The source code and benchmarking results can be found here. The file being used ( watershed.npy ) is a labeled 3d image using 3d watershed for a 500 x 500 x 500 volume . The 4 approaches I am testing are

This is similar to Scipy’s LIL matrix . In a graph of **N** nodes each node **i ** is assigned two lists. One list holds all the nodes adjacent to **i **and the other holds the corresponding weights. Instead of using Python list we chose to implement it using Numpy arrays, and used Cython to speed up some of the graph operations.

This approach uses the least storage, but it is also the slowest among what I have tested so far, because construction involves moving around a lot of memory.(See graph_lil.py and rag_lil.pyx)

In this case for a graph of **N**, for each node **i** , a dictionary is maintained which maps its neighbors to the corresponding weights. This has higher memory usage that LIL for CSR approach , because dictionaries store keys as well as values and due to the small load factor od Python dictionaries. However, graph construction is fast, since it does not involve moving around a lot of memory.The overall time taken is fastest in this case (See graph_custom.py and rag_custom.pyx )

This class inherits from networkx.Graph. The only extra code that I had to write in this case was to contract an edge. Although the memory usage is the highest, the time taken for randomly merging nodes is about 20 times faster. ( See graph_nx.py and rag_nx.py )

This is currently work in progress. It will hold it’s data in the same manner as scipy’s csr_matrix. However to handle edge contraction a new node will be created, and it’s information will be appended towards the end. To accommodate for this all the internal arrays will be dynamically resized, doubling their size when required.

]]>

Certain image segmentation algorithms have a tendency to over segment an image. They divide a region as perceived by humans into two or more regions. This is because they tend to favor small regions of similar color. But in the real world one object might have different shades of the same color or different colors all together. Here is an example using SLIC. In broad terms SLIC is k-means done on (X,Y, Z ) color space

We consider each of these regions as a vertex in a graph. Each region is connected to all the regions that touch it. Similar regions are joined with an edge of less weight. Dissimilar regions are joined with edges oh high weight. One measure of dissimilarity might be difference in the mean color. See the below example.

If we remove the edges with higher weights in an appropriate manner, the regions remaining connected would belong to the same object. Thus in this case the face, the hat, the hair might be finally one connected subgraph of regions. Over the next two weeks I will try to take an over segmented image and build its RAG. As a proof of concept of the underlying data structures and algorithms I will apply a threshold and remove the edges with weights higher than it. Later on I will move onto to more complicated selection procedures including N-cut and if my MST experiments yield good results an MST based procedure.

]]>