A closert look at Normalized Cut

Variation with number of regions

In this post I explained how the Normalized Cut works and demonstrated some examples of it. This post aims to take a closer look at the code. I ran the following code to monitor the time taken by NCut with respect to initial number of regions.

from __future__ import print_function
from skimage import graph, data, io, segmentation, color
import time
from matplotlib import pyplot as plt

image = data.coffee()
segment_list = range(50, 801, 50)

for nseg in segment_list:
    labels = segmentation.slic(image, compactness=30, n_segments=nseg)
    rag = graph.rag_mean_color(image, labels, mode='similarity')
    T = time.time()
    new_labels = graph.ncut(labels, rag)
    time_taken = time.time() - T
    out = color.label2rgb(new_labels, image, kind='avg')
    io.imsave('/home/vighnesh/Desktop/ncut/' + str(nseg) + '.png', out)
    print(nseg, time_taken)


Here is the output sequentially.

By a little guess-work, I figured that the curve approximately varies as x^2.2. For 800 nodes, the time taken is around 35 seconds.

Line Profile

I used line profiler to examine the time taken by each line of code in threshold_normalized. Here are the results.

   218                                           @profile
   219                                           def _ncut_relabel(rag, thresh, num_cuts):
   220                                               """Perform Normalized Graph cut on the Region Adjacency Graph.
   222                                               Recursively partition the graph into 2, until further subdivision
   223                                               yields a cut greather than `thresh` or such a cut cannot be computed.
   224                                               For such a subgraph, indices to labels of all its nodes map to a single
   225                                               unique value.
   227                                               Parameters
   228                                               ----------
   229                                               labels : ndarray
   230                                                   The array of labels.
   231                                               rag : RAG
   232                                                   The region adjacency graph.
   233                                               thresh : float
   234                                                   The threshold. A subgraph won't be further subdivided if the
   235                                                   value of the N-cut exceeds `thresh`.
   236                                               num_cuts : int
   237                                                   The number or N-cuts to perform before determining the optimal one.
   238                                               map_array : array
   239                                                   The array which maps old labels to new ones. This is modified inside
   240                                                   the function.
   241                                               """
   242        59       218937   3710.8      3.2      d, w = _ncut.DW_matrices(rag)
   243        59          151      2.6      0.0      m = w.shape[0]
   245        59           61      1.0      0.0      if m > 2:
   246        44         3905     88.8      0.1          d2 = d.copy()
   247                                                   # Since d is diagonal, we can directly operate on its data
   248                                                   # the inverse of the square root
   249        44          471     10.7      0.0          d2.data = np.reciprocal(np.sqrt(d2.data, out=d2.data), out=d2.data)
   251                                                   # Refer Shi & Malik 2001, Equation 7, Page 891
   252        44        26997    613.6      0.4          vals, vectors = linalg.eigsh(d2 * (d - w) * d2, which='SM',
   253        44      6577542 149489.6     94.9                                       k=min(100, m - 2))
   255                                                   # Pick second smallest eigenvector.
   256                                                   # Refer Shi & Malik 2001, Section 3.2.3, Page 893
   257        44          618     14.0      0.0          vals, vectors = np.real(vals), np.real(vectors)
   258        44          833     18.9      0.0          index2 = _ncut_cy.argmin2(vals)
   259        44         2408     54.7      0.0          ev = _ncut.normalize(vectors[:, index2])
   261        44        22737    516.8      0.3          cut_mask, mcut = get_min_ncut(ev, d, w, num_cuts)
   262        44           78      1.8      0.0          if (mcut < thresh):
   263                                                       # Sub divide and perform N-cut again
   264                                                       # Refer Shi & Malik 2001, Section 3.2.5, Page 893
   265        29        78228   2697.5      1.1              sub1, sub2 = partition_by_cut(cut_mask, rag)
   267        29          175      6.0      0.0              _ncut_relabel(sub1, thresh, num_cuts)
   268        29           92      3.2      0.0              _ncut_relabel(sub2, thresh, num_cuts)
   269        29           32      1.1      0.0              return
   271                                               # The N-cut wasn't small enough, or could not be computed.
   272                                               # The remaining graph is a region.
   273                                               # Assign `ncut label` by picking any label from the existing nodes, since
   274                                               # `labels` are unique, `new_label` is also unique.
   275        30          685     22.8      0.0      _label_all(rag, 'ncut label')

As you can see above 95% of the time is taken by the call to eigsh.

To take a closer look at it, I plotted time while ensuring only one iteration. This commit here takes care of it. Also, I changed the eigsh call to look for the largest eigenvectors instead of the smallest ones, with this commit here. Here are the results.


A single eignenvalue computation is bounded by O(n^1.5) as mentioned in the original paper. The recursive NCuts are pushing the time required towards more than O(n^2).eigsh solves the eigenvalue problem for a symmetric hermitian matrix. It in turn relies on a library called ARPack. As documented here ARPack isn’t very good at finding the smallest eigenvectors. If the value supplied as the argument k is too small, we get the ArpackNoConvergence Exception. As seen from the above plot, finding the largest eigenvectors is much more efficient using the eigsh function.

Since the problem is specific to ARPack, using other libraries might lead to faster computation. slepc4py is one such BSD licensed library. The possibility of optionally importing slec4py should be certainly explored in the near future.

Also, we can optionally ask the user for a function to solve the eigenvalue problem, so that he can use a matrix library of his choice if he/she so desires.

Final Thoughts

Although the current Normalized Cut implementation takes more than quadratic time, the preceding over segmentation method does most of the heavy lifting. With something like SLIC, we can be sure of the number of nodes irrespective of the input image size. Although, a better eigenvalue finding technique for smallest eigenvectors would immensely improve its performance.

GSoC – 2 weeks in

So for the past two weeks I have been experimenting with different graph data structures. Me and my mentors are trying to zero down on a structure which is efficient in terms of time as well as space. Although this phase is taking a little longer than I anticipated, the correct design choices here will save us a lot of memory and CPU time in the future. Below I will  list out some of our requirements and a look at the plausible solutions



The graph data structure should be able to hold close to 10^5 nodes which is not uncommon for 3D images. This rules out the adjacency matrix representation which would require 10^10 slots. A plausible solution is to use Scipy’s sparse matrices. The degree of each node is the number of nodes adjacent to it, for 2D images it will be close to 4, where as for 3D it will be close to 8. So we can safely assume that the number of edges is comparable to the number of nodes. I am testing 4 different approaches to store a graph, 2 of which are inspired from scipy’s sparse matrix classes. Another thing that had to keep in mind was to allow quick contraction of edges. This is useful for approaches like Hierarchical Clustering.In the testing code I am constructing a RAG for the image and randomly collapsing edges till a fixed number of nodes are left, while monitoring memory usage and speed.

The source code and benchmarking results can be found here.  The file being used ( watershed.npy ) is a labeled 3d image using 3d watershed for a 500 x 500 x 500 volume . The 4 approaches I am testing are


LIL (List of Lists)

This is similar to Scipy’s LIL matrix . In a graph of N nodes each node i is assigned two lists. One list holds all the nodes adjacent to i and the other holds the corresponding weights. Instead of using Python list we chose to implement it using Numpy arrays, and used Cython to speed up some of the graph operations.

This approach uses the least storage, but it is also the slowest among what I have tested so far, because construction involves moving around a lot of memory.(See graph_lil.py and rag_lil.pyx)


Custom Graph Class

In this case for a graph of N, for each node i , a dictionary is maintained which maps its neighbors to the corresponding weights. This has higher memory usage that LIL for CSR approach , because dictionaries store keys as well as values and due to the small load factor od Python dictionaries. However, graph construction is fast, since it does not involve moving around a lot of memory.The overall time taken is fastest in this case (See graph_custom.py and rag_custom.pyx )


Networkx Graph Class

This class inherits from networkx.Graph. The only extra code that I had to write in this case was to contract an edge. Although the memory usage is the highest, the time taken for randomly merging nodes is about 20 times faster. ( See graph_nx.py  and rag_nx.py )


CSR Graph Class

This is currently work in progress. It will hold it’s data in the same manner as scipy’s csr_matrix. However to handle edge contraction a new node will be created, and it’s information will be appended towards the end. To accommodate for this all the internal arrays will be dynamically resized, doubling their size when required.