Virenk's Blog

From hours to less than a minute (thats what a good algorithm can do)

Posted on: September 17, 2010

Recently i came across a scenario where a pice of code was running extremely slow (taking approximately 7-8 hours), but with a minor change in algorithm approach it ran in less than a minute!.

Consider a scenario where you have to break a graph into smaller partitions.
s.t each partition must have size <= max_partition_size and cost <= max_partiton_cost.
size = number of nodes in the graph
cost = sum of cost of each node in graph. (assuming each node has a cost associated with it)

Initially let graph (G) has N nodes.

First approach :
1. Cut out a partition P1 from graph (G) s.t P1.size <= max_partition_size, P1.cost <= max_partition_cost.
2. set G to G – P1.  (P1 is cut out and now G has remaining nodes).
3. Repeat above procedure on G.

The complexity of above appraoch is O(n^2). Since max_partiton_size and max_partition_cost are constants, with each iteration, we are processing data of size N, N-K, N-2K, … respectively (K is constant), for which complexity order is O(n^2).

This approach works well for small graph sizes, but for larger ones algorithm will run extremely slow. For example for a graph having 70K nodes, the algorithm approach would take 70K*70K = 4900M steps! Thats huge.

Better appraoch:
For G having large no. of nodes we change the strategy a bit. Instead of diretly forming partitions of required size, we form macro partitions first, and then further split up the macro partitions into partitions of required size recursively.

If G has more than 2000 nodes, we vary
macro_max_partition_cost = 20 * max_partition_cost
macro_max_partiton_size = 20 * max_partition_size

So the approach is :

1. Cut out a partition P1 from graph (G) s.t
if(G.size < 2000) {
P1 is cut out s.t P1.size < max_partition_size and P1.cost < max_partiton_cost
} else {
P1 is cut out s.t P1.size < macro_max_partiton_size and P1.cost < macro_max_partition_cost
2. if (P1.size <= max_partition_size && P1.cost <= max_partition_cost) {
//do nothing , as u already have required partition
} else {
Set G = P1 and repeat above procedure.
3. Set G = G – P1 (remaining portion of G). Repeat above procedure.

By forming larger cuts, we have reduced the total amount of calculations on data.

Using above appraoch i was able to get a piece of code run in less than a minute which was earlier taking 6-7 hours.

Ohter appraoches :
Binary partitioning , i.e split the graph into two pieces, and keep splitting each piece further until requited partition sizes are obtained.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

  • None


%d bloggers like this: