##
Archive for the ‘**Algorithms**’ Category

Recently i came across a scenario where a pice of code was running extremely slow (taking approximately 7-8 hours), but with a minor change in algorithm approach it ran in less than a minute!.

Consider a scenario where you have to break a graph into smaller partitions.

s.t each partition must have size <= max_partition_size and cost <= max_partiton_cost.

where:

size = number of nodes in the graph

cost = sum of cost of each node in graph. (assuming each node has a cost associated with it)

Initially let graph (G) has N nodes.

**First approach :**

1. Cut out a partition P1 from graph (G) s.t P1.size <= max_partition_size, P1.cost <= max_partition_cost.

2. set G to G – P1. (P1 is cut out and now G has remaining nodes).

3. Repeat above procedure on G.

The complexity of above appraoch is O(n^2). Since max_partiton_size and max_partition_cost are constants, with each iteration, we are processing data of size N, N-K, N-2K, … respectively (K is constant), for which complexity order is O(n^2).

This approach works well for small graph sizes, but for larger ones algorithm will run extremely slow. For example for a graph having 70K nodes, the algorithm approach would take 70K*70K = 4900M steps! Thats huge.

**Better appraoch**:

For G having large no. of nodes we change the strategy a bit. Instead of diretly forming partitions of required size, we form macro partitions first, and then further split up the macro partitions into partitions of required size recursively.

If G has more than 2000 nodes, we vary

macro_max_partition_cost = 20 * max_partition_cost

macro_max_partiton_size = 20 * max_partition_size

So the approach is :

1. Cut out a partition P1 from graph (G) s.t

if(G.size < 2000) {

P1 is cut out s.t P1.size < max_partition_size and P1.cost < max_partiton_cost

} else {

P1 is cut out s.t P1.size < macro_max_partiton_size and P1.cost < macro_max_partition_cost

}

2. if (P1.size <= max_partition_size && P1.cost <= max_partition_cost) {

//do nothing , as u already have required partition

} else {

Set G = P1 and repeat above procedure.

}

3. Set G = G – P1 (remaining portion of G). Repeat above procedure.

By forming larger cuts, we have reduced the total amount of calculations on data.

Using above appraoch i was able to get a piece of code run in less than a minute which was earlier taking 6-7 hours.

**Ohter appraoches **:

Binary partitioning , i.e split the graph into two pieces, and keep splitting each piece further until requited partition sizes are obtained.