Lately I got intrigued by the beautiful hidden depths (maybe some people prefer “atrocious inner guts”) of Perl and I decided to take up the Perl Weekly Challenge for fun (I “joined the club” with challenge 082 last week).

This week, the second task is equivalent to the following optimization problem:

Problem Definition

*Given a set of natural numbers , find a partition (i.e., disjoint subsets such that ) such that the absolute value of the difference between the sums of numbers in each subset, denoted , is minimal across all possible partitions.*

Complexity

[*Note*: complexity in this document is always w.r.t. the number of elements of the input set .]

Unfortunately, this optimization problem (let’s denote it as ) is NP-hard, which means (loosely speaking) that it is at least as hard as every problem in NP or, more formally, that every problem in NP can be reduced in polynomial time to it.

To prove this statement, it suffices to show there is an NP-complete problem that can be reduced in polynomial time to , as an NP-complete problem can be considered (loosely speaking) as a representative member of the set NP (in the sense that every other problem in NP can be transformed in polynomial time to it).

An example NP-complete problem of this kind is its “*easier*” brother the “two-way partitioning problem”, which tries to decide if there is a partition such that (and if there is one, return it!). It can indeed be shown that can be reduced to in polynomial time, as yields the solution to which we need only polynomial time processing (linear even) to lead to a solution of (i.e. if , then is a solution of and if not, no solution of exists; note that summing is linear).

Now, on to the question why is NP-complete. It is clear that it lies in NP because a solution, if it exists, can be verified to be correct in polynomial time (linear actually, all you need is summing) on a deterministic Turing machine. Moreover, it can be reduced in polynomial time to the so-called subset sum problem, as argued on https://en.wikipedia.org/wiki/Subset_sum_problem, by noting that it is equivalent to the subset sum problem over , with . This subset sum problem is known to be NP-complete.

This concludes the discussion that is NP-complete and, hence, is NP-hard.

Implementation

The above discussion unfortunately implies that no algorithm in polynomial time (on a *deterministic* Turing machine of course) to solve is currently known. If we would be able to find such an algorithm, we would have resolved a long standing question by proving that P = NP.

My naive implementation has exponential time complexity and can be found here.

Note that heuristic algorithms exist, see e.g. https://www.ijcai.org/Proceedings/09/Papers/096.pdf, to reduce the average running time considerably. However, all of them have worst case exponential time complexity.