I studied applied mathematics, in which I did both a bachelor's and a master's degree.
What are the problem classes P and NP?
P and NP are classes of decision problems. A decision problem is a problem for which the answer is either yes or no. To solve these problems an algorithm can be used.
If this algorithm runs in polynomial time of the input size the problem belongs to the set P. Polynomial time means that the number of steps the algorithm has to perform to solve the problem is in the order of a polynomial in the input size. An example is multiplication of two numbers consisting of n digits. To do this we have to multiply every digit of the first number with every digit of the second number. Therefore we need to perform n^2 steps, which is a polynomial.
If a proof of a yes instance can be verified in polynomial time of the input size a problem is in NP. Clearly, any problem that is in P is also in NP, because we can verify the yes instance just by solving it. However, it might be that a problem does not have a polynomial time algorithm but it can be verified in polynomial time. An example is the traveling salesman problem. Is it possible for a postmen to visit all cities while traveling at most k distance. If we have a route the postmen will take, we can easily check in polynomial time if the distance is smaller than k. However, solving the problem cannot be done in polynomial time as far as we know.
P = NP?
Most mathematicians believe that P is not equal to NP. However, nobody has been able to prove or disprove it so far. If you can either prove or disprove it you can claim one million dollars because it is listed as one of the seven millennium prize problems.
When is a Problem NP-Hard or NP-Complete
A problem is NP-hard if it is at least as difficult as any problem in NP. Since there are also easy problems in NP, namely those that are in P, not every problem in NP is NP-hard. When a problem is both in NP and NP-hard it is considered to be NP-Complete.
With "at least as hard as any problem in NP" it is meant that for any problem in NP there exists a polynomial time reduction algorithm that reduces every instance of the NP-hard problem to a specific instance of the problem in NP.
This sounds a bit vague but let's try to get it more clear what the consequences of this are. Assume we have problem A and B. Now a reduction from A to B is a polynomial time algorithm that picks an arbitrary instance of problem A and turns it into a specific instance of problem B. Now, if A is NP-hard, it means that B must also be NP-hard. Here is why:
If B was not NP-hard, we could solve an instance of A just by transforming it into B and then solve B in polynomial time. Since the transformation, called reduction, was in polynomial time, we could solve A in polynomial time.
If A is not NP-hard, such a reduction will still exist, but we cannot get any information out of it, so it is useless.
To perform these kind of reduction we first need a list of known NP-hard problems. In 1971 Stephen Cook and Leonid Levin proved that 3-SAT is NP-complete. The 3-SAT is as follows:
Given a logical boolean expression with clauses of at most three literals is it possible to get the answer true by setting some variables to true.
From 3-SAT, a lot of problems have been reduced. By now there are a lot of known NP-Complete problems. Therefore, if we have a new problem of which we want to show that it is NP-Hard, we have to pick one of these known NP-hard problems and determine a polynomial time reduction that reduces the known NP-hard problem to our problem. If we then can also show that our problem is in NP we have shown our problem is NP-complete.
Although it might seem logical that any NP-Hard problem is in NP, this is not the case. There are examples of problems that are NP-hard, but not in NP.
Example of a Reduction
Let's say our problem is the Clique problem. The question is: Given a Graph G, with vertex set V and edge set E is there a clique of size k?
A clique is a subset of the vertices that are connected to all other vertices in the clique.
We reduce from the Independent set problem. Here the problem is: Given a Graph G, with vertex set V and edge set E is there an independent of size k?
An independent set is a set of vertices that are all not connected to each other. Let's say we know independent set is an NP-Hard problem (which it is). Now we reduce an instance of independent set to clique the following way:
We are given a graph G for the independent set problem and turn it into G' on whch we will solve the clique problem. The vertex set of G' will be the same as the vertex set of G. The edge set will be different. In G' there is an edge between two vertices if and only if there is no edge between these vertices in G.
Now if G' has a clique of size k, G must have an independent set of size k, namely the same set of vertices.
The reduction runs in polynomial time of the input size. Our input consists of vertices and edges. The reduction removes all edges once and adds at most |V|*|E|/2 edges. This is polynomial in the input size.
Therefore, if we would be able to solve clique in polynomial time, we would now also be able to solve independent set in polynomial time. We know it is impossible to solve independent set in polynomial time so clique must also be NP-Hard, or P = NP. Since we assume P is not equal to NP we must have the first.