This chapter provides routines to solve certain integer programming, transportation and shortest path problems. Additionally ‘best subset’ routines are included.
2Background to the Problems
General linear programming (LP) problems (see Dantzig (1963)) are of the form:
find $x={({x}_{1},{x}_{2},\dots ,{x}_{n})}^{\mathrm{T}}$ to maximize $F\left(x\right)={\displaystyle \sum _{j=1}^{n}}{c}_{j}{x}_{j}$
subject to linear constraints which may have the forms:
This chapter deals with integer programming (IP) problems in which some or all the elements of the solution vector $x$ are further constrained to be integers. For general LP problems where $x$ takes only real (i.e., noninteger) values, refer to Chapter E04.
IP problems may or may not have a solution, which may or may not be unique.
The hatched area in Figure 1 is the feasible region, the region where all the constraints are satisfied, and the points within it which have integer coordinates are circled. The lines of hatching are in fact contours of decreasing values of the objective function $3{x}_{1}+2{x}_{2}$, and it is clear from Figure 1 that the optimum IP solution is at the point $(1,1)$. For this problem the solution is unique.
However, there are other possible situations.
(a)There may be more than one solution; e.g., if the objective function in the above problem were changed to ${x}_{1}+{x}_{2}$, both $(1,1)$ and $(2,0)$ would be IP solutions.
(b)The feasible region may contain no points with integer coordinates, e.g., if an additional constraint
$$3{x}_{1}\le 2$$
were added to the above problem.
(c)There may be no feasible region, e.g., if an additional constraint
$${x}_{1}+{x}_{2}\le 1$$
were added to the above problem.
(d)The objective function may have no finite minimum within the feasible region; this means that the feasible region is unbounded in the direction of decreasing values of the objective function, e.g., if the constraints
Algorithms for IP problems are usually based on algorithms for general LP problems, together with some procedure for constructing additional constraints which exclude noninteger solutions (see Beale (1977)).
The Branch and Bound (B&B) method is a well-known and widely used technique for solving IP problems (see Beale (1977) or Mitra (1973)). It involves subdividing the optimum solution to the original LP problem into two mutually exclusive sub-problems by branching an integer variable that currently has a fractional optimal value. Each sub-problem can now be solved as an LP problem, using the objective function of the original problem. The process of branching continues until a solution for one of the sub-problems is feasible with respect to the integer problem. In order to prove the optimality of this solution, the rest of the sub-problems in the B&B tree must also be solved. Naturally, if a better integer feasible solution is found for any sub-problem, it should replace the one at hand.
A common method for specifying IP and LP problems in general is the use of the MPSX file format (see IBM (1971)). A full description of this file format is provided in the routine document for h02buf.
The efficiency in computations is enhanced by discarding inferior sub-problems. These are problems in the B&B search tree whose LP solutions are lower than (in the case of maximization) the best integer solution at hand.
The B&B method may also be applied to convex Quadratic Programming (QP) problems and Nonlinear Programming (NLP) problems using sequential convex QP approximations.
Routines have been introduced into this chapter to formally apply the technique to dense general QP problems and to sparse LP, QP or NLP problems. Section 2.6 in the E04 Chapter Introduction describes the virtues of having a well-scaled problem. The imposition that a variable be integer makes this more difficult and some practical common sense might be required to make the problem tractable. If a variable is expected to have a large value at the minimum, say $100000$ for instance, then in practical terms it might be better to forget the integer constraint and simply round off the final answer. To do otherwise forces a high level of computation accuracy on the underlying optimiser that might be impossible to achieve.
A special type of linear programming problem is the transportation problem in which there are $p\times q$ variables ${y}_{kl}$ which represent quantities of goods to be transported from each of $p$ sources to each of $q$ destinations.
and if all the ${A}_{k}$ and ${B}_{l}$ are integers, then so are the optimal ${y}_{kl}$.
The shortest path problem is that of finding a path of minimum length between two distinct vertices ${n}_{s}$ and ${n}_{e}$ through a network. Suppose the vertices in the network are labelled by the integers $1,2,\dots ,n$. Let $(i,j)$ denote an ordered pair of vertices in the network (where $i$ is the origin vertex and $j$ the destination vertex of the arc), ${x}_{ij}$ the amount of flow in arc $(i,j)$ and ${d}_{ij}$ the length of the arc $(i,j)$. The LP formulation of the problem is thus given as
$${a}_{ij}=\{\begin{array}{ll}+1& \text{if arc}j\text{ is directed away from vertex}i\text{,}\\ -1& \text{if arc}j\text{ is directed towards vertex}i\text{,}\\ 0& \text{otherwise}\end{array}$$
The above formulation only yields a meaningful solution if ${x}_{ij}=0$ or $1$; that is, $\mathrm{arc}(i,j)$ forms part of the shortest route only if ${x}_{ij}=1$. In fact since the optimal LP solution will (in theory) always yield ${x}_{ij}=0$ or $1$, (1) can also be solved as an IP problem. Note that the problem may also be solved directly (and more efficiently) using a variant of Dijkstra's algorithm (see Ahuja et al. (1993)).
The travelling salesman problem is that of finding a minimum distance route round a given set of cities. In the classical travelling salesman problem the salesperson must visit each city only once before returning to his or her city of origin. It can be formulated as an IP problem in a number of ways. One such formulation is described in Williams (1993). Such IP problems could be solved directly by a mixed integer nonlinear programming solver; however, there are currently no routines in the Library that directly solve such IP problems. However, an acceptable solution to symmetric distance problems may be sought using the probabilistic optimization method known as simulated annealing for which a routine is available. Asymmetric problems can be tackled by the introduction of shadow cities with zero distance between an original city and its shadow. Incomplete problems, where bidirectional travel between each pair of cities is not possible, can be tackled by attributing very large distances to unavailable journeys. For example, a salesperson might not mind backtracking through a previously visited city if this produced the shortest route. This problem is known as the practical travelling salesman problem.
The best $\mathit{n}$ subsets problem assumes a scoring mechanism and a set of $m$ features. The problem is one of choosing the best $n$ subsets of size $p$. It is addressed by two routines in this chapter. The first of these uses reverse communication; the second direct communication (see Section 7 in How to Use the NAG Library for a description of the difference between these two conventions).
3Recommendations on Choice and Use of Available Routines
h02bbf solves dense integer programming problems using a branch and bound method.
h02bff solves dense integer or linear programming problems defined by a MPSX data file.
h02buf converts an MPSX data file defining an integer or a linear programming problem to the form required by e04mff/e04mfaorh02bbf.
h02bvf prints the solution to an integer or a linear programming problem using specified names for rows and columns.
h02bzf supplies further information on the optimum solution obtained by h02bbf.
h02cbf solves dense integer general quadratic programming problems.
h02ccf reads optional parameter values for h02cbf from external file.
h02cdf supplies optional parameter values to h02cbf.
h02cef solves sparse integer linear programming or quadratic programming problems.
h02cff reads optional parameter values for h02cef from external file.
h02cgf supplies optional parameter values to h02cef.
h03abf solves transportation problems. It uses integer arithmetic throughout and so produces exact results. On a few machines, however, there is a risk of integer overflow without warning, so the integer values in the data should be kept as small as possible by dividing out any common factors from the coefficients of the constraint or objective functions.
h03adf solves shortest path problems using Dijkstra's algorithm.
h03bbf is a (symmetric) classical travelling salesman problem.
h02bbf,h02bffandh03abf treat all matrices as dense and hence are not intended for large sparse problems. For solving large sparse LP problems, use e04nqfore04ugf/e04uga.
3.1Transportation Problem
h03abf solves transportation problems. It uses integer arithmetic throughout and so produces exact results. On a few machines, however, there is a risk of integer overflow without warning, so the integer values in the data should be kept as small as possible by dividing out any common factors from the coefficients of the constraint or objective functions.
3.2Feature Selection – Best Subset Problem
h05aaf selects the best $n$ subsets of size $p$ using a reverse communication branch and bound algorithm.
h05abf selects the best $n$ subsets of size $p$ using a direct communication branch and bound algorithm.