In this report, parallelism is used as a way to solve {\em discrete optimization problems}. We search for an {\em optimal solution} $x^{*}\in S^{*}$, where $S^{*}$ is the set of all best solutions in a {\em % domain} $S$, defined as the discrete set of all vectors $x$ in the solution space that satisfy a set of constraints. Improving the search efficiency is of c onsiderable importance since exhaustive search is often impracticable. The method called {\em % branch-and-bound} (noted B\&B) is a tree search algorithm often used as an intelligent search in this context. Its principle lies in successive decompositions of the original problem in smaller disjoint subproblems until an optimal solution is found. The algorithm consists of a heuristic itn this context. Its principle lies in successive decompositions of the original problem in smaller disjoint subproblems until an optimal solution is found. The algorithm consists of a heuristic iterative search that avoids visiting some subproblems which are known not to contain an optimal solution.

Parallel processing has been widely studied as an additional source of improvement in search efficiency in discrete optimization. We review in this report the literature pertinent to {\em parallel branch-and-bound algorithms}. The focus is on {\em distributed memory} parallel systems, which are composed of a set of processors connected by a physical network, each one with its own local memory. The communications between two different processors are implemented through the exchange of messages over links of the network connecting the two processors. Given that an attractive feature is that disjoint subproblems can be decomposed simultaneously and independently, the challenge is how to use the set of processors to improve the search efficiency of B\&B algorithms, concurrently decomposing several subproblems at each iteration. In general terms, the potential to be explored consists of a linear -- on the number of processors -- reduction on the number of iterations. However, the reduction on the number of iterations can deviate considerably from linear due to possible {\em % speedup anomalies}.

Parallel B\&B is traditionally considered as an irregular parallel algorithm due to the fact that the structure of the search tree is not known beforehand. It may result in unnecessary work if a subproblem that does not contain an optimal solution is chosen and assigned to a processor to be decomposed. Therefore, the use of a distributed memory parallel system incurs a number of overheads, including communication overheads and idle time due to workload imbalance and contention on common data structures. Several special techniques have been developed to address these problems, essentially related to the amount of ``necessary'' work assigned to each processor. This report reviews these techniques, attempting to tie the area of parallel B\&B under a common, uniform terminology. It can be profitable both to the parallel processing and to the operations research community members.

Relation to Dimacs:

Part of this work was done when the second author was visiting Dimacs
in May 1994, partially supported by Dimacs NSF grant STC--91--199999 and
the NJ Commission.

Paper available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-03.ps.gz

DIMACS Home Page