DIMACS TR: 2000-34

An Online Scheduling Problem with Job Set-ups

Authors: Srikrishna Divakaran and Michael Saks


Machine scheduling and paging are two fundamental problems that have been studied extensively from the view of online algorithms. In both problems, one processes a sequence $J_1,\ldots,J_n$ of {\em jobs} (or {\em page requests}) that arrive over time. In the paging problem, requests must be answered in the order of arrival because the requests arise in the context of the execution of a sequential program. The need for decision making comes from the presence of a cache which can store a limited number of pages, where serving a cached page is considerably cheaper than an uncached one. Thus algorithms for paging are focused around the choice of caching policy. In scheduling, there is no cache, but the scheduler has (at least limited) freedom to reorder the jobs, and choosing the order is the major issue in the design of online scheduling algorithms. In computer systems, however, there are important situations where both caching and job reordering are available. One very natural setting where this happens is handling web page requests at a high volume web site. As in the usual paging problem, the server must manage a cache. But here the requests are (essentially) independent, so the server is free to reorder them.

In this paper, we begin the study of online algorithms for scheduling systems where both caching and reordering are available. We consider a simple, but already interesting and nontrivial, case: that of a single machine with a cache of unit size, and uniform page size. This particular special case is also natural in the usual machine scheduling context where jobs are classified into one of $\{1,2,...,F\}$ job types, and before processing a job, the machine must be configured (or set-up) according to the type of the job. This configuring (or set-up) is required unless the previously served job is of the same type as the current one. The current configuration of the machine in this scheduling problem, corresponds exactly to the contents of the unit cache in the page server problem.

In this context, we consider the online problem of scheduling a sequence $J=(J_{1},J_{2},...,J_{N})$ of jobs with release times, processing times and sequence independent set-up times, on a single machine, to minimize the maximum flow time. A partitioning of the jobs into $F$ job types is given. A set-up is required at the start of each batch, where a batch is the largest set of contiguously scheduled jobs of the same job type. This problem is a special case of the maximum lateness problem, a problem known to be $NP$-hard even for very restrictive special cases and for which there are no known polynomial time $O(1)$ approximation algorithms. In this paper, we present a polynomial time $O(1)$-competitive online algorithm for the maximum flow time problem and also prove the $NP$-Hardness of the offline version of this problem.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-34.ps.gz

DIMACS Home Page