Cynthia Phillips' Abstracts

Cynthia Phillips
Massively Parallel Computing Research Laboratory
Sandia National Laboratories

Warning:Abstracts contain inline TeX

Scheduling Jobs that Arrive Over Time

Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Clifford Stein
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH 03755-3510
Joel Wein
Department of Computer Science
Polytechnic University
Brooklyn, NY 11201
To appear in Mathematical Programming B

Abstract: A natural and basic problem in scheduling theory is to provide good average quality of service to a stream of jobs that arrive over time. In this paper we consider the problem of scheduling $n$ jobs that are released over time in order to minimize the average completion time of the set of jobs. In contrast to the problem of minimizing average completion time when all jobs are available at time $0$, all the problems that we consider are ${\cal NP}$-hard, and essentially nothing was known about constructing good approximations in polynomial time.

We give the first constant-factor approximation algorithms for several variants of the single and parallel machine model. Many of the algorithms are based on interesting algorithmic and structural relationships between preemptive and nonpreemptive schedules and linear programming relaxations of both. Many of the algorithms generalize to the minimization of average {\em weighted} completion time as well.

(gzip'd) Paper

Improved Bounds on Relaxations of a Parallel Machine Scheduling Problem

Soumen Chakrabarti
IBM Almaden
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Andreas Schulz
Fachbereich Mathematik
Technische Univeritat Berlin
Berlin, Germany
David B. Shmoys
School of Operations Research and Industrial Engineering
Cornell University
Ithaca, NY 14853
Clifford Stein
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH 03755-3510
Joel Wein
Department of Computer Science
Polytechnic University
Brooklyn, NY 11201
To appear in Journal of Combinatorial Optimization.

Abstract: We consider the problem of scheduling $n$ jobs with release dates on $m$ identical parallel machines to minimize the average completion time of the jobs. We prove that the ratio of the average completion time of the optimal nonpreemptive schedule to that of the optimal preemptive schedule is at most $\frac{7}{3}$, improving a bound of $(3-\frac{1}{m})$ due to Phillips, Stein and Wein. We then use our technique to give an improved bound on the quality of a linear programming relaxation of the problem considered by Hall, Schulz, Shmoys and Wein.

(gzip'd) Paper

Constructing Evolutionary Trees in the Presence of Polymorphic Characters

Maria Bonet
Universitat Politecnica De Catalunya
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Tandy Warnow
Tandy Warnow
Department of Computer and Information Science
University of Pennsylvania,br> Philadelphia, PA 19104
Shibu Yooseph
DIMACS Center
Piscataway, NJ 08855-4580
to appear in SIAM Journal on Computing

Abstract: Most phylogenetics literature and construction methods based upon characters presume monomorphism (one state per character per species), and yet polymorphism (multiple states per character per species) is well documented in both Biology and Historical Linguistics. In this paper we consider the problem of inferring evolutionary trees for polymorphic characters. We show efficient algorithms for the construction of perfect phylogenies from polymorphic data. These methods have been used to help construct the evolutionary tree proposed by Warnow, Ringe, and Taylor for the Indo-European family of languages, which was presented by invitation at the National Academy of Sciences in November 1995.

(gzip'd) Paper

Constructing Computer Virus Phylogenies

Leslie A. Goldberg
Department of Computer Science
University of Warwick
Coventry CV4 7AL, UK
Paul Goldberg
Dept of Computer Science and Applied Math
Aston University
Birmingham B4 7ET
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Gregory B. Sorkin
IBM T.J. Watson Research Center
Yorktown Heights, NY 10598
to appear in Journal of Algorithms

Abstract: There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses --- a virus is often written using code fragments from one or more other viruses, which are its immediate ancestors. A phylogeny for a collection of computer viruses is a directed acyclic graph whose nodes are the viruses and whose edges map ancestors to descendants and satisfy the property that each code fragment is ``invented'' only once. To provide a simple explanation for the data, we consider the problem of constructing such a phylogeny with a minimum number of edges. This optimization problem is NP-hard, and we present positive and negative results for associated approximation problems. When tree solutions exist, they can be constructed and randomly sampled in polynomial time.

(gzip'd) Paper

Task Scheduling in Networks

Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Clifford Stein
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH 03755-3510
Joel Wein
Department of Computer Science
Polytechnic University
Brooklyn, NY 11201
SIAM Journal on Discrete Mathematics Vol. 10, No. 4, 11/97, pp. 573-598.

Abstract: Scheduling a set of tasks on a set of machines so as to yield an efficient schedule is a basic problem in computer science and operations research. Most of the research on this problem incorporates the potentially unrealistic assumption that communication between the different machines is instantaneous. In this paper we remove this assumption and study the problem of {\em network scheduling}, where each job originates at some node of a network, and in order to be processed at another node must take the time to travel through the network to that node.

Our main contribution is to give approximation algorithms and hardness proofs for fully general forms of the fundamental problems in network scheduling. We consider two basic scheduling objectives: minimizing the makespan, and minimizing the average completion time. For the makespan we prove small constant factor hardness-to-approximate and approximation results. For the average completion time, we give a log-squared approximation algorithm for the most general form of the problem. The techniques used in this approximation are fairly general and have several other applications. For example, we give the first non-trivial approximation algorithm to minimize the average weighted completion time of a set of jobs on related or unrelated machines, with or without a network.

Minimizing phylogenetic number to find good evolutionary trees

Leslie A. Goldberg
Department of Computer Science
University of Warwick
Coventry CV4 7AL, UK
Paul Goldberg
Dept of Computer Science and Applied Math
Aston University
Birmingham B4 7ET
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Elizabeth Sweedyk
DIMACS Center
Piscataway, NJ 08855-4580
Tandy Warnow
Department of Computer and Information Science
University of Pennsylvania,br> Philadelphia, PA 19104
Discrete Applied Mathematics Vol. 71 Nos. 1-3, 5 December 1996, pp. 111-136.

Abstract: Inferring phylogenetic trees is a fundamental problem in computational biology. We present a new objective criterion, the {\it phylogenetic number}, for evaluating evolutionary trees for species defined by biomolecular sequences or other qualitative characters. The phylogenetic number of a tree $T$ is the maximum number of times that any given character state arises in $T$. By contrast, the classical {\it parsimony} criterion measures the total number of times that different character states arise in $T$. We consider the following related problems: finding the tree with minimum phylogenetic number, and computing the phylogenetic number of a given topology in which only the leaves are labeled by species. When the number of states is bounded (as is the case for biomolecular sequence characters), we can solve the second problem in polynomial time. Given the topology for an evolutionary tree, we can also compute a phylogeny with phylogenetic number 2 (when one exists) for an arbitrary number of states. This algorithm can be used to further distinguish trees that are equal under parsimony. We also consider a number of other related problems.

The Asymmetric Median Tree -- A New Model for Building Consensus Trees

Cynthia A. Phillips
Massively Parallel Computing Research Laboratory
Sandia National Laboratories
Albuquerque, NM 87185-1110
Tandy Warnow
Department of Computer and Information Science
University of Pennsylvania,br> Philadelphia, PA 19104
Discrete Applied Mathematics Vol. 71 Nos. 1-3, 5 December 1996, pp. 311-336.

Abstract: Inferring the consensus of a set of different evolutionary trees for a given species set is a well-studied problem, for which several different models have been proposed. In this paper, we propose a new optimization problem for consensus tree construction, which we call the {\em asymmetric median tree}, or {\em AMT}. Our main theoretical result is the equivalence between the asymmetric median tree problem on $k$ trees and the maximum independent set (MIS) problem on $k$-colored graphs. Although the problem is {\em NP-hard} for three or more trees, we have polynomial time algorithms to construct the AMT for two trees and an approximation algorithm for three or more trees. We define a measure of phylogenetic resolution and show that our algorithms (both exact and approximate) produce consensus trees that on every input are at least as resolved as the standard models in use (strict consensus, majority tree, Nelson Tree). Finally, we show that the AMT combines desirable features of many of the standard consensus tree models in use.

Optimal Time-Critical Scheduling via Resource Augmentation

Cynthia A. Phillips
Massively Parallel Computing Research Laboratory
Sandia National Laboratories
Albuquerque, NM 87185-1110
Clifford Stein
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH 03755-3510
Eric Torng
Department of Computer Science
Michigan State University
East Lansing, MI 48824-1226
Joel Wein
Department of Computer Science
Polytechnic University
Brooklyn, NY 11201
Proceedings of the 29th Annual ACM Symposium on Theory of Computing, El Paso, TX, May 4-6, 1997, pp. 140-149.

Abstract: We consider two fundamental problems in dynamic scheduling: scheduling to meet deadlines in a preemptive multiprocessor setting, and scheduling to provide good response time in a number of scheduling environments. When viewed from the perspective of traditional worst-case analysis, no good on-line algorithms exist for these problems, and for some variants no good off-line algorithms exist unless ${\cal P}= {\cal NP}$.

We study these problems using a relaxed notion of competitive analysis, introduced by Kalyanasundaram and Pruhs, in which the on-line algorithm is allowed more resources than the optimal off-line algorithm to which it is compared. Using this approach, we establish that several well-known on-line algorithms, that have poor performance from an absolute worst-case perspective, are optimal for the problems in question when allowed moderately more resources. For the optimization of average flow time, these are the first results of any sort, for any ${\cal NP}$-hard version of the problem, that indicate that it might be possible to design good approximation algorithms.

(gzip'd) STOC Paper

Approximation algorithms for the fixed-topology phylogenetic number problem

Mary Cryan
Department of Computer Science
University of Warwick
Coventry CV4 7AL, UK
Leslie A. Goldberg
Department of Computer Science
University of Warwick
Coventry CV4 7AL, UK
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching, June 1997.

Abstract: In the $\ell$-phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than $\ell$ connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degree. This version of the problem is known to be ${\cal NP}$-complete for $\ell \ge 3$ even for degree-$3$ trees in which no state labels more than $\ell+1$ leaves (and therefore there is a trivial $\ell + 1$ phylogeny). We give a $2$-approximation algorithm for all $\ell \ge 3$ for arbitrary input topologies and we give an optimal approximation algorithm that constructs a $4$-phylogeny when a $3$-phylogeny exists. Dynamic programming techniques, which are typically used in fixed-toplogy problems, cannot be applied to $\ell$-phylogeny problems. Our $2$-approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are polymorphic.

(gzip'd) Paper

Beyond Islands: Runs in Clone-Probe Matrices

David B. Wilson
Computer Science Division
University of California, Berkeley
David S. Greenberg
Sandia National Laboratories
Albuquerque, NM 87185-1110
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Proceedings of the First International Conference on Molecular Biology (RECOMB), Santa Fe, NM, Jan. 20-23, 1997, pp. 320-329.

Abstract: Physical mapping is a fundamental component of the human genome project. A physical map consists of a set of probes which mark unique positions on a long fragment of DNA, together with the relative order of the probes on the DNA. This order is inferred from clone-probe hybridization experiments, which determine the probes contained within various fragments of the genome. In practice, the order of the probes is not completely determined by the hybridization experiments. To better design these experiments, researchers have analyzed the expected distribution of ``islands'' --- groups of probes which are known to be near one another --- that would result from hybridization experiments with different numbers of clones and probes. In this paper we analyze the distribution of ``runs'' --- groups of probes whose relative order is completely determined by the hybridization experiment. We include analytic, numerical, Monte Carlo, and simulation results on runs, which can further assist in the design of these experiments.

(gzip'd) RECOMB Paper

Resource Management in a Parallel Mixed Integer Programming Package

Jonathan Eckstein
Rutgers University
New Brunswick, NJ 08903
William E. Hart, Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Proceedings Intel Supercomputer Users Group 13th Annual Conference. Albuquerque, NM. June 11-13, 1997.

Abstract: This paper describes resource management in a massively parallel, distributed-memory implementation of a branch-and-bound method for mixed integer programming (MIP) problems. Our parallel branch-and-bound algorithm is decomposed into a variety of interacting but largely asynchronous tasks. These tasks perform the algorithm's basic computations and maintain various kinds of global information or attempt to balance workloads. The optimal amount of parallelism and data centralization varies from task to task. The organization of our application code is intended to accomodate this variability, while making flexible and full use of the CPU. In particular, each processor uses an apparently novel scheduling mechanism to maintain a balance between the fraction of time devoted to compute-intensive but interruptable tasks --- computing bounds to subproblems and heuristically searching for better feasible solutions. While described in the specific context of mixed integer programming, we expect our techniques to apply to massively parallel implementations of other complicated, practical branch-and-bound procedures.

Paper: (*.ps.gz)

Enabling Department-Scale Simulation

David S. Greenberg, William E. Hart, Cynthia A. Phillips
Massively Parallel Computing Research Laboratory
Sandia National Laboratories
Albuquerque, NM 87185-1110

To appear in Algorithms for Parallel Processing. IMA Volumes in Mathematics and Its Applications. 1997.

Abstract: The Department of Energy (DOE) national laboratories have one of the longest and most consistent histories of supercomputer use. We summarize the architecture of DOE's new supercomputers that are being built for the Accelerated Strategic Computing Initiative (ASCI). We then argue that in the near future scaled-down versions of these supercomputers with petaflop-per-weekend capabilities could become widely available to hundreds of research and engineering departments. The availability of such computational resources will allow simulation of physical phenomena to become a full-fledged third branch of scientific exploration, along with theory and experimentation. We describe the ASCI and other supercomputer applications at Sandia National Laboratories, and discuss which lessons learned from Sandia's long history of supercomputing can be applied in this new setting. (gzip'd) Paper

Improved Scheduling Algorithms for Minsum Criteria

Soumen Chakrabarti
IBM Almaden
Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110
Andreas Schulz
Fachbereich Mathematik
Technische Univeritat Berlin
Berlin, Germany
David B. Shmoys
School of Operations Research and Industrial Engineering
Cornell University
Ithaca, NY 14853
Clifford Stein
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH 03755-3510
Joel Wein
Department of Computer Science
Polytechnic University
Brooklyn, NY 11201
Proceedings of the 23rd ICALP, July 1996.

Abstract: We consider the problem of finding near-optimal solutions for a variety of $\cal NP$-hard scheduling problems for which the objective is to minimize the total weighted completion time. Recent work has led to the development of several techniques that yield constant worst-case bounds in a number of settings. We continue this line of research by providing improved performance guarantees for several of the most basic scheduling models, and by giving the first constant performance guarantee for a number of more realistically constrained scheduling problems. For example, we give an improved performance guarantee for minimizing the total weighted completion time subject to release dates on a single machine, and subject to release dates and/or precedence constraints on identical parallel machines. We also give improved bounds on the power of preemption in scheduling jobs with release dates on parallel machines.

We give improved on-line algorithms for many more realistic scheduling models, including environments with parallelizable jobs, jobs contending for shared resources, tree precedence-constrained jobs, as well as shop scheduling models. In several of these cases, we give the first constant performance guarantee achieved on-line. Finally, one of the consequences of our work is the surprising structural property that there are schedules that simultaneously approximate the optimal makespan and the optimal weighted completion time to within small constants. Not only do such schedules exist, but we can find approximations to them with an on-line algorithm.

(gzip'd) Paper

The Network Inhibition Problem

Cynthia A. Phillips
Sandia National Laboratories
Albuquerque, NM 87185-1110

Proceedings of the 25th Annual ACM Symposium on the Theory of Computing, May 16-18, 1993, pp.776-785.

Abstract: In the network inhibition problem, we wish to find the most cost-effective way to reduce the ability of a network to transport a commodity. Potential applications include disabling polluting pipe systems, military supply lines, and illicit drug networks. We wish to compute a fixed-budget attack strategy which will maximally inhibit a network's capability.

In this paper we introduce the network inhibition problem, prove several NP-completeness results, and give algorithms for planar and outerplanar graphs. We prove the problem is strongly NP-complete for general graphs, even of degree at most $3$, and weakly NP-complete for series-parallel graphs, grid graphs, bandwidth-$k$ graphs with $k \ge 3$, and Halin graphs (and thus for $k$-outerplanar graphs with $k \ge 2$). We describe three pseudopolynomial-time algorithms for planar graphs and show how to convert them into fully polynomial-time approximation schemes (FPTAS). Techniques used in these algorithms also give a FPTAS for the restricted shortest path problem in general graphs that is a factor of $\abs{E}$ faster than the best previous scheme whose running time does not depend on the input numbers and the fastest overall for moderately dense graphs and reasonable error parameters. Finally, we describe a polynomial-time algorithm for inhibition of outerplanar graphs. Thus, network inhibition separates outerplanar graphs from series-parallel graphs, separates $1$-outerplanar from $2$-outerplanar graphs, and separates bandwidth-$2$ graphs from bandwidth-$3$ graphs.

An appendix containing more details and all proofs is available in hardcopy only (for now). Contact caphill@cs.sandia.gov

Finding Minimum-Quotient Cuts in Planar Graphs

James K. Park
Bremer Associates
Cambridge, MA 02139
Cynthia A. Phillips
Massively Parallel Computing Research Laboratory
Sandia National Laboratories
Albuquerque, NM 87185-1110
Proceedings of the 25th Annual ACM Symposium on the Theory of Computing, May 16-18, 1993, pp.766-775.

Abstract: Given a graph $G = (V,E)$ where each vertex $v \in V$ is assigned a weight $w(v)$ and each edge $e \in E$ is assigned a cost $c(e)$, the {\em quotient} of a cut partitioning the vertices of $V$ into sets $S$ and $\bar{S}$ is $c(S,\bar{S}) / \min \{ w(S), w(\bar{S}) \}$, where $c(S,\bar{S})$ is the sum of the costs of the edges crossing the cut and $w(S)$ and $w(\bar{S})$ are the sum of the weights of the vertices in $S$ and $\bar{S}$, respectively. The problem of finding a cut whose quotient is minimum for a graph has attracted considerable attention in recent years, due in large part to the work of Rao \cite{Rao87,Rao92} and Leighton and Rao \cite{LeightonR88}. They have shown that an algorithm (exact or approximation) for the minimum-quotient-cut problem can be used to obtain an approximation algorithm for the more famous minimum-$b$-balanced-cut problem, which requires finding a cut $(S,\bar{S})$ minimizing $c(S,\bar{S})$ subject to the constraint $bW \leq w(S) \leq (1-b)W$, where $W$ is the total vertex weight and $b$ is some fixed balance in the range $0 < b \leq 1/2$. Unfortunately, the minimum-quotient-cut problem is strongly NP-hard for general graphs, and the best polynomial-time approximation algorithm known for the general problem (due to Leighton and Rao \cite{LeightonR88}) may produce a cut whose quotient is $\Theta (\lg n)$ times optimal, where $n$ is the size of the graph. However, for planar graphs, the minimum-quotient-cut problem appears more tractable. In particular, Rao \cite{Rao87,Rao92} has developed several efficient approximation algorithms for the planar version of the problem, each of which always produces a cut whose quotient is at most some constant times optimal.

In this paper, we improve Rao's results, both in terms of their accuracy and their speed. We begin with two pseudopolynomial-time {\em exact} algorithms for the planar minimum-quotient-cut problem. As Rao's most accurate approximation algorithm for the problem --- also a pseudopolynomial-time algorithm --- guarantees only a $1.5$-times-optimal cut, our algorithms represent a significant advance. They are also the first polynomial-time algorithms known for the unweighted version of the planar minimum-quotient-cut problem (assigning weight $1$ to all vertices). We also show how to speed up Rao's other two approximation algorithms, which run in polynomial time and have somewhat weaker accuracy bounds. Finally, we prove that the minimum-quotient-cut problem is weakly NP-hard for $2$-outerplanar and series-parallel graphs, even if all edge costs are $1$, and that the minimum-$b$-balanced-cut problem is weakly NP-hard for trees of bandwidth $2$. This last result is the first intractability result known for the planar minimum-$b$-balanced-cut problem, and thus it partially resolves the longstanding open question of this problem's true complexity.


Cynthia Phillips' Home Page
Sandia National Laboratories

This page maintained by caphill@cs.sandia.gov
Last modified: October 27, 1997