Sandia National Laboratories
Daniel M. Dunlavy
Home
Contact Info
Research
Publications
  Journal Articles
  Conference Proceedings
  Technical Reports
Presentations
Software
Curriculum Vitae
Internal Reports


Contact
Daniel M. Dunlavy
Senior Member of Technical Staff
dmdunla@sandia.gov
(505) 284-6092


Related Links
Department
Center
CSRI

Publications


Journal Articles


  • "QCS: A System for Querying, Clustering and Summarizing Documents." Daniel M. Dunlavy, Dianne P. O'Leary, John M. Conroy, and Judith D. Schlesinger. Information Processing & Management, in press (accepted October 2006).  [Preprint PDF]

    Abstract: Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system---the Query, Cluster, Summarize (QCS) system---which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components methods in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence "trimming," and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.

    Key words: information retrieval, latent semantic indexing, clustering, summarization, text processing, sentence trimming


  • "HOPE: A Homotopy Optimization Method for Protein Structure Prediction." Daniel M. Dunlavy, Dianne P. O'Leary, Dmitri Klimov and D. Thirumalai. Journal of Computational Biology, 12(10):1275-1288, December 2005.  [PDF]

    Abstract: We use a homotopy optimization method, HOPE, to minimize the potential energy associated with a protein model. The method uses the minimum energy conformation of one protein as a template to predict the lowest energy structure of a query sequence. This objective is achieved by following a path of conformations determined by a homotopy between the potential energy functions for the two proteins. Ensembles of solutions are produced by perturbing conformations along the path, increasing the likelihood of predicting correct structures. Successful results are presented for pairs of homologous proteins, where HOPE is compared to a variant of Newton's method and to simulated annealing.

    Key words: protein structure prediction, energy minimization, global optimization, homotopy method, simulated annealing.


  • "Structure Preserving Algorithms for Perplectic Eigenproblems." D. Steven Mackey, Niloufer Mackey, and Daniel M. Dunlavy. Electronic Journal of Linear Algebra, 13:10-39, February 2005.  [PDF]  [Supplemental Material]

    Abstract: Structured real canonical forms for matrices in R^{n x n} that are symmetric or skewsymmetric about the anti-diagonal as well as the main diagonal are presented, and Jacobi algorithms for solving the complete eigenproblem for three of these four classes of matrices are developed. Based on the direct solution of 4 x 4 subproblems constructed via quaternions, the algorithms calculate structured orthogonal bases for the invariant subspaces of the associated matrix. In addition to preserving structure, these methods are inherently parallelizable, numerically stable, and show asymptotic quadratic convergence.

    Key words: Canonical form, Eigenvalues, Eigenvectors, Jacobi method, Double structure preserving, Symmetric, Persymmetric, Skew-symmetric, Perskew-symmetric, Centrosymmetric, Perplectic, Quaternion, Tensor product, Lie algebra, Jordan algebra, Bilinear form.


    Top of page

Conference Proceedings


Technical Reports