Title: Retrofitting Legacy Code for Authorization Policy Enforcement
For over three decades, we have been taught the Principle of Design for Security: to create a secure system, design it to be secure from the ground up. To date, however, only a small fraction of software developed has followed this principle. Economic pressures and diverse security requirements force developers to focus on functionality and performance. Security mechanisms are typically added only long after deployment, by retrofitting legacy software. Unfortunately, existing techniques to retrofit legacy software for security are manual, time-consuming, and error-prone.
In my talk, I will focus on the problem of retrofitting legacy software with mechanisms for authorization policy enforcement. A developer faced with the task of adding authorization checks must answer two key questions: (a) what are the security-sensitive operations that must be mediated with authorization checks? and (b) where in the software are these operations performed? I will present program analysis and transformation techniques that reduce the manual effort needed to answer these questions and add authorization checks. The cornerstone of these techniques is a formalism called fingerprints that helps characterize security-sensitive operations. I will present a static program analysis technique to mine fingerprints from legacy software, and show how fingerprints aid in adding authorization checks. Experiments with several real-world software systems show that this technique can drastically reduce the effort needed to retrofit legacy software with security mechanisms.
Title: On Limits of Automatic Pattern Learning
Decades of research in automatic pattern recognition resulted in many learning algorithms. Yet for a new learning task it is still difficult to know which method would work best. Often it is unclear whether the limit in classification accuracy is due to a deficiency in the methods or is intrinsic to the task with the given data. We describe some measures that characterize the intrinsic complexity of a classification problem and its relationship to classifier performance. The measures revealed that a collection of real-world problems can span an interesting continuum between those easily learnable to those with no learning possible. We discuss our results on identifying the domains of dominant competence of several popular classifiers in this measurement space. We describe an exploratory data visualization tool that can help in understanding the data geometry, and some real-world applications in science and engineering.
Title: Painting Rectilinear Pictures Without Picasso
I will talk about the following fun artistic question: Given a rectilinear blue-and-white target pattern, how many steps does it take to "paint" the rectilinear pattern, starting from an all-white canvas, if in each step one chooses a rectangle and paints it blue or white, overwriting any previous colors? In the world of art, this problem has application to cubism (c.f. Picasso, Braque, Cezanne). In computer science this problem has application to minimizing *access control lists* which are used by routers to decide whether to forward or drop an incoming packet.
I will show how, for a (useful) special case, one can get the exact optimum in polynomial time, and I will discuss an approximation algorithm for the NP-Complete general case.
This is joint work with David A. Applegate, Gruia Calinescu, David S. Johnson, Katrina Ligett, and Jia Wang.
Title: Reverse engineering of biological systems using discrete mathematics and evolutionary computation tools
One of the central problems in systems biology is to model gene regulatory networks from experimental data. In this talk, a method thatinfers multi-state, discrete dynamic networks from one or more time series will be introduced. The method utilizes techniques from computational algebra and theory of evolutionary algorithms. Examples of applications to gene regulatory networks will be shown.
Title: Privacy Preserving Combinatorial Optimization by Metaheuristics
We study distributed combinatorial optimization problems where the cost function is defined by information distributed among two or more parties while the information must be kept private each other. Many of combinatorial optimization problems are often NP-hard while conventional Privacy Preserving Optimization (PPO) methods are not designed to treat these problems. Although metaheuristics do not guarantee to be terminated in polynomial time, it is believed to solve find relatively good solutions of NP-hard problems with high probability in practical computation time. Considering this, we study a way to exploit metaheuristics for the PPO. We demonstrate the scalability of our protocol on solving distributed Traveling Salesman Problems which includes distributed private information. The result show that our method solves 512-city problem within a few minutes or nine hours with about 10 % or less than 0.1% error, respectively.
Title: Computing Shift Equivalence
Deciding whether two integer matrices are shift equivalent is an important problem in dynamics, usually arising in the context of symbolic dynamics. I will define shift equivalence and discuss some of its invariants. My goal is to find a normal form for a matrix that reflects its shift equivalence class. This normal form will be used in a database of dynamical systems, being developed by Konstantin Mischaikow (a DIMACS member) and his collaborators.