DIMACS Workshop on Management and Processing of Data Streams
(In conjunction with ACM SIGMOD/PODS and FCRC 2003)

Sunday, June 8, 2003
San Diego, California, USA

Presented under the auspices of the Special Focus on Next Generation Networks Technologies and Applications and the Special Focus on Data Analysis and Mining.


John J. Bates, National Climatic Data Center, NESDIS, NOAA

Title: Remote Sensing Applications Division

Operational data from environmental satellites form the basis for a truly global climate observing system. Similarly, weather radar provide the very high spatial and rapid time sampling of precipitation required to resolve physical processes involved in extreme rainfall events. In the past, these data were primarily used to assess the current state of the atmosphere to help initialize weather forecast models and to monitor the short term evolution of systems (called nowcasting).

The use of these data for climate analysis and monitoring is increasing rapidly. So, also, are the planning and implementation for the next generation of environmental satellite and weather radar programs. These observing systems challenge our ability to extract meaningful information on climate variability and trends. In this presentation, I will attempt only to provide a brief glimpse of applications and analysis techniques used to extract information on climate variability. First, I will describe the philosophical basis for the use of remote sensing data for climate monitoring which involves the application of the forward and inverse forms of the radiative transfer equation. Then I will present three examples of the application of statistical analysis techniques to climate monitoring: 1) the detection of long-term climate trends, 2) the time-space analysis of very large environmental satellite and weather radar data sets, and 3) extreme event detection. Finally, a few conclusions will be given.

Amy Braverman, Jet Propulsion Laboratory

Title: Earth and Space Sciences Division

This talk discusses the role of statistics and data mining in production and analysis of very large, remote-sensing data set produced by NASA's Earth Observing System (EOS). EOS is a long term program to collect and study data for Earth system science. The goal is to better understand interactions among oceans, atmosphere, land surface, solid Earth and biosphere, and ultimately to asses the impact of and consequences for human activity on this planet. We participate by building instruments, designing and implementing algorithms for producing data products that are made available to the public, and actively participating in reseach using those data products. Data analysis is vital to all three activities, and statisticians and computer scientists have an important role to play.

In thinking about the role of statistics and data mining it's important to distinguish between data production and data analysis. Production requires a certain kind of objectivity because others will have to live with the decisions we make. Analysis, on the other hand, is more subjective, and we are free to make and defend whatever assumptions we think are appropriate. In this talk we describe key statistical and data mining challenges in our experience as members of instrument teams, and as participants in scientific investigations using remote-sensing Earth science data.

Oliver Spatscheck, AT&T Labs-Research

Title: How to monitor network traffic 5 Gbit/sec at a time.

In order to guarantee the security and reliability of today's networks, network monitoring is becoming an increasingly important part of network operations. However, ever increasing network data rates makes it more and more difficult to properly monitor traffic. This is particularly true if the monitoring performed requires application layer information and/or has to be performed on an ad hoc basis.

In this talk I will discuss the issues we faced in addressing this monitoring task. In particular, I will focus on how our Gigascope probe with its SQL-like realtime query language allows us to provide an affordable, reliable, flexible and scalable solution to the problem. Gigascopes deployed today process in excess of 1.2 million records per second on a inexpensive Pentium system in a real network.

Previous: Participation
Next: Registration
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on May 19, 2003.