• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

Data Management: Past, Present, and Future

Extracts from this document...


Data Management: Past, Present, and Future Jim Gray Microsoft Research June 1996 Technical Report MSR-TR-96-18 Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052 This paper appeared in IEEE Computer 29(10): 38-46 (1996) IEEE: (c) 1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Data Management: Past, Present, and Future Jim Gray, Microsoft Research, 301 Howard St. San Francisco, CA 94105, 415-778-8222 [email protected] Abstract: Soon most information will be available at your fingertips, anytime, anywhere. Rapid advances in storage, communications, and processing allow us move all information into Cyberspace. Software to define, search, and visualize online information is also a key to creating and accessing online information. This article traces the evolution of data management systems and outlines current trends. Data management systems began by automating traditional tasks: recording transactions in business, science, and commerce. This data consisted primarily of numbers and character strings. Today these systems provide the infrastructure for much of our society, allowing fast, reliable, secure, and automatic access to data distributed throughout the world. Increasingly these systems automatically design and manage access to the data. The next steps are to automate access to richer forms of data: images, sound, video, maps, and other media. A second major challenge is automatically summarizing and abstracting data in anticipation of user requests. These multi-media databases and tools to access them will be a cornerstone of our move to Cyberspace. 1. Introduction And Overview Figure 1: The six generations of data management, evolving from manual methods, through several stages of automated data management. Computers can now store all forms of information: records, documents, images, sound recordings, videos, scientific data, and many new data formats. ...read more.


Rather than implicitly storing the relationship between flights and trips, a relational system explicitly stores each flight-trip pair as a record in the database. This is the "Segment" table in Figure 2.d. To find all segments reserved for customer Jones going to San Francisco, one would write the SQL query: Select Flight# From City, Flight, Segment, Trip, Customer Where Flight.to = "SF" AND Flight.flight# = Segment.flight# AND Segment.trip# = trip.trip# AND trip.customer# = customer.customer# AND customer.name = "Jones" The English equivalent of this SQL query is: "Find the flight numbers for flights to San Francisco which are a segment of a trip booked by any customer named "Jones." Combine the City, Flight, Segment, Tip, and Customer tables to find this flight." This program may seem complex, but it is vastly simpler than the corresponding navigational program. Given this non-procedural query, the relational database system automatically finds the best way to match up records in the City, Flight, Segment, Trip, and Customer tables. The query does not depend on which relationships are defined. It will continue to work even after the database is logically reorganized. Consequently, it has much better data independence than a navigational query based on the network data model. In addition to improving data independence, relational programs are often five or ten times simpler than the corresponding navigational program. Inspired by Codd's ideas, researchers in academe and industry experimented throughout the 1970's with this new approach to structuring and accessing databases promising dramatically easier data modeling and application programming. The many relational prototypes developed during this period converged on a common model and language. Work at IBM Research led by Ted Codd, Raymond Boyce, and Don Chamberlin and work at UC Berkeley led by Michael Stonebraker gave rise to a language called SQL. This language was first standardized in 1985. There have been two major additions to the standard since then [5], [6]. ...read more.


What are the rights and responsibility of people operating in Cyberspace? Our grandchildren will probably still be wrestling with these societal issues 50 years hence. The technical challenges are more tractable. There is broad consensus within the database community on the main challenges and a research agenda to attach those problems. Every five years, the database community does a self-assessment that outlines this agenda. The most recent self-assessment, called the Lagunita II report [8], emphasizes the following challenges: * Defining the data models for new types (e.g., spatial, temporal, image, ...) and integrating them with the traditional database systems. * Scaling databases in size (to petabytes), space (distributed), and diversity (heterogeneous). * Automatically discovering data trends, patterns, and anomalies (data mining, data analysis). * Integrating (combining) data from multiple sources. * Scripting and managing the flow of work (process) and data in human organizations. * Automating database design and administration. These are challenging problems. Solving them will open up new applications for computers both for organizations and for individuals. These systems will allow us to access and analyze all information from anywhere at any time. This easy access to information will transform the way we do science, the way we manage our businesses, the way we learn, and the way we play. It will both enrich and empower us and future generations. Perhaps the most challenging problem is understanding the data. There is little question that most data will be online - both because it is inexpensive to store the data in computers and because it is convenient to store it in computers. Organizing these huge data archives so that people can easily find the information they need is the real challenge we face. Finding patterns, trends, anomalies, and relevant information from a large database is one of the most exciting new areas of data mangement [7]. Indeed, my hope is that computers will be able to condense and summarize information for us so that we will be spared the drudgery searching through irrelevant data for the nuggets we seek. The solution to this will require contributions from many disciplines. ...read more.

The above preview is unformatted text

This student written piece of work is one of many that can be found in our AS and A Level Computer Science section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Computer Science essays

  1. Marked by a teacher

    The systems development life cycle (SDLC) is a conceptual model used in project management ...

    3 star(s)

    Visualisation of a design is another benefit. We all know how difficult it is to picture an end-product from a paper specification. By creating prototype models and simulations we can improve our understanding of what is to be developed. Once a prototype has been created for a project, it is easy for everyone to gain a 'sneak

  2. Marked by a teacher

    The software that manages a database is called a Database Management System (DBMS). The ...

    3 star(s)

    track of the book number; 'Book ID'also appears in the 'Member' table. Assuming that 'Book ID' is a unique identifier, or primary key, of the book table, a foreign key relationship is established by which the book and member tables can subsequently be joined.

  1. The Seven Domains of IT Infrastructure

    Computer systems connected by hubs contribute to sniffing attacks. Switches that are used to connect systems together can reduce the likelihood of sniffing attacks. The LAN Domain and the WAN Domain is connected by what is known as the LAN-to-WAN Domain. The LAN Domain is a trusted zone and the WAN Domain is an untrusted zone.

  2. Information Technology:Programming. A record is a collection of related data which is stored in ...

    The file is organized like a One dimensional table on disk where each records is part of the table.The record number acts like a table index to allow you to find the records You can access a record directly or randomly by calculating its location using a mathematical formula and going directly to the record.

  1. Explain sequence, selection and iteration as used in computer programming; outline the benefits of ...

    These therefore make perfect additions to a program which utilise If statements, loops, or other scenarios where comparisons between data are made. The size of the Boolean in memory depends on the language being used. In .NET, it is 4 bytes (32 bits)

  2. System Analysis The aim of the Jewellery store system is to make the work ...

    Acceptance Testing Acceptance criteria is the agreement made between the user and the system developer. The acceptance criteria contains a list of important requirements that the system has to satisfy. The tests conducted to check the acceptance criteria are called acceptance testing.

  1. Computer systems assignment 1

    If it is read by the computer I Click on the icon of CD-drive to explore the content of the CD. After that I look for an icon "Setup" and double click on it to start the installation process. In the main setup window I can see that it is

  2. The purpose of this coursework is to design a network for a small to ...

    There are different types of licensing available for companies considering the volume of clients on the network the recommended licensing would be the open value subscription. This will give the company the chance of upgrading software during their subscription and being able to change the licensing count.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work