• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

Data Management: Past, Present, and Future

Extracts from this document...


Data Management: Past, Present, and Future Jim Gray Microsoft Research June 1996 Technical Report MSR-TR-96-18 Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052 This paper appeared in IEEE Computer 29(10): 38-46 (1996) IEEE: (c) 1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Data Management: Past, Present, and Future Jim Gray, Microsoft Research, 301 Howard St. San Francisco, CA 94105, 415-778-8222 Gray@Microsoft.com.DRAFT Abstract: Soon most information will be available at your fingertips, anytime, anywhere. Rapid advances in storage, communications, and processing allow us move all information into Cyberspace. Software to define, search, and visualize online information is also a key to creating and accessing online information. This article traces the evolution of data management systems and outlines current trends. Data management systems began by automating traditional tasks: recording transactions in business, science, and commerce. This data consisted primarily of numbers and character strings. Today these systems provide the infrastructure for much of our society, allowing fast, reliable, secure, and automatic access to data distributed throughout the world. Increasingly these systems automatically design and manage access to the data. The next steps are to automate access to richer forms of data: images, sound, video, maps, and other media. A second major challenge is automatically summarizing and abstracting data in anticipation of user requests. These multi-media databases and tools to access them will be a cornerstone of our move to Cyberspace. 1. Introduction And Overview Figure 1: The six generations of data management, evolving from manual methods, through several stages of automated data management. Computers can now store all forms of information: records, documents, images, sound recordings, videos, scientific data, and many new data formats. ...read more.


Rather than implicitly storing the relationship between flights and trips, a relational system explicitly stores each flight-trip pair as a record in the database. This is the "Segment" table in Figure 2.d. To find all segments reserved for customer Jones going to San Francisco, one would write the SQL query: Select Flight# From City, Flight, Segment, Trip, Customer Where Flight.to = "SF" AND Flight.flight# = Segment.flight# AND Segment.trip# = trip.trip# AND trip.customer# = customer.customer# AND customer.name = "Jones" The English equivalent of this SQL query is: "Find the flight numbers for flights to San Francisco which are a segment of a trip booked by any customer named "Jones." Combine the City, Flight, Segment, Tip, and Customer tables to find this flight." This program may seem complex, but it is vastly simpler than the corresponding navigational program. Given this non-procedural query, the relational database system automatically finds the best way to match up records in the City, Flight, Segment, Trip, and Customer tables. The query does not depend on which relationships are defined. It will continue to work even after the database is logically reorganized. Consequently, it has much better data independence than a navigational query based on the network data model. In addition to improving data independence, relational programs are often five or ten times simpler than the corresponding navigational program. Inspired by Codd's ideas, researchers in academe and industry experimented throughout the 1970's with this new approach to structuring and accessing databases promising dramatically easier data modeling and application programming. The many relational prototypes developed during this period converged on a common model and language. Work at IBM Research led by Ted Codd, Raymond Boyce, and Don Chamberlin and work at UC Berkeley led by Michael Stonebraker gave rise to a language called SQL. This language was first standardized in 1985. There have been two major additions to the standard since then [5], [6]. ...read more.


What are the rights and responsibility of people operating in Cyberspace? Our grandchildren will probably still be wrestling with these societal issues 50 years hence. The technical challenges are more tractable. There is broad consensus within the database community on the main challenges and a research agenda to attach those problems. Every five years, the database community does a self-assessment that outlines this agenda. The most recent self-assessment, called the Lagunita II report [8], emphasizes the following challenges: * Defining the data models for new types (e.g., spatial, temporal, image, ...) and integrating them with the traditional database systems. * Scaling databases in size (to petabytes), space (distributed), and diversity (heterogeneous). * Automatically discovering data trends, patterns, and anomalies (data mining, data analysis). * Integrating (combining) data from multiple sources. * Scripting and managing the flow of work (process) and data in human organizations. * Automating database design and administration. These are challenging problems. Solving them will open up new applications for computers both for organizations and for individuals. These systems will allow us to access and analyze all information from anywhere at any time. This easy access to information will transform the way we do science, the way we manage our businesses, the way we learn, and the way we play. It will both enrich and empower us and future generations. Perhaps the most challenging problem is understanding the data. There is little question that most data will be online - both because it is inexpensive to store the data in computers and because it is convenient to store it in computers. Organizing these huge data archives so that people can easily find the information they need is the real challenge we face. Finding patterns, trends, anomalies, and relevant information from a large database is one of the most exciting new areas of data mangement [7]. Indeed, my hope is that computers will be able to condense and summarize information for us so that we will be spared the drudgery searching through irrelevant data for the nuggets we seek. The solution to this will require contributions from many disciplines. ...read more.

The above preview is unformatted text

This student written piece of work is one of many that can be found in our AS and A Level Computer Science section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Computer Science essays

  1. Peer reviewed

    information systems assignment 2

    4 star(s)

    * Accounting * Date * Time * Percentage * Fraction * Scientific * Text * Special * Custom Each of the formats above each have formatting within them, such as what type of numbers are wanted and what type of text.

  2. The Seven Domains of IT Infrastructure

    Computer systems connected by hubs contribute to sniffing attacks. Switches that are used to connect systems together can reduce the likelihood of sniffing attacks. The LAN Domain and the WAN Domain is connected by what is known as the LAN-to-WAN Domain. The LAN Domain is a trusted zone and the WAN Domain is an untrusted zone.

  1. System Analysis The aim of the Jewellery store system is to make the work ...

    While the two concepts are closely related, and have overlapping goals, a domain model is more focused on capturing the concepts in the problem domain rather than the structure of the data associated with that domain. Data flow models (DFM)

  2. Control Unit, Memory Unit, and Arithmetic Logic Unit. The CPU or Central Processing ...

    give sensible suggestions for each of the four areas of peripherals (communication, input, output and storage). In other words the mark will not be for a keyboard or a mouse, but for suggesting sensible methods of input to the system.

  1. What is transaction processing?

    In some cases fraud was committed regarding the rigging ATM machines so that a customer had to type in their password a number of times, during this process someone would be behind taking note of the number and then

  2. Business blue print document for the implementation of SAP R/3 (4.6B) payroll at Mastek ...

    As Dallas is taken as Base location, the amount for Location Allowance is fixed on the basis of the location the employee is scheduled to work in. For finalizing the location allowance the following factors are taken into account: - 1.

  1. With diagrams compare and contrast the relative advantages and disadvantages of digital transmission over ...

    Some examples might help to explain what analogue and digital mean in technology. A simple example of analogue and digital technology Clocks are examples of analogue and digital technology. An analogue clock face can display the time without numbers. The hands keep moving all the time and they continue to rotate, just like the earth around the sun.

  2. Describe the hardware and software used to create and edit graphic images and compare ...

    People who stay in front of monitors for long can develop many heath issues. Monitors are costly to buy and maintain and limits an individual or a company's ability to acquire them. Monitors require frequent monitoring to ensure that the drivers are computer virus free and are working properly.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work