THE DATA MANAGER:
- OS/400- The Operating System-
OS/400, the soft ware operating system for the AS/400 machine, together with the hardware, does all data management. This comprises creating, storing, and accessing data for applications, and means data for or from:
- Database
- Diskette
- Tape
- Another computer system’s database
- Another computer system’s devices
- Spool queues when needed
- Printers
In addition to the menus ad prompts you use to create the objects of your system, there are many levels of help at every step to explain the prompts themselves or any and all parameters of the commands. You do not need manuals or documentation to define the commands.
Although the Data Manager is built in, you must tell it how you plan to use the system. You define the database in enough detail to get started. Using CL, you “create” the device “files” as well as the database “files” according to your own specifications. You do this once for the system rather than once per program as in other systems. In other words, the Data Manager does all the navigation through the system that used to be within a user program, in a way that shares resources and thus allows more than one data user to access a pathway to and from a data type.
All users are controlled by the system rather than the other way around. This is a significant change from other systems you may know. Not only does it open up the data base to concurrent applications the risk bottlenecks, but it also opens the systems up over time to changes in environment and use. There is never the need to redesign the database in order to add a new application because the database is not designed for or by an application but by the system itself, and program access is optimised as a function of system load, not programmer’s ingenuity. The word is shared, and it applies over space and overtime.
- Field, Record, and File Definitions:
A record is the unit of transfer from file to system to program. With added field definition the system will, with your request for data:
- Rearrange fields as specified in your data request .
- Edit fields differently for different data request for output devices.
- Check input data field by field, according to prespecified field definition parameters.
- Sequence input to programs or output to screens by field selected at run time.
- Hide certain fields from a program’s or certain user’s view.
- Map data fields into your program or onto your screen differently from the actual record format.
Applications Programs, 4GLS , Packages , SQL, QUERY
.
On earlier systems a program had to “open” a file before requesting data. The OPEN statement in a COBOL or other program was the place of connection: Execution of open in my program caused the program to lock to a data file-“file” meaning both hardware medium and data type. To err here in the program was to cause a bug, meaning the program wouldn’t work. This is called early binding.
The AS system binds later. The system does not have to know the exact length of fields or records until it actually does a read for you. This means you can supply such information outside your program, using FILE objects that reside outside your program, which you select for your program’s use some time before an actual record is needed. OPENs are still used in COBOL programs, but they serve a different purpose now.
The following figure shows how the system gets data to the display screen when your application asks it to.
Getting data to the display screen.
Spooling:
Spooling equals queuing for input or output. Because the CPU instruction execution is magnitudes faster than mechanical I/O device speeds, spooling is required to balance system flow, allowing the data manager to continue processing data even when a printer putting out the results of a process is falling behind or not available. The same is true for input.
The AS/400 spools all the input and output (most systems have one output spool queue, designed for pictures and not good for much else).
Using the Database Manager
- Reading and writing data records:
The database management system “GETS” and “PUTS” completely manage all your data as well as the system’s own objects. This means that needs to update a customer master record with a new transaction no longer has to read both files of data- old master record and new transaction update file-then match them, after completing all the housekeeping, go on to the next transaction to repeat the process. Such a program in the AS/400 DMS merely makes a statement to the effect that this particular transaction is to be used to update that master file under certain conditions. The DMS will do all the selection, matching, updating, merging, and housekeeping. In other words, a program thinks logically about the data problem
The AS/400 manages your data on three levels:
- The physical layout of data on disk.
- The “Logical View” of data, which is shape and a sequence of that portion of the data to be delivered to a program on a GET.
- Delivery, or presentation, that is, the way you would actually see the data in your program or on a screen or a printer.
The management comprises:
- Placement and retrieval of actual physical data to and from its disk home.
- Protection of data from improper purposes or illegal usage.
- Sharing of data among concurrent users with concomitant time and usage control.
- Criteria selection of discrete data records for presentation to a program or output device.
- Resequencing of records for presentation to the user or program
- Editing of the selected and sequenced data records and final presentation to the requester (user).
The benefits you derive from the preparation of user program requests from actual system data manipulation are these
- Programs that are one-tenth the size of those on prior systems.
- Removal of housekeeping complexities from user programs or fourth generation language applications.
- Correlated or merged and matched data that is delivered to the user.
All this means in this era of data overload is that the automation process can keep up with the data buildup and that the data analyzers can predefine to the system the particular selection of data they are interested in and, quite literally, can turn on the system and have their selected data delivered to them in the form and quantity they can handle.
The set up to make this all happen involves data design at the three levels mentioned earlier. First, using DMS a data person defines the physical data layout you expect your system to handle. One benefit of a full DMS is that you do not have to prepare foe all time.
Next, also using DDS you define the “Logical Views” that will index and tailor your real data and sift and sort it out for each type of user need you know foresee. Later you will add other logical Views for new data uses.
If the user wants different rules to apply, he must come in through a different Logical View or create a new one for his own use.
- The Data Management System(DMS) and your data:
The DMS will organize, prioritize, separate, share, correlate, and control all users of the data, all the time. Even when no user program is running DMS is running. There is some flexibility in how you use the system. You can get and put data directly into the physical, real data level, by older systems may think they can achieve better performance this way: but they will be incorrect because the DMS decides about the real disk access- where data should be disk.
- Database Structure: Inside, Looking Out:
The smallest item the DMS can access and manage is the field. To define a field of data a field of data using the DDS language, you name it and then describe it: how long it is, what kind of data should be in it. You specify these data types wit h one of 15 data type symbols on your RPG-like DDS screen. You can also specify authority checks to be made for specific users to see this data.
When you create a Logical View you create an Access Path, which is an index to all the records your selection /omission criteria cause to be gathered in this set. The AD/400 will reuse already existing Access Path whenever possible. This is important because the system continuously updates them as new data enters the system, unless you specify otherwise in your FILE definition.
Data Architecture:
The data architecture can be looked at this way: as layers building up from the most real- the actual physical data- to the most logical – the way an end user wants to see the data for some particular purpose. Before you can run a particular program of your own to handle data, you must go through the three layers of definitions: the physical data layout (format), the logical data definitions (logical files or views containing records descriptions subsisted to the fields), and the device I/O logical definitions know as device definitions (Device Files in AS/400 terminology).
Think of it as its depicted in Figure 6.1
Figure 6.1 Layers of data definition from the most real to the most usefully edited.
Screen 1: DDS statement
Screen 2: Textual Field definition.
So far we have dealt in great detail regarding the types of databases in use. All these databases no doubt have their place in modern civilization. But for any database to be functional there needs to be a “tool” the user could use to insert and extract information from the database. This is exactly what Database Management Systems or DBMS do.
The need for DBMS:
In a large organization, information regarding individuals are normally scattered across a number of databases. The Human Resource department for instance, might have one database containing information related to the individual’s address and contact numbers. The Finance department on the other hand would have another database containing information regarding wages and income tax while the in-house clinic yet another containing the individual’s medical or health records. When a new employee joins the organization, his or her record would need to be created in all three of these databases.
Similarly, when an employee leaves the organization, his or her record will need to be updated to reflect that the person is no longer on active payroll. To help with these tasks, database administrators turn to database management system or DBMS.
Since most of today’s databases are stored in digital format on computer disks, the DBMS most commonly used today are in the form of computer programs or software. Besides simply inserting and updating records, DBMS perform are huge array of different functions that make database management easier and a lot less painful. Every time you work with a database, you are most likely using some sort of DBMS.
What DBMSs do
The primary function of any DBMS is to provide access to the huge amount of data found in separate (and often geographically distant) databases. It is the job of the DBMS to make all these separate and distinct databases appear as though they were one to the user. Ever wondered of how easily you can access your bank account from any ATM – even though you are hundreds of kilometers away from your main branch? Well that’s the power of DBMS working for you.
DBMS allow users to search for data, add, update and even delete data. Besides these basic functions, DBMS are also responsible for ensuring data security and integrity by blocking access to unauthorized sections of databases and logging all activities conducted by users on the databases. Database administrators also depend on DBMS to perform a host of other advance functions such as supporting batch and on-line programs, facilitating backups and recoveries, optimizing performance, maintaining the catalog and directory of database objects and acting as an interface to other systems programs.
Advantages of DBMS
Database management systems provide a host of advantages over other methods of database management. For one, DBMSs improves the availability of data by making the same information easily available to a large number of different users. In the corporate organization example above, all three departments (HR, finance and clinic) would share some common data regarding their members, such as names and identity card numbers.
DBMSs also help minimized data redundancy because; generally the information in it appears just once. This can therefore significantly reduce the cost of storing information.
DBMSs foster data integrity by allowing updates and changes to the data to be done from one place. The chances of making a mistake are higher if you are required to change the same data in several different places than if you only have to make the change in one place. In our corporate example, when a new employee joins the organization, the HR department might key in his/her particulars in their system and automatically it appears in the finance and clinic databases as well. This reduces repetitive work and increases an organization’s efficiency and productivity.
We live in any increasing paranoid world. Some types of information in databases are just not meant for any Ali, Ah Kong or Muthu. As such, this information are classified “sensitive” and should be protected or secured and only viewed by select individuals.
Through the use of passwords, DBMSs can restrict data access or show only relevant data to those who should see it. Every time you access an ATM facility and punch in your pin number, only information regarding your account is made visible to you – although the ATM might be accessing the same database that has the prime minister’s account information in it. This helps improves security and reduces the likelihood of confidential information being view by an unauthorized party.
But perhaps the most important advantage of DBMSs is the simplest: it makes using databases a lot easier. Many of today’s’ DBMSs come with graphical user interface (GUI) that make using them no different from any other Windows program. This frees the end users to concentrate on the task of using the database without having to worry about all the complicated processes required to store, update, and retrieve data, because the DBMS hide such details. Notice how easy it is to store your friends contact number in your hand phone? That’s due to the simplicity of the device’s DBMS interface. A DBMS also protects data from damage when many users try to retrieve and update the same data at the same time, or when the computer shuts down. Thus with DBMSs, many of the day-to-day functions involving database can be carried out by semi-skilled staff. These help organizations reduce the reliance on computer specialists to meet their data processing needs.
Types of databases :
We can differentiate the types of databases in use today in various ways. One popular (albeit an older) method is to separate databases according to their function. The other (a newer) method is to categorized databases according to their data model. In this section, we will look at databases from a functional perspective. Database models and database management software will be dealt with in the next section.
Database according to function
Understanding databases according to function is perhaps the easiest way to get acquainted with the types of databases in use. According to the functional view, databases can be categorized into two groups. The first group is small databases and the second group is large databases. Let's see what each of these groups of databases are and how they are used.
Small databases
Small databases are database small enough to be used on personal computers. More recently, small databases can also be found in all types of hand-held devices like palmtop computers, personal digital assistants and even cell phones. Small databases are usually used by a single user or at most, a small group of users. This type of database is usually found on single, less powerful machine, although it is possible to network computers running small databases. Small databases are used to perform specific tasks such as keeping track of customers or managing data for projects and appointments. Generally, data in small databases is entered by the user, as these databases are less able to handle large and continuous inflow of data from other sources.
Large databases
Large databases are stored on servers. Servers are powerful, dedicated computers that are capable of storing huge amounts of data. When users need to access large databases, they most likely need to go online or "log-on" to the server. In order for users to be able to log-on, most large databases have a network of computers that span a large area or in the case of the Internet, worldwide.
Large databases are able to handle a continuous flow of information being added into it while simultaneously also handling millions of request for information from around the globe. As such, large databases usually have a specialized team of engineers who oversee the proper functioning of the database. Large databases are very expensive to setup and sustain.
Types of databases (continue)
Data model provides for another (and more popular) method of differentiating databases. Data model refer to the “structure” of databases and explains how the various data in databases are linked together.
Flat-file databases
Before the advent of data models, organizations stored data in file systems of varying sizes. Databases of this type were often called flat-file databases because they were not linked in any meaningful way to data in other files. This resulted in many files being created. Some of these files had data that were repeated in other files. Thus, a huge amount of time was being spent creating and processing repeated data. This wasteful redundancy let for the search of a better way to link data in various files and to reduce the need for repeated data. The result was the creation of various data models. Data models provide a picture of the evolution of database from its humble flat-file beginnings. Let’s track through this evolution.
Hierarchical Databases
The first successful attempt at taming the database problem was the hierarchical model. Databases in the hierarchical model are arranged in a pyramid fashion. Beginning with only one database at the top (which is sometimes called the parent), this database is linked to a number of databases beneath it (which are sometimes called child databases). Each of these child databases is then linked to further databases beneath them (and no, these are not called grand-child databases, but merely child databases). The hierarchical data model bought some semblance of order to the database world. It reduced data redundancy considerably and gave a viable database structure to work with. However, it did not prove to be a cure-all.
Network Databases
Network databases were primarily designed to overcome the limitations of the Hierarchical model without creating a radical departure to the way the databases were structured. Thus in more that one ways, Network databases still behave like Hierarchical databases. This does not mean to say these databases are without merit. For one, Network databases allowed links between child databases. This let to two major improvements. First, the database searching capability was enhanced tremendously and secondly, data redundancy was even further reduced. Nevertheless, the old problem of record creation remained which ultimately gave way to a more popular model: the Relational database.
Relational Databases
In Relational Database, a different approach was taken in database design. No longer were separate databases hierarchical but all databases were “created equal” so to speak. In the Relational model, data is stored in separate databases. These separate databases can then be linked together by referring to a common field (or key field) found in all of them. This way any number of databases can be connected together. Thus in an organization, the various departments might all have their respective databases storing different types of information regarding it’s members or employees. But all these databases will have a common element, which can be the members’ social-security number or any such similar piece of non-changing information.
Object-oriented Databases
Till now all databases were only able to handle small pieces of alphanumeric data inserted into their fields. With the advent of more powerful computers and software, the 80s saw an explosion to the type and variety of what we defined as “data”. No longer was data limited to only plain boring text and numbers. Now sound, photos, videos, graphics and a whole lot of multimedia content were being categorized as data. The old databases were not meant to handle these type of data and so a new bread of databases had to be created that could store and process them. Object-oriented Database answered that call
Linking separate database files
As we separate and store a database into smaller files, one problem becomes apparent: data redundancy. The separated database files might have some common data, for instance, the employee names and employee numbers. How do we make sure that the total number of employees in the finance database matches with the number of employees in the human resource database? This is where the synchronization of the separated database files becomes important. To overcome this, the separated database files are often linked or cross–referenced together. This will ensure that the data in the separated files are always matching
Advance DBMSs: Data Marts and Data Warehouses
The database world is a fast changing one with innovation and new products coming our yearly. In recent years, data marts and data warehouse applications have been gaining popularity as tools to boost an organization’s competitiveness. In the coming K-economy these applications will be commonly used by the user everyday. Let’s see what they are.
Data Warehouses
As an organization’s data processing increases, the need arise to manage all the different databases found in an enterprise. Old data are continuously being replaced with new ones. Not long ago, these historical data was normally destroyed. Fortunately, nowadays many organizations have realized the value of being able to analyze historical data in order to discover patterns of behavior and predict future trends. For example, analyzing historical data can tell a retailer what items were ordered, in what quantities, and by which customers.
Hence a method was developed that allowed organizations to pull all relevant data (whether current or historical) out of the various databases in the enterprise. This method is called data warehousing. A data warehouse is therefore a collection of current and historical data from multiple databases that the organization can use for analysis and decision-making. The purpose of the data warehouse of course, is to bring the “cream” of data used by the organization into one place.
Typically, a data warehouse is housed on an enterprise mainframe server. Data from various processing applications and other sources is selectively extracted and organized on the data warehouse database for use by analytical applications.
Data Marts
Bringing together so much data into a data warehouse can make analysis very difficult. To address this problem, organizations use what are called data marts. Data marts are smaller related sets of data that are grouped together and separated out from the main body of data in the data warehouse. Data marts are designed to be made available to specific sets of users only. For example, data about manufacturing can be put into a data mart and be made available to the production department. Human resource data can be put into another data mart and be provided to the human resources employees. This approach makes it easier for each group in the organization to access the data they need.
Both data mart and data warehouse are advance form of data management tools gaining increasing adaptation among organizations worldwide. But managing data itself is not enough. What good is the best data managing tool if it does not help the user make better decisions? Here we arrive at the core of what it means to have databases.
Data mining: From data to knowledge
Data is only as useful as it guides us to make decisions and predict future trends. For that reason, database have traditionally included query and reporting tools that help users extract relevant pieces of information from the database and later analyze them to see their correlation. These tools have been used for a long time by database administrators as a way to analyze databases and derive “meaning” from all that data.
In recent years however, these tools have been found to be lacking. One reason is that the traditional query and reporting methods do not allow users to discover the underlying relationships of the different pieces of data. Now a new bread of database applications has emerged to fill the need for more powerful data analyzing tools. These applications are called data mining tools.
Just what is data mining? According to one definition, “data mining, also known as "knowledge discovery," refers to computer assisted tools and techniques for sifting through and analyzing vast data stores in order to find trends, patterns, and correlations that can guide decision making and increase understanding”. But really, what this means is making some sense of the mountains of data. However, the real power of data mining is that it can actually discover previously unknown relationships among the data.
Data mining can enable businesses to predict future trends, which they can use to better position their products and services for financial gain. Least you think that data mining is only for the capitalist, nonprofit organizations have also achieved significant benefits from data mining, such as in the area of scientific progress.
Data mining has been used by a large variety of organizations. As an example, by recording the activity of online shoppers over time in it’s Internet store, and analyzing them using data mining tools, a retailer such as Amazon.com can use knowledge of patterns discovered to improve the placement of items in the layout of their mail-order catalog or web page. Telephone companies mine customer billing data to identify customers who spend considerably more than the average on their monthly phone bill. The company can then target these customers to sell additional services. Hospitals can use data mining to identify groups of people whose healthcare costs are likely to increase in the near future so that preventative steps can be taken. Scientist use data mining to analyze billions of bytes of data collected from observations and experiments in the hopes of finding new planets or galaxies or even the elusive cure for cancer.
Databases and You
Databases are everywhere. The purpose of this article has been to increase the awareness of the average user as to their prevalence and potential benefit that can be gained by mastering even the most fundamental concepts of database technology.
As the world rushes into globalization and the Internet and mobile devices make it an even smaller planet, data and database technology will be a central part of every human being’s life. From buying bread at the sundry shop to ordering pay-per-view movies over the super-fast Internet of the future, databases will be relied upon to handle and process an ever increasing amount and type of information generated by the human race. In this world of high tech gadgets and lightning speed communication, the knowledge to access and work with databases will not only be important, it will be crucial to our day-to-day living. The days when you can turn to a clerk to fill up a form are fast disappearing. In the future, it’s only going to be you and the computer database.
Dangers of using Databases-
With great tools come great risks. As powerful as DBMSs are, they have downsides to their use as well. As a knowledge worker, it is important that you understand the strengths and weaknesses of DBMSs. This will allow you to make better decisions on how to organize and use data and DBMSs. Lets look at some of the risk factors associated with the use of DBMSs.
Data Security
Data security is a real concern in these day and age. Perhaps no event in recent memory has highlighted this more then the recent theft of more then 1,400 credit card numbers of prominent world and business leaders from the databases of the World Economic Forum in Davos, Switzerland in February this year. Among the credit card numbers stolen were those that belong to the likes of the then US President, Bill Clinton and Microsoft founder, Bill Gates. This event has not only reminded the world of the dangers posed by hackers but also highlighted the vulnerability of databases.
Today, there is more information stored in computer databases around the world then ever before in the history of mankind. But this itself would not be much cause for worry was it not for another development: the Internet. The Internet has increased the connectivity of databases to such an extent that the physical location of a database is not much of a hindrance anymore to those bent on stealing information from it. This has made incidences of credit card and other personal information theft conducted from hundreds and thousands of kilometers away via the net by hackers become increasingly common.
Data Security and You
You might think that you are save from this security threat since you never pass on your credit card number on the Internet. But consider another proposition: have you received junk mail or junk e-mail before? Ever wondered where they got your name and address from? If you have (and I know very little people who haven’t) then you could be a victim of another risk posed by databases: breach of privacy. Lets see how this can happen.
Thousands of web pages these days require you to enter some sort of personal information before you can access their information or services. Among the information most popularly requested is your name and e-mail address. All these names and e-mails are then stored on the web page owner’s databases which they use to periodically alert members about new features and services. Although all of the owners of these web pages make solemn promises never to divulge or sell your private information to anyone, more then one dot com has admitted or been caught trying to “share” their members’ confidential information with third parties. So as you can see, the issues of privacy and security are real and the legal fraternities of the world are still struggling to address it while at the same time, technology is progressing by leaps and bounds ahead of them.
Who should have access to what?
With great tools come great risks. As powerful as DBMSs are, they have downsides to their use as well. As a knowledge worker, it is important that you understand the strengths and weaknesses of DBMSs. This will allow you to make better decisions on how to organize and use data and DBMSs. Lets look at some of the risk factors associated with the use of DBMSs.
The dollar issue
Besides privacy and security, there is also another aspect in using database that’s of immediate concern to the user; and that’s the cost. Implementing database, especially so in a large environment involves a lot of planning and investment. The cost of training the appropriate personnel itself can be a huge burden to the organization. Often entire network systems need to be set-up just for the database. It is no surprise that many smaller companies are reluctant to implement database on a large scale, preferring to stick to their old “cost-friendly” approach. In time to come, these companies will find it increasing difficult to compete with others that use database technology to deliver products and services to customers.