Database Systems
1. Fundamental Concepts of Database
Database and database technology are having a major impact on the growing use of computers. It is fair to say that database will play a critical role in almost all areas where computers are used, including business, engineering, medicine, law, education, and library science, to name a few. The word "database" is in such common use that we must begin by defining what a database is. Our initial definition is quit general.
A database is a collection of related data. By data, we mean known facts that can be recorded and that have implicit meaning. For example, consider the names, telephone numbers, and addresses of all the people you know. You may have recorded this data in an indexed address book, or you may have stored it on a diskette using a personal computer and software such as DBASE III or Lotus 1-2-3. This is a collection of related data with an implicit meaning and hence is a database.
The above definition of database is quite general; for example, we may consider the collection of words that make up this
数据库管理员英文
page of text to be related data and hence a database. However, the common use of the term database is usually more restricted.
A database has the following implicit properties:
.A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot be
referred to as a database.
.A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some
preconceived applications in which these users are interested.
.A database represents some aspect of the real world, sometimes called the mini world. Changes to the mini world are reflected in the database.
In other words, a database has some source from which data are derived, some degree of interaction with events in the real world, and an audience that is actively interested in the contents of the database.
A database can be of any size and of varying complexity. For example, the list of names and addresses referred to earlier may have only a couple of hundred records in it, each with a
simple structure. On the other hand, the card catalog of a large library may contain half a million cards stored under different categories-by primary author’s last name, by subject, by book title, and the like-with each category organized in alphabetic order. A database of even greater size and complexity may be that maintained by the Internal Revenue Service to keep track of the tax forms filed by taxpayers of the United States. If we assume that there are 100million taxpayers and each taxpayer files an average of five forms with approximately 200 characters of information per form, we would get a database of 100*(106)*200*5 characters(bytes) of information. Assuming the IRS keeps the past three returns for each taxpayer in addition to the current return, we would get a database of 4*(1011) bytes. This huge amount of information must somehow be organized and managed so that users can search for, retrieve, and update the data as needed.
A database may be generated and maintained manually or by machine. Of course, in this we are mainly interested in computerized database. The library card catalog is an example of a database that may be manually created and maintained. A computerized database may be created and maintained either by a group of application programs written specifically for that task or by a database manageme
nt system.
A data base management system (DBMS) is a collection of programs that enables users to create and maintain a database. The DBMS is hence a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. Defining a database involves specifying the types of data to be stored in the database, along with a detailed description of each type of data. Constructing the database is the process of storing the data itself on some storage medium that is controlled by the DBMS. Manipulating a database includes such functions as querying the database to retrieve specific data, updating the database to reflect changes in the mini world, and generating reports from the data.
Note that it is not necessary to use general-purpose DBMS software for implementing a computerized database. We could write our own set of programs to create and maintain the database, in effect creating our own special-purpose DBMS software. In either case-whether we use a general-purpose DBMS or not-we usually have a considerable amount of software to manipulate the database in addition to the database itself. The database and software are together called a database system.
2. Data Models
One of the fundamental characteristics of the database approach is that it provides some level of data abstraction by hiding details of data storage that are not needed by most database users. A data model is the main tool for providing this abstraction. A data is a set of concepts that can be
used to describe the structure of a database. By structure of a database, we mean the data types, relationships, and constraints that should hold on the data. Most data models also include a set of operations for specifying retrievals and updates on the database.
Categories of Data Models
Many data models have been proposed. We can categorize data models based on the types of concepts they provide to describe the database structure. High-level or conceptual data models provide concepts that are close to the way many users perceive data, whereas low-level or physical data models provide concepts that describe the details of how data is stored in the computer. Concepts provided by low-level data models are generally meant for computer specialists, not for typical end users. Between these two extremes is a class of implementation data models, which provide concepts that may be understood by end users but that are not too far removed from the way data is organized within the computer. Implementation data models hide some details of data storage but can be implemented on a computer system in a direct way.
High-level data models use concepts such as entities, attributes, and relationships. An entity is an object that is represented in the database. An attribute is a property that describes some aspect of an object. Relationships among objects are easily represented in high-level data models, which are sometimes called object-based models because they mainly describe objects and their interrelationships.
Implementation data models are the ones used most frequently in current commercial DBMSs and include the three most widely used data models-relational, network, and hierarchical. They represent data using record structures and hence are sometimes called record-based data modes.
Physical data models describe how data is stored in the computer by representing information such as record formats, record orderings, and access paths. An access path is a structure that makes the search for particular database records much faster.
3. Classification of Database Management Systems
The main criterion used to classify DBMSs is the data model on which the DBMS is based. The data models used most often in current commercial DBMSs are the relational, network, and hierarchical models. Some recent DBMSs are based on conceptual or object-oriented models. We will categorize
DBMSs as relational, hierarchical, and others.
Another criterion used to classify DBMSs is the number of users supported by the DBMS. Single-user systems support only one user at a time and are mostly used with personal computer. Multiuser systems include the majority of DBMSs and support many users concurrently.
A third criterion is the number of sites over which the database is distributed. Most DBMSs are centralized, meaning that their data is stored at a single computer site. A centralized DBMS can support multiple users, but the DBMS and database themselves reside totally at a single computer site. A distributed DBMS (DDBMS) can have the actual database and DBMS software distributed over many sites connected by a computer network. Homogeneous DDBMSs use the same DBMS software at multiple sites. A recent trend is to develop software to access several autonomous preexisting database stored under heterogeneous DBMSs. This leads to a federated DBMS (or multidatabase system),, where the participating DBMSs are loosely coupled and have a degree of local autonomy.
We can also classify a DBMS on the basis of the types of access paty options available for storing files. One well-known family of DBMSs is based on inverted file structures. Finally, a DBMS can be general purpose of special purpose. When performance is a prime consideration, a special-purpose D
BMS can be designed and built for a specific application and cannot be used for other applications, Many airline reservations and telephone directory systems are special-purpose DBMSs.
Let us briefly discuss the main criterion for classifying DBMSs: the data mode. The relational data model represents a database as a collection of tables, which look like files. Most relational databases have high-level query languages and support a limited form of user views.
The network model represents data as record types and also represents a limited type of 1:N relationship, called a set type. The network model, also known as the CODASYL DBTG model, has an associated record-at-a-time language that must be embedded in a host programming language.
The hierarchical model represents data as hierarchical tree structures. Each hierarchy represents a number of related records. There is no standard language for the hierarchical model, although most hierarchical DBMSs have record-at-a-time languages.
4. Client-Server Architecture
Many varieties of modern software use a client-server architecture, in which requests by one process (the client) are sent to another process (the server) for execution. Database systems are no exception.
In the simplest client/server architecture, the entire DBMS is a server, except for the query interfaces that interact with the user and send queries or other commands across to the server. For example, relational systems generally use the SQL language for representing requests from the client to the server. The database server then sends the answer, in the form of a table or relation, back to the client. The relationship between client and server can get more work in the
client, since the server will e a bottleneck if there are many simultaneous database users.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论