Database Design

From Computer Science Basics
Jump to: navigation, search

Exam examples

5 For each of the following events state whether it would typically occur at the client end, the server end, or at both. (3 marks)

    • (i) Updating a database
    • (ii) Validation of user fields
    • (iii) Checking username and password


8 The two tables Product and Supplier form a relational database used by a shoe shop. (Two tables are given in the paper.)

  • (a) State the primary key of the Supplier table and justify your choice. (2 marks)
  • (b) List the results of executing the following SQL query on the database above. (4 marks)
    • SELECT Product.Name, Product.Quantity, Supplier.PostCode
    • FROM Product, Supplier
    • WHERE Product.Quantity > 40 AND Supplier.Name = ‘Trainers4Us’
    • AND Product.SupplierCode = Supplier.SupplierCode
  • (c) Write an SQL statement to add the following data to the Product table. (In paper gives a row to add.) (3 marks)
ProductCode Name Cost SupplierCode Quantity
444AA Slippers 6.99 32 100


10 A programmer is developing a computer game. She has to decide between using a text file and a relational database to store some data that is needed for the game. State two possible situations when a text file would be a better choice than a relational database. (2 marks)

AQA Exam specification

Students should know the following at GCSE (KS4):

  • Database concepts
    • understand the basic concepts of a relational database as a data store
    • be able to explain the terms record, field, table, query, primary key, relationship, index and search criteria.
  • Query methods (SQL)
    • be able to create simple SQL statements to extract, add and edit data stored in databases
    • have experience of using these SQL statements from within their own coded systems.
  • Connecting to databases from applications and web based apps
    • be able to use databases from within their own web based applications

National curriculum

There is no direct reference to databases in the National Curriculum (NC) there are indirect mentions of using databases and creative projects.

The exam boards do specifically ask questions about databases.

Databsase definition

A database is an organized collection of data. The data is typically organized to model aspects of reality in a way that supports processes requiring information, such as modelling the availability of rooms in hotels in a way that supports finding a hotel with vacancies.

Database management systems (DBMS) are computer software applications that interact with the user, other applications, and the database itself to capture and analyze data. A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases. Well-known DBMSs include MySQL, PostgreSQL, Microsoft SQL Server, Oracle, Sybase and IBM DB2. A database is not generally portable across different DBMSs, but different DBMS can interoperate by using standards such as SQL and ODBC or JDBC to allow a single application to work with more than one DBMS. Database management systems are often classified according to the database model that they support; the most popular database systems since the 1980s have all supported the relational model as represented by the SQL language. Sometimes a DBMS is loosely referred to as a 'database'.

Types of Database

There are several types of databases.

At GCSE we only consider relational databases.

Flat file

A flat file database is a database which, when not being used, is stored on its host computer system as an ordinary, non-indexed "flat" file. To access the structure of the data and manipulate it, the file must be read in its entirety into the computer's memory. Upon completion of the database operations, the file is again written out in its entirety to the host's file system. In this stored mode, the database has no structure which can be seen external to the file and thus is referred to as a "flat file."

The term has generally implied a small, simple database. As computer memory has become cheaper, larger and more sophisticated databases can now be held in memory in their entirety for faster access, but these would not generally be referred to as flat-file databases.

A flat file can be a plain text file or a binary file. There are usually no structural relationships between the records.

Hierarchical databases

A hierarchical database model is a data model in which the data is organized into a tree-like structure. The data is stored as records which are connected to one another through links. A record is a collection of fields, with each field containing only one value. The entity type of a record defines which fields the record contains.

A record in the hierarchical database model corresponds to a row in the relational database model. The hierarchical database model mandates that each child record has only one parent, whereas each parent record can have one or more child records. In order to retrieve data from a hierarchical database the whole tree needs to be traversed starting from the root node.

Relational databases

A relation is a table structure definition (a set of column definitions) along with the data appearing in that structure. The structure definition is the heading and the data appearing in it is the body, a set of rows.

Object-orientated databases

An object database (also object-oriented database management system) is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented. Object-relational databases are a hybrid of both approaches.


Advantages of a relational database over other types of database

The advantages of a relational database over flat file.

  • Avoids data duplication
  • Avoids inconsistent records
  • Easier to change data
  • Easier to change data format
  • Data can be added and removed easily
  • Easier to maintain security


  • Data is split into several related tables
    • Data is only stored once.
    • No multiple record changes needed
    • More efficient storage
    • Simple to delete or modify details.
    • All records in other tables having a link to that entry will show the change.
    • data is in one place
  • Complex queries can be carried out using a language called SQL. SQL enable programmers to 'Insert', 'Update', 'Delete', 'Create', 'Drop' table records. These actions are further refined by a 'Where' clause.
  • Better security.
    • By splitting data into tables, certain tables can be made confidential. The system can limit access only to those tables a user they are authorised to view.
  • Cater for future requirements.
    • By having data held in separate tables, it is simple to add records that are not yet needed but may be in the future. For example all the countries of the world could be added from a dictionary file.

Advantages of a relational database

Relational database

A relational database is a digital database whose organization is based on the relational model of data, as proposed by E.F. Codd in 1970. This model organizes data into one or more tables (or "relations") of rows and columns, with a unique key for each row. Generally, each entity type described in a database has its own table, the rows representing instances of that type of entity and the columns representing values attributed to that instance. Because each row in a table has its own unique key, rows in a table can be linked to rows in other tables by storing the unique key of the row to which it should be linked (where such unique key is known as a "foreign key"). Codd showed that data relationships of arbitrary complexity can be represented using this simple set of concepts.

Prior to the advent of this model, databases were usually hierarchical, and each tended to be organized with a unique mix of indexes, chains, and pointers. The simplicity of the relational model led to it soon becoming the predominant type of database.

The various software systems used to maintain relational databases are known as Relational Database Management Systems (RDBMS).

Virtually all relational database systems use SQL (Structured Query Language) as the language for querying and maintaining the database.

Components of databases

Schema

A schema is a description of the entire database structure that is used by the database software to maintain the database.

Tables, Rows, Columns and Fields

A table is a collection of related data held in a structured format within a database. It consists of fields (columns), and rows.

In relational databases and flat file databases, a table is a set of data elements (values) using a model of vertical columns (which are identified by their name) and horizontal rows, the cell being the unit where a row and column intersect. A table has a specified number of columns (or fields), but can have any number of rows (records). Each record (row) is identified by one or more values appearing in a particular column subset. The columns subset which uniquely identifies a row is called the primary key.

Record (Row)

Relational databases arrange data as sets of database records, also called rows. Relational databases arrange data as sets of database records, also called rows. Each record consists of several fields; the fields of all records form the columns.

Fields (Cells)

A record is divided into fields. Data that has several parts, known as a record, can be divided into fields. Each record consists of several fields; the fields of all records form the columns.

Sometimes instead of using the term "a field" data in a database can be referred to as a value in a table.

When using rows and columns, a field is known as a cell.

Primary key

A primary key uniquely specifies a field within a table. In order for an attribute to be a good primary key it must not repeat. While natural attributes (attributes used to describe the data being entered) are sometimes good primary keys, surrogate keys are often used instead. A surrogate key is an artificial attribute assigned to an object which uniquely identifies it (for instance, in a table of information about students at a school they might all be assigned a student ID in order to differentiate them). The surrogate key has no intrinsic (inherent) meaning, but rather is useful through its ability to uniquely identify a string.

Composite key

A composite key is a key made up of two or more attributes within a table that (together) uniquely identify a record. (For example, in a database relating students, teachers, and classes. Classes could be uniquely identified by a composite key of their room number and time slot, since no other class could have exactly the same combination of attributes. In fact, use of a composite key such as this can be a form of data verification, albeit a weak one.

Forms

A form is an interface that displays the fields that enable a user to maniupulate data in the database. The form can call fields from rows over several tables. It can be used to view data, edit or enter data, delete data.

The form is often based on a query that finds relevant information that requires updating. The most basic form is the "new record" form that enables the user to enter data.

Queries

A search for data matching several pre-defined criteria is called a query. A query can be written in a the query language SQL, this can specify what data is to be searched for as well as in which tables and combined with boolean AND, OR and ANDNOT.

Structured Query Language (SQL)

Structured Query Language is a special-purpose programming language designed for managing data held in a relational database.

SQL consists of a data definition language and a data manipulation language. The scope of SQL includes data insert, query, update and delete, schema creation and modification, and data access control.

SQL was one of the first commercial languages for Edgar F. Codd's relational model, as described in his influential 1970 paper, "A Relational Model of Data for Large Shared Data Banks."[10] Despite not entirely adhering to the relational model as described by Codd, it became the most widely used database language.[11][12] SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the International Organization for Standardization (ISO) in 1987.[13] Since then, the standard has been revised to include a larger set of features. Despite the existence of such standards, though, most SQL code is not completely portable among different database systems without adjustments.

Reports

A report is simply the output to a query of the database.

The report can be sent to the printer or to the screen. It can be a simple tabulated list or it can be a well formated form made to imitate a paper card or a printed page.

Data Verification and Data Validation

Data validation

Data validation is the process of ensuring that a program receives clean, correct and useful data. It uses routines, often called "validation rules" "validation constraints" or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic.

For example a field - ProductIDNumber - requires numeric data only in the form of a 6 digit number; if the following is entered "00012A", this data is invalid and the record will not be saved and returned to the users with an appropriate error message.

Data Verification

Data Verification is a process where different types of data are checked for accuracy and inconsistencies after data migration is done. It helps to determine whether data was accurately translated when data is transferred from one source to another, is complete, and supports processes in the new system. During verification, there may be a need for a parallel run of both systems to identify areas of disparity and forestall erroneous data loss.

Types of Verification

There are two different levels of verification. The most secure level of verification is "full verification", where each and every file is verified, after backing up, by reading back from the backup medium. A less exhaustive type of verification is "sampling verification." Only a randomly selected sample of what was backed up is read back and verified. The ultimate in backup verification is doing a test restore, but this takes both time and some spare hardware.

Another type of Data Verification is double entry and proofreading data. Proofreading data involves someone checking the data entered against the original document.

Resources

http://www.teach-ict.com/as_as_computing/ocr/H447/F453/3_3_9/database_design/miniweb/index.htm

http://revisionworld.com/gcse-revision/ict/software/databases