Software Development Resources
Create account | Log in | Log in with OpenID | Help

Relational database

From DocForge

A relational database, or relational database management system (RDBMS) is a database system in which information is stored in multiple linear tables representative of the structure of the data. The relationship among sets of data is defined by the relationship between fields. A relational database can be contrasted with an object-oriented database which stores objects and identifies the relationship between objects based on their properties, and also with a NoSQL database which stores individually structured documents.

Contents

[edit] Advantages

The benefits of a relational database over other methods of long term storage make it very popular for many situations.

  • Most types of data can be organized in sets with multiple predefined fields. An address, for example, will contain a street, city, state, and zip code. RDBMSs are designed specifically to store and relate this type of data.
  • Each record / row in the database can easily be given a unique identifier (id) to link to other records. An address record, for example, may be stored with a customer id to represent the relationship between a person and their addresses. Databases can further enforce these relationships to maintain data integrity.
  • Records with simple data types (integers, strings, floats, etc.) can be stored on disk or in memory with a fixed width, making retrieval and parsing fast and efficient.
  • A standard query language, SQL, is supported by most relational database servers. This simplifies development and increases code portability.

[edit] Disadvantages

  • Not all forms of a program's data are best represented in multiple related tables of scalar values. Data expressed in a tree or hierarchy, for example, can become complex. While a structured document form such as XML might be convenient for representation and transmission to other applications, storage of the fundamental data in a relational database may be cumbersome.
  • Spreading data across multiple servers can be difficult and inefficient. In some situations a distributed database implementation can't be seamless to the application using it. Most robust database servers have features for such scalability, and they are often complex enough to require training and expertise for proper administration.
    • One common scalable solution, database replication, requires duplicate copies of data on multiple computers. There can be a significant delay in replication.
    • Storing different tables on different servers requires multiple database connections and can make server-side joining of data difficult.
    • Individual tables distributed across multiple computers make table scanning inefficient.

[edit] Objects

Relational database systems vary in the sets of objects they support.

Tables
As the fundamental storage object, all relational databases support tables. Tables are queried to update and retrieve data. They are the logical representation of data to external systems. Internally they can be stored with any variety of methods, such as in memory, in one file on disk, or spread across multiple systems, but this is virtually seamless to the applications connected to the database server.
Indexes or Keys
Indexes are data structures built for very fast lookups, giving databases the ability to find table rows more efficiently.
Views
Views are stored queries whose results can be used similar to tables.
Stored procedures
Stored procedures are subroutines which reside on the database server. With them an application can run multiple SQL or procedural statements, optionally with parameters, with one simple procedural call.

[edit] Languages

To alter and retrieve the objects in its databases, each relational database system supports some special form of programming language. Most support a subset or superset of the standard Structured Query Language, or SQL. Due to its limitations, most larger databases support their own extension of the language:

[edit] Database Design

Designing relational database models is a fundamental component of software development for many types of applications. It's critical that a database model be designed well from the start because changes to it later can have far reaching implications in any applications that use it.

[edit] Best Practices

As always, the optimal database design is dependent on the situation in which it's going to be used. There are general best practices, or guidelines, which apply to almost all relational databases.

  • Plan with as much foresight as possible. Database design typically should be scheduled relatively early in a project. Altering a data model after an application has been implemented is often a cumbersome task.
  • Learn the data or business domain. It's much easier to lay out data when the domain is completely understood.
  • Organize the data first based on the data itself, and second on how it's going to be used. A primary focus when first laying out tables should be how data elements naturally relate to each other. Proper organization will help build better queries, make applications easier to program, and help retain data integrity.
  • Normalize tables. Ideally each record in a table will represent one distinct item. This will reduce redundancy and help enforce referential integrity. There can be a performance tradeoff, however, when data is broken down into many distinct tables, as they may need to be joined often in queries.
  • Write documentation. Explaining the logic behind design choices will help develop the applications that use it, as well as make changes easier later.
  • Define naming conventions, similar to general coding standards. For example, name all fields in lower case with underscores for spaces, and end all fields which are identifiers with _id. The choices made are less important than the fact they need to be consistent.
  • Create unique keys based on actual unique data, as opposed to arbitrary identifiers, when possible. For example, a product number in a product table should probably be unique. Using a separate guid as the only unique identifier would allow multiple products to have the same product number within the table.
  • Use primary keys, foreign keys, and stored procedures to protect data integrity.
  • Use stored procedures to also aid performance, security, and encapsulation.
  • Follow SQL standards when possible. While each database system has additions and modifications, following the standard will increase code portability. As a database grows, for example, it may be best to move from one vendor to another, and this is much more easily accomplished if standards are supported and adhered to.

[edit] Software

There's a wide variety of relational database servers, desktop applications, and programming libraries available. Some of the more notable include:

[edit] See Also

Discuss