Software Development; Coding Techniques; Computer Science
Database structuring conventions are used to ensure that a database design provides access to the required data quickly and accurately while making efficient use of system resources such as network bandwidth and database storage. Standard conventions are used in defining entities, attributes, keys, views, and triggers.
Databases are built on entities. An entity is simply a single item of interest, such as a person, thing, event, or place, about which data is stored. For example, entities in a database might represent employees (people), products (things), concerts (events), or towns (places). Each entity has an associated set of attributes, which constitute the information stored in the database. For example, attributes of employees might include position, salary, and years of employment. Databases can be modeled as tables, with each entity represented by a row in the table and the various attributes represented by columns. Entities of different types are listed in separate tables. If a database contained records of both a company's employees and the different departments within that company, for example, one table would list the employees and their attributes, and the other would list the departments and their attributes.
Relationships can be constructed between entities of different types. These relationships take one of the following main forms: one-to-one, one-to-many, or many-to-many. For example, multiple employees (many entities) might each have a relationship with the same department (one entity). To construct relationships, each entity must first be assigned an attribute called a primary key. This is a unique identifier whose value must not be repeated within the same table. An employee number might be used to uniquely identify each employee, and a unique department identification (ID) number could be created for each department. Next, the relationship between entities is defined by making the primary key of an entity in one table an attribute of an entity in the other table. This new attribute is called a foreign key. For example, the foreign key column of the employee table would hold the department ID of each employee's department, establishing a relationship between the employee and their department.
When data is updated in one table, it is often necessary to update data in a related table. For example, if the department ID changes, that information must also be changed in the employee table for all employees of that department. Triggers and constraints can be used to ensure that the correct updates occur.
Establishing a consistent naming convention is an important part of database design. When naming objects in a database, keep names as short, clear, and descriptive as possible. If abbreviations are needed, use standard ones. Avoid using terms that might change over time, as database objects can be difficult to rename. An organization's naming conventions, such as those used for file naming, should inform database naming. Various database systems also impose their own limits on object naming. In general, underscores, not spaces, are used to separate words. A prefix or suffix may identify views.
Once the necessary entities and attributes are determined, the database structure can be optimized through normalization. Normalization maximizes the use of keys to relate tables to reduce duplication. For example, if the database of a US-based company had one table for employees and another for customers, both tables would likely have an attribute for each entity's state of residence. As a result, data would be duplicated between the two tables, taking up unnecessary space. In addition, if the state names were abbreviated differently in the different tables—for example, if Texas were abbreviated as “Tex.” in one table and “TX” in the other—this could cause errors. During normalization, a state table would be created to store the name and abbreviation of each state. This table would then be linked to the employee and customer tables using keys, thus preventing data duplication.
Normalization also ensures that the correct attributes are stored in the correct tables. For instance, it is more logical for an employee's work hours to be stored in the employee table than in the department table. This makes searches, or queries, of the data easier to perform.
Normalization reduces the size of tables, which offers performance advantages. Updates and insertions are faster because there is no need to update or add data in multiple places. Deletions do not risk the loss of other, needed data. In general, normalized databases perform well when data is written to the database or changed more often than it is read.
In contrast, denormalized databases offer performance advantages when data is read from the database more often than it is written. This is because the data required can be retrieved from a single table, which is faster than accessing multiple tables, as is needed in a normalized database. Index usage is also more efficient if a database is denormalized. Normalization and denormalization must therefore balance the read and write performance for a particular database.
In order for a database to offer the maximum possible performance while using system resources efficiently, it must be well designed. The foundation of a successful design is a sound database structure. Established database structuring conventions help developers create the most efficient database based on the individual requirements of the end user. Following these conventions has proved to result in improved database design.
—Maura Valentino, MSLIS
Churcher, Clare. Beginning Database Design: From Novice to Professional. 2nd ed., Apress, 2012.
Harrington, Jan L. Relational Database Design and Implementation. 4th ed., Elsevier, 2016.
Hernandez, Michael J. Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design. 3rd ed., Addison-Wesley, 2013.
MacLennan, Bruce J. Principles of Programming Languages: Design, Evaluation, and Implementation. 3rd ed., Oxford UP, 1999.
Mullins, Craig S. Database Administration: The Complete Guide to DBA Practices and Procedures. 2nd ed., Addison-Wesley, 2013.
Scott, Michael L. Programming Language Pragmatics. 4th ed., Elsevier, 2016.
Van Roy, Peter, and Seif Haridi. Concepts, Techniques, and Models of Computer Programming. MIT P, 2004.