Most Deadly Sins of Database Design

Posted: January 8, 2022 by Virendra Yaduvanshi in Database Administrator
Tags: , , , ,

Pengertian pemodelan data ciri, jenis, tekniknya | AnakTik.com
If you don’t get the data right, nothing else matters. However, the business focus on applications often overshadows the priority for a well-organized database design.
Several factors can lead to a poor database design — lack of experience, a shortage of the necessary skills, tight timelines, and insufficient resources can all contribute. Addressing some simple data modeling and design fundamentals lead to the right path. Here, I am trying to explain a few common databases design “sins” that can be easily avoided and ways to correct them in future projects.

1) Poor or missing documentation for databases in production: Documentation for databases usually falls into three categories: incomplete, inaccurate, or none at all. This causes developers, DBAs, architects, and business analysts to scramble to get on the same page. They are left up to their own imagination to interpret the meaning and usage of the data. The best approach is to place the data models into a central repository and spawn automated reports so that with minimal effort, everyone benefits. Producing a central store of models is only half the battle, though. Once that is done, executing validation and quality metrics will enhance the quality of the models over time. It will help in data management and can extend what metadata is captured in the models.

2) Inadequate or no normalization:
Sometimes it needs to de-normalize a database structure to achieve optimal performance but sacrificing flexibility will paint you in a corner. Despite the long-held belief by developers, one table to store everything is not always optimal. Another common mistake is repeating values stored in a table. This can greatly decrease flexibility and increase difficulty when updating the data. Understanding even the basics of normalization adds flexibility to a design while reducing redundant data.

3) Inappropriate data modeling:
There are numerous examples of customers performing the modeling upfront, but once the design is in production, all modeling ceases. To maintain flexibility and ensure consistency when the database changes, those modifications need to find their way back to the model.

4) Improper storage of reference data:
There are two main problems with reference data. It is either stored in many places or, even worse, embedded in the application code. Reference values provide valuable documentation which should be communicated in an appropriate location. The best chance is often via the model. The key is to have it defined in one place and used in other places.

5) Ignorance of foreign key or check constraints: End-users complain all the time about the lack of referential integrity – RI or validation checks defined in the database when reverse engineering databases. For older database systems, it was thought that foreign keys and check constraints slowed performance, thus, the RI and checks should be done via the application. If it is possible to validate the data in the database, can be done there. Error handling will be drastically simplified, and data quality will increase as a result.

6) Avoiding uses of domains and naming standards:
Domains and naming standards are probably two of the most important things that can be incorporated into modeling practices. Domains allow the creation of reusable attributes so that the same attributes are not created in different places with different properties. Naming standards allow to clearly identify those attributes consistently.

7) Inappropriate primary key: The simplest principle to remember when picking a primary key is SUMStatic, Unique, Minimal. It is not necessary to delve into the whole natural vs. surrogate key debate, however, it is important to know that although surrogate keys may uniquely identify the record, they do not always uniquely identify the data. There is a time and a place for both, and you can always create an alternate key for natural keys if a surrogate is used as the primary key.

8) Using a composite key: A composite primary key is a primary key that contains two or more columns. The primary key field serves a single purpose of uniquely identifying the row within the system, as a result, it’s used in other tables as foreign keys. Using a composite primary key means there is a need to add two or three columns in these other tables to link back to this table, which is not as easy or efficient as using a single column.

9) Poor indexing: Indexes on a database are objects that allow certain queries to run more efficiently. Indexes are not a silver bullet for performance – they don’t solve every issue. There are commonly three mistakes that are often done when it comes to indexes:

a) No indexes at all: Without the index, as the table grows with more data, then the queries will likely get slower and slower.

b) Too many indexes: Having too many indexes can also be an issue for databases. Indexes help with the reading of data from a table, but slow down the process of DML.

c) Indexes on every column: it might be tempting to add indexes to every field of the table, doing so can slow down the database

10) Incorrect Data Types: Due to incomplete knowledge about business flows, storing data in the wrong format can cause issues with storing the data and retrieving it later.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s