Data modeling (sometimes spelled data modelling) is the analysis of data objects and their relationships to other data objects. Data modeling is often the first step in database design and object-oriented programming as the designers first create a conceptual model of how data items relate to each other.
How is database modeling used?
Data modeling involves a progression from conceptual model to logical model to physical schema. It’s typically the first step in database design and object-oriented programming, acting as a blueprint for building an optimized database. Designers first create a conceptual model of how data items relate to each other, which data architects can use to build out the physical blueprint.
In summary, data modeling works as a solution between collecting data and turning it into an actionable database.
Portions of this definition originally appeared on Datamation.com and are excerpted here with permission.
What are the types of data models?
The three primary types of data models—conceptual, logical, and physical—progress from an abstract layout to a detailed mapping of the database setup to the database’s final form.
Conceptual data model
Conceptual data models are the simplest and most abstract with little annotation or data used. Frequently used in the discovery stage of a project, the primary goal is to determine the overall layout and rules of data relationships, often related to regulations and data categories.
Logical data model
Logical data models expand on the basic framework of the conceptual model. Particularly useful in data warehousing plans, the goal is to consider more relational factors and make annotations related to overall properties or data attributes.
Physical data model
Physical data models are typically the final and most detailed step before database creation, focusing on database management system-specific properties and rules. It also illustrates details about data points and their relationships enough to create a blueprint with all needed instructions for the database build.
Data model infrastructure
Organizations can choose from several different design and infrastructure methods for visualizing their data beyond the three main types of data modeling.
Such options include the:
- Hierarchical Data Model: Similar to a family tree, data entities look like “parents” or “children” and branch off from other data that shares a relationship with them.
- Relational Data Model: These models are like hierarchical data models, although instead of parent-child relationships, it maps out the connections among various tables of data.
- Entity-Relationship (ER) Data Model: The ER data model creates a diagram that showcases data entities and their relationships. It’s often used with the relational model to understand how data should connect in a database.
- Object-Oriented Data Model: This model, often used in early development stages of multimedia technologies, groups complex real-world data entities into easy-to-read class hierarchies.
Data modeling features
The following are some of the key features of any approach to data modeling:
Data entities and their attributes
Data entities are abstractions of real pieces of data, and attributes are the properties that characterize those entities. When used together, it becomes easier to find relationships, or the similarities and connections across entities.
Unified modeling language (UML)
UML can be considered a set of building blocks and best practices for data modeling. It is a standard modeling language that helps data professionals visualize and construct appropriate model structures for their data needs.
Normalization through unique keys
Normalization is the technique that eliminates the repetition that occurs when building out relationships within a large set of data. It does this by assigning unique keys or numerical values to different groups of data entities, which allows users to normalize, or list only keys, repeating data entries for new entity relationships.