The disciplines and subdisciplines of data are complex and often overlaid. Data analysis, engineering, and science are foundational concepts, while data modeling refers to the process of mapping data at conceptual, logical, and physical levels.
Data modeling overlaps with data science, engineering, and analysis, but its angle is probably more towards the engineering side of things. Namely, data modeling creates visual maps and references that allow data practitioners to visualize a data system.
Data analysis involves interpretation, critical thinking, and other analytical techniques to derive meaning. Modeling is the process of connecting data systems together, e.g., connecting a point-of-sale device to a CRM, or a sales database to a stock system.
What is Data Modeling?
Data modeling is the process of creating maps, graphs, or diagrams that visualize the relationships between data. In a data project, development of models is early and the project’s goals and architectures are the bases for their designs. Any virtual data project requires data models are required for virtually any data project which requires different systems to talk to each other in a structured manner.
The content and format of the diagrams themselves will vary with the project needs and architecture. Most database models are either:
-
- Relational models, which provide well-structured, logical connections between different tables. These are simple to work with but have fixed schema.
- NoSQL models, which are essentially schema-less. These are more intensive to model but allow for the creation of joins while allowing some data to remain nested.
- Graph models, which are excellent for mapping one-to-one and one-to-many relationships in networks. The resulting data structure is heavily nested.
Data typically models in a relational database using structured query language (SQL), utilizing traditional table formats to store information. However, noSQL modeling uses collections of documents and is generally much more flexible. Graph databases are another possibility which heavily nests and suits interconnected network-based data, e.g., traffic networks, social media, or other digital networks.
There are other types of data modeling, such as traditional hierarchical modeling, object-oriented models, which use class hierarchies and associated features, and dimensional data models, which are frequently for business intelligence (BI).
3 Stages
Data modeling has these three stages:
-
- Conceptual
- Logical
- Physical/technical
Conceptual data models are broad and abstractive. Here, data engineers, analysts, and other data practitioners will work together to overview the problem. For example, a brick-and-mortar high street store might want to integrate their point-of-sale data with their online store and logistic and distribution systems. Connecting these systems will allow the shop to recognize online customers when they shop in-store and refer in-store customers to the online store if something is out of stock and vice-versa.
At conceptual level, the three main components should be drafted (POS, online store, and distribution system) with the main entities (customers and products).
Logical data models add primary keys, attributes, and relationships. For example, customers will be broken down by attributes such as customer IDs, names, addresses, emails, etc. Products will contain their product IDs, location, category, etc. Assignment on nullability and optionality are at this stage.
Physical models then transfer these models onto the specific architecture and add foreign keys, data types, metadata, and everything else required to make the systems functional and communicable.
What Is Data Analysis?
Data analysis has a much wider, more general remit than data modeling. In fact, you could argue that most people conduct some level of data analysis in their daily lives – our brains are analyzing data constantly. Without data analysis, data is just a static entity. It needs to be processed and understood to mean anything.
In a business context, data analysis involves everything from analyzing sales trends and data to tracking customers and analyzing audience demographics or financial metrics. As a result, enterprise-level companies will employ a wide range of different data analysts.
At a fundamental level, there are probably six core types of data analysis:
-
- Causal Analysis
- Descriptive Analysis
- Exploratory Analysis
- Inferential Analysis
- Mechanistic Analysis
- Predictive Analysis
It’s often necessary to transform and clean data before loading it into dashboards and suites for analysis. Data engineers might handle the cleaning and transformation of data. Data analysts are perhaps more skilled in statistics and mathematics than programming or database management.
The job role of a data analyst is more client-facing – they work closely with the business or organization to analyze data to solve specific business problems.
Data analysis involves everything from visualization, clustering, exploration, classification, regression, and simulation modeling. The result of data analysis forms conclusions and builds solutions.
Data Analysts and Data Modeling
The concepts of data analysis and data modeling do not always exist in isolation from each other.
However, analysis is not really required when models are created to solve a simple, practical task (e.g., connecting brick-and-mortar POS databases to online store databases). Data doesn’t have to be analyzed to be modeled in a database, though it should obviously be appropriately clean and correctly validated.
The store requires analysis if it wants to query that database and compare in-store customers to online customers. This might involve querying the databases and retrieving appropriate data for insight.
The data modeling process involves data analysts heavily, but it really depends on the specific project in question. Data analysts will need to understand the database that the business is using so they can launch the required queries.
Summary: The Difference Between Data Analysis and Data Modeling
Data is hugely diverse and intersects with practically every digital system on the planet. Business, organizational, or other commercial contexts use data modeling to frequently connect different architectures or build new architecture from scratch to solve problems.
On the other hand, data analysis involves everything from querying databases to analyzing machine learning models. While data modelers lean towards the engineering side of things, data analysts lean towards mathematics and statistics. For example, a data analyst may have very little knowledge of database architectures. Conversely, those involved in data modeling will likely require an in-depth understanding of databases.