Data Model

Data modeling is the process of representing data in a form that is easy for business users to understand and find answers to their questions. Data modeling requires a centralized approach to ensure consistent enterprise metrics as well as a self-service approach for business users to blend data to support their data investigations.

Topics

Data modeling

Data modeling is essential both to provide data in a form that is ready to answer most anticipated business questions and to ensure a consistent view of all enterprise numbers. Data models also bypass the complexity of the physical way data is stored and instead present business users views of their data that make sense to them. For example, finance users do not have to understand SQL or MDX query languages but can easily query a relational database management system (RDBMS) or Essbase cube using recognizable finance terms from their own lexicon.

The data model is a single place to define enterprise business calculations. Regardless of how or where those calculations are used, the value will be consistent and trusted. For example, the metric cost-to-hire would have the applicable source systems correctly mapped and the calculation defined centrally. Then any visualization or reporting process calling that metric would always report the same number.

Governed semantic model

Develop and deliver trusted and governed semantic models to ensure a consistent view of business-critical data. Map complex data into familiar and consistent business terms. Design an optimized, fine-tuned query for execution.

The semantic model is comprised of three layers: starting with the physical layer, which feeds into the logical layer, which then feeds into the presentation layer. The physical layer maps the organization’s physical data source systems and is usually configured and managed by IT. The logical layer is used to build business calculations, hierarchies, and mapping of several data sources into logical reporting areas. For example, the ERP system and data warehouse can be mapped together for financial reporting areas. The presentation layer is how users are presented the attributes and metrics available to them to create their analytics stories. While all data is consistently calculated, a user’s particular view of that data is filtered based on their security access and authorization.

Figure 1: Reviewing data lineage in the semantic modeling tool

The semantic model is also visible to third-party visualization tools (e.g., Tableau, Power BI, or custom apps) as a JDBC source. This ensures that if some business groups choose different visualization tools, enterprise metrics only need to be defined once and remain consistent across all reporting platforms in the company.

Learn more about the semantic layer

Self-service data modeling

Users can directly join two or more tables and control the relationship (e.g., inner or outer joins) through self-service. Easily share self-service data models with colleagues.

Watch a multi-table dataset demo (2:57)

Learn more about self-service data modeling

Data augmentation and recommendations

Data sets can be augmented with additional data, attributes, or transformations. The built-in reference knowledge includes:

Global positioning system enrichments:
Reference latitude and longitude for cities or zip codes.
Reference-based enrichments:
Designate gender using the person’s first name as the attribute that defines the gender decision.
Column concatenations:
Link a person’s first and last name in one column.
Part extractions:
Separate the house number from the street name in an address.
Semantic extractions:
Extract information from a recognized semantic type, such as domain from an email address.
Date part extractions:
Pull out the day of week from a date that uses a month/day/year (or day/month/year) format to make the data more useful in visualizations.
Full and partial obfuscation:
Mask detected sensitive fields, such as credit card or social security numbers.
Common recommendations:
Delete columns containing detected sensitive fields.
Custom knowledge enrichments:
Leverage custom inclusions that the administrator has added to Oracle Analytics.

Learn more about data enrichment with custom knowledge

Figure 3: Configuring custom reference knowledge

Semantic Modeler Markup Language (SMML)

Data model developers can build, edit, and tune semantic models using the web-based graphical tool; However, another approach is to programmatically modify models using the Semantic Modeler Markup Language (SMML). SMML is a JSON-based markup language that describes the design-time semantic model's objects. Each SMML file represents an object in the semantic model. You can use SMML files for metadata migration, programmatic metadata generation and manipulation, metadata patching, and other functions. This means developers can edit the semantic model code directly or apply changes via other programmatic processes by simply making textual changes directly to the SMML definition.

Learn more about SMML

Figure 4: JSON-based Semantic Modeler Markup Language (SMML)

Multiuser model development and Git integration

The semantic modeler integrates with any Git-compatible repository, such as GitHub, GitLab, or Git on Oracle Visual Builder, to provide a seamless, collaborative, multi-user development environment and source control. Git integration for multi-user development supports branching, merging, pull and push operations, and enables full visibility into the development lifecycle of the semantic model.

Learn more about the semantic modeler

Try the semantic modeler tutorial

Figure 5: Multi-user development with Git integration

Previous: Connect

Next: Data prep