D.2.12. Knowledge Graph Federation¶
Model-driven federated interoperable and distributed data platforms.
The capability Knowledge Graph Federation (D.2.12) is part of the capability area Technology Execution in the Technology Pillar.
Model-driven federated interoperable and distributed data platforms.
The Knowledge Graph Federation capability is vital for integrating distributed data sources in a federated knowledge graph.
Two key aspects are crucial for its implementation:
-
Authoritative Identifier Resolution: Establishing an authoritative approach for identifier resolution is essential. It involves defining a consistent strategy to resolve identifiers across the federated data sources. This ensures data integrity, eliminates conflicts, and enables accurate cross-referencing and integration of data.
-
Meaning Understanding and Policy Integration: Understanding the meaning and policies associated with identified objects is critical. This involves capturing metadata, semantics, and contextual information to gain a comprehensive understanding of object attributes and relationships. By integrating policy management and entitlements, organizations can enforce access controls, data governance, and compliance across the federated knowledge graph.
By implementing an authoritative identifier resolution strategy and integrating meaning understanding and policy integration, organizations can enable effective knowledge graph federation. This allows for seamless data integration, collaboration, and decision-making across distributed sources while ensuring data integrity, security, and compliance.
Storage side¶
Key aspects of the Knowledge Graph Federation capability, on the storage side (i.e. on the side of the data source, the data publisher, or the data product), include:
-
Local Storage Management: Managing data storage at a local level within individual components or nodes of the distributed EKG. This involves implementing scalable storage solutions that can handle the volume, velocity, and variety of data specific to each component.
-
Data Modeling and Storage Structure: Designing and structuring the knowledge graph data model at a local level to accurately represent entities, relationships, and their properties. While storage structure is of lesser importance in a semantic technology context, it is still essential to ensure efficient organization and representation of data within the local storage of each component.
-
Local Data Ingestion and Integration: Establishing mechanisms to ingest and integrate data at a local level within each component of the EKG. This includes local data extraction, transformation, and loading (ETL) processes, as well as integration interfaces specific to each component to seamlessly incorporate diverse data into the local storage.
-
Indexing and Local Storage Optimization: Implementing indexing techniques and optimizing the storage structure at a local level to enable efficient data retrieval within each component. This includes creating indexes on relevant attributes and properties specific to each local storage to enhance query performance and optimize storage efficiency.
Retrieval side¶
And on the retrieval side (i.e. on the side of the users and their use cases):
-
Querying and Retrieval Optimization: Enabling fast and accurate retrieval of information within each component of the distributed EKG. This involves implementing optimized query mechanisms, query languages, or APIs specific to each component's storage technology to provide efficient retrieval of relevant information.
-
Inter-Component Data Access and Retrieval: Establishing mechanisms for inter-component data access and retrieval within the distributed EKG. This includes ensuring seamless communication and data exchange between different components to support cross-component querying and retrieval of information.
-
SPARQL Query Support: Ensuring the capability to query each node of the EKG architecture using SPARQL. SPARQL is a query language specifically designed for semantic knowledge graphs, allowing expressive querying with consideration for ontologies. Supporting SPARQL queries within the distributed EKG (not as a SPARQL endpoint that's exposed to users or other systems) enables complex use cases that leverage the full expressivity of ontologies and semantic reasoning.
By managing data storage and retrieval at a local level within the distributed architecture of the EKG, organizations can ensure efficient utilization of storage resources and optimize retrieval mechanisms for each component. This distributed approach allows for scalability, fault tolerance, and performance optimization in large-scale EKG deployments, while SPARQL support enables powerful and ontology-aware querying within the EKG.
Holistic view, unified view, "Data Sphere"¶
A major capability within the technology pillar of an Enterprise Knowledge Graph (EKG) is the ability to decouple the user interface or user experience from the backend complexities while providing a single, unified view of entities. This capability enables users to have a consistent and comprehensive understanding of any given concept, such as customers, contracts, products, employees, and more.
Key points about this capability include:
- Unified Entity Representation: The user interface of the EKG presents a unified representation of entities, ensuring that users have a single view of any given concept. Instead of encountering multiple representations of an entity scattered across different systems or data sources, users can access a consolidated and holistic view, enhancing their understanding and decision-making processes.
- Centralized Entity Information: The EKG consolidates entity information from diverse data sources, systems, and applications into a centralized repository. The user interface leverages this centralized repository to retrieve and display comprehensive entity details, providing users with a complete and up-to-date view of relevant information.
- Entity Context and Relationships: The user interface showcases the context and relationships associated with entities. Users can explore the connections and dependencies between entities, enabling them to gain insights, understand the impact of relationships, and navigate through related information more efficiently.
- Customizable Entity Views: The user interface allows for customization of entity views based on user preferences and requirements. Users can personalize the displayed attributes, data fields, visualizations, and additional contextual information to align with their specific needs and enhance their understanding of entities.
- Simplified Entity Navigation: The user interface streamlines entity navigation by providing intuitive tools and features. Users can easily access related entities, drill down into specific attributes or data subsets, and traverse through entity hierarchies or networks to explore additional details, contributing to a seamless and efficient user experience.
- Entity-Level Collaboration: The user interface facilitates collaboration and knowledge sharing at the entity level. Users can annotate, comment, and share insights or findings related to specific entities, fostering collaboration and information exchange among users within the organization.
- Consistency across Use Cases: Regardless of the use case or application scenario, the user interface maintains a consistent and unified entity view. Users can rely on a common representation and understanding of entities across various contexts, promoting coherence, accuracy, and efficiency in their interactions with the EKG.
By providing a single view of entities within the decoupled user interface, the EKG enables users to gain a comprehensive understanding of concepts, entities, and their interrelationships. This unified perspective enhances decision-making, collaboration, and knowledge discovery while reducing the complexity and cognitive load associated with disparate representations of entities.
Warn
Work in progress
Warn
Work in progress. Describe the five levels of maturity for this Capability.
Warn
Work in progress
Warn
Work in progress, describe how this capability is possibly being delivered today in a non-EKG context and optionally what the issues are that EKG could or should improve
Warn
Work in progress, describe how this capability would be delivered or supported using an EKG approach, making the link to the "how" i.e. the EKG/Method.
Warn
Work in progress, list examples of use cases that contribute to this capability, making the link to use cases in the catalog at https://catalog.ekgf.org/use-case/..