In a convenient e-book:


The Data Reference Model (DRM) is one of the Reference Models of the IndEA Framework (#IndEA) (#IndEADRM) and the Federal Enterprise Architecture Framework (#FEAF) (#FEAFDRM).

DRM provides the structure and description of the department’s data (metadata), the logical data model (depicting the relationship between various data elements), taxonomy, the security associated with each data element and its sharing. It provides the framework to design the 3 components of Data Architecture, namely, Data Description, Data Context and Data Sharing. These 3 areas deal with Discovery, Creation, Management and Exchange of enterprise data. Database Schema, Data Steward and Exchange Package are the key concepts/ components in the 3 areas respectively. Defining Metadata and Data Standards are key activities in the design of Enterprise Data Architecture.


The DRM principles

  • Principle DRM 1: Data Asset Data is an asset that has a specific and measurable value to the Government and is managed accordingly. Archive and preserve all information (both in raw and aggregated form) exchanged, especially outside the government ecosystem, for future reference and if needed, for resolution of disputes. The Archival and preservation must be in accordance with the applicable regulatory requirements.
  • Principle DRM 2: Data-sharing Data is shared across the Government, subject to rights and privileges, so as to prevent creation and maintenance of duplicative sets of data by different agencies. Data Sharing shall be subject to conformance with the principles of Security & Privacy.
  • Principle DRM 3: Data Trustee Each dataset has a trustee accountable for data quality and security.
  • Principle DRM 4: Data Security Data is protected from loss, unauthorized use and corruption, through adoption of international standards and best practices, duly protecting the privacy of personal data and confidentiality of sensitive data.
  • Principle DRM 5: Common Vocabulary and Data Definitions Data is defined consistently throughout all levels of Government, and the definitions are understandable and available to all users.

The DRM framework focuses on 3 areas related to data architecture. These are: Data Description, Data Context and Data Sharing.

drm.png

Each area of the DRM is expressed in terms of certain concepts and their relationships. These
concepts and their inter-relations constitute the DRM Abstract Model.

drmam.png

The Data Description area focuses on providing an unambiguous understanding of the data in terms of structure (syntax) and meaning (semantics). Correct and uniform description of data enables the following capabilities in government:

  • Data Discovery – It enables a department to quickly and accurately identify the data required to fulfill its governance objectives (through functions and services). The data may be owned by the department itself or by another department in any level of Government. Data discovery is further strengthened by the categorization, search and query capabilities provided by other areas.
  • Data Sharing and Reuse – The ability to discover data (who is generating/managing what data) and a clear understanding of its meaning ensures that the data can be easily shared and reused in many activities both within and outside the department.
  • Data Harmonization – A uniform way of describing the data through a well-defined model enables different departments to compare the data assets and helps in harmonizing the syntax and semantics of the data assets; a useful outcome of this would be the creation of common entities which can be used across departments.
amdd.png

The abstract model of Data Description essentially depicts the concepts that will be used to describe the Data Description area and their relationships. Two aspects of data description that are needed to be captured are:

  • The metadata (data about data) and
  • The mechanism for storing the metadata.

Sections below detail the Metadata and Data Standards.

Any data asset can be classified as structured, semi-structured or un-structured. Semi-structured or un structured data would include textual material, multi-media files etc. The metadata should accordingly be captured along these two dimensions:

  • As logical data models for describing structured data and
  • As Digital Data Resource metadata for describing semi-structured and un-structured data (using standards such as Dublin Core Meta Data Standard)

The structured data would invariably be implemented in the Data Architecture as Entity – Relationship diagram. The Digital Data Resource would be captured as metadata records.


Data context is any information that provides additional meaning to the data in terms of nature of data (category), the organization which is responsible for creating/managing the data, which business process created the data etc. The data context along with the data description makes it possible for a potential consumer of data to discover the data (if such a data discovery service is available) and understand the context in which the data was created so that a decision can be taken on whether the data is relevant for his/her purposes.

The Data Context provides important information to potential consumers of data so that they can take an informed decision on whether the data is appropriate to the specific context in which they wish to use it. It may be noted that the data context of a data asset should always be defined from the perspective of the owner (steward) department.

amdc.png

Data sharing is the use of information by one or more consumers that is produced by a source other than the consumer. Data sharing has two stakeholders: the data supplier/producer and the data consumer. The data supplier/producer should always be one but there could be one or more data consumers. The consumers of data produced by a Government department may be other divisions within the same department, other departments within Government and external stakeholders such as citizens, businesses, NGOs etc.

amds.png
drmorm.png

The nature of an attribute of a data entity determines the type of security practices to be followed at different levels in which the data is handled such as at the time of data capture, at the time of data transportation over the network, data storage, data sharing and data display in public or private domain.

drmsrm.png

Source: The IndEA Framework V1.0 document (IndEA Framework).