Introduction
The Ministry of Electronics and Information Technology, Government of India (MeitY) published the draft National Data Governance Framework Policy (NDGFP) in May 2022.[1] On 14 June 2022, the Observer Research Foundation participated in a stakeholder interaction with Minister of State for Electronics and Information Technology Rajeev Chandrasekhar.
The NDGFP has proposed the constitution of a new body, to be named the India Data Management Office (IDMO), to manage the India Datasets Programme and develop rules, standards, and guidelines in line with the NDGFP. Building on the objectives and clarifications provided during the stakeholder interaction, this document outlines ORF’s recommendations on the NDGFP for MeitY to consider.
-
- Composition and accountability mechanisms of the IDMO
- Clear standards for data anonymisation
- Clarification regarding discretionary powers of the IDMO
Composition and Accountability Mechanisms of the IDMO
1.1. Composition of the IDMO
The NDGFP provides that IDMO shall be established under the Digital India Corporation (DIC) under the MeitY. (Sec. 5.1) The IDMO has been entrusted with a wide range of responsibilities including the framing, managing, and periodic review and revision of the policy (Sec. 5.1); developing rules, standards, and guidelines, designing and managing the India datasets platform (Sec. 5.3); capacity and skill-building (Sec. 6.12); developing a redressal mechanism (Sec. 6.14); implementation and enforcement of the NDGFP (Sec. 6.15); and awareness building (Sec. 6.16).
Despite the wide range of responsibilities, the NDGFP is silent on the composition and the functioning of the IDMO. The IDMO is not a statutory authority, but rather will be a department or division under the DIC— a not-for-profit company set up by the MeitY. To ensure accountability and transparency, the composition and structure of the IDMO should be specified in the policy itself.
For instance, the India Semiconductor Mission (ISM)[a] has its established management structure[2] detailing different key personnel and the advisory committee. The management structure also offers the possibility of hiring domain and functional experts and other eminent experts. A similar management structure could be included in the NDGFP to ensure the inclusion of appropriate technical expertise as well as stakeholders from industry, academia, and civil society in the work of the IDMO, as opposed to post-facto external consultations.
It will also be important to establish how the NDGFP and IDMO will interact with the provisions of the Data Protection Bill, 2021 and the proposed Data Protection Authority (DPA).
1.2. Increasing accountability: Widening the scope of consultations
The NDGFP entrusts the IDMO with the responsibility of developing rules, standards, and guidelines in relation to the following:
- All data/datasets/metadata ( 5.2)
- Data storage and retention framework for ministries/ departments ( 6.1).
- Inter-governmental data access ( 6.2)
- Identification of datasets ( 6.4)
- Data anonymisation ( 6.5)
- Data quality and metadata standards ( 6.6)
- Datasets access and availability ( 6.8)
- Disclosure norms ( 6.11)
- Implementation manual ( 6.17)
The IDMO also provides for consultation for formulation of rules, standards, and guidelines for data, datasets, and metadata (Sec. 5.2 and 6.6). The policy mandates that at least two semi-annual consultations, and report carding have to be organised by the IDMO for this purpose. However, the consultations have been limited to ministries, state governments, and industry.
The scope of consultation should be widened and other stakeholders such as civil society organisations and educational institutions should be involved. The NDGFP covers all data, personal and non-personal, collected and being managed by any government entity (Sec. 3.1), and therefore inclusive consultations are necessary to minimise inadvertent harm to specific stakeholders.
Consultative mechanisms must be incorporated in the NDGFP itself. For instance, the Pre-legislative Consultation Policy[3] provides a detailed procedure for pre-legislative consultations for any bill to be introduced in Parliament. A similar framework for consultations under the NDGFP should be incorporated to ensure transparency and accountability.
2. Clear Standards for Anonymisation of Data
The NDGFP has proposed creating a repository of datasets and ensuring the safety of these data by anonymising the non-personal data submitted (See Sec. 2.2). The rules and regulations to overlook anonymisation are to be rolled out by the IDMO. (Sec. 2.3).
2.1. Internal validity of defining anonymisation
To ensure data security, the IDMO needs to define standards for anonymisation, avoiding conflation with other methods such as pseudonymisation. Anonymised data refers to the process of removing all personal identifiers, both direct and indirect, that can assist in the attribution of data to an individual. Psuedoanonymisation[4] is a process by which data “can no longer be attributed to a specific data subject without the use of additional information.” In other words, pseudonymisation replaces personal identifiers with artificial ones.
Standards for anonymisation need to be periodically updated by the IDMO to move in tandem with the changing requirements of dynamic data and data use cases. In this vein, the IDMO should not rely purely on the more static National Privacy Principles or Personal Data Privacy policies that may be released.
Further, the NGDFP suggests that all datasets will be collated into a large easily navigable portal to which private sector organisations are also encouraged to contribute (Sec. 5.4). It is necessary, therefore, to peg the definition of ‘anonymised data’ to global standards. Anonymisation requirements must reduce the risk of data being de-anonymised, and create countermeasures in those cases.
2.2. Inspection systems
The IDMO must create mechanisms to hold suppliers of data accountable for appropriate levels of anonymisation of said data before the data is transferred to the IDMO and any sub-authorities.
The NDGFP could take inspiration from the UK Information Commissioner Office’s standards for anonymisation,[5] and instate a “motivated intruder system” to create a regular check system for the robustness and security of anonymised data. “A useful test to include as part of assessing identifiability risk is whether an intruder would be able to achieve identification if they were motivated to attempt it. This is known as the motivated intruder test.”
3. Clarification Regarding the Discretionary Powers of IDMO
The stated aim of the draft NDGFP is to “provide access to the non-personal and/or anonymised datasets to Indian researchers and start-ups” (Sec. 5.3); “promote transparency […] in non-personal data and datasets access” (Sec. 2e); and ensure “greater citizen awareness, participation and engagement.” (Sec. 2.i)
However, the proposed IDMO will have certain discretionary powers, the exercise of which could distract from these stated aims.
3.1. Limits to data requests
The IDMO will reserve the right to “decide whether requesting entities may be allowed access to full databases/datasets or combinations thereof, for their use cases.” (Sec. 6.9)
Even if the IDMO were to define a set of acceptable use cases in advance, or to “define the principles for ethical and fair use of data shared beyond the government ecosystem” (Sec. 6.13), the issue of the IDMO’s interpretation of the appropriateness requesting entity’s use case, or its denial of access on any other basis could present a challenge. The IDMO’s decision-making process should, therefore, be transparent and the criteria for denial of requests clearly elucidated. These checks would help enhance access and citizen engagement with the Indian datasets platform.
3.2. Usage rights
The Intellectual Property Rights (IPR) ownership of datasets must be recognised and factored into decisions governing their fair and ethical use. The NDGFP should elaborate on this principle.
In its December 2020 report on the Non-Personal Data Governance Framework, the Kris Gopalakrishnan Committee observed that for non-personal data, the individual ownership approach cannot be followed as there are no identifiable data principals.[6] As such, the most discernible expression of rights and ownership over non-personal data is by those who have the IPR for it.[b] However, the committee report also recommended “mandatory data sharing” in certain cases to promote competition, enable startups, or further the public interest.[7]
Keeping these approaches in mind, it should be agreed upon whether the copyright license of a dataset, or the expected purpose of its use, or both, should determine the terms of its usage. The NDGFP’s clause about “usage rights” states: “The IDMO may ensure that data usage rights along with permissioned purposes are to be with the data principal.” (Sec. 6.10) Greater clarity is required whether the data principal referred to here is the IPR-holder of the dataset(s) or the entity sharing the dataset(s). Also, the roles and powers of the IDMO and other parties involved in determining usage rights need to be determined consultatively, and clearly defined.
3.3. User charges
The NDGFP has stated that it “may decide to charge user charges/fees for its maintenance services.” Monetising access to the datasets access platform/India datasets repository would be contrary to its intended use as a public platform to catalyse India’s “research and start-up ecosystem” (Sec. 2.2) and promote greater citizen participation and engagement.
Shravistha Ajaykumar is Associate Fellow at ORF’s Centre for Security, Strategy and Technology.
Basu Chandola is Associate Fellow at ORF.
Trisha Ray is Associate Fellow at ORF’s Centre for Security, Strategy and Technology.
Anirban Sarma is Senior Fellow at ORF’s Centre for New Economic Diplomacy.
[a] The India Semiconductor Mission (ISM) is a specialised and independent Business Division under the DIC.
[b] The legal basis for this is Section 2(o) of the Indian Copyright Act (1957), which protects computer databases as ‘literary works’.
[1] Draft National Data Governance Framework Policy as released by the MeitY
[2] India Semiconductor Mission, “About Us”.
[3] EU General Data Protection Regulation, Directive 95/46/EC, Article 4.
[4] Government of India, Decisions taken in the meeting of the Committee of Secretaries (CoS) held on 10th January, 2014 under the Chairmanship of Cabinet Secretary on the Pre-legislative Consultation Policy (PLCP) ( January 2014).
[5] UK Information Commissioner’s Office, “Chapter 2: How do we ensure anonymisation is effective?” in Draft anonymisation, pseudonymisation and privacy enhancing technologies guidance (October 2021).
[6] Ministry of Electronics and IT, Report by the Committee of Experts on Non-Personal Data Governance Framework, December 2020.
[7] Ministry of Electronics and IT, Report by the Committee of Experts on Non-Personal Data Governance Framework, December 2020.
The views expressed above belong to the author(s). ORF research and analyses now available on Telegram! Click here to access our curated content — blogs, longforms and interviews.