5.0 Health data management
Primary domain 5, health data management, is shown in this section.
Data and its management form the foundation of more advanced technologies and methods. Various digital healthcare systems and AI models are dependent on the data that is available to them. The management and processing of healthcare data is therefore foundational to the functioning of these more advanced systems and technologies. Data drives and underpins the accuracy of algorithms; analysis methods and tools used to make data derived or assisted decisions. This includes where and how data are stored as well as data-flows and how they are applied to various pathways (for example, patient journeys). We all share a responsibility to ensure the data quality throughout these processes from initial collection and storage through to its application in systems and services.
5.1 Data management and processing
An understanding of data stewardship is important to ensure that good quality data is used as input for these systems to generate an appropriate output. This includes awareness of data sharing, governance and regulatory issues as well the more technical aspects of data storage, security and processing.
5.1.1 Data collection and context
How data is collected and the context of data is vital for interpreting the output of data analysis methods and deriving useful and actionable insight form data.
Capability statement - I am aware of different sources of health and social care data (for example, Office for National Statistics (ONS) data, census data, patient registries, hospital episode statistics) and how to access them, including ethical and information governance requirements
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient in the use of standardised, validated tools for data collection and extraction, related to my area of specialism
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient at entering data into digital systems and devices accurately, completely and correctly and recognise the importance of this for data-driven digital healthcare
Archetypes:
- User
Capability statement - I recognise the importance of understanding the health and care data context (how and why it was collected) and the effect on subsequent data interpretation
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am aware of sources of data bias and how biases may affect data interpretation or predictive models; and that this can reinforce existing societal inequalities
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I can explain the key differences between structured and unstructured data with relevant examples related to my area of expertise
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am aware of opportunities and challenges of "big data" (for example, volume, variety, velocity and veracity)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
Capability statement - I am able to develop (and use) metadata standards, taxonomies ontologies and classifications for the storage and retrieval of data, to ensure interoperability and standardisation
Archetypes:
- Creator
- Embedder
Capability statement - I am proficient in assessing the reliability of data and the appropriateness of tools used to produce or collect data (for example, databases vs Excel)
Archetypes:
- Creator
- Embedder
Capability statement - I actively support users with compliance with research requirements relating to data collection. I support them in building competence in these areas
Archetypes:
- Driver
- Embedder
- User
Capability statement - I am aware of different sources of health and social care data (for example, Office for National Statistics (ONS) data, census data, patient registries, hospital episode statistics) and how to access them, including ethical and information governance requirements
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient in the use of standardised, validated tools for data collection and extraction, related to my area of specialism
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient at entering data into digital systems and devices accurately, completely and correctly and recognise the importance of this for data-driven digital healthcare
Archetypes:
- User
Capability statement - I recognise the importance of understanding the health and care data context (how and why it was collected) and the effect on subsequent data interpretation
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am aware of sources of data bias and how biases may affect data interpretation or predictive models; and that this can reinforce existing societal inequalities
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I can explain the key differences between structured and unstructured data with relevant examples related to my area of expertise
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am aware of opportunities and challenges of "big data" (for example, volume, variety, velocity and veracity)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
Capability statement - I am able to develop (and use) metadata standards, taxonomies ontologies and classifications for the storage and retrieval of data, to ensure interoperability and standardisation
Archetypes:
- Creator
- Embedder
Capability statement - I am proficient in assessing the reliability of data and the appropriateness of tools used to produce or collect data (for example, databases vs Excel)
Archetypes:
- Creator
- Embedder
Capability statement - I actively support users with compliance with research requirements relating to data collection. I support them in building competence in these areas
Archetypes:
- Driver
- Embedder
- User
5.1.2 Data storage
How data is stored securely and its longer term retention is important for the maintenance and accessibility of data as well as the speed and ease of data retrieval and sharing.
Capability statement - I am aware that data can be stored locally (local storage on servers) or remotely (for example, cloud storage requiring an internet connection)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
Capability statement - I am aware of different systems used to store health and clinical data (for example, Electronic Health/Medical Records, registries, patient generated data sources, genomics databases, secondary data sources) and the impact of this on subsequent data analysis and access (for example, retrieving data from different database systems)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am aware of the data storage requirements for my organisation and adhere to these requirements (for example, where to store data, how to deidentify data and which formats to use)
Archetypes:
- Driver
- Creator
- Embedder
- User
Capability statement - I am proactive in the maintenance of data to ensure its integrity, including the backing up/archiving of data and the options available in my workplace
Archetypes:
- Embedder
- User
Capability statement - I am aware that data can be stored in different types of database (for example, relational databases, document based databases, graph databases)
Archetypes:
- Creator
- Embedder
Capability statement - I can design systems and infrastructure for data storage with a focus on accessibility, privacy and security
Archetypes:
- Creator
- Embedder
Capability statement - I am aware that data can be stored locally (local storage on servers) or remotely (for example, cloud storage requiring an internet connection)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
Capability statement - I am aware of different systems used to store health and clinical data (for example, Electronic Health/Medical Records, registries, patient generated data sources, genomics databases, secondary data sources) and the impact of this on subsequent data analysis and access (for example, retrieving data from different database systems)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am aware of the data storage requirements for my organisation and adhere to these requirements (for example, where to store data, how to deidentify data and which formats to use)
Archetypes:
- Driver
- Creator
- Embedder
- User
Capability statement - I am proactive in the maintenance of data to ensure its integrity, including the backing up/archiving of data and the options available in my workplace
Archetypes:
- Embedder
- User
Capability statement - I am aware that data can be stored in different types of database (for example, relational databases, document based databases, graph databases)
Archetypes:
- Creator
- Embedder
Capability statement - I can design systems and infrastructure for data storage with a focus on accessibility, privacy and security
Archetypes:
- Creator
- Embedder
5.1.3 Data visualisation and reporting
Tailoring the presentation and communication of data to multiple stakeholders with different levels of data literacy is important for decision making, research and resource allocation.
Capability statement - I am proficient at interpreting information presented in a variety of commonly used visualisations (for example, bar charts, histograms, pie charts, scatter plots)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I can use interactive data dashboards to view summaries of data in my domain of expertise
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient at interpreting and summarising data from dashboards and other tools relating to my area of expertise
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am capable of identifying suitable methods of visualisation for different data types
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I can evaluate the quality of data to ensure it is fit for purpose to be used for reporting/communicating finding, and only report on data that meets this standard
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am proficient at interpreting information presented in a variety of specialised visualisations relevant to my field of practice (for example, Kaplan-Meier curves, OncoPrint, Circos)
Archetypes:
- Creator
- Embedder
- User
Capability statement - I can create data dashboards to summarise, visualise and present data to a variety of stakeholders
Archetypes:
- Creator
- Embedder
Capability statement - I can create a range of visualisations for a variety of audiences (for example, technical and lay audiences) to present data visually for data exploration and reporting using appropriate visualisation design theories
Archetypes:
- Creator
- Embedder
Capability statement - I am proficient at interpreting information presented in a variety of commonly used visualisations (for example, bar charts, histograms, pie charts, scatter plots)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I can use interactive data dashboards to view summaries of data in my domain of expertise
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am proficient at interpreting and summarising data from dashboards and other tools relating to my area of expertise
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am capable of identifying suitable methods of visualisation for different data types
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I can evaluate the quality of data to ensure it is fit for purpose to be used for reporting/communicating finding, and only report on data that meets this standard
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I am proficient at interpreting information presented in a variety of specialised visualisations relevant to my field of practice (for example, Kaplan-Meier curves, OncoPrint, Circos)
Archetypes:
- Creator
- Embedder
- User
Capability statement - I can create data dashboards to summarise, visualise and present data to a variety of stakeholders
Archetypes:
- Creator
- Embedder
Capability statement - I can create a range of visualisations for a variety of audiences (for example, technical and lay audiences) to present data visually for data exploration and reporting using appropriate visualisation design theories
Archetypes:
- Creator
- Embedder
5.1.4 Data processing and analytics
To extract actionable insights from data it must first be processed and analysed. This involves transforming data into a format where analysis methods may be applied and applying these methods with an understanding of which approach to apply based on the characteristics of the data and the intended goals.
Capability statement - I am aware of the importance of data provenance, data transparency and audit
Archetypes:
- Creator
- Embedder
Capability statement - I have an awareness of information governance processes and procedures when dealing with organisations external to the NHS and adhere to local guidance when dealing with external entities
Archetypes:
- Creator
- Embedder
Capability statement - I am aware of health data sets available to me in my area and the types of clinical questions that could be answered with these data. I am aware that public datasets may have been processed and that this may make them unsuitable for certain applications.
Archetypes:
- Creator
- Embedder
Capability statement - I am able to critically analyse a health dataset in terms of the clinical questions it could answer and develop a data analysis strategy (for example, a plan of how the data could be analysed to answer these questions)
Archetypes:
- Creator
- Embedder
Capability statement - I am aware of and know how to access secure and trusted IT resources for tasks like high performance computing and cloud based data storage
Archetypes:
- Shaper
- Creator
- Embedder
Capability statement - I recognise and promote the use of common data standards where appropriate to store and share data
Archetypes:
- Creator
- Embedder
Capability statement - I can query healthcare databases applying analytical tools to analyse large datasets for audit and research purposes
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I promote the use of data provenance for data transparency and audit, and how this can impact on subsequent decisions made using data
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - As someone working with data for research, monitoring or process improvement purposes, I know how to effectively deidentify/pseudonymise data
Archetypes:
- Creator
- Embedder
Capability statement - I am able to filter data to derive a subset of interest for further processing/analysis using statistical/programming tools (for example, SPSS, Excel, Python, R)
Archetypes:
- Creator
- Embedder
Capability statement - I am confident at applying data linkage and integration into composite data sets and record steps taken (for example, metadata, audit trail)
Archetypes:
- Creator
- Embedder
Capability statement - I know where to look to fin relevant anonymised datasets and trusted analytics services
Archetypes:
- Creator
- Embedder
Capability statement - I understand fundamental statistical principles and can use statistical and software tools (for example, Python, R, SPSS, Tableau, Power BI) to analyse data
Archetypes:
- Creator
- Embedder
Capability statement - I understand the pros and cons of different statistical methods for different analytical tasks
Archetypes:
- Creator
- Embedder
Capability statement - I can take account of data provenance issues when reporting the results of analysis (for example, which populations where included/excluded, the source of the data and it's context and how it has been transformed/filtered prior to analysis)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am confident in the use of Learning Health Systems in daily practice for continual improvement of care and can factor in generated knowledge to adapt my practice and improve processes accordingly making decisions based on data
Archetypes:
- User
Capability statement - I am aware of the importance of data provenance, data transparency and audit
Archetypes:
- Creator
- Embedder
Capability statement - I have an awareness of information governance processes and procedures when dealing with organisations external to the NHS and adhere to local guidance when dealing with external entities
Archetypes:
- Creator
- Embedder
Capability statement - I am aware of health data sets available to me in my area and the types of clinical questions that could be answered with these data. I am aware that public datasets may have been processed and that this may make them unsuitable for certain applications.
Archetypes:
- Creator
- Embedder
Capability statement - I am able to critically analyse a health dataset in terms of the clinical questions it could answer and develop a data analysis strategy (for example, a plan of how the data could be analysed to answer these questions)
Archetypes:
- Creator
- Embedder
Capability statement - I am aware of and know how to access secure and trusted IT resources for tasks like high performance computing and cloud based data storage
Archetypes:
- Shaper
- Creator
- Embedder
Capability statement - I recognise and promote the use of common data standards where appropriate to store and share data
Archetypes:
- Creator
- Embedder
Capability statement - I can query healthcare databases applying analytical tools to analyse large datasets for audit and research purposes
Archetypes:
- Driver
- Creator
- Embedder
Capability statement - I promote the use of data provenance for data transparency and audit, and how this can impact on subsequent decisions made using data
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - As someone working with data for research, monitoring or process improvement purposes, I know how to effectively deidentify/pseudonymise data
Archetypes:
- Creator
- Embedder
Capability statement - I am able to filter data to derive a subset of interest for further processing/analysis using statistical/programming tools (for example, SPSS, Excel, Python, R)
Archetypes:
- Creator
- Embedder
Capability statement - I am confident at applying data linkage and integration into composite data sets and record steps taken (for example, metadata, audit trail)
Archetypes:
- Creator
- Embedder
Capability statement - I know where to look to fin relevant anonymised datasets and trusted analytics services
Archetypes:
- Creator
- Embedder
Capability statement - I understand fundamental statistical principles and can use statistical and software tools (for example, Python, R, SPSS, Tableau, Power BI) to analyse data
Archetypes:
- Creator
- Embedder
Capability statement - I understand the pros and cons of different statistical methods for different analytical tasks
Archetypes:
- Creator
- Embedder
Capability statement - I can take account of data provenance issues when reporting the results of analysis (for example, which populations where included/excluded, the source of the data and it's context and how it has been transformed/filtered prior to analysis)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am confident in the use of Learning Health Systems in daily practice for continual improvement of care and can factor in generated knowledge to adapt my practice and improve processes accordingly making decisions based on data
Archetypes:
- User
5.2 Data/cyber security
To ensure public trust and to meet legal requirements, health and clinical data should be protected from loss (or leakage), theft and attack while being processed, stored or shared.
Capability statement - I abide by the organisational regulations and guidelines aimed to prevent data loss and theft, including when data is being stored and transferred
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I understand and comply with cybersecurity standards in my organisation by keeping my training record up to date
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am able to access and act on requirements communicated to me about new security threats (for example, emails from IT personnel asking me to carry out a certain action or be aware of particular threat)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I understand the need for data encryption to protect data and use this where appropriate to store and transfer data
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I regularly update systems (for example, installing routine software updates when prompted) to protect systems/equipment from cyber-attacks
Archetypes:
- Embedder
Capability statement - As someone who procures technology (for example, mobile technology, electronic record systems), I ensure that such technology is secure for the purposes of storing and transmitting data
Archetypes:
- Driver
- Embedder
Capability statement - I am proactive in learning about and implementing security protocols and applying recommended data/cyber security standards
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am capable of championing the education of other staff/users in the security issues related to health technology and promote best practices in maintaining the security of health systems
Archetypes:
- Shaper
- Driver
- Creator
Capability statement - I abide by the organisational regulations and guidelines aimed to prevent data loss and theft, including when data is being stored and transferred
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I understand and comply with cybersecurity standards in my organisation by keeping my training record up to date
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am able to access and act on requirements communicated to me about new security threats (for example, emails from IT personnel asking me to carry out a certain action or be aware of particular threat)
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I understand the need for data encryption to protect data and use this where appropriate to store and transfer data
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I regularly update systems (for example, installing routine software updates when prompted) to protect systems/equipment from cyber-attacks
Archetypes:
- Embedder
Capability statement - As someone who procures technology (for example, mobile technology, electronic record systems), I ensure that such technology is secure for the purposes of storing and transmitting data
Archetypes:
- Driver
- Embedder
Capability statement - I am proactive in learning about and implementing security protocols and applying recommended data/cyber security standards
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am capable of championing the education of other staff/users in the security issues related to health technology and promote best practices in maintaining the security of health systems
Archetypes:
- Shaper
- Driver
- Creator
5.2.1 Data privacy and confidentiality
Health and social data contains sensitive information about individuals. The principles of privacy and confidentiality are important for maintaining and improving public trust in organisations collecting and processing data.
Capability statement - I am aware of the need to maintain confidentiality and privacy of health and social care data at all times respecting the data subjects right to privacy
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I always ensure that the data subject is informed of the reasons why data is collected, stored and who will have access to their data and for what purposes
Archetypes:
- Creator
- Embedder
Capability statement - When sharing data, I always ensure that only those with a legitimate reason to view or access the data are included in the data sharing to prevent the leaking of sensitive information to those without permission or need to access/view data
Archetypes:
- Driver
- Creator
- Embedder
- User
Capability statement - I am aware of the risks of re-identification of pseudonymised data (for example, using data matching) and can assess and document if further steps are required to further de-identify data
Archetypes:
- Creator
- Embedder
- User
Capability statement - I am aware of and can apply de-identification (anonymization) to data and recognise its importance for maintaining the confidentiality of data subjects and sources
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am familiar with the architecture and reasoning behind creating Trusted Research Environments (TRE's) and apply/use them wherever appropriate
Archetypes:
- Creator
- Embedder
Capability statement - I practice the application of analysis methods to reduce the risk of re-identification of pseudonymised data
Archetypes:
- Shaper
- Driver
- Embedder
- User
Capability statement - I maintain a closed loop consent process with data providers including the citizens and model openness, transparency and accountability for its use
Archetypes:
- Driver
- Creator
Capability statement - I am aware of the need to maintain confidentiality and privacy of health and social care data at all times respecting the data subjects right to privacy
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I always ensure that the data subject is informed of the reasons why data is collected, stored and who will have access to their data and for what purposes
Archetypes:
- Creator
- Embedder
Capability statement - When sharing data, I always ensure that only those with a legitimate reason to view or access the data are included in the data sharing to prevent the leaking of sensitive information to those without permission or need to access/view data
Archetypes:
- Driver
- Creator
- Embedder
- User
Capability statement - I am aware of the risks of re-identification of pseudonymised data (for example, using data matching) and can assess and document if further steps are required to further de-identify data
Archetypes:
- Creator
- Embedder
- User
Capability statement - I am aware of and can apply de-identification (anonymization) to data and recognise its importance for maintaining the confidentiality of data subjects and sources
Archetypes:
- Shaper
- Driver
- Creator
- Embedder
- User
Capability statement - I am familiar with the architecture and reasoning behind creating Trusted Research Environments (TRE's) and apply/use them wherever appropriate
Archetypes:
- Creator
- Embedder
Capability statement - I practice the application of analysis methods to reduce the risk of re-identification of pseudonymised data
Archetypes:
- Shaper
- Driver
- Embedder
- User
Capability statement - I maintain a closed loop consent process with data providers including the citizens and model openness, transparency and accountability for its use
Archetypes:
- Driver
- Creator
Page last reviewed: 14 February 2023
Next review due: 20 February 2024