top of page

Mastering Privacy Data Inventory in the Age of IoT: A Guide for Businesses

Updated: Jul 20, 2024


Mastering Privacy Data Inventory in the Age of IoT: A Guide for Businesses
Mastering Privacy Data Inventory in the Age of IoT: A Guide for Businesses

Mastering Privacy Data Inventory in the Age of IoT: A Guide for Businesses What is a data inventory?  


Generally speaking, an Inventory is a collection of things that are important to a business. For example; Retail businesses maintain inventories of the products on their store shelves or store in inventory stock rooms. They may track inbound product orders from their distributors. Manufacturing companies track just-in-time raw material orders required for the manufacturing process, or parts needed for final assembly. Finished goods waiting for delivery represent income and are tracked carefully. 

Traditionally, when questioned about a data inventory, Information Technology teams will rattle off a list of systems they support. Oftentimes this information is a simple list of application names stored in an Excel spreadsheet and is the answer to the question “What systems do you support?”. The system list may also represent the various system support teams or department support responsibilities. The systems list may also be used for technical security checks, patching or routine monitoring performed on those systems.  

There can be variations in defining a system inventory viewed from the perspective of particular team members and what they bring to the table. If you are a system architect, you may think of a system as the hardware platform and operating system, while a software engineer may consider the source code or application that resides on the hardware. Database administrators care less about the system hardware and User Interface software but are more concerned with the data that is at the heart of the system. 

The notion of the system's inventory as a collection of things important to a business has changed in recent years. No longer can companies simply describe their inventories as a list of systems. New disciplines and compliance obligations require an understanding of the data in use as well as the associated metadata of the data they process. Metadata, or “attributes” are characteristic information associated with the data in the system. Metadata may include; Where the data was sourced from, The processes the data is being used for, Where the data is being stored or transferred to, and a large number of other descriptors or attributes relevant for meeting many regulatory requirements.  

Regulatory Drivers  


Since its inception in 1999, the term “The Internet of Things” (IoT) has given rise to a vast array of new kinds of data that never existed before. Before the advent of IoT and connected devices, the data which companies processed was processed on internally hosted systems, and was primarily transactional in nature.  Data was used to perform the primary functions a business needed to operate. Data was concerned with account numbers, part numbers, order numbers, details of transactions, and other data core to the company’s business. It was run on large main or mid-frame systems and data could only be shared via dedicated FTP, or EDI  transactions. As the World Wide Web grew and its capabilities became more data-centric, everything changed. Data could easily move between web pages that served increasing amounts of data to a growing audience of business and consumer users. The digital wave of IoT gave birth to new businesses, new data types, new data attributes, and new regulatory and compliance risks.  

In 1970 The Fair Credit Reporting Act and the Unsolicited Credit Card Act required organizations to recognize Credit Card data as a unique data type and apply the controls required to protect it. In 1996 the  Health Information Portability and Accountability Act (HIPAA) clearly defined health information with similar control requirements for providers, payers, and clearinghouses in the healthcare industry. The “P” in HIPAA stands for Portability. Regulators saw the future as one where more and more health data would be shared among healthcare professionals.  These regulations established a need for companies to better understand the types, sensitivities, and handling and sharing requirements of the data they processed. From these roots, regulators continue to expand the protections around personal information, creating an ever-growing regulatory environment filled with requirements that privacy teams are monitoring. As a result, privacy teams need new views of the data and the related attributes in their systems and how it is being processed. 


Today the notion of a data inventory or a Privacy Inventory that considers not just the system housing data but the data throughout the entire data lifecycle is a requirement. 


To monitor the new digital lifecycle successfully, certain characteristics of the data being processed need to be understood and tracked. In recent years attribute tracking has become a more popular feature of privacy platforms. By monitoring and tracking a variety of data elements used in the collection, processing, storage, sharing, and disposal of data, more mature levels of data protection have begun to emerge. Attribute tracking also allows privacy professionals to fully document their organization's processing of personal information as required by the EU General Data Protection Regulation (GDPR) in Article 30. 

Assigning and tracking attributes can create an organization of understanding about the data that a company maintains. By creating a Privacy Inventory based on an organizational model of these components it becomes easy to understand all the moving parts of a privacy inventory. 


The Privacy Inventory Model 


This Privacy Inventory model is made up of five levels that seek to document a deeper understanding of a company's processing of personal information. By asking foundational questions we can gather the relevant data we need.  

  • Where is data stored? 

  • What data elements do I process? 

  • How sensitive is the data I have?

  • Why am I collecting this data?

  • Whose data do I process? 

The answers to these questions work to form the privacy inventory which is organized in the following five levels.


Level One - Where is my data stored? 


The initial level describes the systems and repositories where data is stored. There is a fair amount of diversity in Level One as it accounts for both structured and unstructured repositories. Systems,  applications, and databases may be included as well as network share drives, cloud storage, and even  3rd party vendor platforms. Often common system names are used for shared enterprise repositories or proprietary systems that the company uses, and identifying all Level One assets can be a complex task. 





Level Two - What data elements do the systems contain? 


The goal of Level Two is to understand the data elements in each of the systems, applications, or datasets listed in Level One. Additional components of Level Two may include the connections or mappings to other systems and the data elements contained and shared in each system. This would include sharing with 3rd parties outside of the organization, as well as connections among the internal systems.  


Level Three - Which pieces of data are important? 


Although all data is important, some elements carry greater sensitivity and require special handling controls. In Level Three we begin to define the data characteristics or attributes that are used to track the data type, data categories, classification levels, and special handling and sharing requirements that are needed for a particular data element. In addition to standard classification levels like Public, Internal, Confidential,  and Sensitive, additional attributes that indicate data types like PHI for health information, IP for Intellectual Property, and ACP for Attorney-Client Privileged, or company-specific attributes can be assigned. Assigning and tracking the base data attributes allows for clearer application of technical and administrative controls.  


Level Four - Why do I have this data? 


In level four we continue defining the data being processed and document the purpose of processing personal data. Originated as a requirement from the GDPR Article 30 to document all processing of personal information. Level four information includes details on data ownership, reasons for processing,  data retention lengths, and other organizations where the data is being shared and processed.





Level 5 - Whose data do we have?  

Level five of the data privacy inventory focuses on the data subject and the personally identifiable data we have on them. Can the organization look across all the systems and repositories where data is stored and accurately identify the data of one individual? Based on the organization’s relationship with the data subject, can we positively confirm the identity of an individual rights data requestor? and can we then tie that verified identity to all the data being processed about that person? Accurate Level Five data directly supports the ability to fulfill data subject rights requests.


Data Privacy Inventory Pyramid


Data Privacy Inventory Pyramid
Data Privacy Inventory Pyramid

In addition to assisting in meeting compliance requirements, documenting, monitoring, and updating key data attributes throughout the data lifecycle in a sustainable manner leads to a greater awareness of a company's processing activities and drives higher levels of data quality.    


Ongoing Maintenance and Use 


Sustaining an accurate Privacy Inventory requires ongoing processes that are focused on collecting data about the processing of personal information in an organization. By continually documenting changes to the data environment and the use of PII, accurate views of processing can be maintained. Two common ways many organizations maintain inventories are by the use of Privacy Impact Assessments, or through tollgate processes that review and authorize new deployments or significant changes to existing applications. 

Privacy Impact assessments are used to gather information from, Business, and IT owners about the systems and processes that utilize personal information that they are responsible for. Privacy teams will require the business and IT groups to complete PIAs for the applications they support. PIA requirements can call for an initial assessment, as well as additional assessments when significant change is made to the application. Some companies require annual PIA updates. The PIA will provide system, attribute, and privacy risk information.  

The Software Development Life Cycle (SDLC) run in many traditional IT departments can also be leveraged to gather and track information about the processing of personal information. In many SDLC models Planning and Analysis phase toll gates provide opportunities to collect information on the use of PII and the associated risks. By reviewing system and application changes before a new design is completed the privacy teams can gain a better understanding of the change in processing of PII and the potential risk the change may produce. 


Challenges with manual collection


Manually executed inventory surveys, PIAs, and tollgate reviews have been the primary way organizations collect and monitor the systems and data they process. All of these methods do add value but have common shortfalls as well. That shortfall is they all require humans to answer questions. Sending out surveys to answer complicated questions is not a formula for success in many organizations. 

Survey forms often come back with incomplete or incorrect answers or are not returned at all. If there is not a strong culture for surveys within the organization, it is likely that it will not ever get completed and returned leaving privacy teams few options for understanding PII use in their company. If the survey is started and the designated target of the survey does not know an answer and looks for help, some platforms allow the form to be shuttled to another person who may know the answer, further drawing out an already complicated process. Often the privacy officer will have to follow up to clarify or complete answers, creating additional work in gathering and maintaining privacy inventory data.


Automated scanning, The new dawn 


Until recently most privacy platforms contained electronic versions of manual forms combined with some basic workflow capabilities but still relied on humans to understand and define the data they processed. Data Loss Prevention (DLP) technologies provided some point-in-time pattern-matching capabilities but ultimately fell short of providing the kind of monitoring needed to supplant humans as data collectors in understanding PII processing. 



PIA with LightBeam
PIA in LightBeam

However, in recent years new tools with advanced scanning and AI-driven policies and rule sets have emerged that increase the accuracy and completeness of collecting data while lowering the human effort needed to maintain it. 

AI-driven scanning tools provide continuous PII and compliance scanning. Intent-based crawlers can identify PII in documents, images, and databases and document where they exist, creating an automated privacy inventory that is aware of the unique characteristics of an organization's PII. Tools that continually monitor the organization learn about the documents in the environment and apply rules over the storage and use of those documents while alerting privacy teams when new copies are made.

By applying the power of AI and automated environment scanning, privacy teams can focus on using PII in their organization and no longer need to spend time trying to get the information they need and wondering if what they gathered was accurate. Documentation of processing activities required by some regulations can be produced automatically and always available and updated as it becomes part of the data inventory. 


There is no doubt that the need to understand the PII data that companies process is a growing need. New uses of data are created every day. New data types challenge old rules and must be evaluated for appropriate use and risk.  With better AI-driven tools, privacy teams can focus on the new dynamics of data use and not struggle to merely gather the data needed to complete the risk analysis.  

With newer technologies today's Privacy Inventories can be always on, always accurate, and provide new control over the use of PII without the heavy lift of manual interviews or surveys.  

To discover how LightBeam can assist in automating your Privacy workflows with AI, please visit LightBeam.ai to schedule a call with us.

Commentaires


Les commentaires sur ce post ne sont plus acceptés. Contactez le propriétaire pour plus d'informations.
bottom of page