Combining Consumer and Public Realm Data

Photo by Joshua Newton on Unsplash

Use Case Developed in the Workshop

The average person checks their phone every 12 minutes, which adds up to 80 times per day. Data generated from these and other digital interactions dwarfs the amount of data currently collected by cities. It provides insights into individuals' lives at a level of detail and breadth never before possible. Individuals must provide consent for all the services used and data collected. However, full awareness of the data collected and its uses is impossible for anyone to keep up with. As data from those services is combined, the ability to create super-profiles of individuals' lives and movements increases and the risks of re-identification and commercial misuses increase.

This set of data and associated commercial uses and re-uses in themselves are a use case for a data trust to be able to assist consumers with consents and ensuring data is actually used in their best interests. Further, what if such depth of consumer behaviour data is able to be used with additional data collected from the public realm?

This case study is not one that we had prepared for the participatory workshop. It was one that emerged from participants who wanted to study "the elephant in the room" - all of the data that is already being collected about us.

As with all of the other use cases outlined, there is the potential for a heightened level of consumer profiling for targeted advertising and physical and digital service customization as a result of their health data, use of energy and buildings or mobility services. While some consumers may welcome that, the risk of marginalization of those unable to participate - or exploitation of those with no choice but to participate - may also occur. This use case is set within the wider context of a move towards a broader societal understanding and involvement in data; and of corporate responsibilities and accountability in the digital age.

How It Works

This wider scope for a trust is defined in terms of the individual and the rights to data about them, rather than being defined by geography. It presents important considerations for how to define the trust and if a trust can cover such a breadth:

  • All third parties collecting any data for those within the geo-fenced area of the trust would be subject to the rules and obligations of the trust. This would include data for those that live, work or visit the area

  • Should that trust have a regulatory role on behalf of the citizens of that geo-area for managing the data those citizens produce? With this wider scope, the trust would be an initial test case for a broader governance mechanism on behalf of consumers.

  • The consumer could be presented with an additional consent option of accept, reject or let the trust manage and assure the selected use of data.

  • Would the trust then be better able to address the competitive disadvantages that new service providers face due to the massive existing consumer profile data held by incumbents?

What We Heard

This level of existing data aggregation and use by private industry is seen as the elephant in the room. As a result in this use case the trust would have the fiduciary has responsibly not just for the data being collected from the new devices in the public realm, but from existing ones (ATM’s, phones, cars, fitbits, etc.). Individuals would be able to sign up for the service the trust offers (opt in) and the trust would have the authority and resources to handle the data on their behalf.

For a service provider to get access to the data gathered from their own their devices, they would have to agree to a license agreement or to have their data open. So contractually that starts to open up an individuals data sets through that individual's control, managed on their behalf through the trust, to other service providers and level the commercial playing field.

This model would allow an experiment of how to resolve what is today an impossible situation of each us as individuals to manage our own data plus those from new sensors in the environment.