Data from the people to the people: Combining Ubicomp with HCI for data empowerment

Prologue

People live their everyday life performing different activities which are recorded by the actors themselves or using digital devices/services which collect data about their behaviour. Lupton et al. [1] identifies 5 modes of self-tracking: private, pushed, communal, imposed and exploited self-tracking. Private self-tracking is something that is usually orchestrated by an individual to achieve a goal or engage in self improvement and communal self-tracking is done by individual to contest a solution or draw attention to a specific issue in the community. For these two cases, the user has initiated the self-tracking and are controlling the collection process, and meanwhile also gaining some value from it. This may not be true in the other cases, when the tracked individual is ‘nudged’ to self-tracking or she/he isn’t even aware of it. A lot of services have stated in their Terms of Service clause that they use the data for improving ‘user experience’, but let’s face it, who reads them. This means that people have lost the control and access and even worse, the ownership over their data.

Ubiquitous technology, penetration of smartphones, and context aware applications make data collection effortless for the people. On one hand this system-driven data enquiry [2] gives people more freedom by reducing the demand on users, but it also raises a lot of concerns regarding control, privacy and being vulnerable for exploitation. Also with ubiquitous computing, “presentless” devices/sensors and machine learning, the term of computing between individual and device or second wave of human computer interaction (HCI), is already starting to be irrelevant. This can be a bit frightening, because to machines people are merely parameters or static numbers in an algorithm, which makes way for ‘algorithmic identities’ [3]. As data processing methods improve, the concern among HCI community rises regarding this machine-like and interactions between humans and computers. Blackwell [4] discussed in his paper what challenges modern data processing technologies impose HCI and interaction design. The issues surrounding context and embodiment have always surrounded interaction design [5] and with machine learning (ML) the question is how to apply it but still guarantee humane computer interaction ?

Self-tracking by design is a ‘selfish’ activity and it is usually the aim of self tracking to achieve personal goals, reach self-awareness, optimise or improve one’s life. But people are not living their lives alone. Their actions and decisions are influenced by other individuals in their immediate circle or community. In order to make sense of individual’s lives and quantify them, we need to know how other people are behaving around us as well. Because of the selfish nature of self-tracking, most of the applications available limit the collaboration to only sharing [6, 7] the results or learnings with others to get extra encouragement or to show their progress. Rooksby et al. [8] also touched the social aspect of personal tracking, where the participants were sharing their data and even doing tracking together with their families, friends or coworkers. One participant in their study said that although they are taking walks together with his wife and tracking them, he would like to have the same tracker to compare the information. In the paper he suggests to look into personal tracking as social tracking.

Me and Data

Before joining Open Lab and the Digital Civics program, most of my work involved around dealing with large location information databases and studying mobile positioning data by transforming mobile operator billing information into meaningful location information. During that period I have been involved with different consortiums and projects, where we leveraged modern data processing architectures to extract knowledge about people’s mobility and activities from passive mobile positioning data.

It was there developed a fascination towards numbers and how can you unravel and understand life through them. I have done self-tracking for about 5 years now. For example, here is my movements in Estonia from 2010-2015 recorded using passive mobile positioning. I have tried multiple applications to collect and understand my life, but what they all have limitations and I always end up collecting raw sensor data and then using my own programming skills to extract some knowledge.

The thing that always irritates me is the insights that some applications offer you (e.g deep sleep, light sleep). What is the meaning of it? What influences it? How can we learn from it and improve our lives? What do we need to build to have the eureka moment? These questions trouble me and keep me awake! So why do we use these self monitoring apps and gadgets if we don’t get anything useful out of them? Is it because it is a cool thing to do? I hope to find some answers to these questions (or more information at least) at the end of my PhD, by experimenting on myself and unraveling other people’s relationships with their data.

Contextual Data Discovery

As the long introduction revealed, my interest lies in data and how can we use ubiquitous technology and self-tracking to do personal data empowerment and draw true meaning from numbers. I’m not just talking about presenting people with the amount of calories or how many hours of deep sleep did they get, but trying to put these numbers in context. Also instead of trying to think about some kind of problem or an issue like weight loss or sleep deprivation, that could be tackled using data collecting and processing, I’m letting the data decide – doing data driven research. Thinking about different types of data, how it can be obtained and connected with other datasets. And then letting people reflect on the collected information and to see their lives in a relation with others.

Drawing on Rooksby’s [8] social tracking I would like take this concept of sharing your numbers to a next level of sharing in context with others in order to discover insights and obtain knowledge about the community as a whole by looking how individuals live their life in it, and related to others. I believe this contextual data discovery opens up new methods of interpreting and sensemaking which can be more helpful than just giving people a numeric value or measure. True understanding comes when we can make connections between real life events, interactions with others and these self-tracked datasets. Like Richard Wesley Hamming said: “The purpose of computing is insight, not numbers.”

Challenges

As always, designing new solutions we counter a set of challenges whether they are technical, physiological or behavioural. Most of my previous work has been revolving around technology: data science, Internet-of-Things (IoT) and sensors, application development and many more. Working with large datasets and doing mostly quantitative research, the HCI that I have encountered before, was done mostly between me and computers, but now it’s between groups of people (the users) and computers. The digital civics angle puts thing in a whole new perspective. So before going off and developing something and just putting it out there, I need to understand what are the issues surrounding this area. There is a bigger picture to be looked at, to prevent falling into the trap of using technology to solve today’s problems by creating tomorrow’s!

Challenges to face:

What data we need to collect?
Privacy and control – ubiquitous technology and algorithmic data generation.
Technical/visualising challenges – how to take advantage of ubiquitous technologies to collect and display these datasets.
Putting things in context – getting people to share their personal data and start collaborating with each other.
Continuity of data – how to get people to collect data and engage in long term usage.

When there is talk of tracking, people immediately start thinking about surveillance, “The Big Brother” and that somebody is spying on them. Privacy is definitely an issue when it comes to recording sensitive information about people’s lives, but I think it is more about people’s fears of having no control over what is recorded and where it is used. As discussed previously, in lot of the cases people have deliberately (or unknowingly) or in order to use some service or device given up control of what is tracked and for what purpose.

In order to do true data empowerment we need to put individual in control of their own personal data. For this, there is need to first notify people of their data trails, then understand what are the factors that prevent people from doing long term data collection and then build tools (together with the public) for them to track themselves and give them the data in a form that they own, control and understand. From this understanding people can reflect on their lives and make sense of it. This will hopefully also elicit behavioural change and ultimately improve their lives and lives of others around them.

The challenge of doing data driven research is that you can develop a data lake fallacy, where you collect all sorts of data and don’t exactly know how and what are you looking for. One could argue that this is the point of the data lakes, to let in all the data, without restrictions and government and then use already existing data to draw some meaning out of the new data, but it often ends up being so overwhelming and having just a massive set of numbers and figures without any context. That’s why data should be kept in context. This is also emphasised by Taylor et al [9] who says that data being kept within the context it was collected, and that it should be constantly questioned; “data […] doesn’t by itself assert things in the world; rather, it helps to surface, assemble, cement and (at times) unravel forms of knowing, ideas, controversies, and so on”. I strongly agree with this notion, but additional to the challenge of putting things in context we need to get people to share their data and start collaborating with each other. In order to create this framework of contextual data discovery, people need to establish trust towards the system and each other, which can only be achieved when everything is open and transparent.

Additional to the challenges revolving around trust and privacy, there are issues around technology (collecting and visualising) that could help us understand this data and motivations to collect it in the first place. These issues trouble technologists as more and more tools for personal informatics emerge and in Ubicomp 2015 conference there was a workshop which explored the issues with self tracking and tried to come up with new ways of engaging people in self-tracking practices [10].

Ubiquitous, Pervasive and Everywhere But Not for Everyone

With Mark Weiser’s vision (stated in his influential article “The computer for the 21st century” [11] from 1991) of third generation computing comes the age of ‘calm computing’, where computing and sensors integrate into everything and may appear everywhere – often times without a physical presence. The coming of the smartphones and always on sensors on them, they present themselves as “seamful” devices which are always tracking and always with us. Coming from a technical background and having built ubiquitous technology [12], for me it seemed that Weiser’s vision is already here and we can take advantage of all the sensors and ubiquitous technology what surrounds us and track ourselves and the environment we live in.

Abowd et al. also emphasises the “hacking in the real world” and “do-it-yourself” mentality which means that research is moving out of the lab setting to the real world and into the hands of non ‘tech-savvy’ people. This transition happened with personal computing by the mid-1990’s the when HyperCard was introduced, but ubicomp is still searching for it’s own ‘killer app’.

Personal tracking as we know it today started with people in communities like Quantified Self and with more tech savvy people. Now, with more widely available technologies like smartphones, smartwatches and activity trackers (e.g Fitbit, Jawbone Up, Nike Fuelband) it’s moving into the mainstream. Although we have all of these gadgets available, people still struggle with personal tracking; constantly switching between different devices/applications and finding it hard to link different datasets and aspects of their lives. For some people these device serve only a purpose of a nice accessory or a pretty watch [8].

Li et al. [2] presented a stage model of personal informatics, which approaches personal information systems with a stage-based model dividing the process to 5 stages – preparation, collection, integration, reflection, and action. I like the breakdown of self monitoring into stages, because in order to make real sense of the data we have to be aware what are we doing on every step. Robert L. Mercer said: “There is no data like more data!”. As this might be true when building a robust machine learning models, I would argue that when trying to unravel your life using numbers it’s not. Just collecting data without proper analysis – the data becomes noise. In the end we want insights not numbers.

Being a true technologist I would say OK; let’s just put all the data into a machine learning algorithm or let the artificial intelligence (AI) decide what is relevant in the data and give suggestions how to fix everything and make everybody’s lives better. As machine learning can be used to recognize activities are draw our attention to something concurrent in data, there is still need for human intervention and interaction in order to get true meaning out of it. People are needed to add qualitative input to the data to define the context the activity is performed [5].

I’m a strong believer in technology as a helpful tool for tackling life’s problems and using systematic and algorithmic processing of data, but I have to agree with Mark Klein, Principal Research Scientist at the Center for Collective Intelligence, that artificial intelligence is not a mature enough to do anything more than specialise in little contributions, meanwhile the messy problems that are kicking our butt as a species are wicked problems [12], that AI is simply not within light years of being able to help solve. That’s why we need to also harness the biggest resource we have on earth – the collective intelligence of people or the “World Brain”.

Collective Intelligence and Sensemaking

As mentioned before, self-tracking is highly selfish and personal activity with the goal of improving one’s life or achieving individual goals. Sites like Quantified Self are good for connecting people who are doing self-tracking; they can share their experiences with each other and learn from others. Although, it’s good for improving the quality of tracking and data exploration or sharing information about different solutions which can be used for personal data practices, but it doesn’t contribute an awful lot to the sensemaking process. The questions still remain: “What does these numbers mean?”, “What influences them?”, “How can we learn from them and improve our lives?”.

When moving away from individual nature of self-tracking, there is another form of self-logging, Communal Tracking [14], where people collaborate and collect personal data for a collective cause. These citizen sensing or quantified communities use personal tracking to draw attention to issues in the community or challenge governmental policies on planning and development. However, I see this process being a bit flawed and seems a bit like voting – “I’ll cast my vote, give my input, and then I’m done”. Coming back to the concept of data lakes and Taylor’s ‘data in context’, the data has no value if it is stripped out context. Instead of just collecting this data and then processing it in some warehouse, in order to present it do government officials to come up with the solutions, it should be presented back to the people within the context, in order to reflect, elicit discussions and start drawing meaning from it. There are already initiatives like Future city Glasgow which make use of data shared by organizations and people to improve the city and its services. Although it is a good example of opening up data, I believe this collaboration could be pushed even further than sharing; people should also take part in a collective sensemaking. This way it would truly serve the cause of digital civics and civic engagement. By including more people and getting more opinions we could ultimately come to a better understandings and solutions. Drawing parallels from the open-source world, there is saying by Eric Raymond: “Given enough eyeballs, all bugs are shallow”.

Personal Data Empowerment, “for who”?

As Rooksby [8] suggested, we should (re)think the personal tracking as social tracking. I have stressed many times in this blog post, the key to true sense making is contextualising. This is all good, but how and where can where can we apply this concept?

From Li’s [2] perspective lifelogging is always done with a purpose and followed by immediate action, however studies done in the field of personal informatics, reveal that a lot of the cases these are not the motives for people [8, 15]. The relationships between people and their digital possessions have been studied before in HCI, for example Chris Elsden [15] explored how users experience data collection and how do they interact with their ‘Quantified Past’. People feel strong ownership towards their data and feel that it is important to have control over it. Referencing back to Rooksby [8] and the co-present tracking, where spouses tracked activities done together and then reflected on them together. Having only one tracker on one person raised issues of not being able to have your own data to compare.

Although I argue that this concept of user controlled data sharing and contextual data discovery could be used in various scenarios and different scales, for my future research I’ll be looking at families, multigenerational households and workplaces. These places represent groups of people who have their own lives and routines which could be recorded by personal tracking devices, but they also have close interactions within their environment. As this contextual data discovery can give more meaning to experiences and perhaps help individuals achieve greater level of mindfulness, it could also help them in collective sensemaking which lead to better decisions.

Conclusion

In this blog post I explored the concept of self-tracking as social tracking, looking at how can ubiquitous technologies combined with HCI research help to contribute into true sensemaking and data empowerment. The trail of data that people are leaving behind is enormous, but often it’s just collected and never used or used for the wrong reasons. By utilising modern data processing tools and methods of machine learning, control of this data can be given back to people. Emphasising the importance of context, I presented the concept of contextual data discovery, which gives individuals the ability to look their personal data in context with other people’s data.

The question that drives me is how can I present people’s data to them in a way that is meaningful to them, to help them improve their lives? In my further research I would like to know more about people’s engagement with data and how to involve the public in designing data tools and applications. Moreover I would like to see whether these tools invoke people to start contributing to digital public space which provides the technical infrastructure for digital civics.

References

Lupton, D. (2014). Self-tracking Modes: Reflexive Self-Monitoring and Data Practices.
Li, I., Dey, A., & Forlizzi, J. (2010). A stage-based model of personal informatics systems. Proceedings of the 28th International Conference on Human Factors in Computing Systems CHI 10, 557. http://doi.org/10.1145/1753326.1753409
Cheney-Lippold, J. (2011). A new algorithmic identity: soft biopolitics and the modulation of control. Theory, Culture & Society, 28(6), 164-181.
Blackwell, A. F. (2015). Interacting with an Inferred World: The Challenge of Machine Learning for Humane Computer Interaction.
Dourish, P. (2004). What We Talk About When We Talk About Context.Personal and Ubiquitous Computing, Volume 8, Issue 1, pp 19-30. http://doi.org/10.1007/s00779-003-0253-8
Quantified Self Tools, Available from: <http://quantifiedself.com/guide/tools>. [Accessed: 28 December 2015].
Personal Informatics Tools, Available from: <http://www.personalinformatics.org/tools/>. [Accessed: 28 December 2015].
Rooksby, J., Rost, M., Morrison, A., & Chalmers, M. C. (2014). Personal tracking as lived informatics. Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems – CHI ’14, 1163–1172. http://doi.org/10.1145/2556288.2557039
Taylor, A. S., Lindley, S., Regan, T., & Sweeney, D. (2015). Data-in-Place: Thinking through the Relations Between Data and Community. CHI 2015, Crossings, 2863–2872. http://doi.org/10.1145/2702123.2702558
Frontiers of QS: <https://frontiersqs.wordpress.com/>. [Accessed: 10 December 2015].
Weiser, M. (1991). The computer for the 21st century. Scientific American.
Puussaar, A. (2014). Indoor Positioning Using WLAN Fingerprinting with Post-Processing Scheme, Tartu Ülikool.
Ritter, H. W. J., Webber, M. M. (1973). Dilemmas in a General Theory of Planning. Policy Sciences 4, 155-169.
Gabrys, J. (2014). Programming environments: Environmentality and citizen sensing in the smart city. Environment and Planning D: Society and Space, 32(1), 30–48. http://doi.org/10.1068/d16812
Chris Elsden and David S. Kirk. (2014). A quantified past: remembering with personal informatics. In Proceedings of the 2014 companion publication on Designing interactive systems (DIS Companion ’14). ACM, New York, NY, USA, 45-48. DOI=http://dx.doi.org/10.1145/2598784.2602778