Context of Human Activity
A lot has been written about the time, location, and even social context. In this publication, we focus on human activity and its relation to time and location dimensions. We break the problem down into three tasks:
- (1) obtaining a large set of open-domain human activities,
- (2) extracting the spatiotemporal context of these activities at the moment they were performed,
- (3) predicting activity given user context and vice versa.
Understanding Context for Tasks and Activities
J. R. Benetka, J. Krumm (Microsoft), and P. N. Bennet (Microsoft).
In: ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR ’19), March 2019.
info / paper (PDF) / slides
Many prior works address the first task by using manually curated lists of activities. The shortcoming of such approach is that it hardly covers the wide spectrum of one’s potential activities. More scalable way to go is application of natural language processing (NLP) methods on a suitable textual corpus. If we realize that verbs or verb phrases are, by definition, sentence constituents that introduce an action (e.g., feed, go to) and nouns or noun phrases typically fulfil the role of verbs’ arguments (e.g., ducks, popular café), we can create a simple grammar that isolates verb+noun pairs that describe an activity (e.g., feed ducks, go to popular café). This is one of the situations when a straightforward solution takes us a long way – this time in capturing thousands of diverse human activities.
Twitter as Self-reporting Platform
Next step is to choose a suitable data source which ideally carries evidence about people’s activities along with information about their location and time. Social networks make a perfect candidate: they can be seen as large crowdsourcing platforms with the potential to reveal the global picture of human activity behavior. Twitter especially stands out for a number of reasons: 1) people use it to advertise their doings , 2) most of the posts are ‘now’-oriented, and 3) tweets come with a timestamp and geolocation (raw or via Foursquare).
A seemingly prohibitive disadvantage of using Twitter as an evidence source is the inherent bias  of its content. Fortunately, as we demonstrate in the paper (see Figure below), self-censorship is only affecting which activities people decide to share, not the contextual details of the ones they post about. This finding allows for creation of trustworthy spatiotemporal profiles of activities that we do capture.
As shortage of data is the least of a problem when working with Twitter data, after analyzing Foursquare-linked tweets from the US region for over a year, we extracted more than 100,000 distinct activity descriptors on many granularity levels (from general notions such as `thinking‘ to very specific endeavours like `practicing egg drop soup delivery skill‘). Moreover, by piggybacking on the Foursquare categorization of places we could model location in terms of types (e.g., gym, airport), not coordinates in space.
Most frequent activities
The notorious bias of Twitter is evident in the distribution of activities rather than in their mere presence. Most commonly, people mention activities related to travel/transportation, eating & drinking, or entertainment. The diversity, however, seems to be endless. Check out the ranked list below:
Probabilistic activity models
The extracted instances of activities and their spatiotemporal patterns present a fertile ground for establishing models that embrace human activity as a contextual feature. In the paper, we go on and actually build the models that predict a person’s activity given her context and vice versa.
The plot below illustrates the probabilistic distribution of locations where people “drop off a kid” throughout the day.
Insights & Takeaways
- Social networks can be used as a source of thousands of open-domain activities.
- By focusing on ongoing activities extracted from Twitter we can reliably profile their spatiotemporal patterns.
- Clever combination of multiple data sources (i.e., Twitter and Foursquare) allowed for modeling locations as categories rather than coordinates which subsequently resulted in drastic dimensionality reduction.
- People DO TWEET about their activities FROM OUTER SPACE!
@WilliamShatner Yes, Standard Orbit, Captain. And we're detecting signs of life on the surface.
— Chris Hadfield (@Cmdr_Hadfield) January 3, 2013
 Is it really about me?: message content in social awareness streams., M. Naaman et al., Proceedings of the CSCW. ACM, 2010.
 I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience., A. E. Marwick and D. Boyd, New Media & Society, 2010.