EPJ Data Science

http://link.springer.com/journal/13688

List of Papers (Total 125)

The effect of Pokémon Go on the pulse of the city: a natural experiment

Pokémon Go, a location-based game that uses augmented reality techniques, received unprecedented media coverage due to claims that it allowed for greater access to public spaces, increasing the number of people out on the streets, and generally improving health, social, and security indices. However, the true impact of Pokémon Go on people’s mobility patterns in a city is still ...

Data-driven modeling of collaboration networks: a cross-domain analysis

We analyze large-scale data sets about collaborations from two different domains: economics, specifically 22,000 R&D alliances between 14,500 firms, and science, specifically 300,000 co-authorship relations between 95,000 scientists. Considering the different domains of the data sets, we address two questions: (a) to what extent do the collaboration networks reconstructed from the ...

Rapid rise and decay in petition signing

Contemporary collective action, much of which involves social media and other Internet-based platforms, leaves a digital imprint which may be harvested to better understand the dynamics of mobilization. Petition signing is an example of collective action which has gained in popularity with rising use of social media and provides such data for the whole population of petition ...

The shape of collaborations

The structure of scientific collaborations has been the object of intense study both for its importance for innovation and scientific advancement, and as a model system for social group coordination and formation thanks to the availability of authorship data. Over the last years, complex networks approach to this problem have yielded important insights and shaped our understanding ...

Comparison of traffic reliability index with real traffic data

Existing studies have developed different indices based on various approaches including network connectivity, delay time and flow capacity, estimating the traffic reliability states from different angles. However, these indices mainly estimate traffic reliability from single view and rarely consider the combined effect of city traffic dynamics and underlying network structure. ...

Classification of Westminster Parliamentary constituencies using e-petition data

In a representative democracy it is important that politicians have knowledge of the desires, aspirations and concerns of their constituents. Opportunities to gauge these opinions are however limited and, in the era of novel data, thoughts turn to what alternative, secondary, data sources may be available to keep politicians informed about local concerns. One such source of data ...

A roadmap for the computation of persistent homology

Persistent homology (PH) is a method used in topological data analysis (TDA) to study qualitative features of data that persist across multiple scales. It is robust to perturbations of input data, independent of dimensions and coordinates, and provides a compact representation of the qualitative features of the input. The computation of PH is an open area with numerous important ...

Instagram photos reveal predictive markers of depression

Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners’ average unassisted diagnostic success ...

Prediction of employment and unemployment rates from Twitter daily rhythms in the US

By modeling macro-economical indicators using digital traces of human activities on mobile or social networks, we can provide important insights to processes previously assessed via paper-based surveys or polls only. We collected aggregated workday activity timelines of US counties from the normalized number of messages sent in each hour on the online social network Twitter. In ...

Early detection of promoted campaigns on social media

Social media expose millions of users every day to information campaigns - some emerging organically from grassroots activity, others sustained by advertising or other coordinated efforts. These campaigns contribute to the shaping of collective opinions. While most information campaigns are benign, some may be deployed for nefarious purposes, including terrorist propaganda, ...

An alternative approach to the limits of predictability in human mobility

Next place prediction algorithms are invaluable tools, capable of increasing the efficiency of a wide variety of tasks, ranging from reducing the spreading of diseases to better resource management in areas such as urban planning. In this work we estimate upper and lower limits on the predictability of human mobility to help assess the performance of competing algorithms. We do ...

Inferring social influence in transport mode choice using mobile phone data

A longitudinal mobile phone data that include both location and communication logs is analyzed to infer social influence in terms of ego-network effect in the commute mode choice. The results show that person’s strong ties are more important to determine if driving is the person’s transport mode choice, whereas weak ties are more important to determine if public transit is the ...

Individual position diversity in dependence socioeconomic networks increases economic output

The availability of big data recorded from massively multiplayer online role-playing games (MMORPGs) allows us to gain a deeper understanding of the potential connection between individuals’ network positions and their economic outputs. We use a statistical filtering method to construct dependence networks from weighted friendship networks of individuals. We investigate the 30 ...

Uncovering the relationships between military community health and affects expressed in social media

Military populations present a small, unique community whose mental and physical health impacts the security of the nation. Recent literature has explored social media’s ability to enhance disease surveillance and characterize distinct communities with encouraging results. We present a novel analysis of the relationships between influenza-like illnesses (ILI) clinical data and ...

Topological analysis of data

Propelled by a fast evolving landscape of techniques and datasets, data science is growing rapidly. Against this background, topological data analysis (TDA) has carved itself a niche for the analysis of datasets that present complex interactions and rich structures. Its distinctive feature, topology, allows TDA to detect, quantify and compare the mesoscopic structures of data, ...

Contact activity and dynamics of the social core

Humans interact through numerous communication channels to build and maintain social connections: they meet face-to-face, make phone calls or send text messages, and interact via social media. Although it is known that the network of physical contacts, for example, is distinct from the network arising from communication events via phone calls and instant messages, the extent to ...

Gender matters! Analyzing global cultural gender preferences for venues using social sensing

Gender differences is a phenomenon around the world actively researched by social scientists. Traditionally, the data used to support such studies is manually obtained, often through surveys with volunteers. However, due to their inherent high costs because of manual steps, such traditional methods do not quickly scale to large-size studies. We here investigate a particular aspect ...

The happiness paradox: your friends are happier than you

Most individuals in social networks experience a so-called Friendship Paradox: they are less popular than their friends on average. This effect may explain recent findings that widespread social network media use leads to reduced happiness. However the relation between popularity and happiness is poorly understood. A Friendship paradox does not necessarily imply a Happiness paradox ...

Improving official statistics in emerging markets using machine learning and mobile phone data

Mobile phones are one of the fastest growing technologies in the developing world with global penetration rates reaching 90%. Mobile phone data, also called CDR, are generated everytime phones are used and recorded by carriers at scale. CDR have generated groundbreaking insights in public health, official statistics, and logistics. However, the fact that most phones in developing ...

BiFold visualization of bipartite datasets

The emerging domain of data-enabled science necessitates development of algorithms and tools for knowledge discovery. Human interaction with data through well-constructed graphical representation can take special advantage of our visual ability to identify patterns. We develop a data visualization framework, called BiFold, for exploratory analysis of bipartite datasets that ...

Absence makes the heart grow fonder: social compensation when failure to interact risks weakening a relationship

Social networks require active relationship maintenance if they are to be kept at a constant level of emotional closeness. For primates, including humans, failure to interact leads inexorably to a decline in relationship quality, and a consequent loss of the benefits that derive from individual relationships. As a result, many social species compensate for weakened relationships by ...

Applying Hidden Markov Models to Voting Advice Applications

In recent times, a phenomenon that threatens the representative democracy of many developed countries is the low voter turnout. Voting Advice Applications (VAAs) are used to inform citizens about the political stances of the parties that involved in the upcoming elections, in an effort to facilitate their decision making process and increase their participation in this democratic ...

Generic temporal features of performance rankings in sports and games

Many complex phenomena, from trait selection in biological systems to hierarchy formation in social and economic entities, show signs of competition and heterogeneous performance in the temporal evolution of their components, which may eventually lead to stratified structures such as the worldwide wealth distribution. However, it is still unclear whether the road to hierarchical ...