Abstract
We present a study of anonymized data capturing high-level communication activities within the Microsoft Instant Messenger network. We analyze properties of the communication network defined by user interactions and demographics, as reported and as derived from one month of data collected in June 2006. The compressed dataset occupies 4.5 terabytes, composed from 1 billion conversations per day (150 gigabytes) over one month of logging. The dataset contains more than 30 billion conversations among 240 million people. We focus on analyses of high-level characteristics and patterns that emerge from the collective dynamics of 240 million people, rather than the actions and characteristics of individuals. Analyses center on numbers and durations of conversations; the content of communications was neither available nor pursued. From the data we construct a communication graph with 190 million nodes and 1.3 billion undirected edges. We find that the graph is well connected, with an effective diameter of 7.8, and is highly clustered, with a clustering coefficient decaying slowly with exponent −0.4. We also find strong influences of homophily in activities, where people with
Keywords
Related Publications
Planetary-scale views on a large instant-messaging network
We present a study of anonymized data capturing a month of high-level communication activities within the whole of the Microsoft Messenger instant-messaging system. We examine c...
Computing topological parameters of biological networks
Abstract Summary: Rapidly increasing amounts of molecular interaction data are being produced by various experimental techniques and computational prediction methods. In order t...
Heterogeneous Graph Transformer
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed for homogeneous graphs, in which a...
Why we twitter
Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. Tw...
Economic Costs of Diabetes in the U.S. in 2017
OBJECTIVE This study updates previous estimates of the economic burden of diagnosed diabetes and quantifies the increased health resource use and lost productivity associated wi...
Publication Info
- Year
- 2007
- Type
- article
- Pages
- 28
- Citations
- 45
- Access
- Closed