introduction

Orange “Data for Development” - D4D - is an open data challenge, encouraging research teams around the world to use four datasets of anonymous call patterns of Orange's Ivory Coast subsidiary, to help address society development questions in novel ways. The data sets are based on anonymized Call Detail Records extracted from Orange’s customer base, covering the months of December 2011 to April 2012.

Research teams wishing to take on the challenge and participate to the development of Ivory Coast society will have access to the data to analyse it and cross-compare it with other types of data to find useful insights. The best research results will be selected by an independent D4D committee and will be presented at the 2013 NetMob conference and later at an event in Ivory Coast. 

objectives and description

The goal of the D4D challenge, in line with our Group’s Orange for Development initiative, is to contribute to the socio-economic development and well-being of populations. Knowledge of typical behaviours of mobile telephone users can be very useful, for example to identify early signs of epidemics, to be reactive in times of crisis, to measure the threat and resultant impact of droughts, to optimize the usage of certain infrastructures, etc.  The research subject can be chosen freely as long as it relates to an objective of development and improved quality of life for all.

Orange encourages the participants to cross-compare D4D data with other types of data which they have found through their own research. By way of example and to stimulate ideas, a list of data sources from NGOs or international organizations is available on this website, although Orange cannot of course guarantee the quality or their relevance for all projects.

This website is available to researchers, public institutions or NGOs involved or interested in the development of sub-Saharan Africa and Ivory Coast in particular. A suggestion box and a newsletter are provided to encourage contributions by proposing useful subjects or links to resources and contacts. In particular, all those who have databases that could be usefully employed, in conjunction with mobile phone communication data in the framework of the D4D challenge, are cordially invited to share their data. The suggestion space also provides a forum for researchers or Data representation specialists who would like to exchange ideas or simply to initiate new contacts.

who can participate?

Participants wishing to utilize the Orange database and participate in the challenge must be affiliated with a public or private research institution.

selection and access to the data

pre-selection:
Access to the data requires the submission of a research project (which will remain confidential) and agreement to the Terms and Conditions (T&C’s). Orange and the Committee Chairman will make a selection of the research proposals to ensure that only projects consistent with the development objectives of the Challenge will receive an access key to the data sets.

access key:
The projects whose objectives conform to the D4D Challenge will be notified by mail, and will need at that time to send back the signed T&C's, in order for projects teams to receive a use-once access key giving access to the datasets to download them.

datasets

The data collection took place in Cote d’Ivoire over a five-month period, from December 2011 to April 2012. The original dataset contains 2.5 billion records, calls and text messages exchanged between 5 million users. The customer identifier was anonymized by Orange Cote d’Ivoire. All subsequent data processing was completed by Orange Labs in Paris.

We will release four datasets in order to offer a large spectrum of possible analyses:

  • Aggregate communication between cell towers;
  • Mobility traces: fine resolution dataset;
  • Mobility traces: coarse resolution dataset;
  • Communication sub-graphs.

All information about user’s whereabouts were recorded when a user sent or received a call or a text message (SMS). The cell tower that routed the call is recorded. We provide slightly blurred geographic locations for about 1,200 towers. Random user identifiers have been generated separately for each dataset.

1. Aggregate communication between cell towers.
For every one-hour period we provide the number of calls and the total communication time between every pair of cell towers. We also provide the information about which cell tower initiated the call.  Calls starting in a particular one-hour period are associated with this time period, irrespective of their termination time.

 2. Mobility traces: fine resolution dataset.
In every two-week period, we choose a random sample of active users. We provide timestamps and the cell towers those users made calls and texts from during the two-week period. This process is repeated for successive two-week periods with different random samples.

3. Mobility traces: coarse resolution dataset.
We choose a random sample of active users. We provide timestamps and location information for calls and SMS made by those users during the entire 5-month period. The location information is not given by the cell tower but by the “sous-préfecture”, an administrative unit in Cote d’Ivoire. The country has 255 subprefectures and we provide a table with their geographical location.

4. Communication sub-graphs.
We have chosen several thousand of random users (egos) for whom we constructed communication graphs obtained by considering all communications between ego and his/her contacts at up to two degrees of separation from the ego. The communications inside each sub-graph are aggregated by two-week time window over five months. Random identifiers are attributed separately in each graph and remain unchanged over entire observation period.

evaluation

The results of the research work will be assessed by the D4D Committee from three different aspects: the scientific insights’ quality and originality, the potential to contribute to a relevant development issue and the quality of Data visualisation to make the data speak in simple and engaging ways.

The social development criteria broadly covers sustainable development goals:

  • Social development: helps reduce poverty and inequality; promote mutual aid solidarity; develop healthcare and education, etc.
  • Economic development: helps create new economic activity and employment; in particular, in agriculture; to improve infrastructures and public services, etc.
  • Environmental development: help protect natural resources and their diversity; facilitate access to and good management of vital resources; identify signs of approaching crises such as drought; improve living conditions (cleanliness, access to drinking water, etc.) and protect them against natural catastrophes.

The deliberations to select the most worthy projects will be led by the Committee Chairman in order to reach a majority consensus amongst the  committee members.

the D4D evaluation committee

Professor Vincent Blondel, University of Louvain (UCL), Louvain-La-Neuve, Belgium - Chairman
Professor  Francis Akindes, Université de Bouaké, Bouaké, Ivory Coast
Mr William Hoffman, Head of Telecom industry, World Economic Forum, New York, USA
Mrs Mari-Noëlle Jégo-Laveissière, Head of Orange Labs, Paris, France
Mr Robert Kirkpatrick, Head of Global Pulse, United Nations, New York, USA
Mr Chris Locke, Managing director GSMA Development Fund, GSMA,  London, UK
Professor  Alex (Sandy) Pentland, Medialab, MIT, Cambridge, USA

                       

awards

Four prizes will be awarded:

  • Best Overall: prize for the best project on Scientific and Development aspects.
  • Best Scientific: prize for a project proposing an innovative methodology, a new question addressed and relevant original findings.
  • Best Development Insight: prize for the most "practical" project, addressing a significant question with a real potential for application in the field
  • Best Visualisation: prize for the clearest and most appealing visual representation. This prize will be awarded only if a project's presentation has visibly demonstrated considerable creative effort.

The “Best Overall” winner will be awarded with a US$ 3,000; the other prize winners will each receive a US$ 1,000. One person from each winning team will be invited, expenses paid, to participate to the results presentation at the NetMob Conference to be held in Spring 2013.

results publication and exploitation

The organizing committee will invite the winning researchers to present their projects at the NetMob International Conference on "Analysis of Mobile Phone Datasets and Networks", scheduled for Spring 2013, at which the results will be presented to the scientific community.

A second event will be organized by Orange in Ivory Coast in mid 2013. The most relevant projects will be presented, with the consent of their authors, to public authorities and organizations who might wish to utilize the results. We will endeavour to ensure the development insights will be used, and where appropriate, we will set-up contact between the research teams and the organisation wishing to collaborate further.

Furthermore, the winning projects will be promoted, for example though a specialist publication or articles in journals. The D4D Challenge prize label is also likely to bring media coverage via our Public Relations department for the winning projects.

calendar

June 2012: Launch of the invitation to candidates. Start of  registrations.
October 31, 2012: Deadline for registration and submission  of applications. Orange and the Chairman of the Committee evaluate the candidate  applications on a regular basis. Orange's anonymous datasets are made accessible  to the selected teams.
January 31, 2013: Deadline for submission of  project.
February/March 2013: Evaluation of the projects by the D4D  committee.
May 2-3, 2013: Presentation of the winning projects at the  NetMob Conference.
Spring 2013: Presentation of the winning projects in  Ivory Coast, exploration of their application by appropriate local organizations  and possible follow-up with the research teams.

contacts and communication

If you are interested in this challenge, we encourage you to register to receive the D4D newsletter: d4d.newsletter[at]orange.com
For Twitter, use the hashtag #data4d.
For further questions about the Challenge, please contact us at: d4d.contact[at]orange.com

France Telecom-Orange is one of the world's leading telecommunications operators with 172,000 employees and a turnover of 45.3 Bn Euro. Our Group served more than 220 million customers in 2011 and operates in 35 countries in Europe, Africa and the Middle East. In Europe Orange is a leading provider of mobile telephony and ADSL Internet access; worldwide it is a leader in ICT services for multinational corporations, sold under the Orange Business Services brand name.