Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

The Carat project started in 2012 has collected over 1.5 TB of data from over 850,000 mobile users all over the world. The project uses Apache Thrift to transmit data, and Apache Spark to run data analysis tasks, and the gist of the Carat analysis method has been published. While the Carat application code is open source, the data is much harder to share because of its size and privacy concerns. This paper outlines the challenges in sharing such a large-scale dataset with detailed information about smart devices, applications, and their users, and presents some solutions to these challenges.
Original languageEnglish
Title of host publicationProceedings of 2016 IEEE International Conference on Big Data : Open Science in Big Data Workshop
Number of pages4
PublisherIEEE
Publication date2016
Pages2374-2377
ISBN (Electronic)978-1-4673-9005-7
DOIs
Publication statusPublished - 2016
MoE publication typeA4 Article in conference proceedings
EventIEEE International Conference on Big Data - Washington D.C., United States
Duration: 5 Dec 20168 Dec 2016
Conference number: 4

Fields of Science

  • 113 Computer and information sciences

Cite this

Peltonen, E., Lagerspetz, E., Nurmi, P., & Tarkoma, S. (2016). Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data. In Proceedings of 2016 IEEE International Conference on Big Data: Open Science in Big Data Workshop (pp. 2374-2377). IEEE. https://doi.org/10.1109/BigData.2016.7840871
Peltonen, Ella ; Lagerspetz, Eemil ; Nurmi, Petteri ; Tarkoma, Sasu. / Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data. Proceedings of 2016 IEEE International Conference on Big Data: Open Science in Big Data Workshop. IEEE, 2016. pp. 2374-2377
@inproceedings{d5b424f77c434b0f801c1b5bf9342dbe,
title = "Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data",
abstract = "The Carat project started in 2012 has collected over 1.5 TB of data from over 850,000 mobile users all over the world. The project uses Apache Thrift to transmit data, and Apache Spark to run data analysis tasks, and the gist of the Carat analysis method has been published. While the Carat application code is open source, the data is much harder to share because of its size and privacy concerns. This paper outlines the challenges in sharing such a large-scale dataset with detailed information about smart devices, applications, and their users, and presents some solutions to these challenges.",
keywords = "113 Computer and information sciences",
author = "Ella Peltonen and Eemil Lagerspetz and Petteri Nurmi and Sasu Tarkoma",
note = "Volume: Proceeding volume:",
year = "2016",
doi = "10.1109/BigData.2016.7840871",
language = "English",
pages = "2374--2377",
booktitle = "Proceedings of 2016 IEEE International Conference on Big Data",
publisher = "IEEE",
address = "United States",

}

Peltonen, E, Lagerspetz, E, Nurmi, P & Tarkoma, S 2016, Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data. in Proceedings of 2016 IEEE International Conference on Big Data: Open Science in Big Data Workshop. IEEE, pp. 2374-2377, IEEE International Conference on Big Data, Washington D.C., United States, 05/12/2016. https://doi.org/10.1109/BigData.2016.7840871

Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data. / Peltonen, Ella; Lagerspetz, Eemil; Nurmi, Petteri; Tarkoma, Sasu.

Proceedings of 2016 IEEE International Conference on Big Data: Open Science in Big Data Workshop. IEEE, 2016. p. 2374-2377.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data

AU - Peltonen, Ella

AU - Lagerspetz, Eemil

AU - Nurmi, Petteri

AU - Tarkoma, Sasu

N1 - Volume: Proceeding volume:

PY - 2016

Y1 - 2016

N2 - The Carat project started in 2012 has collected over 1.5 TB of data from over 850,000 mobile users all over the world. The project uses Apache Thrift to transmit data, and Apache Spark to run data analysis tasks, and the gist of the Carat analysis method has been published. While the Carat application code is open source, the data is much harder to share because of its size and privacy concerns. This paper outlines the challenges in sharing such a large-scale dataset with detailed information about smart devices, applications, and their users, and presents some solutions to these challenges.

AB - The Carat project started in 2012 has collected over 1.5 TB of data from over 850,000 mobile users all over the world. The project uses Apache Thrift to transmit data, and Apache Spark to run data analysis tasks, and the gist of the Carat analysis method has been published. While the Carat application code is open source, the data is much harder to share because of its size and privacy concerns. This paper outlines the challenges in sharing such a large-scale dataset with detailed information about smart devices, applications, and their users, and presents some solutions to these challenges.

KW - 113 Computer and information sciences

U2 - 10.1109/BigData.2016.7840871

DO - 10.1109/BigData.2016.7840871

M3 - Conference contribution

SP - 2374

EP - 2377

BT - Proceedings of 2016 IEEE International Conference on Big Data

PB - IEEE

ER -

Peltonen E, Lagerspetz E, Nurmi P, Tarkoma S. Too Big to Mail: On the Way to Publish Large-scale Mobile Analytics Data. In Proceedings of 2016 IEEE International Conference on Big Data: Open Science in Big Data Workshop. IEEE. 2016. p. 2374-2377 https://doi.org/10.1109/BigData.2016.7840871