Data Structures for Statistical Computing in Python

2010 Proceedings of the Python in Science Conferences 10,212 citations

Abstract

In this paper we are concerned with the practical issues of working with data sets common to finance, statistics, and other related fields. pandas is a new library which aims to facilitate working with these data sets and to provide a set of fundamental building blocks for implementing statistical models. We will discuss specific design issues encountered in the course of developing pandas with relevant examples and some comparisons with the R language. We conclude by discussing possible future directions for statistical computing and data analysis using Python.

Keywords

Python (programming language)Computer scienceData scienceData explorationStatistical analysisData structureComputational statisticsTheoretical computer scienceData miningSoftware engineeringProgramming languageMachine learningStatisticsVisualizationMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
article
Pages
56-61
Citations
10212
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

10212
OpenAlex

Cite This

Wes McKinney (2010). Data Structures for Statistical Computing in Python. Proceedings of the Python in Science Conferences , 56-61. https://doi.org/10.25080/majora-92bf1922-00a

Identifiers

DOI
10.25080/majora-92bf1922-00a