Abstract

Time series feature engineering is a time-consuming process because scientists and engineers have to consider the multifarious algorithms of signal processing and time series analysis for identifying and extracting meaningful features from time series. The Python package tsfresh (Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests) accelerates this process by combining 63 time series characterization methods, which by default compute a total of 794 time series features, with feature selection on basis automatically configured hypothesis tests. By identifying statistically significant time series characteristics in an early stage of the data science process, tsfresh closes feedback loops with domain experts and fosters the development of domain specific features early on. The package implements standard APIs of time series and machine learning libraries (e.g. pandas and scikit-learn) and is designed for both exploratory analyses as well as straightforward integration into operational data science applications.

Keywords

Python (programming language)Computer scienceSeries (stratigraphy)ScalabilityTime seriesArtificial intelligenceData miningR packageTime domainUnit testingFeature extractionMachine learningPattern recognition (psychology)SoftwareProgramming languageDatabase

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
article
Volume
307
Pages
72-77
Citations
1100
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1100
OpenAlex

Cite This

Maximilian Christ, N. Braun, Julius Neuffer et al. (2018). Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package). Neurocomputing , 307 , 72-77. https://doi.org/10.1016/j.neucom.2018.03.067

Identifiers

DOI
10.1016/j.neucom.2018.03.067