Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package)

Abstract

Time series feature engineering is a time-consuming process because scientists and engineers have to consider the multifarious algorithms of signal processing and time series analysis for identifying and extracting meaningful features from time series. The Python package tsfresh (Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests) accelerates this process by combining 63 time series characterization methods, which by default compute a total of 794 time series features, with feature selection on basis automatically configured hypothesis tests. By identifying statistically significant time series characteristics in an early stage of the data science process, tsfresh closes feedback loops with domain experts and fosters the development of domain specific features early on. The package implements standard APIs of time series and machine learning libraries (e.g. pandas and scikit-learn) and is designed for both exploratory analyses as well as straightforward integration into operational data science applications.

Keywords

Python (programming language)Computer scienceSeries (stratigraphy)ScalabilityTime seriesArtificial intelligenceData miningR packageTime domainUnit testingFeature extractionMachine learningPattern recognition (psychology)SoftwareProgramming languageDatabase

Affiliated Institutions

Related Publications

The Great Crash, the Oil Price Shock, and the Unit Root Hypothesis

Pierre Perrón

We consider the null hypothesis that a time series has a unit root with possibly nonzero drift against the alternative that the process is «trend-stationary». The interest is th...

1989 Econometrica 7549 citations

Faster and Better: A Machine Learning Approach to Corner Detection

Edward Rosten , Reid Porter , Tom Drummond

The repeatability and efficiency of a corner detector determines how likely it is to be useful in a real-world application. The repeatability is important because the same scene...

2008 IEEE Transactions on Pattern Analysis... 1809 citations

Bootstrap Methods and their Application

A. C. Davison , D. V. Hinkley

Bootstrap methods are computer-intensive methods of statistical analysis, which use simulation to calculate standard errors, confidence intervals, and significance tests. The me...

1997 Cambridge University Press eBooks 6929 citations

Convolutional feature masking for joint object and stuff segmentation

Jifeng Dai , Kaiming He , Jian Sun

The topic of semantic segmentation has witnessed considerable progress due to the powerful features learned by convolutional neural networks (CNNs). The current leading approach...

2015 466 citations

A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems

Ali Elkahky , Yang Song , Xiaodong He

Recent online services rely heavily on automatic personalization to recommend relevant content to a large number of users. This requires systems to scale promptly to accommodate...

2015 710 citations

Publication Info

Year: 2018
Type: article
Volume: 307
Pages: 72-77
Citations: 1100
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package)

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1100

OpenAlex

Cite This

APA Style

                            
                                    Maximilian Christ, 
                                
                                    N. Braun, 
                                
                                    Julius Neuffer
                                
                                et al.
                            
                            (2018). 
                            Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package). 
                            Neurocomputing
                            , 307
                            
                            , 72-77.
                            https://doi.org/10.1016/j.neucom.2018.03.067

Identifiers

DOI: 10.1016/j.neucom.2018.03.067