Abstract
Designing, conducting, analyzing, reporting, and interpreting the findings of a research study require an understanding of the types and characteristics of data and variables. Descriptive statistics are typically used simply to calculate, describe, and summarize the collected research data in a logical, meaningful, and efficient way. Inferential statistics allow researchers to make a valid estimate of the association between an intervention and the treatment effect in a specific population, based upon their randomly collected, representative sample data. Categorical data can be either dichotomous or polytomous. Dichotomous data have only 2 categories, and thus are considered binary. Polytomous data have more than 2 categories. Unlike dichotomous and polytomous data, ordinal data are rank ordered, typically based on a numerical scale that is comprised of a small set of discrete classes or integers. Continuous data are measured on a continuum and can have any numeric value over this continuous range. Continuous data can be meaningfully divided into smaller and smaller or finer and finer increments, depending upon the precision of the measurement instrument. Interval data are a form of continuous data in which equal intervals represent equal differences in the property being measured. Ratio data are another form of continuous data, which have the same properties as interval data, plus a true definition of an absolute zero point, and the ratios of the values on the measurement scale make sense. The normal (Gaussian) distribution (“bell-shaped curve”) is of the most common statistical distributions. Many applied inferential statistical tests are predicated on the assumption that the analyzed data follow a normal distribution. The histogram and the Q–Q plot are 2 graphical methods to assess if a set of data have a normal distribution (display “normality”). The Shapiro-Wilk test and the Kolmogorov-Smirnov test are 2 well-known and historically widely applied quantitative methods to assess for data normality. Parametric statistical tests make certain assumptions about the characteristics and/or parameters of the underlying population distribution upon which the test is based, whereas nonparametric tests make fewer or less rigorous assumptions. If the normality test concludes that the study data deviate significantly from a Gaussian distribution, rather than applying a less robust nonparametric test, the problem can potentially be remedied by judiciously and openly: (1) performing a data transformation of all the data values; or (2) eliminating any obvious data outlier(s).
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
Descriptive statistics and normality tests for statistical data
Descriptive statistics are an important part of biomedical research which is used to describe the basic features of the data in the study. They provide simple summaries about th...
Categorical Data Analysis
Preface. 1. Introduction: Distributions and Inference for Categorical Data. 1.1 Categorical Response Data. 1.2 Distributions for Categorical Data. 1.3 Statistical Inference for ...
Applied Regression Analysis and Other Multivariable Methods
1. CONCEPTS AND EXAMPLES OF RESEARCH. Concepts. Examples. Concluding Remarks. References. 2. CLASSIFICATION OF VARIABLES AND THE CHOICE OF ANALYSIS. Classification of Variables....
Biostatistics Primer: Part 2
Biostatistics is the application of statistics to biologic data. This article is the second part of a 2‐part series on the application of statistics in nutrition science. The fi...
Correlation Coefficients: Appropriate Use and Interpretation
Correlation in the broadest sense is a measure of an association between variables. In correlated data, the change in the magnitude of 1 variable is associated with a change in ...
Publication Info
- Year
- 2017
- Type
- review
- Volume
- 125
- Issue
- 4
- Pages
- 1375-1380
- Citations
- 155
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1213/ane.0000000000002370
- PMID
- 28787341