


The interval is limited by the sampling error and by the variance of the population distribution. , Statistical Intervals: A Guide for Practitioners and Researchers, 2017. The tolerance interval is a bound on an estimate of the proportion of data in a population.Ī statistical tolerance interval a specified proportion of the units from the sampled population or process. What Are Statistical Tolerance Intervals? The range of common values for data is called a tolerance interval. Knowing the common range of values can help with setting expectations and detecting anomalies. Kick-start your project with my new book Statistics for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.įor example, if you have a sample of data from a domain, knowing the upper and lower bound for normal values can be helpful for identifying anomalies or outliers in the data.įor a process or model that is making predictions, it can be helpful to know the expected range that sensible predictions may take. That the tolerance interval for a data sample with a Gaussian distribution can be easily calculated.That a tolerance interval requires that both a coverage proportion and confidence be specified.That statistical tolerance intervals provide a bounds on observations from a population.In this tutorial, you will discover statistical tolerance intervals and how to calculate a tolerance interval for Gaussian data.Īfter completing this tutorial, you will know: Instead, a tolerance interval covers a proportion of the population distribution. It is also different from a confidence interval that quantifies the uncertainty of a population parameter such as a mean. A tolerance interval comes from the field of estimation statistics.Ī tolerance interval is different from a prediction interval that quantifies the uncertainty for a single predicted value. A bound on observations from a population is called a tolerance interval. These bounds can be used to help identify anomalies and set expectations for what to expect. It can be useful to have an upper and lower limit on data.
