This chapter explains the statistical functions and their use in Microsoft Excel. The functions are described in Table 12-1. You might, however, need some statistical knowledge to interpret and use these functions.
Statistics addresses the following questions:
How is data presented?
What conclusions can be drawn from this data?
Statistical evaluations address extensive and complex data analyses with regard to:
Summarizing data with key measures.
Defining models to explain data patterns (such as correlation, regression, and multivariate analyses).
Defining underlying distributions to explain deviations and errors in data (such as binomial and Poisson distributions for random events and normal distribution—the Gaussian distribution curve—for measurement errors).
Evaluating observations to test whether they are random.
Defining models to allow for errors.
Table 12-1. Overview of the Statistical Functions
Function (Excel 2010, Excel 2007, and Excel 2003) | Description |
---|---|
AVEDEV() | Returns the average absolute deviation of data points from their mean |
AVERAGE() | Returns the average |
AVERAGEA() | Returns the average, including numbers, text, and logical values |
AVERAGEIF() | Average (arithmetic average) of all cells in a range that meet a specified criterion |
AVERAGEIFS() | Returns the average (arithmetic average) of all cells in a range that meet several criteria |
BETA.DIST() (2010)/BETADIST() | Returns the values of the cumulative beta distribution function |
BETA.INV() (2010)/BETAINV() | Returns the inverse of the cumulative beta distribution |
BINOM.DIST() (2010)/BINOMDIST() | Returns the probabilities of a binomial distributed random variable |
BINOM.INV() (2010)/CRITBINOM() | Returns the smallest value from the cumulative binomial distribution |
CHI.DIST() (2010) | Returns the right-tailed, chi-squared distribution value |
CHISQ.DIST.RT() (2010)/CHIDIST() | Returns the chi-squared distribution value |
CHISQ.INV() (2010) | Returns the inverse of the chi-squared distribution |
CHISQ.INV.RT() (2010)/CHIINV() | Returns the inverse of a right-tailed, chi-squared distribution |
CHISQ.TEST() (2010)/CHITEST() | Returns the independent test statistic of a chi-squared distribution |
CONFIDENCE.NORM() (2010)/CONFIDENCE() | Returns the confidence interval for a normal distribution |
CONFIDENCE.T() (2010) | Returns the confidence interval for a t-distribution |
CORREL() | Returns the correlation coefficient between two data sets |
COUNT() | Counts the numbers in an argument list |
COUNTA() | Counts the values in an argument list |
COUNTBLANK() | Returns the number of empty cells in a range |
COUNTIF() | Counts the cells in a range that meet a certain criterion |
COUNTIFS() | Counts the cells of a range that meet several criteria |
COVARIANCE.P()/COVAR() | Returns the covariance of a population |
COVARIANCE.S() (2010) | Returns the covariance of a sample |
DEVSQ() | Returns the sum of the squared deviations |
EXPON.DIST() (2010)/EXPONDIST() | Returns the probabilities of an exponential distribution |
F.DIST() (2010) | Returns the values of the F-distribution |
F.DIST.RT() (2010)/FDIST() | Returns the right-tailed values of the F-distribution |
F.INV() (2010) | Returns the inverse of the F-distribution |
F.INV.RT() (2010)/FINV() | Returns the right-tailed inverse of the F-distribution |
F.TEST() (2010)/FTEST() | Returns the test statistics of an f-test |
FISHER() | Returns the Fisher transformation |
FISHERINV() | Returns the inverse of the Fisher transformation |
FORECAST() | Returns the estimated value for a linear trend |
FREQUENCY() | Returns a frequency distribution as a single-column matrix |
GAMMA.DIST() (2010)/GAMMDIST() | Returns the probabilities of a gamma-distributed random variable |
GAMMA.INV() (2010)/GAMMAINV() | Returns the inverse of the gamma distribution |
GAMMALN() | Returns the natural logarithm of the gamma function |
GAMMALN.PRECISE() (2010) | Returns the natural logarithm of the gamma function with 15 places |
GEOMEAN() | Returns the geometric mean |
GROWTH() | Returns values based on an exponential trend |
HARMEAN() | Returns the harmonic mean |
HYPGEOM.DIST() (2010)/HYPGEOMDIST() | Returns the probabilities of a hypergeometric distribution |
INTERCEPT() | Returns the intersection of a linear regression line |
KURT() | Returns the kurtosis of a data set |
LARGE() | Returns the k-th highest value of a data group |
LINEST() | Returns the parameters of a linear trend |
LOGEST() | Returns the parameters of an exponential trend |
LOGNORM.DIST() (2010)/LOGNORMDIST() | Returns the lognormal distribution |
LOGNORM.INV() (2010)/LOGNORMINV() | Returns the inverse of the lognormal distribution |
MAX() | Returns the largest value from an argument list |
MAXA() | Returns the largest value from an argument list, including numbers, text, and logical values |
MEDIAN() | Returns the median |
MIN() | Returns the smallest value from an argument list |
MINA() | Returns the smallest value from an argument list, including numbers, text, and logical values |
MODE.SNGL() (2010)/MODE() | Returns the most frequent value occurring in a data set |
MODE.MULT() (2010) | Returns a vertical array of the most frequent or repeated values in an array or a data set |
NEGBINOMDIST() | Returns the probabilities of a negative binomial distributed random variable |
NORM.DIST() (2010)/NORMDIST() | Returns the probabilities of a normal distributed random variable |
NORM.INV() (2010)/NORMINV() | Returns the inverse of the normal distribution |
NORM.S.DIST() (2010)/NORMSDIST() | Returns the probabilities of a standard normal distributed random variable |
NORM.S.INV() (2010)/NORMSINV() | Returns the inverse of the standard normal distribution |
PEARSON() | Returns the Pearson correlation coefficient |
PERCENTILE() | Returns a percentile of a data set |
PERCENTILE.EXC() | Returns the k-quantile of a data group where k is between 0 and 1, excluding 0 and 1 |
PERCENTILE.INC() | Returns the k-quantile of a data group where k can be between 0 and 1, including 0 and 1 |
PERCENTRANK() | Returns the rank of a percentile as a percentage |
PERCENTRANK.EXC() (2010) | Returns the rank of a percentile as a percentage, excluding 0.1 |
PERCENTRANK.INC() (2010) | Returns the percentage rank of a value |
PERMUT() | Returns the number of permutations for the specified number of objects |
POISSON.DIST() (2010)/POISSON() | Returns the probabilities of a Poisson distributed random variable |
PROB() | Returns the probability that values in a range lie between two limits |
QUARTILE() | Returns the quartile of a data set |
QUARTILE.EXC() (2010) | Returns the quartile of a data set based on percentiles, excluding 0.1 |
QUARTILE.INC() (2010) | Returns the quartile of a data set based on percentiles, including 0.1 |
RANK() | Returns the rank of a number within a list of numbers |
RANK.AVG (2010) | Returns the rank of a number within a list of numbers: the relative value to other values within the list is important; if several values have the same rank, the average rank is returned |
RANK.EQ() (2010) | Returns the rank of a number within a list of numbers: the value is relative to other values in the list; if several values have the same rank, the top rank of this group of values is returned |
RSQ() | Returns the square of the Pearson correlation coefficient |
SKEW() | Returns the skewness of a distribution |
SLOPE() | Returns the slope of a linear regression line |
SMALL() | Returns the k-th lowest value of a data group |
STANDARDIZE() | Returns a normalized value |
STDEV.P() (2010)/STDEVP() | Calculates the standard deviation based on the population |
STDEV.S (2010)/STDEV() | Estimates the standard deviation based on a sample |
STDEVA() | Estimates the standard deviation based on a sample, including numbers, text, and logical values |
STDEVPA() | Calculates the standard deviation based on the population, including numbers, text, and logical values |
STEYX() | Returns the standard error of the predicted y-values for each x-value in a regression |
T.DIST() (2010) | Returns the left-tailed Student’s t-distribution |
T.DIST.RT() (2010) | Returns the right-tailed Student’s t-distribution |
T.DIST.2T() (2010)/TDIST() | Returns the two-tailed Student’s t-distribution |
T.INV() (2010) | Returns the left-tailed inverse of a Student’s t-distribution |
T.INV.2T() (2010)/TINV() | Returns the two-tailed inverse of a Student’s t-distribution |
T.TEST() (2010)/TTEST() | Returns the test statistics of a Student’s t-test |
TREND() | Returns values based on a linear trend |
TRIMMEAN() | Returns the mean of a data group, excluding the values from the top and bottom of the data set |
VAR.P() (2010)/VARP() | |
VAR.S() (2010)/VAR() | Calculates the variance based on a sample |
VARA() | Estimates the variance based on a sample, including numbers, text, and logical values |
VARPA() | Calculates the variance based on the population, including numbers, text, and logical values |
WEIBULL.DIST() (2010)/WEIBULL() | Returns the probabilities of a Weibull-distributed random variable |
ZTEST() | Returns the one-tailed probability test value for a Gauss (normal) distribution |
New in Excel 2007. The following functions from Table 12-1 were introduced in Excel 2007:
AVERAGEIF()
AVERAGEIFS()
COUNTIFS()
Many statistical functions were revised for Excel 2010. The accuracy and performance of these functions was enhanced in response to user feedback. The number and type of parameters, as well as the usage, is in some cases different from that of the previous versions. The names of some functions have also been changed to keep them in line with current naming convention.
To maintain compatibility with previous versions of Excel, the original functions are still available (in the new Compatibility category). The names of the modified functions were changed to include a period in the function name. However, the names don’t indicate which category a function belongs to or whether the name changed or the algorithm or function is new. The following lists will help you determine the function category.
This book focuses on the functions in Excel 2010. The older versions of functions are not explained in detail but only referenced.
New in Excel 2010. The following functions are available only in Excel 2010:
CHISQ.DIST()
CHISQ.INV
CONFIDENCE.T()
COVARIANCE.P()
COVARIANCE.S()
F.DIST
F.INV()
GAMMALN.PRECISE()
MODE.MULT()
PERCENTILE.EXC()
PERCENTILE.INC()
PERCENTRANK.EXC()
PERCENTRANK.INC()
QUARTILE.EXC()
QUARTILE.INC()
RANK.AVG()
RANK.EQ()
T.DIST()
T.DIST.RT()
T.INV()
The accuracy and processing speed of the following functions were improved:
BETA.DIST()
BETA.INV()
BINOM.DIST()
BINOM.INV()
CHISQ.INV.RT()
CHISQ.TEST()
F.DIST.RT()
F.INV.RT()
GAMMA.DIST()
GAMMA.INV()
GAMMALN()
GEOMEAN()
HYPGEOM.DIST()
LOGNORM.INV()
LOGNORM.DIST()
NEGBINOM.DIST()
NORM.DIST()
NORM.INV()
NORM.S.DIST()
POISSON.DIST()
LINEST()
STDEV.S()
T.DIST.RT()
T.DIST.2T()
T.INV.2T()
VAR.S()
The functions with name changes only are listed in Table 12-1.
One example is used to illustrate most of the functions in this chapter and to show how the functions address the two questions about Statistics posed at the beginning of the chapter.
Scenario. A small software company markets its products over the Internet through its website as well as through direct sales. The company regularly monitors its product sales over the web, as well as the general traffic to its website. Because the webpages serve different purposes, the company can capture valuable information about the visitors and their interests. The following data is available:
Daily visits to the website
Visits to the webpages (for general information, information about products and events, and so on)
Total sales per day for each product
Number of clicks after a newsletter has been sent
Cost and time required to maintain the website
Sales calls of each field representative per day
Sales of each field representative per month