In
statisticsStatistics is a branch of mathematics concerned with collecting and interpreting data. According to other definitions, it is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Statisticians improve the quality of data with the...
,
signal processingSignal processing is an area of electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time to perform useful operations on those signals...
, and time series analysis, a
sinusoidal model to approximate a sequence
Yi is:
where
C is constant defining a
meanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
level, α is an
amplitudeAmplitude is the magnitude of change in the oscillating variable, with each oscillation, within an oscillating system. For instance, sound waves are oscillations in atmospheric pressure and their amplitudes are proportional to the change in pressure during one oscillation...
for the
sine waveThe sine wave or sinusoid is a function that occurs often in mathematics, music, physics, signal processing, audition, electrical engineering, and many other fields...
, ω is the
frequencyFrequency is the number of occurrences of a repeating event per unit time. It is also referred to as temporal frequency.The period is the duration of one cycle in a repeating event, so the period is the reciprocal of the frequency....
,
Ti is a time variable, φ is the
phaseThe phase of an oscillation or wave is the fraction of a complete cycle corresponding to an offset in the displacement from a specified reference point at time t = 0. Phase is a frequency domain or Fourier transform domain concept, and as such, can be readily understood in terms of simple harmonic...
, and
Ei is the error sequence in approximating the sequence
Yi by the model. This sinusoidal model can be fit using nonlinear least squares; to obtain a good fit, nonlinear least squares routines may require good starting values for the constant, the amplitude, and the frequency.
Fitting a model with a single sinusoid is a special case of
least-squares spectral analysisLeast-squares spectral analysis is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis...
.
A good starting value for
C can be obtained by calculating the
meanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
of the data.
In
statisticsStatistics is a branch of mathematics concerned with collecting and interpreting data. According to other definitions, it is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Statisticians improve the quality of data with the...
,
signal processingSignal processing is an area of electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time to perform useful operations on those signals...
, and time series analysis, a
sinusoidal model to approximate a sequence
Yi is:
where
C is constant defining a
meanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
level, α is an
amplitudeAmplitude is the magnitude of change in the oscillating variable, with each oscillation, within an oscillating system. For instance, sound waves are oscillations in atmospheric pressure and their amplitudes are proportional to the change in pressure during one oscillation...
for the
sine waveThe sine wave or sinusoid is a function that occurs often in mathematics, music, physics, signal processing, audition, electrical engineering, and many other fields...
, ω is the
frequencyFrequency is the number of occurrences of a repeating event per unit time. It is also referred to as temporal frequency.The period is the duration of one cycle in a repeating event, so the period is the reciprocal of the frequency....
,
Ti is a time variable, φ is the
phaseThe phase of an oscillation or wave is the fraction of a complete cycle corresponding to an offset in the displacement from a specified reference point at time t = 0. Phase is a frequency domain or Fourier transform domain concept, and as such, can be readily understood in terms of simple harmonic...
, and
Ei is the error sequence in approximating the sequence
Yi by the model. This sinusoidal model can be fit using nonlinear least squares; to obtain a good fit, nonlinear least squares routines may require good starting values for the constant, the amplitude, and the frequency.
Fitting a model with a single sinusoid is a special case of
least-squares spectral analysisLeast-squares spectral analysis is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis...
.
Good starting value for C
A good starting value for
C can be obtained by calculating the
meanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
of the data. If the data show a
trendWhen a series of measurements of a process is treated as a time series, trend estimation is the application of statistical techniques to make and justify statements about trends in the data...
, i.e., the assumption of constant location is violated, one can replace
C with a linear or quadratic
least squaresThe method of least squares is applied to approximate solutions of overdetermined systems, i.e. systems of equations in which there are more equations than unknowns. Least squares is often applied in statistical contexts, particularly regression analysis....
fit. That is, the model becomes
or
Good starting value for frequency
The starting value for the frequency can be obtained from the dominant frequency in a
periodogramThe periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898 as in the following quote:...
. A complex demodulation phase plot can be used to refine this initial estimate for the frequency.
Good starting values for amplitude
A complex demodulation amplitude plot can be used to find a good starting value for the amplitude. In addition, this plot can indicate whether or not the amplitude is constant over the entire range of the data or if it varies. If the plot is essentially flat, i.e., zero slope, then it is reasonable to assume a constant amplitude in the non-linear model. However, if the slope varies over the range of the plot, one may need to adjust the model to be:
That is, one may replace α with a function of time. A linear fit is specified in the model above, but this can be replaced with a more elaborate function if needed.
Model validation
As with any
statistical modelA statistical model is a set of mathematical equations which describe the behavior of an object of study in terms of random variables and their associated probability distributions...
, the fit should be subjected to graphical and quantitative techniques of model validation. For example, a run sequence plot to check for significant shifts in location, scale, start-up effects, and outliers. A lag plot can be used to verify the
residualsIn statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of "deviation of a sample from the mean": the error of a sample is the deviation of the sample from the population mean or actual function, while the residual of a sample is the...
are independent. The outliers also appear in the lag plot, and a
histogramIn statistics, a histogram is a graphical display of tabulated frequencies, shown as bars. It shows what proportion of cases fall into each of several categories: it is a form of data binning. The categories are usually specified as non-overlapping intervals of some variable. The categories must...
and
normal probability plotThe normal probability plot is a graphical technique for normality testing: assessing whether or not a data set is approximately normally distributed....
to check for skewness or other non-
normalityIn probability theory and statistics, the normal distribution or Gaussian distribution is a continuous probability distribution that describes data that cluster around a mean or average. The graph of the associated probability density function is bell-shaped, with a peak at the mean, and is known...
in the residuals.
External links