#### Short-Term Wind Speed Forecasting Using the Data Processing Approach and the Support Vector Machine Model Optimized by the Improved Cuckoo Search Parameter Estimation Algorithm

Journal of
Short-Term Wind Speed Forecasting Using the Data Processing Approach and the Support Vector Machine Model Optimized by the Improved Cuckoo Search Parameter Estimation Algorithm
Chen Wang 2
Jie Wu 1
Jianzhou Wang 0
Zhongjin Hu 2
Vida Maliene
0 School of Statistics, Dongbei University of Finance and Economics , Dalian 116025 , China
1 School of Mathematics and Computer Science, Northwest University for Nationalities , Lanzhou 730030 , China
2 School of Mathematics & Statistics, Lanzhou University , Lanzhou 730000 , China
Power systems could be at risk when the power-grid collapse accident occurs. As a clean and renewable resource, wind energy plays an increasingly vital role in reducing air pollution and wind power generation becomes an important way to produce electrical power. Therefore, accurate wind power and wind speed forecasting are in need. In this research, a novel short-term wind speed forecasting portfolio has been proposed using the following three procedures: (I) data preprocessing: apart from the regular normalization preprocessing, the data are preprocessed through empirical model decomposition (EMD), which reduces the effect of noise on the wind speed data; (II) artificially intelligent parameter optimization introduction: the unknown parameters in the support vector machine (SVM) model are optimized by the cuckoo search (CS) algorithm; (III) parameter optimization approach modification: an improved parameter optimization approach, called the SDCS model, based on the CS algorithm and the steepest descent (SD) method is proposed. The comparison results show that the simple and effective portfolio EMD-SDCS-SVM produces promising predictions and has better performance than the individual forecasting components, with very small root mean squared errors and mean absolute percentage errors.
1. Introduction
The demand for clean and renewable energy resources has
increased significantly since the acid emissions and air
pollution caused by burning fossil fuels have heavily polluted
the world environment. As a clean and renewable resource,
wind energy plays an increasingly vital role in energy supply
and wind power generation becomes an important way to
generate electrical power. However, the stochastic fluctuation
of wind makes it problematic to forecast [1–3]. Therefore,
ef fort to improve the accuracy of wind speed forecasting
continues so as to lower the possibility of the power-grid
collapse accident occurrence.
Wind speed forecasting is an important foundation and
prerequisite for the prediction of wind power generation. The
more accurate wind speed forecasting result can reduce wind
rotating equipment and operation cost and improve
limitation of wind power penetration. At the same time the precise
prediction of wind speed helps dispatching department
timely adjustments to the program, so as to reduce the impact
of wind power on the grid and effectively avoid the adverse
ef fect of wind farm on the power system, enhancing the
competitiveness of wind power in the electricity market.
In literature studies, statistically based and neural
network-based methods are two models pervasively used to
forecast the wind speed [4–7]. With the development of
artificial intelligent techniques, some artificial intelligent methods
have been presented, such as Artificial Neural Networks,
fuzzy logic methods, and support vector machine. Guo et al.
[8] presented a wind speed strategy based on the chaotic
time series modeling technique and the Apriori algorithm.
Barbounis et al. [9] employed three different types of neural
network (NN) models to forecast the hourly wind speed (up
to 3 days) in a wind park located on the Greek island of Crete.
However, there are several unknown parameters in the NN
model. Thus, many researchers have indicated the need to
optimize the parameters in the NN model to improve wind
speed forecasting accuracy. Wang and Hu [10] improved the
performance of the back propagation (BP) NN model in the
wind speed forecasting field by optimizing the parameters in
the BP model. Both models, that is, the statistical and the
NNbased models, have been used by Azad et al. [11] to solve the
long-term wind speed forecasting problem for two stations in
Malaysia. However, wind speed forecasting results obtained
by the neural network models are not always superior to
those obtained by other models. Chen and Yu [12] developed
a new model by integrating the unscented Kalman filter
with the support vector regression-based state-space model.
Comparison results indicated that the new proposed model
outperforms the NN model. Apart from the NN models,
the parameter optimization strategy has also been applied to
other wind speed forecasting models. Gani et al. [13]
proposed that firefly algorithm combines with SVM algorithm
for a problem of short-term wind speed forecast, where f iref ly
algorithm is used to optimize the parameters of SVMs and
successfully obtain the accuracy forecasting result. Compared
with artificial intelligent models, statistical approaches are
less expensive and intrusive and, hence, more practical in
forecasting wind power generation. Statistical models are
widely used to forecast model for short-term wind
forecasting, predicting wind conditions several hours in advance,
which is particularly useful for wind power generation [14].
But for the nonlinear wind speed time series is often not
satisfactory, especially in multistep prediction, and the error will
be significantly increased with the extension of the prediction
time. The new paradigm of big data stream mobile computing
is quickly gaining momentum [15], while wind speed
forecasting results have been applied to many different areas [16].
It is found that the existing wind speed forecasting
models have the following disadvantages: (
1
) some of the
existing models have taken no account of the randomness,
instability, and the large fluctuation of the wind speed data,
which may lead to a high forecasting error. Therefore, in this
research, a model based on the ensemble empirical mode
decomposition (EEMD) technique is utilized to adaptively
decompose the original wind speed data into a finite number
of intrinsic mode functions with a similarity property to
modeling. (
2
) The existing traditional parameter estimation
methods, such as the moment estimation or the likelihood
estimation, are not dynamic and need to solve some equations
with a great deal of calculations. Therefore, the artificial
intelligent parameter estimation method named the cuckoo
search (CS) algorithm is used in this paper to estimate the
unknown parameters in the forecasting model. (
3
) Though
some researchers applied the artificial intelligent parameter
estimation approaches to the parameter estimation, they
just adopted the original approach without considering the
deficiency of the approach. Thus, in this paper, the steepest
descent (SD) method is used to optimize the CS algorithm
so as to enhance the convergence rate. Based on the above
motivations, in this research, a new short-term wind speed
forecasting portfolio which not only can maintain the
characteristics of the wind speed data but can also automatically
estimate the unknown parameters in the forecasting model with
a considerable convergence rate has been proposed through
the following three procedures: (I) data preprocessing: apart
from regular normalization preprocessing, the data are
preprocessed through the EMD model, which reduces the effect
of the noise on the wind speed data; (II) artificially intelligent
parameter optimization introduction: the unknown
parameters in the support vector machine (SVM) model are
optimized by the cuckoo search (CS) algorithm; (III) parameter
optimization approach modification: although the original
CS algorithm is simple and efficient, it has disadvantages
such as insufficient search vigor and slow search speed during
the latter part of the search. Therefore, this paper proposes
an improved parameter optimization approach based on the
CS algorithm and the steepest descent (SD) method, which
is abbreviated as the SDCS model. The performance of the
developed EMD-SDCS-SVM model has been compared with
those obtained by the individual forecasting components
using the following two error evaluation criteria: the root
mean squared error and the mean absolute percentage error.
The paper is organized as follows: Section 2 introduces
related methodologies, Section 3 presents the simulation
examples and discussions, and the last section presents
concluding remarks.
2. Related Methodologies
2.1. Data Preprocessing Approach. Data preprocessing is a
common way to improve forecasting accuracy, especially for
data with high noise and different scales. This paper focuses
on handling these two problems by using the EMD model and
the normalization preprocessing approach, respectively.
2.1.1. Empirical Mode Decomposition Model. The EMD model
is an adaptive decomposition approach proposed by
Baccarelli et al. [15]. It is used in a wide range of applications,
especially in dealing with nonlinear time series. The EMD model
decomposes the original time series into several different
sequences with different scales (also called the intrinsic mode
function (IMF)) as well as a residual sequence. All IMFs must
satisfy two requirements:
(a) The number of extreme points (all maximum and
minimum points are included) must be equal to the
number of zero crossings or differ by no more than
one.
(b) In all cases, the average of the envelopes defined by
the local maxima and minima must be zero.
With the above two limitations, a signal sequence () can
be decomposed with the assistance of the EMD method [16]
through the following steps.
Step 1. Calculate all the local extrema (including all the
minimum and maximum values).
Step 2. Connect the local maxima by a cubic spline line to
generate the upper envelope and similarly produce the lower
envelope by connecting all the local minima with a cubic
spline interpolation, represented by upper and lower,
respectively.
Step 3. Calculate the average value of the two envelopes 1
by
1 =
( upper + lower) .
2
ℎ1 = ( ) −
1.
Step 4. Calculate the difference (ℎ1) between the data and 1
by
Step 5. Judge whether ℎ1 satisfies the two requirements of the
IMFs. If not, regard ℎ1 as the original signal sequence; then
ℎ11 = ℎ1 − 11. Repeat this process times until ℎ1 which
is calculated by ℎ1 = ℎ1(−1) − 1 is an IMF. The first IMF
sequence is obtained by
Step 6. Calculate the first residual sequence according to
Step 7. Regard 1 as the raw data and return to Step 1 to repeat
this procedure unless the final residue turns into either a
monotonic function or a function from which no more IMF
sequences can be extracted.
Finally, the original signal sequence is decomposed into
IMF1 = ℎ1 .
1 = ( ) − IMF1.
( ) = ∑IMF + .
=1
,,m,in∗
where denotes the penalty coefficient, and are two
slack variables, and is the tube size. Problem (
8
) can be
solved by introducing two Lagrange multipliers and ∗ and
minimizing the following Lagrange function [14]:
∗
2.1.2. Normalization Preprocessing. To improve the training
efficiency and the generalization ability of the SVM model,
normalization preprocessing is used to address the IMF
sequences obtained by the SVM model. Normalization
preprocessing is defined as follows:
processed =
−
max −
min ,
min
where and processed represent the original data sequence
and the preprocessed data sequence, respectively, and min
and max denote the minimum and the maximum data in the
original data sequence, respectively.
2.2. Support Vector Machine Model. The SVM model is the
core of statistical machine learning theories. It can surmount
difficulties that appear in the traditional machine learning
methods, such as the curse of dimensionality, easily falling
into local optima and overlearning. In addition, it has great
generalization ability [17]. Therefore, the SVM model has
long been an attractive tool with powerful capabilities in
solving classification and regression problems. In this paper,
we mainly focus on the SVM model for regression.
Suppose that there are in-sample data points (or
training samples) (
1, 1
), (
2, 2
), . . . , ( , ) where ∈
denotes the input vector and ∈ is the targeted output
corresponding to the input vector . T he main purpose of the
SVM for regression is to find a function () which satisfies
(
7
)
(
8
)
(
9
)
(
10
)
(
11
)
(
12
)
(
1
)
(
2
)
(
3
)
(
4
)
(
5
)
(
6
)
the following two requirements: (a) the deviation between
( ) and is no greater than a given positive real number ,
for all = 1, 2, . . . , , and (b) () is as flat as possible. In the
SVM algorithm, is defined by the formula
( ) =
( ) + ,
where : → is a nonlinear mapping, is the threshold
value, and the unknown coefficients and can be estimated
by solving the following optimization problem:
− ∗) (
−
∗
) (
, )
− ∑ (
=1
+
∗
) + ∑
=1
( −
∗)
− ∗) (
−
∗
) (
, )
∗
) + ∑
=1
( −
∗)
−
) = 0, 0 ≤
, ∗ ≤ .
= −
1
∑ (
2 ,=1
= −
1
∑ (
2 ,=1
− ∑ (
=1
where (⋅, ⋅) is called the kernel function. The following four
types of kernel functions are commonly used [18, 19]: (a)
linear kernel function: (, ) = , (b) polynomial kernel
function: (, ) = ( + 1) , (c) sigmoid kernel function:
(, ) = tanh( + ) , and (d) Gaussian kernel function:
(, ) = exp(−‖ − ‖ 2/2 2), where , , , and are kernel
parameters.
2.3.1. Original Cuckoo Search Algorithm. The CS algorithm
was f irst developed by Sun et al. [20] in 2007. It is derived
from the action of cuckoos laying their eggs in the nests of
other birds to let those birds hatch eggs for them. However,
once the host birds discover the cuckoo eggs, these eggs will
be thrown away or the host birds will abandon their nests and
rebuild a new nest elsewhere. The CS algorithm is constructed
based on three assumptions: (a) Only one egg is laid by each
cuckoo in a randomly selected nest; (b) The best nests will be
carried over to the following generations; and (c) The number
of available host nests is a constant, and the probability value
of an egg laid by a cuckoo being discovered by the host bird
is which has the range of 0 to 1.
In the CS algorithm, each nest represents a solution.
The pseudo code of the CS technique [21] presented in
Algorithm 1 can aid in understanding the CS process.
The Le´vy flight mentioned in the pseudo code of
Algorithm 1 is generated according to:
+1 =
+ ⊕ Le´vy ( ) ,
where > 0 is the step size, which should be related to
the scale of the problem of interest. The product ⊕ indicates
entry-wise multiplication location. A Le´vy flight is
considered when the step-lengths are distributed according to the
following probability distribution:
Le´vy ∼ =
−
, 1 < ≤ 3
which has an infinite variance. Here, the consecutive steps of
a cuckoo search essentially form a random walk process that
obeys a power-law step-length distribution with a heavy tail.
2.3.2. Modified Cuckoo Search Method. Similar to other
metaheuristic algorithms, the original CS algorithm is simple and
efficient; however, it has disadvantages such as insufficient
search vigor and slow search speed during the latter part of
the search. As one of the oldest optimization algorithms, the
steepest descent (SD) method [22] is simple and intuitive.
Currently, there are many effective optimization algorithms
(
13
)
(
14
)
established on the basis of this algorithm. In order to
overcome the CS’s shortcoming of slow convergence rate, the SD
method is used to modify the CS algorithm, and the modified
model is abbreviated as the SDCS model. In the SDCS model,
the following equation substitutes for (
13
):
where
is defined by
+1 =
+ ,
= −∇ (
) .
(
15
)
(
16
)
The SDCS process can be expressed by the following
procedures.
Step 1. Initialize the initial points 0, the end error > 0 , and
set = 0 .
Step 2. Calculate ∇( ). If ‖∇( )‖ = ‖ − ∇( )‖ ≤ ⊕
Le´vy() ≤ , terminate the iteration and output the value of
. Otherwise, go to Step 3.
Step 3. Set
= −∇ (
).
Step 4. Conduct one-dimensional search. Get the value of
which satisfied equation ( + ) = min≥0 ( +
then set +1 = + , fl + 1 , and return to Step 2.
);
The step size and step-length distribution function of the
CS algorithm can be improved by using steepest descent due
to its simplicity and flexibility. The final optimal solution
can be obtained by modifying the step size and step-length
distribution function constantly.
2.4. Proposed Novel Model. Based on the above
methodologies, we propose a novel short-term wind speed forecasting
portfolio with three steps (Figure 1): (I) data preprocessing:
both the regular normalization preprocessing model and
the EMD approach are used for data preprocessing, which
reduces the effect of noise and different scales on the wind
speed data; (II) artificially intelligent parameter optimization
Steepest descent modified the cuckoo search (1) Initialize
Select the initial points x0
(
2
) Calculate
∇f(xk) = −∇g(xik) =
⊕ Lévy() <
(
5
) Termination criteria
or go to Step 1
(
3
) Iteration
Iterate and output the
xk; if ∇f(xk) =
−∇g(xik) = ⊕
Lévy() < stop
iterating
(
4
) Modified
Constantly modify the step size
and the distribution of step-length
Lévy( ); get the optimum solution
introduction: the unknown parameters in the SVM model are
optimized by the CS algorithm; (III) parameter optimization
approach modification: although the original CS algorithm is
simple and efficient, it has disadvantages such as insufficient
search vigor and slow search speed during the latter part
of the search. Therefore, this paper proposes an improved
parameter optimization approach based on the CS algorithm
and the steepest descent (SD) method, which we call the
SDCS model. The final forecasting model is called the
EMDSDCS-SVM model.
The performance of SVM depends on a good set of
parameters, including the penalty parameter and the
parameter of the kernel function. The parameter adjustment and
selection of support vector machine is still a difficult issue in
the research field. The generalization performance of support
vector machine is closely related to the selection of specif ic
parameters in the model. The parameter of penalty coefficient
and kernel parameters must be selected by the users.
However, in practical applications, the forecasting complexity
control is more difficult, because the parameters of and
must be adjusted simultaneously.
(
1
) The Penalty Coefficient . The penalty coefficient is to
balance the model between the complexity and the training
error, so that the model has better extending ability.
Furthermore, the parameter can control the robustness of the
forecasting model. The different training groups have different
optimal values. For forecasting problems, if the parameter
is smaller, the punishment for miscalculation samples in the
sample data is smaller. As a result, the training error becomes
larger, and the system’s generalization ability is poorer. When
new data is forecasted by the model, the f itting error will
be very high, and the phenomenon of “less learning” will
appear. On the contrary, if the parameter is too large, the
weight of (1/2)‖‖ 2 will be smaller. Although the fitting
error of the available data is very low, the fitting error of the
new data is also very high. It is the so-called “overlearning”
phenomenon. The generalization ability of the model is still
very poor. Each sample data group has at least one suitable ,
which makes the SVM generalization performance the best.
Therefore, the correct choice of parameter can improve the
prediction accuracy of the model.
(
2
) T he Kernel Function . For the kernel function of the SVM,
the linear kernel function, polynomial kernel function, radial
basis kernel function, and sigmoid function are usually the
most used. The width of radial basis function is the same to
all kernel functions, and can reflect corresponding width
of inner product kernel for input. If is too small, it will
lead to overfitting or memory of the training group. If is
too large, it will make SVM discriminant function too gentle.
Width of kernel function and the penalty coefficient affect
the shape of prediction curve of the support vector machine
from different angles. In practical applications, too large or
Modified
cuckoo
search
method
Get the fitness
function
Optimization of the step size and the probability distribution of step size Lévy( )
too small penalty coefficient and kernel function will
make the generalization performance of the support vector
machine worse.
Based on the analysis of influences of each parameter
on the performance of SVM, we put forward the time
series forecasting model by using modified cuckoo search
(SDCS) algorithm to optimize SVM parameters. It not only
maintains the characteristics of time series, but also can select
the parameters of SVM automatically, which eliminates the
blindness and randomness caused by artificial selection. The
main procedures of this EMD-SDCS-SVM are as follows.
Procedure 1. Collect wind speed time series data. Use the
EMD to preprocess the wind speed data and reconstitute the
new wind speed time series, which will be treated as the
training sample of the SVM model.
Procedure 2. Determine the range of and , the maximum
step (stepmax), the minimum step (stepmin), and the
maximum number of iterations max. Set the probability of an egg
laid by a cuckoo being discovered by the host bird as =
0.25, and initialize the number of the host nests as = 25 .
Each nest corresponds to a two-dimensional vector (, ) .
Procedure 3. Search the optimum value of the
two-dimensional vector (, ) according to the SDCS algorithm, and the
detailed steps that need to be implemented in this procedure
are shown in Figure 2.
Procedure 4. Use the optimum parameter values obtained in
Procedure 3 and the processed data obtained in Procedure 1
to construct the forecasting model and obtain the forecasting
results.
3. Simulation Examples and Discussions
3.1. Data Division and Parameter Initialization. Wind speed
data recorded by four wind turbines (numbered #1, #2, #3, and
#4) during the period from Jan 2, 2011, to Jan 6, 2011, with a
time resolution of 10 minutes are used to verify the
effectiveness of the new proposed hybrid model. The data from Jan 2
to Jan 5 are adopted as the in-sample data (i.e., training data),
while those on Jan 6 are used as the out-of-sample data (i.e.,
testing data).
Step 1. The original wind speed series are decomposed into a
high-frequency component and a low-frequency component,
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
Jan 7
Time (10 min) (a) Actual wind speed series {X}
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
Jan 7
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
Jan 7
(b) High-frequency noise reduction based on soft-threshold denoising method
Wind speed after noise reduction
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
Jan 7
(c) Wind speed series after noise reduction {XD}
Actual data
Forecasting data (e) Forecasting result for wind speed
M
V
S
S
C
D
S
D
M
E
IMF1
IMF2
IMF3
IMF4
IMF5
IMF6
which represents the noise signal and main features of the
wind speed series (see Figures 3(a)–3(c)).
Step 2 (data splitting and normalization). The available wind
speed series after noise reduction are split into the training set
and the test set, which are denoted by including input sets and
output sets for training parameters of SVM and consisting
of inputs and outputs for the testing model’s forecasting
effectiveness, respectively. To establish the model, the training
datasets and the input test sets are normalized with the same
setting (see Figure 3(d)).
Step 3 (initialization: a SVM with two parameters). The
penalty coefficient and the kernel function are shown in
Figure 3(e). The number of connection weights of the SVM
is the size of the cities in the SDCS algorithm, namely, the
dimension of the optimized parameters.
Step 4 (optimization). The objective function of the SDCS
algorithm is given as follows:
+1 =
+ ⊕ Le´vy ( ) .
(
17
)
Step 5 (SVM construction). The best solution obtained by the
SDCS algorithm is set to be the final connection weights of
SVM training and construction. The terminal condition of
network training is set to be the reach of maximum iterations
or no further improvement (see Figures 3(d)–3(e)).
Step 6 (EMD-SDCS-SVM construction for the test dataset).
The forecasting data of the output test sets are generated by
importing the input test sets based on the established optimal
SVM (see Figure 3(e)).
Step 7 (evaluation). The quality of the EMD-SDCS-SVM is
assessed by the indices SDCS and SVM, which presents the
validity and informativeness of EMD, respectively. With the
aim of comprehensive evaluation, MAPE is calculated as well.
To employ the methodologies introduced in Section 2
of this paper, the parameters contained in the models are
initialized as follows: in the CS algorithm, the number of the
host nests is initialized as = 25 , and the probability of an
egg laid by a cuckoo being discovered by the host bird is given
as = 0.25. T he Gaussian kernel function is chosen for the
SVM method. In the GA algorithm, the maximum number
of iterations is initialized as 50, and the population size is 100.
The probability of cross is 0.3 and the probability of mutation
is 0.1. When the CS algorithm and GA are adapted for SVM
16
14
12
10
8
6
4
optimization, the search interval of the penalty coefficient
is set to [0.1, 100], while the search interval of the kernel
function is set to [0.01, 1000].
3.2. Data Preprocessing Results. Wind speed data are first
preprocessed by the EMD method. Figure 4 shows the IMFs
and residue results obtained by the EMD method for the
four wind turbines. As indicated in Figure 4, for the #2 and
#3 wind turbines, 7 IMF sequences are extracted from the
original wind speed training dataset, while 6 IMF sequences
are extracted for the other two wind turbines. According to
the principle of denoising, eliminating the high-frequency
sequence from the IMF sequences can assist in obtaining
cleaner data sequence, that is, data sequence with lower noise.
For this paper, the first IMF sequence obtained by the EMD
method is eliminated from the original data sequence to
improve the accuracy of wind speed forecasting. The
visualization of the denoise preprocessing of the EMD method of
the four wind turbines is shown in Figure 5. T he f inal results
after denoise processing with the EMD method and the
normalization operation are also presented in Figure 5.
3.3. Forecasting Results. To validate the effectiveness of the
EMD-SDCS-SVM model in wind speed forecasting, the
model is used to forecast wind speed with four horizons:
1-step-ahead, 2-step-ahead, 4-step-ahead, and 6-step-ahead.
The forecasting results obtained by this model are compared
with those obtained by the nonparameterization method
EMD-SVM, the unmodified parameterization method
EMDCS-SVM, and another parameterization method,
EMD-GASVM, where GA is the abbreviation for the Genetic
Algorithm [23].
Figure 6 presents the forecasting results of the four
EMDbased models. In this figure, the wind speed data in the center
of the circular rings with the value of 0 is the smallest, while
the bigger the radius, the larger the wind speed value. The
difference of the radius between each adjacent two circular rings
is 5. As shown in Figure 6, the forecasting results obtained by
these EMD-based models fit the actual wind speed data best
when the forecasting horizon is 1-step-ahead, while the fit is
the worst in the 6-step-ahead situation; that is, the deviation
between the wind speed data forecast by the models and
the actual wind speed data becomes larger as the forecasting
horizon increases. In addition, the EMD-SVM and the
EMDGA-SVM methods deviate much more significantly from the
actual data when compared to the other models.
In addition, the forecast results obtained by these models
are analyzed according to the Quantile-Quantile (Q-Q) plot.
The quantile corresponding to a datum () means that
approximately a decimal fraction of the data can be
found below the datum. The quantile is calculated in the
following manner: sort the data in a sequence { }=1,2,..., in
an ascending order. T he sorted data{ ⟨⟩ }=1,2,..., have rank
= 1, 2, . . . , . Then, the quantile value for the datum ⟨⟩
is computed by
=
− 0.5
(
18
)
T he 0.25, 0.5, and 0.75 quantiles are called the lower
quantile, the median, and the upper quantile, respectively.
T he Q-Q plot is used to compare the quantiles of two samples.
If the two samples come from the same type of distribution,
the plot will be a straight line. A straight reference line that
passes through the lower quantile and the upper quantile is
helpful for assessing the Q-Q plot. T he greater the distance
from this reference line, the more likely it is that the two
samples come from populations with different distributions.
The vertical and the horizontal axes of the Q-Q plot are the
estimated quantiles from the two samples, respectively. If the
sizes of these two samples are the same, the Q-Q plot is just
a plot of the sorted data in the first sample against the sorted
data in the second sample. As an example, Figure 7 provides
an empirical Q-Q plot of the quantiles of the actual wind
speed sequence versus the quantiles of the forecast data for
the #4 wind turbine, where represents the actual wind
speed data sequence, and 1 , 2 , 3 , and 4 denote the wind
21
ad lise20
-eeaph taun10
s q1 0
t
1 Y 5
ad lise20
-eeaph taun10
s q1 0
t
2 Y 5
ad lise20
-eeaph taun10
s q1 0
t
4 Y 0
ad lise20
t-eeaph taqun10
s 1 0
6 Y 0
X versus Y1 10 X quantiles 10
speed data sequences forecasted by the EMD-SVM model,
the EMD-CS-SVM model, the EMD-GA-SVM model, and
the EMD-SDCS-SVM model, respectively. The straight line
shown in each subplot is just the extrapolated line which
joins the lower and the upper quantiles, and the vertical axis
and the horizontal axis in each subplot are the estimated
quantiles from the corresponding forecast data sequence and
the actual data sequence. As observed from Figure 7, the
forecast values sometimes are larger than the actual values
(corresponding to the plus symbol located above the straight
line), while sometimes they are smaller than the actual values
(corresponding to the plus symbol located below the straight
line). Figure 7 also reveals that the EMD-SDCS-SVM model
fits the actual wind speed data best when compared to the
other three models.
3.4. Forecasting Error Comparison. Results presented in
Section 3.2 provide graphical visualization of the performance
of the different forecasting models. In this section, the
superior performance of the EMD-SDCS-SVM model is shown
quantitatively. To do this, two error evaluation criteria named
the root mean squared error (RMSE) and the mean absolute
percentage error (MAPE) are adopted and defined as follows:
RMSE = √
where is the number of data points in the out-of-sample data
and and ̂ are the actual value and the forecasted value,
respectively.
From Table 1 and Figure 8, it can be seen that compared
to the ANN forecasting models, the SVM models perform
favorable forecasting accuracy; in particular in four- and
sixstep-ahead forecasting result, the SVM is superior to the
ANN model of BPNN, Elman NN, and WNN.
The forecasting error results with different forecasting
horizons of these 4 models are given in Table 2 and Figure 9.
As observed from Table 2 and Figure 9, the forecasting error
values become larger as the forecasting horizon increases. For
the #1, #2, and #4 wind turbines, the EMD-SDCS-SVM model
always obtains more accurate wind speed forecasting results
than the other three models. In addition, for the #3 wind
turbine, the EMD-SDCS-SVM model is superior to both the
EMD-SVM and the EMD-CS-SVM models, which means
that the proposed novel model EMD-SDCS-SVM has made
promising predictions and has better performance than its
individual forecasting components.
4. Conclusions
Wind speed forecasting plays a significant part in the
economy and security of wind farm systems’ operation. Accurate
forecasting results have significant influence on the economy.
Recently, academia and industry have paid more attention
to wind speed forecasting. More accurate forecasting could
reduce costs and risks, improve the security of power systems,
and help administrators develop an optimal action program,
IAM 04 288 90 37 86 0% 79 12 1 9 29 4 04 80 558 64
1
7 0 6 5
R .12 .6 .156 .37 .180 .728 .177 .78 .132 .76 .20 .79 .135 .37 .72 .104
A
3 1 1 7 4 1 9
M 13 10 74 58 78 19 6 6 6 3 1 4 4 4 88 0
4 .2
1 4 7
PB .19 11 .906 .88 .526 .137 .178 .172 .1266 .139 .12294 .1704 .2899 .1372 .3890 .1288
6 9 8 4 8 7 8
SVM .3700 .726 .334 .356 .384 .357 .59 .19 396 .19 8 1
9 7 9 8
7 .36 .64 .2 7 .5
7 8 .0 6 .5 6 .0 9
0 0 0 0 0 1
e
p
E E E E E E E E
try SE P SE P SE P SE P SE P SE P SE P SE P
A A A A A A A A
ro M M M M M M M M
r M M M M M M M M
E
tu 1# 2# 3# #4 1# 2# 3# #4
d
d
a
e
h
a
The result of forecasting accuracy
Construct the training
set and the test set
Training set
Test set
· · ·
· · ·
· · ·
· · ·
Multi-ANN compared with SVM
Accuracy
Stability
Support vector machine
The forecasting accuracy of each turbine (MSE)
T1
T2
T3
TN
3
2
1
0
·
·
·
#1
#3
#1
#3
4
3
2
1
0
c in
e b
r r
fo tu 1# 2# 3# #4 1# 2# 3# #4
t d
e
r W
e
f
f
i
s
t
l
n
d o
z
h i
t r
i o
w H
u M
s
g D
n M
i
t E
s
a
M
e M
l
E
b
a
T M
M
D
M
E
M
M
E
M
M
E
M
M
E
M
E
M
M
E
M
E
W u
t
p
e
t
s
4
p
e
t
s
6
d
a
e
h
a
rE ty M AM M AM M AM M AM M AM M AM M AM M AM
e
d n
n i 1 2 3 4 1 2 3 4
i rb # # # # # # # #
EMDSVM
One-step
Two-step EMDGASVM
EMDCSSVM
EMD
SDCSSVM
Four-step
Six-step EMDSVM
Six-step
Four-step EMDGASVM
EMDCSSVM
EMD
SDCSSVM
Two-step
One-step
MAPE
#1
#3
MSE
#1
#3
EMDSVM
One-step
Two-step EMDGASVM
EMDCSSVM
Four-step
Six-step
EMD
SDCSSVM EMDSVM
One-step
Two-step EMDGASVM
EMDCSSVM
Four-step
Six-step
EMD
SDCS
SVM
thereby enhancing the economic social benefits of
powergrid management. Therefore, it is highly desirable to develop
techniques for wind speed forecasting to improve accuracy.
However, individual models do not always achieve a desirable
performance. The proper selection method of a hybrid model
can reduce certain negative effects that are inherent to each
of these individual models; moreover, the hybrid forecasting
model can make full use of the advantages of each of the
individual models and is less sensitive, in certain cases, to
the factors that make the individual models perform in an
undesirable manner.
In this paper, to enhance the forecasting capacity of the
proposed combined model, consisting of three procedures,
the data preprocessing procedure, the artificial intelligent
parameter optimization introduction procedure, and the
parameter optimization approach modification procedure
were integrated. The SVM model used in this paper can
handle data with nonlinear features, and the SD technique
is adopted to enhance the convergence speed of the CS
algorithm, which is utilized to optimize the parameters in the
SVM model. T he ef fectiveness and robustness of the proposed
approach has been successfully tested by the real wind speed
data sampled at four wind turbines. Based on the Q-Q plot
and the error comparison, results show that the developed
portfolio EMD-SDCS-SVM has made promising predictions
and has better performance than its individual forecasting
components despite very small MAPE and MSE values. For
instance, the average MAPE values of the combined model
were 0.7138%, 1.0281%, 4.8394%, 0.9239%, and 7.3367%,
which are lower than those of BPNN, WNN, and Elman
NN. By improving forecasting accuracy and stability, in the
wind farm, a large amount of money and energy could be
saved. The hybrid model can be applied to forecast the wind
speed that can be used in wind power scheduling to produce
various benefits, saving on economic dispatching, reducing
production costs, and reducing the spinning reserve capacity
of electrical power system. This model is also useful for
supporting wind farm decision making in practice. The
combined forecasting model, which has high precision, is a
promising model for use in the future. In addition, this hybrid
model can be utilized in other forecasting fields, such as
product sales forecasting, tourism demand forecasting, early
warning and flood forecasting, and traffic-flow forecasting.
Competing Interests
The authors declare that they have no competing interests.
Advances in
ns Research
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
Journal of
Algebra
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
bability
and
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
The Scientiifc
World Journal
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
International Journal of
Combinatorics
Hindawi Publishing Corporation
ht p:/ www.hindawi.com
Submit your manuscr ipts
Mathematics
Mathematical
Pro
blems
gineering
Discrete
Nature
and Society
International
Journal of
Mathematics
and
Mathematical
Sciences
Journal of
Journal of
[1] A. Tascikaraoglu , B. M. Sanandaji , K. Poolla , and P. Varaiya , “ Exploiting sparsity of interconnections in spatio-temporal wind speed forecasting using Wavelet Transform,” Applied Energy , vol. 165 , pp. 735 - 747 , 2016 .
[2] M. Lydia , S. Suresh Kumar , A. I. Selvakumar , and G. E. P. Kumar , “ Linear and non-linear autoregressive models for short-term wind speed forecasting,” Energy Conversion and Management , vol. 112 , pp. 115 - 124 , 2016 .
[3] Z. Men , E. Yee , F.-S. Lien , D. Wen , and Y. Chen , “ Short-term wind speed and power forecasting using an ensemble of mixture density neural networks , ” Renewable Energy , vol. 87 , pp. 203 - 211 , 2016 .
[4] H. Liu , H.-Q. Tian , Y.-F. Li , and L. Zhang , “ Comparison of four Adaboost algorithm based artificial neural networks in wind speed predictions,” Energy Conversion and Management , vol. 92 , pp. 67 - 81 , 2015 .
[5] H. Liu , H. Q. Tian , X. F. Liang , and Y. F. Li , “ New wind speed forecasting approaches using fast ensemble empirical model decomposition, genetic algorithm , Mind Evolutionary Algorithm and Artificial Neural Networks,” Renewable Energy , vol. 83 , pp. 1066 - 1075 , 2015 .
[6] Q. Hu , R. Zhang , and Y. Zhou , “ Transfer learning for short-term wind speed prediction with deep neural networks , ” Renewable Energy , vol. 85 , pp. 83 - 95 , 2016 .
[7] H. Liu , H.-Q. Tian , X.-F. Liang , and Y.-F. Li , “ Wind speed forecasting approach using secondary decomposition algorithm and Elman neural networks , ” Applied Energy , vol. 157 , pp. 183 - 194 , 2015 .
[8] Z. Guo , D. Chi , J. Wu , and W. Zhang, “ A new wind speed forecasting strategy based on the chaotic time series modelling technique and the Apriori algorithm , ” Energy Conversion and Management , vol. 84 , pp. 140 - 151 , 2014 .
[9] T. G. Barbounis , J. B. Theocharis , M. C. Alexiadis , and P. S. Dokopoulos, “Long-term wind speed and power forecasting using local recurrent neural network models , ” IEEE Transactions on Energy Conversion , vol. 21 , no. 1 , pp. 273 - 284 , 2006 .
[10] J. Wang and J. Hu , “ A robust combination approach for shortterm wind speed forecasting and analysis-combination of the ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression) model ,” Energy, vol. 93 , pp. 41 - 56 , 2015 .
[11] H. B. Azad , S. Mekhilef , and V. G. Ganapathy, “Long-term wind speed forecasting and general pattern recognition using neural networks , ” IEEE Transactions on Sustainable Energy , vol. 5 , no. 2 , pp. 546 - 553 , 2014 .
[12] K. Chen and J. Yu , “ Short-term wind speed prediction using an unscented Kalman filter based state-space support vector regression approach ,” Applied Energy, vol. 113 , pp. 690 - 705 , 2014 .
[13] A. Gani , K. Mohammadi , S. Shamshirband , T. A. Altameem , D. Petkovic´, and S. Ch , “ A combined method to estimate wind speed distribution based on integrating the support vector machine with firefly algorithm,” Environmental Progress & Sustainable Energy , vol. 35 , no. 3 , pp. 867 - 875 , 2016 .
[14] Y. J. Lin , U. Kruger , J. Zhang , Q. Wang , L. Lamont , and L. El Chaar, “ Seasonal analysis and prediction of wind energy using random forests and ARX model structures , ” IEEE Transactions on Control Systems Technology , vol. 23 , no. 5 , pp. 1994 - 2002 , 2015 .
[15] E. Baccarelli , N. Cordeschi , A. Mei , M. Panella , M. Shojafar , and J. Stefa , “ Energy-efficient dynamic traffic offloading and reconfiguration of networked data centers for big data stream mobile computing: review, challenges, and a case study,” IEEE Network , vol. 30 , no. 2 , pp. 54 - 61 , 2016 .
[16] D. Liu , D. Niu , H. Wang , and L. Fan , “ Short-term wind speed forecasting using wavelet transform and support vector machines optimized by genetic algorithm,” Renewable Energy , vol. 62 , pp. 592 - 597 , 2014 .
[17] S. J. Watson , L. Landberg , and J. A. Halliday , “ Application of wind speed forecasting to the integration of wind energy into a large scale power system , ” IEE Proceedings: Generation, Transmission and Distribution , vol. 141 , no. 4 , pp. 357 - 362 , 1994 .
[18] N. E. Huang , Z. Shen , S. R. Long et al., “ T he empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , ” Proceedings of the Royal Society of London A , vol. 454 , no. 1971 , pp. 903 - 995 , 1998 .
[19] Y. Gan , L. Sui , J. Wu , B. Wang , Q. Zhang , and G. Xiao, “ An EMD threshold de-noising method for inertial sensors , ” Measurement , vol. 49 , no. 1 , pp. 34 - 41 , 2014 .
[20] H.-X. Sun , N.-N. Zhao , and X.-H. Xu , “ Text region localization using wavelet transform in combination with support vector machine , ” Journal of Northeastern University (Natural Science) , vol. 28 , no. 2 , pp. 165 - 168 , 2007 .
[21] M. Bouzerdoum , A. Mellit , and A. M. Pavan , “ A hybrid model (SARIMA-SVM) for short-term power forecasting of a smallscale grid-connected photovoltaic plant,” Solar Energy , vol. 98 , pp. 226 - 235 , 2013 .
[22] X. Wang , J. Wen , Y. Zhang , and Y. Wang , “ Real estate price forecasting based on SVM optimized by PSO,” Optik , vol. 125 , no. 3 , pp. 1439 - 1443 , 2014 .
[23] X.-S. Yang and S. Deb , “Cuckoo search via Le´vy flights,” in Proceedings of the World Congress on Nature & Biologically Inspired Computing (NaBIC '09) , pp. 210 - 214 , Coimbatore, India, December 2009 .
Volume 2014 Volume 2014 Volume 2014