Dynamic Programming Based Operation of Reservoirs: Applicability and Limits / Edition 2 available in Hardcover
- Pub. Date:
- Cambridge University Press
Dynamic programming is a method of solving multi-stage problems in which decisions at one stage become the conditions governing the succeeding stages. It can be applied to the management of water reservoirs, allowing them to be operated more efficiently. This is one of the few books dedicated solely to dynamic programming techniques used in reservoir management. It presents the applicability of these techniques and their limits on the operational analysis of reservoir systems. The dynamic programming models presented in this book have been applied to reservoir systems all over the world, helping the reader to appreciate the applicability and limits of these models. The book also includes a model for the operation of a reservoir during an emergency situation. This volume will be a valuable reference to researchers in hydrology, water resources and engineering, as well as professionals in reservoir management.
About the Author
K. D. W. Nandalal is Senior Lecturer in the Department of Civil Engineering at the University of Peradeniya, Sri Lanka. His research interests include water resources systems analysis and reservoir water quality modelling.
Janos J. Bogardi is Director of the United Nations University Institute for Environmental and Human Security. He was co-editor of Risk, Reliability, Uncertainty, and Robustness of Water Resource Systems (Cambridge University Press 2002).
Read an Excerpt
Cambridge University Press
9780521874083 - Dynamic Programming Based Operation of Reservoirs - Applicability and Limits - by K. D. W. Nandalal and Janos J. Bogardi
1 Water resources management
The water resource has a major influence on human activities. It is a major input in almost all sectors of human endeavor. Water serves essential biological functions and no human can survive in its complete absence. Water’s contributions to human welfare include its role as a basic element of social and economic infrastructure. Also important are water’s natural attributes that contribute to human aesthetic enjoyment and general psychological welfare. But water also has negative impacts on human well-being. Floods, inundations, and water-borne diseases are also associated with water.
Water has played a major role in socio-economic development due to the magnitude and widespread occurrence of its positive and negative impacts. The quality of human life is directly dependent on how well these resources are managed. Water management activities are intended to enhance the positive contributions of water or control its negative impacts.
Ancient civilizations grew up in the river valleys of the Tigris and Euphrates, Nile, Indus, Yellow River, etc., where there was plenty of water. Water management activities, particularly irrigation, played a central role in thedevelopment of these civilizations. In those days the planning and management of the water resources were primarily for single uses. The continuing growth of the human population, especially since the nineteenth century, together with rapid industrial development and rising expectations of a better life necessitated more complex and consistent water resources management. These competing demands and uncontrolled use, along with the pollution of water, have made it a scarce resource.
Water resources problems are going to be more complex worldwide in the future (Simonovic, 2000). Population growth, climate variability, regulatory requirements, project planning horizons, temporal and spatial scales, social and environmental considerations, transboundary considerations, etc., all contribute to the complexity of water resources planning and management problems. Traditional engineering has gradually been overchallenged by the multitude of claims, constraints, and opportunities. Since the Second World War, systems analysis has emerged as one of the tools for solving such complex water resources management problems (Dantzing, 1963; Hillier and Lieberman, 1990; Loucks et al., 1981).
Systems analysis can generally be defined as a group of methods developed for identifying, describing, and screening a system, its performance and behavior under different conditions and with different goals to be pursued. It provides a decision maker with a broad information base about the system and gives the opportunity of estimating the system behavior to compare several feasible alternatives. In its process, a variety of initial assumptions, objectives, constraints, and decision variables are specified and their influence on the system operation is evaluated. Hence, systems analysis techniques can be very valuable tools for solving planning and operating tasks in water resources management based on the systematic and efficient organization and analysis of relevant information.
There are a number of terms which are used synonymously with the systems approach; these include systems engineering, operations research, operations analysis, management science, cybernetics, and policy analysis. Hall and Dracup (1970) defined systems engineering as the art and science of selecting, from a large number of feasible alternatives involving substantial engineering content, that particular set of actions which will accomplish the overall objectives of the decisions makers, within the constraints of law, morality, economics, resources, political and social pressures, and laws governing physical life and other natural sciences.
Together with the determination of physical elements of a system, the operation policy of the system is equally important in finding the best performance of the system to serve its purpose. The operation policy of a water resources system can be defined on a short-term or a long-term time base. This classification implies not only the time base (e.g., hourly or daily for short-term and monthly or seasonal for long-term operation) but also the uncertainty of the system and its components. For short-term operation, uncertainty may be neglected, and all the phenomena can be considered as deterministic. However, for long-term operation the stochasticity, inherent both in a system and in its environment, must not be neglected. The complexity of a system itself, together with the uncertainty of all the phenomena involved including the goals to be achieved, raises the need for effective methods for deriving such operation policies that would provide an expected “optimal” response of the system under a number of different conditions. A variety of methods in systems analysis or operations research have been developed for analyzing water resources systems. In general, systems analysis implies two basic strategies in operational assessment: simulation and optimization approaches.
Simulation is used to analyze the effects of proposed management plans: achievement regarding system performance is evaluated based on selected sets of decisions. By definition, the simulation method does not claim that a particular combination of decisions represents the optimal one. The difficulty inherent in this approach is the large number of feasible operation plans (combinations of decisions) to be checked. If simulation alone were used, the search for the “best” solution might not only be very tedious, but also could lead to alternatives far from the optimal one.
Optimization models are used to narrow down the search for promising combinations of decision variables. Optimization eliminates all the undesirable operation plans and proposes policies which are close to the global optimal solution. However, optimization usually relies on a very simple representation of a water resources system. Therefore, optimized alternatives may be further refined by applying simulation techniques. The most frequently used optimization techniques in water resources management can be classified into three major groups: (1) linear programming (LP), (2) dynamic programming (DP), and (3) nonlinear programming (NLP). This general classification, in addition to simulation models, represents the basic methods used in planning and management of water resources systems (Yeh, 1985). An extremely large number of simulation and optimization models providing a broad range of analysis capabilities for evaluating reservoir operations have been built over the past several decades. Wurbs (1993) sorted through these numerous models and reached a better understanding of which method might be the most useful in various types of decision support situations. Since most of the water resources systems display considerable nonlinearities and sequential nature, operational assessment – especially in the case of reservoirs – is usually based on DP. The more so, since DP lends itself to a relatively easy incorporation of stochasticity (Loucks et al., 1981).
1.2 ROLE OF RESERVOIRS
According to Takeuchi (2002), there are presently nearly 40 000 large reservoirs in the world impounding approximately 6000 km3 of water and inundating an aggregate area of 400 000 km2. Recent surveys show that this number increases at a rate of approximately 250 new reservoirs each year. These figures clearly reflect the fact that reservoirs, irrespective of their interference in the aquatic ecosystem of the respective watercourse, have a firmly established position in our striving to harness and manage the available water resources.
The history of man-made reservoirs can be traced back to antiquity. Perhaps at the beginning the “water reservoir” was no more than a huge tank to store water during the wet season for use during the dry season. Today, with the development of civilization, reservoirs can be found all over the world. The reservoirs can serve single or multiple purposes including hydropower generation, water supply for irrigation, industrial and domestic use, flood control, improvement of water quality, recreation, wildlife conservation, and navigation. The effective use of reservoir systems has become increasingly important. Next to the exigence of the rational use of a limited resource, a better-managed reservoir may make the physical extension of the system – to add new reservoirs – unnecessary. The operation of a single reservoir for a single function does not present many analytical problems, but the same is not true when a reservoir fulfils a number of potentially conflicting objectives or where several reservoirs are operated conjunctively. Through a global review of performance of dams/reservoirs, the World Commission on Dams (2000) presented an integrated assessment of when, how, and why dams/reservoirs succeed or fail in meeting development objectives.
Reservoir construction was most intensive during the period 1950–70 in many well-developed regions where river runoff was finally almost fully regulated. Subsequently, the rates of reservoir construction have decreased considerably although they are still high in those countries with rich natural resources of river runoff. This is caused partly by the increasing role of hydropower engineering where there are liquid and solid fuel deficits. In addition, reservoirs provide the greater part of the water consumed by industry, power stations, and agriculture. They are the basis for large-scale water management systems regulating river runoff as well as protecting populated areas from floods and inundations.
1.3 OPTIMAL RESERVOIR OPERATION
Reservoirs have to be best operated to achieve maximum benefits from them. For many years the rule curves, which define ideal reservoir storage levels at each season or month, have been the essential operational tool. Reservoir operators are expected to maintain these pre-fixed water levels as closely as possible while generally trying to satisfy various water needs downstream. If the levels of reservoir storage are above the target or desired levels, the release rates are increased. Conversely, if the levels are below the targets, the release rates are decreased. Sometimes operation rules are defined to include not only storage target levels, but also various storage allocation zones, such as conservation, flood control, spill or surcharge, buffer, and inactive or dead storage zones. Those zones also may vary throughout the year and the advised release range for each zone is provided by the rules. The desired storage levels and allocation zones mentioned above are usually defined based on historical operating practice and experience. Having only these target levels for each reservoir, the reservoir operator has considerable responsibility in day-to-day operation with respect to the appropriate trade-off between storage levels and discharge deviations from ideal conditions. Hence, such an operation requires experienced operators with sound judgment. Needless to say, predetermined operation rules have proven to be quite inflexible when dealing with unexpected situations.
To counteract the inefficiency in operating a reservoir system only by the “rule curves,” additional policies for operation have now been incorporated into most reservoir operation rules. These operation guidelines define precisely when conditions are not ideal (e.g., when maintenance of the ideal storage levels becomes impractical), and the decisions to be made for various combinations of hydrological and reservoir storage conditions. For some reservoir systems, this type of operation policy has already taken over the rule curves and is acting as the principal rule for reservoir operation.
Over the past several decades, increasing attention has been given to systems analysis techniques for deriving operation rules for reservoir systems. As the references reveal, the 1980s and 1990s were the most productive period in this respect. As a result, a variety of methods are now available for analyzing the operation of reservoir systems. In general, these techniques lead to models which can be classified into two categories: optimization models and simulation models. Simulation models can effectively analyze the consequences of various proposed operation rules and indicate where marginal improvements in operation policy might be made. However, the simulation technique is not very appropriate in selecting the best rule from the set of possible alternatives.
Optimization models can eliminate the clearly undesirable alternatives. Yeh (1985) reviewed the state-of-the-art of the mathematical models developed for reservoir operations. The alternatives that are found to be most promising based on optimization methods can then be further analyzed and improved using simulation techniques.
Although both optimization and simulation can be, and at times are, used independently to analyze an operational problem, they are essentially two complementary methods. In fact, optimization and simulation are used conjunctively to derive and to assess alternative operating strategies of single and multiple reservoir systems (e.g., Jacoby and Loucks, 1972; Mawer and Thorn, 1974; Gal, 1979; Karamouz and Houck, 1982, 1987; Stedinger et al., 1984; Tejada-Guibert et al., 1993; Harboe et al., 1995; Liang et al., 1996).
Linear programming (LP) and dynamic programming (DP) have been the most popular among the optimization models in deriving optimum operation rules for reservoir systems. Linear programming is concerned with solving problems in which all relations among the variables are linear, both in the constraints and in the objective function to be optimized. The fact that most of the functions encountered in problems with reservoir operation are nonlinear has been the main obstacle to the successful and practically relevant use of LP in this area. Although linearization techniques can be employed, this might not be satisfactory. The degree of the approximation required in the linearization process can seriously affect the reliability associated with this technique. However, LP has been used in optimal reservoir operation and the following are some applications: Gablinger and Loucks (1970), Roefs and Bodin (1970), Gilbert and Shane (1982), Shane and Gilbert (1982), Palmer and Holmes (1988), Randall et al. (1990), Reznicek and Simonovic (1990, 1992). Due to this favorable coincidence the authors are convinced that dynamic programming and its derivative techniques have a superior applicability to serve as the basis for the operation of real-world reservoir systems. Hence this book is dedicated to exploring this potential. More than two decades after the large-scale introduction of DP based reservoir operational studies, the time has come to review the development and to outline, supported with practical case studies, the vast field of applicability of DP based rules in reservoir system operation.
Dynamic programming, a method that breaks down a multidecision problem into a sequence of subproblems with few decisions, is ideally suited for time-sequential decision problems such as deriving operation policies for reservoirs.
1.4 CONVENTIONAL DYNAMIC PROGRAMMING
Dynamic programming is a technique used for optimizing a multistage process. It is a “solution-seeking” concept which replaces a problem of n decision variables by n subproblems having preferably one decision variable each. Such an approach allows analysts to make decisions stage-by-stage, until the final result is obtained. Hence the original problem needs to be decomposed into subproblems and each subproblem is referred to as a stage. This decomposition could be defined either in space or in time. Each stage is characterized by different system states expressed by the numerical value of selected state variable(s). Transition of the state from one stage to the next is expressed by a particular course of action (or the decision what to do), which is represented by a decision variable. Changes of the system’s state influenced by the decision taken at the previous stage are described by the state transformation equation. This transition of the state is possible only if certain rules are followed: both system state and decision variable can take values within particular domains. These limits form a set of constraints which must be met at every stage during the optimization process.
The computational routine for deriving the optimal policy follows Bellman’s recursive equation (Eq. 1.1), which is described diagrammatically in Figure 1.1. This can be solved by either moving forward (forward DP) or moving backward (backward DP) stage by stage.
For every state s at stage j the optimal policy is given by (subscripts denote backward computational procedure)
|Display matter not available in HTML version|
CSjXj = costs or contribution of the decision Xj given state Sj at the actual stage,
fj + 1* = accumulated suboptimal costs (or contribution) for following stages j + 1, j + 2, … , N,
N = total number of stages,
sj = system state at stage j,
sj + 1 = t(sj, xj) = state transformation equation,
j = stage, and
xj = decision taken at stage j.
In other words the above equation reflects Bellman’s principle of optimality. Generally, the DP procedure starts by setting the objective function’s value (cost or benefit) at the initial stage to zero, or any other arbitrary value. Subsequently, suboptimal policy derived at the last computational stage is
|Image not available in HTML version|
Figure 1.1 Basic structure of dynamic programming
actually the global optimum of the problem. The optimal policy can then be derived as a set of decisions, each of which is taken at a subsequent stage with respect to the corresponding suboptimal decisions derived at the preceding stage.
It is essential to point out that DP models require problem-specific formulations. This is due to differences that appear among the variety of problems that can be solved using DP: objective functions can have different forms; some problems have one and some of them can have several state variables; state transformation equations are not the same in all cases; decision variables can vary among different problems, etc.
1.5 INCREMENTAL DYNAMIC PROGRAMMING
Simultaneous derivation of operation policies for all the reservoirs in a multi-reservoir system is important, because the optimum conditions of the system cannot be investigated by considering reservoirs in isolation. In conventional DP, the state variables (reservoir storage) are normally discretized. Dense discretization is preferred over a coarse one to obtain an operation policy close to the global optimum. These two factors, simultaneous investigation of all the reservoirs of the system (state variables) and dense discretization of these state variables, exponentially increase the total number of state variables to be considered. This phenomenon is termed the “curse of dimensionality” of DP problems.
Larson (1968) introduced incremental dynamic programming (IDP), a successive approximation method, to overcome high dimensionality problems. This chapter presents the IDP technique. Several applications of the IDP technique in reservoir management are presented in subsequent chapters.
Incremental dynamic programming is one of the techniques used in alleviating the problems of excessive time and computer storage requirements. The general scheme of IDP procedure is concisely presented in Figure 1.2. IDP uses the recursive equation of DP to search for an improved trajectory starting with an assumed feasible solution, which can be visualized as a trial trajectory. The improved trajectory is then sought within the pre-specified range, defined as the “corridor.”
The computation cycle is complete when the search process has converged to the optimal trajectory according to a pre-specified convergence criterion. New iteration steps are needed as long as the convergence criterion is not satisfied. In the next iteration the locally improved trajectory obtained
|Image not available in HTML version|
Figure 1.2 Incremental dynamic programming optimization procedure
from the previous iteration serves as the new initial trial trajectory.
The IDP procedure begins with selection of a trial trajectory. A trajectory is the sequence of admissible transformations of the state vectors throughout the entire period of consideration. It also defines the initial value of the objective function. A trajectory is feasible if it satisfies all constraints. It is optimal if it is associated with the best possible achievement of the objective criterion of the system performance.
The basic idea behind the selection of an initial trajectory is to provide, for the search process for the optimal trajectory, both a starting point and a region called the “corridor” around the trial trajectory. The initial trial trajectory should therefore be feasible since it serves as the first approximation of the optimal trajectory.
The next step of the IDP procedure after determining an initial trial trajectory is construction of a corridor around it as shown in Figure 1.3.
The corridor specifies the values of state variables to be considered at each time step in the optimization process. For a given corridor, the difference between adjacent values of a state variable is the width of corridor. In general, a corridor for a single-reservoir system consisting of three state variables is defined symmetrically around the trial trajectory of state variable Sj as follows:
|Display matter not available in HTML version|
|Image not available in HTML version|
Figure 1.3 Construction of the corridor for IDP
|Display matter not available in HTML version|
UBC = upper bound of corridor,
LBC = lower bound of corridor,
Δ = half corridor width, and
Sj = state variable at the beginning of stage j based on the initial policy (first trial).
However, nonsymmetrical corridors may result if any of the boundaries of the corridor exceed the feasible state space of the system.
After the construction of a corridor around the trial trajectory, an improved trajectory and the corresponding objective function value within the corridor are sought. This is done by using the recursive equation of the conventional DP algorithm restricting computations of the state transformations to pre-fixed values of state variables within the pre-specified corridor.
Convergence behavior of the IDP algorithm to reach the global optimum is an essential issue. Selection of the feasible initial trial trajectory is entirely an arbitrary process. But in standard practice the initial corridor width is a coarse one. This technique follows the principle of choosing the initial corridor width sufficiently large to cover a considerable range of potential storage volume for the first cycle of the IDP procedure. The corridor width is decreased progressively as the iteration proceeds (Turgeon, 1982).
In general, the larger the initial corridor width around the initial trajectory, the smaller the number of iterations required to reach the optimal solution. Use of a large corridor width in earlier iterations is to ensure that the improved trajectories for such iterations are really obtained. Moreover, since the initial trajectory for any later iteration is the improved trajectory compared to the preceding one and it is closer to optimality, smaller corridor widths can be used in later iterations to search for the optimal trajectory.
The iterative process is then continued until a convergence criterion, explained later, is fulfilled. The objective function value obtained after termination of the IDP is considered as the optimum value. To ensure that the final solution is a true optimum value, a few more sets of IDP computation runs with different initial corridor widths may be attempted and the results compared to check whether the solution obtained remains the same.
According to the IDP procedure, each iteration of search for the improved trajectory results in a trajectory which is associated with a better value of the objective function than that of the trajectory for the preceding iteration. The convergence of the IDP solution exhibits a monotonic nature. Thus, a point convergence cannot be attained unless the cycle of computation is allowed infinitely. Therefore, the convergence criteria should be defined to limit the computer time used.
The iterative process of IDP is repeated until there is no further significant improvement of objective function value. As a criterion to terminate the computation, the following expression can be applied. That is, whenever
|Display matter not available in HTML version|
then the computation cycle should be terminated.
Here, OFi is the objective function’s value with respect to the set of constraints for iteration, i = 1, 2, 3, …
Instead of searching for the optimal solution over the entire state-space domain as in the classical DP, only three states of storage volume are involved in the analysis at any iteration in the case of a single reservoir. Similarly, IDP can tackle multiunit reservoir systems by taking a limited state space for every individual reservoir in the system. Thus, this technique can overcome the problem of dimensionality. Computer storage and computer time requirements can be reduced considerably.
1.6 STOCHASTIC DYNAMIC PROGRAMMING
Stochastic dynamic programming (SDP) is very common in reservoir operation. Since uncertainty is the inherent characteristic of water resources systems, it is often inadequate to opt for deterministic decision models, at both planning and operational stages.
The stochastic nature of inflows can be handled by two approaches: an implicit or an explicit approach. In the implicit approach, a time series model is used to generate a number of synthetic inflow sequences. The system is optimized for each streamflow sequence and operating rules are found by multiple regression. During the optimization the synthetic data series are considered as deterministic series. The implicit approach optimizes the system operation under a large number of streamflow sequences, at the expense of computer time.
The explicit approach considers the probability distribution of the inflows rather than specific flow sequences. This approach generates an operation policy comprising storage targets or release decisions for every possible reservoir storage and inflow state combination in each time step, rather than a mere single schedule of reservoir releases.
Future states or outcomes of any stochastic process such as rainfall and streamflow cannot be predicted with certainty. However, based on past performance, probability associated with any particular outcome may be estimated. Hydrological uncertainty of streamflows is explicitly taken into consideration in the explicit SDP models. These models incorporate discrete probability distributions in the optimization process. They describe the extent of uncertainty of future occurrences of streamflows and correlations of streamflows in time and space that may be present among streamflow time series to different reservoirs of the same water resources management system.
Assuming that the unconditional steady-state probability distributions for monthly streamflows are not changing from one year to the next, a Markov chain could be defined for streamflow. Since there are 12 months in a year there would be a lag-one Markov chain with 12 transitional probability matrices. The elements of it could be denoted as Pp,qj, the probability of occurrence of a streamflow class q in month ( j + 1) given a streamflow state p in month j. In the model presented, first order (lag-one) Markov chains are used to estimate the discrete conditional (transition) probabilities that represent the stochasticity inherent in streamflows. Discrete transition probabilities are estimated for a number of representative inflow values for each month, using the available historical streamflow records.
In a DP formulation of a reservoir operational problem, time periods are often considered as stages. The stored volumes of water in reservoirs at the beginning of the time periods represent the state of the system. The decisions to be taken at each stage are the quantities of water to be released. These can be implicitly identified by specifying the storage volumes at the next stage (identifying the storage volumes at the end of the time step considered). To incorporate the markovian nature of the streamflow, it is also defined as a state variable in SDP formulations. Therefore, a SDP formulation of a reservoir operational problem will have a two-dimensional state space representing the storage volumes and inflows to the reservoirs.
Use of SDP requires discretization of state variables and representation of them by a finite number of characteristic values. Sets of characteristic (representative) storage volumes and streamflows are chosen to cover the entire range of possible storage volumes and streamflows.
The domain of inflows, which must be wide enough to represent the entire range of potential inflows, is divided into a certain number of intervals or classes. These intervals or classes could be equally spaced or of variable size. In general, averages of the inflows that fall into these intervals are chosen as discrete values to represent inflow classes. The values represent the entire interval in the subsequent computations.
Means and variances of inflows during each month can be used to check whether they are reproduced by the discretization. If they are found to be not reproducing these statistics satisfactorily, a trial-and-error selection of the class margins and representative values may be used. Frequency diagrams can be of help in the selection procedure.
Interval (Sj,min, Sj,max) is divided into NS − 1 equally spaced storage intervals, where Sj,min and Sj,max are the minimum and maximum limits of live storage of the reservoir at the beginning of month j. NS is the number of reservoir space classes. Then the boundary values of these equally spaced intervals are used as discrete values of storage.
The backward stochastic dynamic programming algorithm (Loucks et al., 1981) is used for optimizing reservoir operation. The forward algorithm has no sense in the case of SDP, as the expectation over the future states has to be considered. The SDP optimization process derives the optimum operating strategy of the reservoir from Bellman’s backward recursive relationship:
|Display matter not available in HTML version|
B(Sj, Sj + 1, Ij) = cost or contribution of the decision Xj given state Sj at the initial stage,
Fj + 1n − 1 = accumulated suboptimal cost (or contribution) by optimal operation of the reservoir over the last n − 1 stages,
Ij = inflow during period j,
Pp,qj = transition probabilities of inflows (defined previously),
Sj = system state at stage j,
Sj + 1 = t(Sj,Xj) = state transformation equation,
j = stage, and
Xj = decision taken at stage j.
The outline of the SDP procedure is displayed in Figure 1.4.
The SDP procedure starts by initiating the values of the objective function at the last stage (a month in the future) to zero, or any other arbitrary value, for each combination of the discrete values of the two state variables at some time step in the future. Thereafter the process continues by traversing backwards along the temporal stages (i.e., months). The optimization consists of a number of iterations, each having 12 monthly stages representing one annual cycle. Usually one iteration cycle comprises 12 stages (months) of computation but more refined temporal stages (decades, etc.) can also be envisaged. The aggregate of the objective function’s expectation grows by setting its value at the beginning of each iteration (i.e., a year) to the respective accumulated value of the objective function at the end of the last stage of the previous iteration. After a few iterations, the increase in value for any state over a period of one year becomes constant
|Image not available in HTML version|
Figure 1.4 Flow diagram for the stochastic dynamic programming model
and independent of the state. This is the expected annual return from the operation of the system.
There are two criteria that determine the convergence.
(a) Stabilization of the expected annual increment of the optimum value obtained by Bellman’s recursive formula (Loucks et al., 1981).
During continued backward computation of the SDP algorithm, the optimum expected return for all possible initial states will be determined for each stage (month). When the expected return for a period of one year becomes constant for all state transformations in each stage (month), the convergence criterion of constant expected annual objective achievement is satisfied.
(b) Stabilization of the operation policy (Chow et al., 1975).
At each stage (month) of the SDP algorithm, an operation policy for that stage is determined. After continuing backward computation for a couple of annual cycles, a stable operation policy can be obtained. This implies that once stabilized the operation policy for a specific month will not change from year to year. When this condition is reached the convergence criterion of stabilization of the operation policy is achieved.
Operation policy designated by SDP is a set of rules specifying the storage level at the beginning of the next month for each combination of storage levels at the beginning of the current month and inflow during the current month. Due to the discrete nature of the SDP algorithm, the number of state transformations in any stage shows an exponential growth with increase of the number of state variables. A polynomial growth of the number of state transformations at each stage can be noted with increase of the number of state discretizations. This is reflected in the excessive computer time and memory requirements necessary to run a SDP model with a comparatively fine discretization of state variables.
1.7 DYNAMIC PROGRAMMING IN RESERVOIR OPERATIONS
Dynamic programming is an approach developed to solve sequential, or multistage, decision problems; hence the name “dynamic” programming. But this approach is equally applicable for decision problems where the sequential property is induced solely for computational convenience.
The DP technique is efficient in making a sequence of interrelated decisions. It is based on Bellman’s principle of optimality (Bellman, 1957): “An optimal policy has the property that whatever the initial state and the initial decisions are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.” That implies a sequential decision process in which a problem involving several variables is broken down into a sequence of simpler problems, each having preferably a single variable. DP is very well suited to studying reservoir operational problems. Since its development, the number of applications of DP in studying reservoir operational problems has increased enormously. The DP technique is not restricted to any particular problem structure. It can handle nonlinear objective functions and nonlinear constraints. For most reservoir problems, if DP is applied to determine reservoir releases, the state variable is storage, the decision variable is release and the stage is represented by time period.
Hall and Buras (1961) were the first to apply the DP technique in water resources systems analysis. They used DP to solve a problem of capacity allocation among several reservoir sites. Yakowitz (1982) presented state-of-the-art reviews with extensive lists of references on DP and its applications for several water resources problems and Yeh (1985) did the same for optimal reservoir operation. Models developed for solving reservoir operation problems can be classified by how they characterize the streamflow process. One group of models, called deterministic models, uses a specific sequence of streamflow – either historical or synthetically generated – in deriving operation rules. The other group of models, called stochastic models, uses a statistical description of the streamflow process instead of a specific streamflow sequence.
1.7.1 Deterministic dynamic programming based reservoir operation models
Meier and Beightler (1967) illustrated the applicability of DP in optimizing branched multistage systems in water resources planning. Hall and Shepard (1967) developed a DP-LP technique for optimizing a reservoir system in which the multiple-reservoir system is decomposed into a master-problem and subproblems. The master-problem could be seen as the task to be solved by a system coordinating agency and the subproblems by single-reservoir managers. In that work the subproblems were solved by DP. The schedule of releases and energy production were reported to the system coordinating agency which was modeled by LP.
Larson (1968) introduced the concept of incremental dynamic programming (IDP), putting DP into an iterative context. IDP uses the incremental concept for the state variables. Only a limited state space is considered for a given iteration run. It starts with a feasible initial solution, which can be visualized as a trajectory along the subsequent stages. Traditional DP is then applied in the neighborhood of this trajectory. At the end of each iteration step an improved trajectory is obtained, which is used as the trial trajectory for the next iteration step.
Considering only a limited state space vastly reduces computer time and memory requirements. However, the major setback of using this technique is the possibility of ending up at a local optimum (Turgeon, 1982). That can be avoided by starting with large increments to define the imaginary corridor around the actual trajectory and reducing them gradually as the iteration proceeds. Another way to avoid getting trapped at a local optimum is to repeat the iteration with different initial conditions. Finally, both approaches, i.e., varying increments and different starting solutions, can be coupled (Nandalal, 1986).
Heidari et al. (1971) systematized the use of IDP and referred to it as discrete differential dynamic programming (DDDP). Nopmongcol and Askew (1976) analyzed the difference between IDP and DDDP and concluded that DDDP is a generalization of IDP.
Trott and Yeh (1973) developed a method to determine the optimal planning of a reservoir system with cascade and parallel reservoir configurations. The policy was obtained by decomposing the original problem into a series of subproblems of one state variable each and by applying Bellman’s method of successive approximations in such a manner that the series of optimizations over the subproblems converge to a solution of the original problem. Each subproblem was analyzed using the DDDP technique.
Murray and Yakowitz (1979) developed a successive approximation dynamic programming technique using differential dynamic programming principles, constraining a sequential decision variable as applicable to multireservoir control problems in some cases. This approach is known as the constrained differential dynamic programming (CDDP) algorithm.
Karamouz and Houck (1987) formulated two dynamic programming models, one deterministic and one stochastic, to generate operating rules for a single reservoir. The deterministic model comprises a deterministic dynamic program, regression analysis, and simulation. The stochastic model is a stochastic dynamic program. It describes streamflow with a discrete lag-one Markov process. It was concluded that the deterministic model generated rules were effective in the operation of medium to very large reservoirs. The stochastic dynamic programming generated rules were more effective for the operation of small reservoirs.
Harboe (1987) applied DP to a system of reservoirs in which low-flow augmentation was the main purpose. The objective function used in the optimization is to maximize the minimum flow. A sequential optimization starting from upstream and considering one reservoir at a time is employed. The optimum results of one reservoir are used as the inputs to the downstream reservoir. The local optimum obtained was very close to the global optimum due to the high cross-correlation among monthly flows at different locations in the basin.
1.7.2 Stochastic dynamic programming based reservoir operation models
Under real-world conditions the time sequence of the streamflow time series or demands is not known in advance. Therefore, deterministic optimization models are often inadequate for effective water resources systems analysis due to the uncertainties inherent in the prediction of hydrological, economic, and other factors. The stochastic nature of the inflows can be handled by two approaches: implicit or explicit. In the implicit approach, a time series model is used to generate a number of synthetic inflow sequences. The system is optimized for each streamflow sequence and the operating rules are found by multiple regression. During the optimization the synthetic data series are considered as deterministic series.
Although the implicit approach can be easily adopted for single-reservoir optimization, numerous difficulties are encountered in applying it to multireservoir systems. The difficulty of obtaining a computationally manageable algorithm which derives the optimal results becomes much more severe when the streamflows into each reservoir are interdependent. In such a situation, complicated synthetic streamflow-generating models are used to obtain the cross-correlated streamflows into each of the reservoirs. The implicit approach optimizes the system operation under a large number of streamflow sequences, at the expense of computer time. It is therefore employed only for long-range planning purposes.
The explicit approach considers the probability distribution of the inflows rather than specific flow sequences. This approach generates an operation policy comprising storage targets or release decisions for every possible reservoir storage and inflow state in each time step, rather than a mere single schedule of reservoir releases.
Young (1967) proposed an implicit stochastic approach to optimize the operation of a single reservoir. He combined Monte Carlo simulation for synthetic streamflow generation, deterministic DP optimization, and regression analysis to derive the operating strategy which was expressed in terms of release as a function of initial storage volume in the reservoir and inflow during the time step.
Harboe et al. (1970) used deterministic DP to derive the optimal operation policy of a single reservoir serving multiple purposes: water supply, energy generation, flood and water quality control downstream of the reservoir. The last two purposes were considered as maximum storage and minimum downstream release constraints respectively, whereas the target water supply was incorporated as a parameter into the optimization procedure. By varying the level of the water supply target, successive DP optimizations were applied to obtain a family of the optimal operating trajectories with respect to the maximization of the firm energy production. The authors stressed the efficiency of the developed algorithm and suggested that it could easily be implemented as the optimization core of an implicit stochastic DP methodology.
Opricović and Djordjević (1976) presented an implicit SDP based algorithm for optimal long-term control of a single multipurpose reservoir with both direct and indirect users. The approach takes into account the fact that water already used for one purpose (direct user) can be utilized by another user located further downstream (indirect user). The developed optimization method maximizes the total benefit earned from the delivered water by applying DP at each of the three
© Cambridge University Press
Table of Contents
List of figures page vi
List of tables viii
Water resources management 1
Role of reservoirs 2
Optimal reservoir operation 3
Conventional dynamic programming 4
Incremental dynamic programming 4
Stochastic dynamic programming 6
Dynamic programming in reservoir operations 9
Developments in dynamic programming 13
Incremental dynamic programming in optimal reservoir operation 16
IDP in optimal reservoir operation: single reservoir 16
IDP in optimal reservoir operation: multiple-reservoir system 23
Stochastic dynamic programming in optimal reservoir operation 31
SDP in optimal reservoir operation: single reservoir 31
SDP in optimal reservoir operation: multiple-reservoir system 32
Some algorithmic aspects of stochastic dynamic programming 38
Optimal reservoir operation for water quality 59
IDP based models in reservoir operation for quality 60
The Jarreh Reservoir in Iran 63
Application of the models to the Jarreh Reservoir 65
Large-scale reservoirsystem operation 73
Use of dynamic programming in multiple-reservoir operation 73
Decomposition method 78
Composite reservoir model formulation 94
Implicit stochastic dynamic programming analysis 103
Disaggregation/aggregation techniques based on dynamic programming 106
Optimal reservoir operation for flood control 110
Feitsui Reservoir Project in Taiwan 110
Operational mode switch system between long-term and short-term operation 12
Development of SDP model for long-term operation 112
Operational mode switch system 118
Application and sensitivity analysis 121
Some remarks on operational mode switch system 123