rightsasa.blogg.se - R latin hypercube sampling

The sampling process takes very long on my (multicore) computer, so I'd like to run the sampling procedures in parallel (using multiple CPU cores simultaneously). see below) using the "clhs" package in R. The obtained values Y 1, Y 2, Y 3, Y 4, and Y 5 can be used for the determination of statistical characteristics (mean, standard deviation.).I am intending to extract representative samples from populations (a,b,c,d. Now, the investigated quantity Y= Y( X 1, X 2, X 3, X 4) is calculated five times. Inverse probabilistic transformation F –1 is then used for the determination X 1 from F 1,1, etc. Thus, in the first trial, variables X 1, X 2, X 3, and X 4 are assigned the values corresponding to the second, fourth, second, and fifth layers of their distribution functions, respectively. Similar operations are done for each variable. 4 for the fifth, corresponding to the numbers 0.56 - 0.25 - 0.83 - 0.17 and 0.30 in the column for X 1. 3 (with the highest number 0.83) for the first trial, layer no. Then, the layer numbers for variable X 1 (for example) for individual trials are assigned with respect to the order of random values (for X 1) ranked by size from the maximum to minimum. Thus, 5 × 4 = 20 random numbers with uniform distribution in interval (0 1) are generated (see the table in the left part of Fig. In our case, Ywill be calculated for five combinations of the four input quantities. Only five layers are used here for simplicity usually, several tens of layers are used. The application is illustrated on a case with four random quantities ( X 1, X 2, X 3, and X 4) and the definition interval of Fdivided into five layers ( Fig. Then, each input variable is assigned the value corresponding to the center of the pertinent layer of its distribution function. In practice, this is achieved by means of random numbers and their rank-ordering. In this way, various layers of the individual variables are always randomly combined. In each trial, the order numbers of layers are assigned randomly to the individual variables ( X 1, X 2., X m). N, the same for all variables, also corresponds to the number of trials (= simulation experiments). The definition interval of the distribution function Fof each of mvariables is divided into Nlayers. This is done by random assigning the order numbers of layers to the individual input quantities. If the output variable ydepends on several input quantities, x 1, x 2., x m, it is necessary that each quantity is assigned values of all layers and that the quantities and layers of individual variables are randomly combined. This approach is called stratified sampling. With reasonably high number of layers (tens or hundreds), the created quantity xwill have the proper probability distribution. The interval (0 1) is divided into several layers of the same width, and the xvalues are calculated via the inverse transformation ( F –1) for the Fvalues corresponding to the center of each layer. The difference is that LHS creates the values of Fnot by generating random numbers dispersed in chaotic way in the interval (0 1), but by assigning them certain fix values. The basic idea of LHS is similar to the generation of random numbers via the inverse probabilistic transformation (3) and Figure 2 shown in Chapter 15. This problem can appear especially if the output function depends on many input variables.Ī method called Latin Hypercube Sampling (LHS) removes this drawback. Sometimes, more numbers are generated in one region than in others, and the generated quantity has thus somewhat different distribution than demanded. Second, it can happen that the generated random numbers of distribution function F(which serves for the creation of random numbers with nonstandard distributions) are not distributed sufficiently and regularly in the definition interval (0 1). If the output quantity must be obtained by time-consuming numerical computations, the simulations can last a very long time, and the response surface method is not always usable. First, it usually needs a very high number of simulations. The Monte Carlo method has two disadvantages.