Moreover, most feature extraction methods are unsupervised, that is, the time series data are unlabeled. Where supervised methods are used, features are extracted based on their ability to predict some label, such as the future evolution of the time series. These features can subsequently be dealt with by the same methods used for steady state systems, such as principal component analysis, independent component analysis, kernel methods, etc. In dynamic PCA, first proposed by Ku et al. This approach implicitly estimates the autoregressive structure of the data e.
As functions of the model, the T 2 and Q -statistics will also be functions of the lag parameters. Since the mean and covariance structures are assumed to be invariant, the same global model is used to evaluate observations at any future time point. Although dynamic PCA is designed to deal with autocorrelation in the data, the resultant score variables will still be autocorrelated or even crosscorrelated when no autocorrelation is present [ 4 , 6 ]. Several remedies have been proposed to alleviate this problem, for example, wavelet filtering [ 7 ], ARMA filtering [ 6 ], and the use of residuals from predictive models [ 8 ].
Nonlinear PCA models have been considered by several authors [ 9 , 10 , 11 , 12 , 13 ]. Stefatos and Hamza [ 14 ] and Hsu et al. Nonlinear variants of these approaches have been investigated by Cai et al. Slow feature analysis [ 20 ] is an unsupervised learning method, whereby functions g x are identified to extract slowly varying features y t from rapidly varying signals x t. This is done virtually instantaneously, that is, one time slice of the output is based on very few time slices of the input.
Extensions of the method have been proposed by other authors [ 21 , 22 , 23 ]. Multiscale methods can be seen as a complementary approach preceding feature extraction from the time series.
Join Kobo & start eReading today
In this case, each process variable is extended or replaced by different versions of the variable at different scales. For example, with multiscale PCA, wavelets are used to decompose the process variables under scrutiny into multiple scale representations before application of PCA to detect and identify faulty conditions in process operations. In this way, autocorrelation of variables is implicitly accounted for, resulting in a more sensitive method for detecting process anomalies. Multiscale PCA constitutes a promising extension of multivariate statistical process control methods, and several authors have reported successful applications thereof [ 24 , 25 , 26 , 27 ].
Bakshi [ 28 , 29 ] has proposed the use of a nonlinear multiscale principal component analysis methodology for process monitoring and fault detection based on multilevel wavelet decomposition and nonlinear component extraction by the use of input-training neural networks. In this case, wavelets are first used to decompose the data into different scales, after which PCA was applied to the reconstituted time series data. Choi et al. With SSA, the time series is first embedded into a p -dimensional space known as the trajectory matrix.
Singular value decomposition is then applied to decompose the trajectory matrix into a sum of elementary matrices [ 32 , 33 , 34 ], each of which is associated with a process mode. Subsequently, the elementary matrices that contribute to the norm of the original matrix are grouped, with each group giving an approximation of the original matrix.
Finally, the smoothed approximations or modes of the time series are recovered by diagonal averaging of the elementary matrices obtained from decomposing the trajectory matrix. Nonetheless, it has not been used widely in statistical process monitoring as yet, although some studies have provided promising results [ 2 , 35 , 36 ].
Table 1 gives a summary of multiscale methods that have been considered in process monitoring schemes over the last two decades. In the latter case, the scores of the eigenvectors would represent an orbit or attractor with some geometrical structure, depending on the frequencies with which different regions of the phase space are visited.
A new multivariate statistical process monitoring method using principal component analysis
The topology of this attractor is a direct result of the underlying dynamics of the system being observed, and the changes in the topology are usually an indication of a change in the parameters or structure of the system dynamics. Therefore, descriptors of the attractor geometry can serve as sensitive diagnostic variables to monitor abnormal system behavior. For process monitoring purposes, the data captured in a moving window are embedded in a phase space, and descriptors such as correlation dimension [ 49 , 50 , 51 ], Lyapunov exponents, and information entropy [ 49 ] have been proposed to monitor deterministic or potentially chaotic systems.
- Account Options.
- Multivariate Statistical Process Control - CERN Document Server?
- Laser Cinematography of Explosions: Lectures Delivered during the Course on Experimental Methods in Mechanics October 1971.
- No. 3: Vascongada.
- Multivariate Statistical Process Control with mvMonitoring.
These approaches have not found widespread adoption in the industry yet, since the reliability of the descriptors may be compromised by high levels of signal noise. Process circuits or plants lend themselves naturally to representation by networks and process monitoring schemes can exploit this.
For example, Cai et al. The edges of the network were determined by means of kernel canonical correlation analysis a nonlinear approach to correlation relationships between sets of variables. Features were extracted from the variables based on the dynamic average degree of each vertex in the network. A standard PCA model, as described in Section 1. Case studies have indicated that this could yield considerable improvement in the reliability of the model to detect process disturbances. Any given sequence of numbers or time series can be characterized by similarity matrix containing measures of similarity e.
A recurrence matrix is generated by binary quantization of the similarity matrix, based on a user specified threshold value.
Multivariate Statistical Process Control with Industrial Applications
This thresholded matrix can be portrayed graphically as a recurrence plot, amenable to qualitative interpretation. The recurrence matrix, consisting of zeros and ones, can also be used as a basis to extract features that are representative of the dynamic behavior of the time series. This approach is widely referred to as recurrence quantification analysis, and in process engineering, it has mainly been used in the description of electrochemical phenomena and corrosion [ 53 , 54 , 55 , 56 , 57 , 58 ], but in principle has general applicability to any dynamic system.
More recent extensions of recurrence quantification analysis have been considered by using the unthresholded similarity matrix as a basis for feature extraction. This is also referred to as global, as opposed to local recurrence quantification described in Section 2. The resulting recurrence plot can consequently be treated as an artificial image amenable to analysis by a large variety of algorithms normally applied to textural images, as discussed in more detail in Section 4. Autocorrelated data can be addressed by fitting models to the data and analyzing the residuals, instead of the variables.
Apart from ARIMA models, other models, such as neural networks [ 60 , 61 , 62 ], decision trees [ 63 ], and just-in-time-learning with PCA [ 64 ], have also been proposed. If it is assumed that the data matrix X contains all the dynamic information of the system, then the use of predictive models can be viewed as an attempt to remove all the dynamic information from the system to yield Gaussian residuals that can be monitored in the normal way. State space models offer a principled approach for the identification of the subspaces containing the data.
This can be summarized as follows. State space models and their variants have been considered by several authors [ 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 ]. In principle, machine learning models are better able to deal with complex nonlinear systems than linear models, and some authors have considered the use of these approaches.
For example, Chen and Liao [ 62 ] have used a multilayer perceptron neural network to remove the nonlinear and dynamic characteristics of processes to generate residuals that could be used as input to a PCA model for the construction of simple monitoring charts. Guh and Shiue [ 63 ] have used a decision tree to detect shifts in the multivariate means of process data. Auret and Aldrich [ 48 ] have considered the use of random forests in the detection of change points in process systems.
Statistical process control of multivariate processes - Semantic Scholar
In addition, Aldrich and Auret [ 2 ] have compared the use of random forests with autoassociative neural networks and singular spectrum analysis in a conventional process monitoring framework. The application of deep learning in process monitoring is an emerging area of research that shows particular promising.
This includes the use of stacked autoencoders [ 76 ], deep long short term memory LSTM neural networks [ 77 ], and convolutional neural networks. Table 2 gives an overview of the feature extraction methods that have been investigated over the last few decades. Finally, as an example of the application of a process monitoring scheme incorporating feature extraction from time series data in a moving window, the following study can be considered.
It is based on the Tennessee Eastman benchmark process widely used in these types of studies. The feature extraction process considered here is an extension of recurrent quantitative analysis discussed in Section 2. Instead of using thresholded recurrence plots, unthresholded or global recurrence plots are considered, as explained in more detail in below.
The Tennessee Eastman TE process as proposed by Downs and Vogel [ 97 ] and has been used as a benchmark in numerous process control and monitoring studies [ 98 ]. It captures the dynamic behavior of an actual chemical process, the layout of which is shown in Figure 3. The plant consists of 5 units, namely a reactor, condenser, compressor, stripper and separator, as well as eight components four gaseous reactants A, C, D, E, one inert reactant B, and three liquid products F, G, H [ 97 ].
In this instance, the plant-wide control structure suggested by Lyman and Georgakis [ 99 ] was used to simulate the process and to generate data related to varying operating conditions. A total of four data sets were used, that is, one data set associated with NOC and the remaining three associated with three different faults conditions. The TE process comprises 52 variables, of which 22 are continuous process measurements, 19 are composition measurements, and the remaining 11 are manipulated variables.
These variables are presented in Table 3. The NOC samples were used to construct an off-line process monitoring model that consisted of a moving window of length b , moving sliding along the time series with a step size s. The three fault conditions are summarized in Table 4.
get link Fault conditions 3, 9, and 15 are the most difficult to detect, and many fault diagnostic approaches fail to do so reliably. In this case study, the approach previously proposed by Bardinas et al.
- Reward Yourself;
- Strategic Knowledge Management Technology.
- Nonnus of Panopolis in Context: Poetry and Cultural Milieu in Late Antiquity with a Section on Nonnus and the Modern World.
- Prologomena to the History of Israel;
- The Quack Doctor: Historical Remedies For All Your Ills.
The methodology can be briefly summarized as shown in Figure 4. Process monitoring methodology after Bardinas et al. A window of user defined length b slides along the time series A with a user defined step size s , yielding time series segments B , each of which can be represented by a similarity matrix C that is subsequently considered as an image from which features can be extracted via algorithms normally used in multivariate image analysis D.
Two sets of features were extracted from the similarity or distance matrices, namely features from the gray level co-occurrence matrices of the images, as well as local binary pattern features, as briefly discussed below. GLCMs assign distributions of gray level pairs of neighboring pixels in an image based on the spatial relationships between the pixels. From this matrix, various textural descriptors can be defined. Only four of these were used, as defined by Haralick et al. LBPs are nonparametric descriptors of the local structure of the image [ ]. The LBP operator is defined for a pixel in the image as a set of binary values obtained by comparing the center pixel intensity with its neighboring pixels.
If the neighboring pixel exceeds the intensity of the center pixel value, this pixel is set to 1 otherwise 0. Apart from the selection of a feature extraction method, one of the main choices that need to be made in the process monitoring scheme is the length of the sliding window. If this is too small, the essential dynamics of the time series would not be captured. On the other hand, if it is too large, it would result in a considerable lag before any change in the process can be detected. There is also the possibility that transient changes may go undetected altogether.
A new multivariate statistical process monitoring method using principal component analysis
In the case of a moving window, the step size of the moves also needs to be considered. The selection of these two parameters can be done by means of a grid search, and the results of which are shown in Figure 5. Skip to search form Skip to main content. View via Publisher. Alternate Sources. Save to Library. Create Alert. Share This Paper. Figures and Topics from this paper.