Currently, extensive signal analysis is performed in order to evaluate structural health of turbomachinery blades. This approach is affected by constraints of time and the availability of qualified personnel. Thus, new approaches to blade dynamics identification that provide faster and more accurate results are sought after. Generally, modal analysis is employed in acquiring dynamic properties of a vibrating turbomachinery blade and is widely adopted in condition monitoring of blades. The analysis provides useful information on the different modes of vibration and natural frequencies by exploring different shapes that can be taken up during vibration since all mode shapes have their corresponding natural frequencies. Experimental modal testing and finite element analysis are the traditional methods used to evaluate mode shapes with limited application to real live scenario to facilitate a robust condition monitoring scheme. For a real time mode shape evaluation, rapid evaluation and low computational cost is required and traditional techniques are unsuitable. In this study, artificial neural network is developed to evaluate the mode shape of a lab scale rotating blade assembly by using result from finite element modal analysis as training data. The network performance evaluation shows that artificial neural network (ANN) is capable of mapping the correlation between natural frequencies and mode shapes. This is achieved without the need of extensive signal analysis. The approach offers advantage from the perspective that the network is able to classify mode shapes and can be employed in real time including simplicity in implementation and accuracy of the prediction. The work paves the way for further development of robust condition monitoring system that incorporates real time mode shape evaluation.
This paper discusses the idea of capturing an expert’s knowledge in the form of human understandable rules and then inserting these rules into a dynamic cell structure (DCS) neural network. The DCS is a form of self-organizing map that can be used for many purposes, including classification and prediction. This particular neural network is considered to be a topology preserving network that starts with no pre-structure, but assumes a structure once trained. The DCS has been used in mission and safety-critical applications, including adaptive flight control and health-monitoring in aerial vehicles. The approach is to insert expert knowledge into the DCS before training. Rules are translated into a pre-structure and then training data are presented. This idea has been demonstrated using the well-known Iris data set and it has been shown that inserting the pre-structure results in better accuracy with the same training.
Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.
The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.
With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.
In this paper we develop a model that couples Two Concurrent Convolution Neural Network with different filters (TC*CNN) for face recognition and compare its performance to an existing sequential CNN (base model). We also test and compare the quality and performance of the models on three datasets with various levels of complexity (easy, moderate, and difficult) and show that for the most complex datasets, edges will produce the most accurate and efficient results. We further show that in such cases while Support Vector Machine (SVM) models are fast, they do not produce accurate results.
Conventional reservoir prediction methods ar not sufficient to explore the implicit relation between seismic attributes, and thus data utilization is low. In order to improve the predictive classification accuracy of reservoir lithology, this paper proposes a deep learning lithology prediction method based on ResNet (Residual Neural Network) and SENet (Squeeze-and-Excitation Neural Network). The neural network model is built and trained by using seismic attribute data and lithology data of Shengli oilfield, and the nonlinear mapping relationship between seismic attribute and lithology marker is established. The experimental results show that this method can significantly improve the classification effect of reservoir lithology, and the classification accuracy is close to 70%. This study can effectively predict the lithology of undrilled area and provide support for exploration and development.
This paper describes the effects of photovoltaic voltage changes on Multi-level inverter (MLI) due to solar irradiation variations, and methods to overcome these changes. The irradiation variation affects the generated voltage, which in turn varies the switching angles required to turn-on the inverter power switches in order to obtain minimum harmonic content in the output voltage profile. Genetic Algorithm (GA) is used to solve harmonics elimination equations of eleven level inverters with equal and non-equal dc sources. After that artificial neural network (ANN) algorithm is proposed to generate appropriate set of switching angles for MLI at any level of input dc sources voltage causing minimization of the total harmonic distortion (THD) to an acceptable limit. MATLAB/Simulink platform is used as a simulation tool and Fast Fourier Transform (FFT) analyses are carried out for output voltage profile to verify the reliability and accuracy of the applied technique for controlling the MLI harmonic distortion. According to the simulation results, the obtained THD for equal dc source is 9.38%, while for variable or unequal dc sources it varies between 10.26% and 12.93% as the input dc voltage varies between 4.47V nd 11.43V respectively. The proposed ANN algorithm provides satisfied simulation results that match with results obtained by alternative algorithms.
The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.
Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.
Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.
In the last few years, Automatic Number Plate Recognition (ANPR) systems have become widely used in the safety, the security, and the commercial aspects. Forethought, several methods and techniques are computing to achieve the better levels in terms of accuracy and real time execution. This paper proposed a computer vision algorithm of Number Plate Localization (NPL) and Characters Segmentation (CS). In addition, it proposed an improved method in Optical Character Recognition (OCR) based on Deep Learning (DL) techniques. In order to identify the number of detected plate after NPL and CS steps, the Convolutional Neural Network (CNN) algorithm is proposed. A DL model is developed using four convolution layers, two layers of Maxpooling, and six layers of fully connected. The model was trained by number image database on the Jetson TX2 NVIDIA target. The accuracy result has achieved 95.84%.
Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.
The shift towards decision making (DM) based on artificial intelligence (AI) techniques will change the way in which consumer markets and our societies function. Through AI, predictive analytics is being used by businesses to identify these patterns and major trends with the objective to improve the DM and influence future business outcomes. This paper proposes an Artificial Neural Network (ANN) approach to predict the success of telemarketing calls for selling bank long-term deposits. To validate the proposed model, we uses the bank marketing data of 41188 phone calls. The ANN attains 98.93% of accuracy which outperforms other conventional classifiers and confirms that it is credible and valuable approach for telemarketing campaign managers.
This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.
Uncontrolled growth of abnormal cells in the lung in the form of tumor can be either benign (non-cancerous) or malignant (cancerous). Patients with Lung Cancer (LC) have an average of five years life span expectancy provided diagnosis, detection and prediction, which reduces many treatment options to risk of invasive surgery increasing survival rate. Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) for earlier detection of cancer are common. Gaussian filter along with median filter used for smoothing and noise removal, Histogram Equalization (HE) for image enhancement gives the best results without inviting further opinions. Lung cavities are extracted and the background portion other than two lung cavities is completely removed with right and left lungs segmented separately. Region properties measurements area, perimeter, diameter, centroid and eccentricity measured for the tumor segmented image, while texture is characterized by Gray-Level Co-occurrence Matrix (GLCM) functions, feature extraction provides Region of Interest (ROI) given as input to classifier. Two levels of classifications, K-Nearest Neighbor (KNN) is used for determining patient condition as normal or abnormal, while Artificial Neural Networks (ANN) is used for identifying the cancer stage is employed. Discrete Wavelet Transform (DWT) algorithm is used for the main feature extraction leading to best efficiency. The developed technology finds encouraging results for real time information and on line detection for future research.
The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.
Certain systems can function well only if they recognize the sound environment as humans do. In this research, we focus on sound classification by adopting a convolutional neural network and aim to develop a method that automatically classifies various environmental sounds. Although the neural network is a powerful technique, the performance depends on the type of input data. Therefore, we propose an approach via a slice bispectrogram, which is a third-order spectrogram and is a slice version of the amplitude for the short-time bispectrum. This paper explains the slice bispectrogram and discusses the effectiveness of the derived method by evaluating the experimental results using the ESC‑50 sound dataset. As a result, the proposed scheme gives high accuracy and stability. Furthermore, some relationship between the accuracy and non-Gaussianity of sound signals was confirmed.
Speech to text in Malay language is a system that converts Malay speech into text. The Malay language recognition system is still limited, thus, this paper aims to investigate the performance of ten Malay words obtained from the online Malay news. The methodology consists of three stages, which are preprocessing, feature extraction, and speech classification. In preprocessing stage, the speech samples are filtered using pre emphasis. After that, feature extraction method is applied to the samples using Mel Frequency Cepstrum Coefficient (MFCC). Lastly, speech classification is performed using Feedforward Neural Network (FFNN). The accuracy of the classification is further investigated based on the hidden layer size. From experimentation, the classifier with 40 hidden neurons shows the highest classification rate which is 94%.
The Bacterial Foraging Optimization (BFO) algorithm is inspired by the behavior of bacteria such as Escherichia coli or Myxococcus xanthus when searching for food, more precisely the chemotaxis behavior. Bacteria perceive chemical gradients in the environment, such as nutrients, and also other individual bacteria, and move toward or in the opposite direction to those signals. The application example considered as a case study consists in establishing the dependency between the reaction yield of hydrogels based on polyacrylamide and the working conditions such as time, temperature, monomer, initiator, crosslinking agent and inclusion polymer concentrations, as well as type of the polymer added. This process is modeled with a neural network which is included in an optimization procedure based on BFO. An experimental study of BFO parameters is performed. The results show that the algorithm is quite robust and can obtain good results for diverse combinations of parameter values.
Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.
Prediction of wall shear stress in a rectangular channel, with non-homogeneous roughness distribution, was studied. Estimation of shear stress is an important subject in hydraulic engineering, since it affects the flow structure directly. In this study, the Genetic Algorithm Artificial (GAA) neural network is introduced as a hybrid methodology of the Artificial Neural Network (ANN) and modified Genetic Algorithm (GA) combination. This GAA method was employed to predict the wall shear stress. Various input combinations and transfer functions were considered to find the most appropriate GAA model. The results show that the proposed GAA method could predict the wall shear stress of open channels with high accuracy, by Root Mean Square Error (RMSE) of 0.064 in the test dataset. Thus, using GAA provides an accurate and practical simple-to-use equation.
This paper aims at bringing a scientific contribution to the cardiac arrhythmia biomedical diagnosis systems; more precisely to the study of the amelioration of cardiac arrhythmia classification performance using artificial neural network, adaptive neuro-fuzzy and fuzzy inference systems classifiers. The purpose of this amelioration is to enable cardiologists to make reliable diagnosis through automatic cardiac arrhythmia analyzes and classifications based on high confidence classifiers. In this study, six classes of the most commonly encountered arrhythmias are considered: the Right Bundle Branch Block, the Left Bundle Branch Block, the Ventricular Extrasystole, the Auricular Extrasystole, the Atrial Fibrillation and the Normal Cardiac rate beat. From the electrocardiogram (ECG) extracted parameters, we constructed a matrix (360x360) serving as an input data sample for the classifiers based on neural networks and a matrix (1x6) for the classifier based on fuzzy logic. By varying three parameters (the quality of the neural network learning, the data size and the quality of the input parameters) the automatic classification permitted us to obtain the following performances: in terms of correct classification rate, 83.6% was obtained using the fuzzy logic based classifier, 99.7% using the neural network based classifier and 99.8% for the adaptive neuro-fuzzy based classifier. These results are based on signals containing at least 360 cardiac cycles. Based on the comparative analysis of the aforementioned three arrhythmia classifiers, the classifiers based on neural networks exhibit a better performance.