International Science Index

196
10010776
Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients
Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Paper Detail
88
downloads
195
10010599
River Stage-Discharge Forecasting Based on Multiple-Gauge Strategy Using EEMD-DWT-LSSVM Approach
Abstract:
This study presented hybrid pre-processing approach along with a conceptual model to enhance the accuracy of river discharge prediction. In order to achieve this goal, Ensemble Empirical Mode Decomposition algorithm (EEMD), Discrete Wavelet Transform (DWT) and Mutual Information (MI) were employed as a hybrid pre-processing approach conjugated to Least Square Support Vector Machine (LSSVM). A conceptual strategy namely multi-station model was developed to forecast the Souris River discharge more accurately. The strategy used herein was capable of covering uncertainties and complexities of river discharge modeling. DWT and EEMD was coupled, and the feature selection was performed for decomposed sub-series using MI to be employed in multi-station model. In the proposed feature selection method, some useless sub-series were omitted to achieve better performance. Results approved efficiency of the proposed DWT-EEMD-MI approach to improve accuracy of multi-station modeling strategies.
Paper Detail
119
downloads
194
10010374
Specific Emitter Identification Based on Refined Composite Multiscale Dispersion Entropy
Abstract:
The wireless communication network is developing rapidly, thus the wireless security becomes more and more important. Specific emitter identification (SEI) is an vital part of wireless communication security as a technique to identify the unique transmitters. In this paper, a SEI method based on multiscale dispersion entropy (MDE) and refined composite multiscale dispersion entropy (RCMDE) is proposed. The algorithms of MDE and RCMDE are used to extract features for identification of five wireless devices and cross-validation support vector machine (CV-SVM) is used as the classifier. The experimental results show that the total identification accuracy is 99.3%, even at low signal-to-noise ratio(SNR) of 5dB, which proves that MDE and RCMDE can describe the communication signal series well. In addition, compared with other methods, the proposed method is effective and provides better accuracy and stability for SEI.
Paper Detail
136
downloads
193
10009991
Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values
Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Paper Detail
308
downloads
192
10010012
Analysis of Image Segmentation Techniques for Diagnosis of Dental Caries in X-ray Images
Abstract:

Early diagnosis of dental caries is essential for maintaining dental health. In this paper, method for diagnosis of dental caries is proposed using Laplacian filter, adaptive thresholding, texture analysis and Support Vector Machine (SVM) classifier. Analysis of the proposed method is compared with Otsu thresholding, watershed segmentation and active contouring method. Adaptive thresholding has comparatively better performance with 96.9% accuracy and 96.1% precision. The results are validated using statistical method, two-way ANOVA, at significant level of 5%, that shows the interaction of proposed method on performance parameter measures are significant. Hence the proposed technique could be used for detection of dental caries in automated computer assisted diagnosis system.

Paper Detail
281
downloads
191
10009852
Early Recognition and Grading of Cataract Using a Combined Log Gabor/Discrete Wavelet Transform with ANN and SVM
Abstract:
Eyes are considered to be the most sensitive and important organ for human being. Thus, any eye disorder will affect the patient in all aspects of life. Cataract is one of those eye disorders that lead to blindness if not treated correctly and quickly. This paper demonstrates a model for automatic detection, classification, and grading of cataracts based on image processing techniques and artificial intelligence. The proposed system is developed to ease the cataract diagnosis process for both ophthalmologists and patients. The wavelet transform combined with 2D Log Gabor Wavelet transform was used as feature extraction techniques for a dataset of 120 eye images followed by a classification process that classified the image set into three classes; normal, early, and advanced stage. A comparison between the two used classifiers, the support vector machine SVM and the artificial neural network ANN were done for the same dataset of 120 eye images. It was concluded that SVM gave better results than ANN. SVM success rate result was 96.8% accuracy where ANN success rate result was 92.3% accuracy.
Paper Detail
326
downloads
190
10009891
Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine
Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Paper Detail
343
downloads
189
10009491
Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies
Abstract:

Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.

Paper Detail
275
downloads
188
10009167
Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine
Abstract:

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Paper Detail
522
downloads
187
10008539
Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments
Abstract:
With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.
Paper Detail
1077
downloads
186
10008834
Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach
Abstract:
With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.
Paper Detail
825
downloads
185
10008509
Trabecular Bone Radiograph Characterization Using Fractal, Multifractal Analysis and SVM Classifier
Abstract:
Osteoporosis is a common disease characterized by low bone mass and deterioration of micro-architectural bone tissue, which provokes an increased risk of fracture. This work treats the texture characterization of trabecular bone radiographs. The aim was to analyze according to clinical research a group of 174 subjects: 87 osteoporotic patients (OP) with various bone fracture types and 87 control cases (CC). To characterize osteoporosis, Fractal and MultiFractal (MF) methods were applied to images for features (attributes) extraction. In order to improve the results, a new method of MF spectrum based on the q-stucture function calculation was proposed and a combination of Fractal and MF attributes was used. The Support Vector Machines (SVM) was applied as a classifier to distinguish between OP patients and CC subjects. The features fusion (fractal and MF) allowed a good discrimination between the two groups with an accuracy rate of 96.22%.
Paper Detail
449
downloads
184
10008279
Investigation of New Gait Representations for Improving Gait Recognition
Abstract:

This study presents new gait representations for improving gait recognition accuracy on cross gait appearances, such as normal walking, wearing a coat and carrying a bag. Based on the Gait Energy Image (GEI), two ideas are implemented to generate new gait representations. One is to append lower knee regions to the original GEI, and the other is to apply convolutional operations to the GEI and its variants. A set of new gait representations are created and used for training multi-class Support Vector Machines (SVMs). Tests are conducted on the CASIA dataset B. Various combinations of the gait representations with different convolutional kernel size and different numbers of kernels used in the convolutional processes are examined. Both the entire images as features and reduced dimensional features by Principal Component Analysis (PCA) are tested in gait recognition. Interestingly, both new techniques, appending the lower knee regions to the original GEI and convolutional GEI, can significantly contribute to the performance improvement in the gait recognition. The experimental results have shown that the average recognition rate can be improved from 75.65% to 87.50%.

Paper Detail
427
downloads
183
10008369
Integrated ACOR/IACOMV-R-SVM Algorithm
Abstract:
A direction for ACO is to optimize continuous and mixed (discrete and continuous) variables in solving problems with various types of data. Support Vector Machine (SVM), which originates from the statistical approach, is a present day classification technique. The main problems of SVM are selecting feature subset and tuning the parameters. Discretizing the continuous value of the parameters is the most common approach in tuning SVM parameters. This process will result in loss of information which affects the classification accuracy. This paper presents two algorithms that can simultaneously tune SVM parameters and select the feature subset. The first algorithm, ACOR-SVM, will tune SVM parameters, while the second IACOMV-R-SVM algorithm will simultaneously tune SVM parameters and select the feature subset. Three benchmark UCI datasets were used in the experiments to validate the performance of the proposed algorithms. The results show that the proposed algorithms have good performances as compared to other approaches.
Paper Detail
449
downloads
182
10008614
Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information
Abstract:
Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.
Paper Detail
1001
downloads
181
10008142
Development of the Academic Model to Predict Student Success at VUT-FSASEC Using Decision Trees
Abstract:

The success or failure of students is a concern for every academic institution, college, university, governments and students themselves. Several approaches have been researched to address this concern. In this paper, a view is held that when a student enters a university or college or an academic institution, he or she enters an academic environment. The academic environment is unique concept used to develop the solution for making predictions effectively. This paper presents a model to determine the propensity of a student to succeed or fail in the French South African Schneider Electric Education Center (FSASEC) at the Vaal University of Technology (VUT). The Decision Tree algorithm is used to implement the model at FSASEC.

Paper Detail
538
downloads
180
10008417
Investigation of Wave Atom Sub-Bands via Breast Cancer Classification
Abstract:

This paper investigates successful sub-bands of wave atom transform via classification of mammograms, when the coefficients of sub-bands are used as features. A computer-aided diagnosis system is constructed by using wave atom transform, support vector machine and k-nearest neighbor classifiers. Two-class classification is studied in detail using two data sets, separately. The successful sub-bands are determined according to the accuracy rates, coefficient numbers, and sensitivity rates.

Paper Detail
478
downloads
179
10007588
Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images
Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Paper Detail
488
downloads
178
10007186
Moving Object Detection Using Histogram of Uniformly Oriented Gradient
Abstract:

Moving object detection (MOD) is an important issue in advanced driver assistance systems (ADAS). There are two important moving objects, pedestrians and scooters in ADAS. In real-world systems, there exist two important challenges for MOD, including the computational complexity and the detection accuracy. The histogram of oriented gradient (HOG) features can easily detect the edge of object without invariance to changes in illumination and shadowing. However, to reduce the execution time for real-time systems, the image size should be down sampled which would lead the outlier influence to increase. For this reason, we propose the histogram of uniformly-oriented gradient (HUG) features to get better accurate description of the contour of human body. In the testing phase, the support vector machine (SVM) with linear kernel function is involved. Experimental results show the correctness and effectiveness of the proposed method. With SVM classifiers, the real testing results show the proposed HUG features achieve better than classification performance than the HOG ones.

Paper Detail
655
downloads
177
10007255
An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms
Authors:
Abstract:

One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).

Paper Detail
531
downloads
176
10007895
Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information
Abstract:
The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an important feature for visual thing recognition. With color-based SIFT features and SVM, we can discard unreliable matching pairs and increase the robustness of matching tasks. The experimental results show that the proposed object recognition system with color-assistant SIFT SVM classifier achieves higher recognition rate than that with the traditional gray SIFT and SVM classification in various situations.
Paper Detail
410
downloads
175
10007098
A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments
Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Paper Detail
470
downloads
174
10006260
Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method
Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Paper Detail
1444
downloads
173
10006407
An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing
Abstract:
Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.
Paper Detail
845
downloads
172
10009011
CBIR Using Multi-Resolution Transform for Brain Tumour Detection and Stages Identification
Abstract:

Image retrieval is the most interesting technique which is being used today in our digital world. CBIR, commonly expanded as Content Based Image Retrieval is an image processing technique which identifies the relevant images and retrieves them based on the patterns that are extracted from the digital images. In this paper, two research works have been presented using CBIR. The first work provides an automated and interactive approach to the analysis of CBIR techniques. CBIR works on the principle of supervised machine learning which involves feature selection followed by training and testing phase applied on a classifier in order to perform prediction. By using feature extraction, the image transforms such as Contourlet, Ridgelet and Shearlet could be utilized to retrieve the texture features from the images. The features extracted are used to train and build a classifier using the classification algorithms such as Naïve Bayes, K-Nearest Neighbour and Multi-class Support Vector Machine. Further the testing phase involves prediction which predicts the new input image using the trained classifier and label them from one of the four classes namely 1- Normal brain, 2- Benign tumour, 3- Malignant tumour and 4- Severe tumour. The second research work includes developing a tool which is used for tumour stage identification using the best feature extraction and classifier identified from the first work. Finally, the tool will be used to predict tumour stage and provide suggestions based on the stage of tumour identified by the system. This paper presents these two approaches which is a contribution to the medical field for giving better retrieval performance and for tumour stages identification.

Paper Detail
282
downloads
171
10005474
Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling
Abstract:
Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.
Paper Detail
795
downloads
170
10005686
Hybrid Approach for Country’s Performance Evaluation
Authors:
Abstract:

This paper presents an integrated model, which hybridized data envelopment analysis (DEA) and support vector machine (SVM) together, to class countries according to their efficiency and performance. This model takes into account aspects of multi-dimensional indicators, decision-making hierarchy and relativity of measurement. Starting from a set of indicators of performance as exhaustive as possible, a process of successive aggregations has been developed to attain an overall evaluation of a country’s competitiveness.

Paper Detail
636
downloads
169
10005377
Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study
Abstract:
Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.
Paper Detail
1487
downloads
168
10007377
Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Abstract:
Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.
Paper Detail
737
downloads
167
10004872
Automatic Detection and Classification of Diabetic Retinopathy Using Retinal Fundus Images
Abstract:
Diabetic Retinopathy (DR) is a severe retinal disease which is caused by diabetes mellitus. It leads to blindness when it progress to proliferative level. Early indications of DR are the appearance of microaneurysms, hemorrhages and hard exudates. In this paper, an automatic algorithm for detection of DR has been proposed. The algorithm is based on combination of several image processing techniques including Circular Hough Transform (CHT), Contrast Limited Adaptive Histogram Equalization (CLAHE), Gabor filter and thresholding. Also, Support Vector Machine (SVM) Classifier is used to classify retinal images to normal or abnormal cases including non-proliferative or proliferative DR. The proposed method has been tested on images selected from Structured Analysis of the Retinal (STARE) database using MATLAB code. The method is perfectly able to detect DR. The sensitivity specificity and accuracy of this approach are 90%, 87.5%, and 91.4% respectively.
Paper Detail
1111
downloads