International Science Index

99
10011898
Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process
Abstract:
The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.
Paper Detail
78
downloads
98
10011928
Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural Networks
Abstract:
In recent years, in the field of sports, decision making such as member in the game and strategy of the game based on then analysis of the accumulated sports data are widely attempted. In fact, in the NBA basketball league where the world's highest level players gather, to win the games, teams analyze the data using various statistical techniques. However, it is difficult to analyze the game data for each play such as the ball tracking or motion of the players in the game, because the situation of the game changes rapidly, and the structure of the data should be complicated. Therefore, it is considered that the analysis method for real time game play data is proposed. In this research, we propose an analytical model for "determining the optimal lineup composition" using the real time play data, which is considered to be difficult for all coaches. In this study, because replacing the entire lineup is too complicated, and the actual question for the replacement of players is "whether or not the lineup should be changed", and “whether or not Small Ball lineup is adopted”. Therefore, we propose an analytical model for the optimal player selection problem based on Small Ball lineups. In basketball, we can accumulate scoring data for each play, which indicates a player's contribution to the game, and the scoring data can be considered as a time series data. In order to compare the importance of players in different situations and lineups, we combine RNN (Recurrent Neural Network) model, which can analyze time series data, and NN (Neural Network) model, which can analyze the situation on the field, to build the prediction model of score. This model is capable to identify the current optimal lineup for different situations. In this research, we collected all the data of accumulated data of NBA from 2019-2020. Then we apply the method to the actual basketball play data to verify the reliability of the proposed model.
Paper Detail
31
downloads
97
10011857
Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance
Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Paper Detail
154
downloads
96
10011788
Feature Analysis of Predictive Maintenance Models
Authors:
Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Paper Detail
345
downloads
95
10011557
A Deep-Learning Based Prediction of Pancreatic Adenocarcinoma with Electronic Health Records from the State of Maine
Abstract:

Predicting the risk of Pancreatic Adenocarcinoma (PA) in advance can benefit the quality of care and potentially reduce population mortality and morbidity. The aim of this study was to develop and prospectively validate a risk prediction model to identify patients at risk of new incident PA as early as 3 months before the onset of PA in a statewide, general population in Maine. The PA prediction model was developed using Deep Neural Networks, a deep learning algorithm, with a 2-year electronic-health-record (EHR) cohort. Prospective results showed that our model identified 54.35% of all inpatient episodes of PA, and 91.20% of all PA that required subsequent chemoradiotherapy, with a lead-time of up to 3 months and a true alert of 67.62%. The risk assessment tool has attained an improved discriminative ability. It can be immediately deployed to the health system to provide automatic early warnings to adults at risk of PA. It has potential to identify personalized risk factors to facilitate customized PA interventions.

Paper Detail
183
downloads
94
10011582
Semi-Analytic Method in Fast Evaluation of Thermal Management Solution in Energy Storage System
Authors:
Abstract:
This article presents the application of the semi-analytic method (SAM) in the thermal management solution (TMS) of the energy storage system (ESS). The TMS studied in this work is fluid cooling. In fluid cooling, both effective heat conduction and heat convection are indispensable due to the heat transfer from solid to fluid. Correspondingly, an efficient TMS requires a design investigation of the following parameters: fluid inlet temperature, ESS initial temperature, fluid flow rate, working c rate, continuous working time, and materials properties. Their variation induces a change of thermal performance in the battery module, which is usually evaluated by numerical simulation. Compared to complicated computation resources and long computation time in simulation, the SAM is developed in this article to predict the thermal influence within a few seconds. In SAM, a fast prediction model is reckoned by combining numerical simulation with theoretical/empirical equations. The SAM can explore the thermal effect of boundary parameters in both steady-state and transient heat transfer scenarios within a short time. Therefore, the SAM developed in this work can simplify the design cycle of TMS and inspire more possibilities in TMS design.
Paper Detail
165
downloads
93
10011592
Development of Fuzzy Logic and Neuro-Fuzzy Surface Roughness Prediction Systems Coupled with Cutting Current in Milling Operation
Abstract:
Development of two real-time surface roughness (Ra) prediction systems for milling operations was attempted. The systems used not only cutting parameters, such as feed rate and spindle speed, but also the cutting current generated and corrected by a clamp type energy sensor. Two different approaches were developed. First, a fuzzy inference system (FIS), in which the fuzzy logic rules are generated by experts in the milling processes, was used to conduct prediction modeling using current cutting data. Second, a neuro-fuzzy system (ANFIS) was explored. Neuro-fuzzy systems are adaptive techniques in which data are collected on the network, processed, and rules are generated by the system. The inference system then uses these rules to predict Ra as the output. Experimental results showed that the parameters of spindle speed, feed rate, depth of cut, and input current variation could predict Ra. These two systems enable the prediction of Ra during the milling operation with an average of 91.83% and 94.48% accuracy by FIS and ANFIS systems, respectively. Statistically, the ANFIS system provided better prediction accuracy than that of the FIS system.
Paper Detail
115
downloads
92
10011593
Estimation of the Drought Index Based on the Climatic Projections of Precipitation of the Uruguay River Basin
Abstract:

The impact the climate change is not recent, the main variable in the hydrological cycle is the sequence and shortage of a drought, which has a significant impact on the socioeconomic, agricultural and environmental spheres. This study aims to characterize and quantify, based on precipitation climatic projections, the rainy and dry events in the region of the Uruguay River Basin, through the Standardized Precipitation Index (SPI). The database is the image that is part of the Intercomparison of Model Models, Phase 5 (CMIP5), which provides condition prediction models, organized according to the Representative Routes of Concentration (CPR). Compared to the normal set of climates in the Uruguay River Watershed through precipitation projections, seasonal precipitation increases for all proposed scenarios, with a low climate trend. From the data of this research, the idea is that this article can be used to support research and the responsible bodies can use it as a subsidy for mitigation measures in other hydrographic basins.

Paper Detail
129
downloads
91
10011609
Air Handling Units Power Consumption Using Generalized Additive Model for Anomaly Detection: A Case Study in a Singapore Campus
Abstract:

The emergence of digital twin technology, a digital replica of physical world, has improved the real-time access to data from sensors about the performance of buildings. This digital transformation has opened up many opportunities to improve the management of the building by using the data collected to help monitor consumption patterns and energy leakages. One example is the integration of predictive models for anomaly detection. In this paper, we use the GAM (Generalised Additive Model) for the anomaly detection of Air Handling Units (AHU) power consumption pattern. There is ample research work on the use of GAM for the prediction of power consumption at the office building and nation-wide level. However, there is limited illustration of its anomaly detection capabilities, prescriptive analytics case study, and its integration with the latest development of digital twin technology. In this paper, we applied the general GAM modelling framework on the historical data of the AHU power consumption and cooling load of the building between Jan 2018 to Aug 2019 from an education campus in Singapore to train prediction models that, in turn, yield predicted values and ranges. The historical data are seamlessly extracted from the digital twin for modelling purposes. We enhanced the utility of the GAM model by using it to power a real-time anomaly detection system based on the forward predicted ranges. The magnitude of deviation from the upper and lower bounds of the uncertainty intervals is used to inform and identify anomalous data points, all based on historical data, without explicit intervention from domain experts. Notwithstanding, the domain expert fits in through an optional feedback loop through which iterative data cleansing is performed. After an anomalously high or low level of power consumption detected, a set of rule-based conditions are evaluated in real-time to help determine the next course of action for the facilities manager. The performance of GAM is then compared with other approaches to evaluate its effectiveness. Lastly, we discuss the successfully deployment of this approach for the detection of anomalous power consumption pattern and illustrated with real-world use cases.

Paper Detail
118
downloads
90
10010619
A Spatial Information Network Traffic Prediction Method Based on Hybrid Model
Abstract:

Compared with terrestrial network, the traffic of spatial information network has both self-similarity and short correlation characteristics. By studying its traffic prediction method, the resource utilization of spatial information network can be improved, and the method can provide an important basis for traffic planning of a spatial information network. In this paper, considering the accuracy and complexity of the algorithm, the spatial information network traffic is decomposed into approximate component with long correlation and detail component with short correlation, and a time series hybrid prediction model based on wavelet decomposition is proposed to predict the spatial network traffic. Firstly, the original traffic data are decomposed to approximate components and detail components by using wavelet decomposition algorithm. According to the autocorrelation and partial correlation smearing and truncation characteristics of each component, the corresponding model (AR/MA/ARMA) of each detail component can be directly established, while the type of approximate component modeling can be established by ARIMA model after smoothing. Finally, the prediction results of the multiple models are fitted to obtain the prediction results of the original data. The method not only considers the self-similarity of a spatial information network, but also takes into account the short correlation caused by network burst information, which is verified by using the measured data of a certain back bone network released by the MAWI working group in 2018. Compared with the typical time series model, the predicted data of hybrid model is closer to the real traffic data and has a smaller relative root means square error, which is more suitable for a spatial information network.

Paper Detail
283
downloads
89
10010466
Rain Cell Ratio Technique in Path Attenuation for Terrestrial Radio Links
Abstract:

A rain cell ratio model is proposed that computes attenuation of the smallest rain cell which represents the maximum rain rate value i.e. the cell size when rainfall rate is exceeded 0.01% of the time, R0.01 and predicts attenuation for other cells as the ratio with this maximum. This model incorporates the dependence of the path factor r on the ellipsoidal path variation of the Fresnel zone at different frequencies. In addition, the inhomogeneity of rainfall is modeled by a rain drop packing density factor. In order to derive the model, two empirical methods that can be used to find rain cell size distribution Dc are presented. Subsequently, attenuation measurements from different climatic zones for terrestrial radio links with frequencies F in the range 7-38 GHz are used to test the proposed model. Prediction results show that the path factor computed from the rain cell ratio technique has improved reliability when compared with other path factor and effective rain rate models, including the current ITU-R 530-15 model of 2013.

Paper Detail
289
downloads
88
10010278
A Prediction Model Using the Price Cyclicality Function Optimized for Algorithmic Trading in Financial Market
Abstract:

After the widespread release of electronic trading, automated trading systems have become a significant part of the business intelligence system of any modern financial investment company. An important part of the trades is made completely automatically today by computers using mathematical algorithms. The trading decisions are taken almost instantly by logical models and the orders are sent by low-latency automatic systems. This paper will present a real-time price prediction methodology designed especially for algorithmic trading. Based on the price cyclicality function, the methodology revealed will generate price cyclicality bands to predict the optimal levels for the entries and exits. In order to automate the trading decisions, the cyclicality bands will generate automated trading signals. We have found that the model can be used with good results to predict the changes in market behavior. Using these predictions, the model can automatically adapt the trading signals in real-time to maximize the trading results. The paper will reveal the methodology to optimize and implement this model in automated trading systems. After tests, it is proved that this methodology can be applied with good efficiency in different timeframes. Real trading results will be also displayed and analyzed in order to qualify the methodology and to compare it with other models. As a conclusion, it was found that the price prediction model using the price cyclicality function is a reliable trading methodology for algorithmic trading in the financial market.

Paper Detail
684
downloads
87
10009991
Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values
Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Paper Detail
537
downloads
86
10008929
CFD Simulation for Flow Behavior in Boiling Water Reactor Vessel and Upper Pool under Decommissioning Condition
Abstract:

In order to respond the policy decision of non-nuclear homes, Tai Power Company (TPC) will provide the decommissioning project of Kuosheng Nuclear power plant (KSNPP) to meet the regulatory requirement in near future. In this study, the computational fluid dynamics (CFD) methodology has been employed to develop a flow prediction model for boiling water reactor (BWR) with upper pool under decommissioning stage. The model can be utilized to investigate the flow behavior as the vessel combined with upper pool and continuity cooling system. At normal operating condition, different parameters are obtained for the full fluid area, including velocity, mass flow, and mixing phenomenon in the reactor pressure vessel (RPV) and upper pool. Through the efforts of the study, an integrated simulation model will be developed for flow field analysis of decommissioning KSNPP under normal operating condition. It can be expected that a basis result for future analysis application of TPC can be provide from this study.

Paper Detail
458
downloads
85
10008599
Prediction on Housing Price Based on Deep Learning
Abstract:

In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Paper Detail
3189
downloads
84
10008615
Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network
Abstract:
This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.
Paper Detail
807
downloads
83
10008614
Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information
Abstract:
Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.
Paper Detail
1223
downloads
82
10008486
Integration of Big Data to Predict Transportation for Smart Cities
Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system.  The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Paper Detail
1051
downloads
81
10008735
Assessment of Path Loss Prediction Models for Wireless Propagation Channels at L-Band Frequency over Different Micro-Cellular Environments of Ekiti State, Southwestern Nigeria
Abstract:

The design of accurate and reliable mobile communication systems depends majorly on the suitability of path loss prediction methods and the adaptability of the methods to various environments of interest. In this research, the results of the adaptability of radio channel behavior are presented based on practical measurements carried out in the 1800 MHz frequency band. The measurements are carried out in typical urban, suburban and rural environments in Ekiti State, Southwestern part of Nigeria. A total number of seven base stations of MTN GSM service located in the studied environments were monitored. Path loss and break point distances were deduced from the measured received signal strength (RSS) and a practical path loss model is proposed based on the deduced break point distances. The proposed two slope model, regression line and four existing path loss models were compared with the measured path loss values. The standard deviations of each model with respect to the measured path loss were estimated for each base station. The proposed model and regression line exhibited lowest standard deviations followed by the Cost231-Hata model when compared with the Erceg Ericsson and SUI models. Generally, the proposed two-slope model shows closest agreement with the measured values with a mean error values of 2 to 6 dB. These results show that, either the proposed two slope model or Cost 231-Hata model may be used to predict path loss values in mobile micro cell coverage in the well-considered environments. Information from this work will be useful for link design of microwave band wireless access systems in the region.

Paper Detail
507
downloads
80
10007834
Prediction of Time to Crack Reinforced Concrete by Chloride Induced Corrosion
Abstract:

In this paper, a review of different mathematical models which can be used as prediction tools to assess the time to crack reinforced concrete (RC) due to corrosion is investigated. This investigation leads to an experimental study to validate a selected prediction model. Most of these mathematical models depend upon the mechanical behaviors, chemical behaviors, electrochemical behaviors or geometric aspects of the RC members during a corrosion process. The experimental program is designed to verify the accuracy of a well-selected mathematical model from a rigorous literature study. Fundamentally, the experimental program exemplifies both one-dimensional chloride diffusion using RC squared slab elements of 500 mm by 500 mm and two-dimensional chloride diffusion using RC squared column elements of 225 mm by 225 mm by 500 mm. Each set consists of three water-to-cement ratios (w/c); 0.4, 0.5, 0.6 and two cover depths; 25 mm and 50 mm. 12 mm bars are used for column elements and 16 mm bars are used for slab elements. All the samples are subjected to accelerated chloride corrosion in a chloride bath of 5% (w/w) sodium chloride (NaCl) solution. Based on a pre-screening of different models, it is clear that the well-selected mathematical model had included mechanical properties, chemical and electrochemical properties, nature of corrosion whether it is accelerated or natural, and the amount of porous area that rust products can accommodate before exerting expansive pressure on the surrounding concrete. The experimental results have shown that the selected model for both one-dimensional and two-dimensional chloride diffusion had ±20% and ±10% respective accuracies compared to the experimental output. The half-cell potential readings are also used to see the corrosion probability, and experimental results have shown that the mass loss is proportional to the negative half-cell potential readings that are obtained. Additionally, a statistical analysis is carried out in order to determine the most influential factor that affects the time to corrode the reinforcement in the concrete due to chloride diffusion. The factors considered for this analysis are w/c, bar diameter, and cover depth. The analysis is accomplished by using Minitab statistical software, and it showed that cover depth is the significant effect on the time to crack the concrete from chloride induced corrosion than other factors considered. Thus, the time predictions can be illustrated through the selected mathematical model as it covers a wide range of factors affecting the corrosion process, and it can be used to predetermine the durability concern of RC structures that are vulnerable to chloride exposure. And eventually, it is further concluded that cover thickness plays a vital role in durability in terms of chloride diffusion.

Paper Detail
616
downloads
79
10007894
Methodology for Obtaining Static Alignment Model
Abstract:
In this paper, a methodology is presented to obtain the Static Alignment Model for any transtibial amputee person. The proposed methodology starts from experimental data collected on the Hospital Militar Central, Bogotá, Colombia. The effects of transtibial prosthesis malalignment on amputees were measured in terms of joint angles, center of pressure (COP) and weight distribution. Some statistical tools are used to obtain the model parameters. Mathematical predictive models of prosthetic alignment were created. The proposed models are validated in amputees and finding promising results for the prosthesis Static Alignment. Static alignment process is unique to each subject; nevertheless the proposed methodology can be used in each transtibial amputee.
Paper Detail
538
downloads
78
10007916
Nonlinear Estimation Model for Rail Track Deterioration
Abstract:

Rail transport authorities around the world have been facing a significant challenge when predicting rail infrastructure maintenance work for a long period of time. Generally, maintenance monitoring and prediction is conducted manually. With the restrictions in economy, the rail transport authorities are in pursuit of improved modern methods, which can provide precise prediction of rail maintenance time and location. The expectation from such a method is to develop models to minimize the human error that is strongly related to manual prediction. Such models will help them in understanding how the track degradation occurs overtime under the change in different conditions (e.g. rail load, rail type, rail profile). They need a well-structured technique to identify the precise time that rail tracks fail in order to minimize the maintenance cost/time and secure the vehicles. The rail track characteristics that have been collected over the years will be used in developing rail track degradation prediction models. Since these data have been collected in large volumes and the data collection is done both electronically and manually, it is possible to have some errors. Sometimes these errors make it impossible to use them in prediction model development. This is one of the major drawbacks in rail track degradation prediction. An accurate model can play a key role in the estimation of the long-term behavior of rail tracks. Accurate models increase the track safety and decrease the cost of maintenance in long term. In this research, a short review of rail track degradation prediction models has been discussed before estimating rail track degradation for the curve sections of Melbourne tram track system using Adaptive Network-based Fuzzy Inference System (ANFIS) model.

Paper Detail
664
downloads
77
10007634
An Improved Heat Transfer Prediction Model for Film Condensation inside a Tube with Interphacial Shear Effect
Abstract:

The analysis of heat transfer design methods in condensing inside plain tubes under existing influence of shear stress is presented in this paper. The existing discrepancy in more than 30-50% between rating heat transfer coefficients and experimental data has been noted. The analysis of existing theoretical and semi-empirical methods of heat transfer prediction is given. The influence of a precise definition concerning boundaries of phase flow (it is especially important in condensing inside horizontal tubes), shear stress (friction coefficient) and heat flux on design of heat transfer is shown. The substantiation of boundary conditions of the values of parameters, influencing accuracy of rated relationships, is given. More correct relationships for heat transfer prediction, which showed good convergence with experiments made by different authors, are substantiated in this work.

Paper Detail
707
downloads
76
10007552
Spatial Variation of WRF Model Rainfall Prediction over Uganda
Abstract:
Rainfall is a major climatic parameter affecting many sectors such as health, agriculture and water resources. Its quantitative prediction remains a challenge to weather forecasters although numerical weather prediction models are increasingly being used for rainfall prediction. The performance of six convective parameterization schemes, namely the Kain-Fritsch scheme, the Betts-Miller-Janjic scheme, the Grell-Deveny scheme, the Grell-3D scheme, the Grell-Fretas scheme, the New Tiedke scheme of the weather research and forecast (WRF) model regarding quantitative rainfall prediction over Uganda is investigated using the root mean square error for the March-May (MAM) 2013 season. The MAM 2013 seasonal rainfall amount ranged from 200 mm to 900 mm over Uganda with northern region receiving comparatively lower rainfall amount (200–500 mm); western Uganda (270–550 mm); eastern Uganda (400–900 mm) and the lake Victoria basin (400–650 mm). A spatial variation in simulated rainfall amount by different convective parameterization schemes was noted with the Kain-Fritsch scheme over estimating the rainfall amount over northern Uganda (300–750 mm) but also presented comparable rainfall amounts over the eastern Uganda (400–900 mm). The Betts-Miller-Janjic, the Grell-Deveny, and the Grell-3D underestimated the rainfall amount over most parts of the country especially the eastern region (300–600 mm). The Grell-Fretas captured rainfall amount over the northern region (250–450 mm) but also underestimated rainfall over the lake Victoria Basin (150–300 mm) while the New Tiedke generally underestimated rainfall amount over many areas of Uganda. For deterministic rainfall prediction, the Grell-Fretas is recommended for rainfall prediction over northern Uganda while the Kain-Fritsch scheme is recommended over eastern region.
Paper Detail
803
downloads
75
10008907
Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison
Abstract:
Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.
Paper Detail
630
downloads
74
10007032
Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data
Abstract:
Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.
Paper Detail
861
downloads
73
10006260
Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method
Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Paper Detail
1632
downloads
72
10006389
Estimation of Relative Subsidence of Collapsible Soils Using Electromagnetic Measurements
Abstract:

Collapsible soils are weak soils that appear to be stable in their natural state, normally dry condition, but rapidly deform under saturation (wetting), thus generating large and unexpected settlements which often yield disastrous consequences for structures unwittingly built on such deposits. In this study, a prediction model for the relative subsidence of stressed collapsible soils based on dielectric permittivity measurement is presented. Unlike most existing methods for soil subsidence prediction, this model does not require moisture content as an input parameter, thus providing the opportunity to obtain accurate estimation of the relative subsidence of collapsible soils using dielectric measurement only. The prediction model is developed based on an existing relative subsidence prediction model (which is dependent on soil moisture condition) and an advanced theoretical frequency and temperature-dependent electromagnetic mixing equation (which effectively removes the moisture content dependence of the original relative subsidence prediction model). For large scale sub-surface soil exploration purposes, the spatial sub-surface soil dielectric data over wide areas and high depths of weak (collapsible) soil deposits can be obtained using non-destructive high frequency electromagnetic (HF-EM) measurement techniques such as ground penetrating radar (GPR). For laboratory or small scale in-situ measurements, techniques such as an open-ended coaxial line with widely applicable time domain reflectometry (TDR) or vector network analysers (VNAs) are usually employed to obtain the soil dielectric data. By using soil dielectric data obtained from small or large scale non-destructive HF-EM investigations, the new model can effectively predict the relative subsidence of weak soils without the need to extract samples for moisture content measurement. Some of the resulting benefits are the preservation of the undisturbed nature of the soil as well as a reduction in the investigation costs and analysis time in the identification of weak (problematic) soils. The accuracy of prediction of the presented model is assessed by conducting relative subsidence tests on a collapsible soil at various initial soil conditions and a good match between the model prediction and experimental results is obtained.

Paper Detail
790
downloads
71
10005251
Using Simulation Modeling Approach to Predict USMLE Steps 1 and 2 Performances
Abstract:
The prediction models for the United States Medical Licensure Examination (USMLE) Steps 1 and 2 performances were constructed by the Monte Carlo simulation modeling approach via linear regression. The purpose of this study was to build robust simulation models to accurately identify the most important predictors and yield the valid range estimations of the Steps 1 and 2 scores. The application of simulation modeling approach was deemed an effective way in predicting student performances on licensure examinations. Also, sensitivity analysis (a/k/a what-if analysis) in the simulation models was used to predict the magnitudes of Steps 1 and 2 affected by changes in the National Board of Medical Examiners (NBME) Basic Science Subject Board scores. In addition, the study results indicated that the Medical College Admission Test (MCAT) Verbal Reasoning score and Step 1 score were significant predictors of the Step 2 performance. Hence, institutions could screen qualified student applicants for interviews and document the effectiveness of basic science education program based on the simulation results.
Paper Detail
1073
downloads
70
10005342
Surface Roughness Analysis, Modelling and Prediction in Fused Deposition Modelling Additive Manufacturing Technology
Abstract:
Fused deposition modelling (FDM) is one of the most prominent rapid prototyping (RP) technologies which is being used to efficiently fabricate CAD 3D geometric models. However, the process is coupled with many drawbacks, of which the surface quality of the manufactured RP parts is among. Hence, studies relating to improving the surface roughness have been a key issue in the field of RP research. In this work, a technique of modelling the surface roughness in FDM is presented. Using experimentally measured surface roughness response of the FDM parts, an ANFIS prediction model was developed to obtain the surface roughness in the FDM parts using the main critical process parameters that affects the surface quality. The ANFIS model was validated and compared with experimental test results.
Paper Detail
1447
downloads