In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.
During their activity, all systems must be operational without failures and in this context, the dependability concept is essential avoiding disruption of their function. As computer networks are systems with the same requirements of dependability, this article deals with an analysis of failures for a computer network. The proposed approach integrates specific tools of the plat-form KB3, usually applied in dependability studies of industrial systems. The methodology is supported by a multi-agent system formed by six agents grouped in three meta agents, dealing with two levels. The first level concerns a modeling step through a conceptual agent and a generating agent. The conceptual agent is dedicated to the building of the knowledge base from the system specifications written in the FIGARO language. The generating agent allows producing automatically both the structural model and a dependability model of the system. The second level, the simulation, shows the effects of the failures of the system through a simulation agent. The approach validation is obtained by its application on a specific computer network, giving an analysis of failures through their effects for the considered network.
Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system. This is in turn can increase the speed and efficiency of the recognition system. In this paper, we proposed algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criteria. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.
Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.
Analysing the world banking sector, we realize that traditional risk measurement methodologies no longer reflect the actual scenario with uncertainty and leave out events that can change the dynamics of markets. Considering this, regulators and financial institutions began to search more realistic models. The aim is to include external influences and interdependencies between agents, to describe and measure the operationalization of these complex systems and their risks in a more coherent and credible way. Within this context, X-Events are more frequent than assumed and, with uncertainties and constant changes, the concept of antifragility starts to gain great prominence in comparison to others methodologies of risk management. It is very useful to analyse whether a system succumbs (fragile), resists (robust) or gets benefits (antifragile) from disorder and stress. Thus, this work proposes the creation of the Banking Antifragility Index (BAI), which is based on the calculation of a triangular fuzzy number – to "quantify" qualitative criteria linked to antifragility.
A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.
Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.
Cloud computing can be defined as one of the prominent technologies that lets a user change, configure and access the services online. it can be said that this is a prototype of computing that helps in saving cost and time of a user practically the use of cloud computing can be found in various fields like education, health, banking etc. Cloud computing is an internet dependent technology thus it is the major responsibility of Cloud Service Providers(CSPs) to care of data stored by user at data centers. Scheduling in cloud computing environment plays a vital role as to achieve maximum utilization and user satisfaction cloud providers need to schedule resources effectively. Job scheduling for cloud computing is analyzed in the following work. To complete, recreate the task calculation, and conveyed scheduling methods CloudSim3.0.3 is utilized. This research work discusses the job scheduling for circulated processing condition also by exploring on this issue we find it works with minimum time and less cost. In this work two load balancing techniques have been employed: ‘Throttled stack adjustment policy’ and ‘Active VM load balancing policy’ with two brokerage services ‘Advanced Response Time’ and ‘Reconfigure Dynamically’ to evaluate the VM_Cost, DC_Cost, Response Time, and Data Processing Time. The proposed techniques are compared with Round Robin scheduling policy.
Technology era drives people to use mobile phone to support their daily life activities. Technology development has a rapid phase which pushes the IT company to adjust any technology changes in order to fulfill customer’s satisfaction. As a result of that, many companies in the USA emerged from systematics software development approach to agile software development approach in developing systems and applications to develop many mobile phone applications in a short phase to fulfill user’s needs. As a systematic approach is considered as time consuming, costly, and too risky, agile software development has become a more popular approach to use for developing software including mobile applications. This paper reflects a short-term project to develop a diet tracker mobile application using agile software development that focused on applying scrum framework in the development process.
Water erosion is the major cause of the erosion that shapes the earth's surface. Modeling water erosion requires the use of software and GIS programs, commercial or closed source. The very high prices for commercial GIS licenses, motivates users and researchers to find open source software as relevant and applicable as the proprietary GIS. The objective of this study is the modeling of water erosion and the hydrogeological and morphophysical characterization of the Oued M'Goun watershed (southern flank of the Central High Atlas) developed by free programs of GIS. The very pertinent results are obtained by executing tasks and algorithms in a simple and easy way. Thus, the various geoscientific and geostatistical analyzes of a digital elevation model (SRTM 30 m resolution) and their combination with the treatments and interpretation of satellite imagery information allowed us to characterize the region studied and to map the area most vulnerable to water erosion.
Vehicular Ad-hoc Network (VANET) is an emerging and very promising technology that has great demand on the access capability of the existing wireless technology. VANETs help improve trafﬁc safety and efficiency. Each vehicle can exchange their information to inform the other vehicles about the current status of the traffic ﬂow or a dangerous situation such as an accident. To achieve these, a reliable and efficient Medium Access Control (MAC) protocol with minimal transmission collisions is required. High speed nodes, absence of infrastructure, variations in topology and their QoS requirements makes it difficult for designing a MAC protocol in vehicular networks. There are several MAC protocols proposed for VANETs to ensure that all the vehicles could send safety messages without collisions by reducing the end-to-end delay and packet loss ratio. This paper gives an overview of the several proposed MAC protocols for VANETs along with their benefits and limitations and presents an overall classification based on their characteristics.
Usability is one of the most important quality attributes for web-based information systems. Specifically, for e-commerce applications, usability becomes more prominent. In this study, we aimed to explore the features that experienced users seek in e-commerce applications. We used eye tracking method in evaluations. Eye movement data are obtained from the eye-tracking method and analyzed based on task completion time, number of fixations, as well as heat map and gaze plot measures. The results of the analysis show that the eye movements of participants' are too static in certain areas and their areas of interest are scattered in many different places. It has been determined that this causes users to fail to complete their transactions. According to the findings, we outlined the issues to improve the usability of e-commerce websites. Then we propose solutions to identify the issues. In this way, it is expected that e-commerce sites will be developed which will make experienced users more satisfied.
As a process of developing a service system, the term ‘service engineering’ evolves in scope and definition. To achieve an integrated understanding of the process, a general framework and an ontology are required. This paper extends a previously built service engineering framework by exploring metamodels for the framework artefacts based on a foundational ontology and a metamodel landscape. The first part of this paper presents a correlation map between the proposed framework with the ontology as a form of evaluation for the conceptual coverage of the framework. The mapping also serves to characterize the artefacts to be produced for each activity in the framework. The second part describes potential metamodels to be used, from the metamodel landscape, as alternative formats of the framework artefacts. The results suggest that the framework sufficiently covers the ontological concepts, both from general service context and software service context. The metamodel exploration enriches the suggested artefact format from the original eighteen formats to thirty metamodel alternatives.
Enterprise Applications (EAs) aid the organizations achieve operational excellence and competitive advantage. Over time, most Small and Medium Enterprises (SMEs), which are known to be the major drivers of most thriving global economies, use the costly on-premise versions of these applications thereby making business difficult to competitively thrive in the same market environment with their large enterprise counterparts. The advent of cloud computing presents the SMEs an affordable offer and great opportunities as such EAs can be cloud-hosted and rented on a pay-per-use basis which does not require huge initial capital. However, as there are numerous Cloud Service Providers (CSPs) offering EAs as Software-as-a-Service (SaaS), there is a challenge of choosing a suitable provider with Quality of Service (QoS) that meet the organizations’ customized requirements. The proposed model takes care of that and goes a step further to select the most affordable among a selected few of the CSPs. In the earlier stage, before developing the instrument and conducting the pilot test, the researchers conducted a structured interview with three experts to validate the proposed model. In conclusion, the validity and reliability of the instrument were tested through experts, typical respondents, and analyzed with SPSS 22. Results confirmed the validity of the proposed model and the validity and reliability of the instrument.
This paper is aimed at creating an Automatic Java X-Machine testing tool for software development. The nature of software development is changing; thus, the type of software testing tools required is also changing. Software is growing increasingly complex and, in part due to commercial impetus for faster software releases with new features and value, increasingly in danger of containing faults. These faults can incur huge cost for software development organisations and users; Cambridge Judge Business School’s research estimated the cost of software bugs to the global economy is $312 billion. Beyond the cost, faster software development methodologies and increasing expectations on developers to become testers is driving demand for faster, automated, and effective tools to prevent potential faults as early as possible in the software development lifecycle. Using X-Machine theory, this paper will explore a new tool to address software complexity, changing expectations on developers, faster development pressures and methodologies, with a view to reducing the huge cost of fixing software bugs.
Over recent years, web development has changed significantly. Driven largely by the rise of trends like mobiles, the world of development is rapidly evolving. The rise of the Internet makes web applications crucial nowadays. The web application has been an interface for a company and one of the ways they present their portfolio to the client. On the other hand, the web has become part of the file management system which takes over the role of paper. Due to high demand in web applications, developers are required to develop a web application that are cost-effective, secure and well coded. A framework has been proposed to develop an application rather than using library style development. The framework is helping the developer in creating the structure of a web automatically. This paper will compare the advantages and disadvantages of web development using framework against library-style development. This comparison is based on a previous research paper focusing on two main indicators, which are the impact to management and impact to the developer.
Internet of Things (IoT) is a powerful industry system, which end-devices are interconnected and automated, allowing the devices to analyze data and execute actions based on the analysis. The IoT technology leverages the technology of Radio-Frequency Identification (RFID) and Wireless Sensor Network (WSN), including mobile and sensor. These technologies contribute to the evolution of IoT. However, due to more devices are connected each other in the Internet, and data from various sources exchanged between things, confidentiality of the data becomes a major concern. This paper focuses on one of the major challenges in IoT; authentication, in order to preserve data integrity and confidentiality are in place. A few solutions are reviewed based on papers from the last few years. One of the proposed solutions is securing the communication between IoT devices and cloud servers with Elliptic Curve Cryptograhpy (ECC) based mutual authentication protocol. This solution focuses on Hyper Text Transfer Protocol (HTTP) cookies as security parameter. Next proposed solution is using keyed-hash scheme protocol to enable IoT devices to authenticate each other without the presence of a central control server. Another proposed solution uses Physical Unclonable Function (PUF) based mutual authentication protocol. It emphasizes on tamper resistant and resource-efficient technology, which equals a 3-way handshake security protocol.