Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.
This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.
This study investigated factors affecting the price of cement in Nigeria, and developed a mathematical model that can predict future cement prices. Cement is key in the Nigerian construction industry. The changes in price caused by certain factors could affect economic and infrastructural development; hence there is need for proper proactive planning. Secondary data were collected from published information on cement between 2014 and 2019. In addition, questionnaires were sent to some domestic cement retailers in Port Harcourt in Nigeria, to obtain the actual prices of cement between the same periods. The study revealed that the most critical factors affecting the price of cement in Nigeria are inflation rate, population growth rate, and Gross Domestic Product (GDP) growth rate. With the use of data from United Nations, International Monetary Fund, and Central Bank of Nigeria databases, amongst others, a Multiple Linear Regression model was formulated. The model was used to predict the price of cement for 2020-2025. The model was then tested with 95% confidence level, using a two-tailed t-test and an F-test, resulting in an R2 of 0.8428 and R2 (adj.) of 0.6069. The results of the tests and the correlation factors confirm the model to be fit and adequate. This study will equip researchers and stakeholders in the construction industry with information for planning, monitoring, and management of present and future construction projects that involve the use of cement.
In geophysical exploration surveys, the quality of acquired data holds significant importance before executing the data processing and interpretation phases. In this study, 2D seismic reflection survey data of Fort Abbas area, Cholistan Desert, Pakistan was taken as test case in order to assess its quality on statistical bases by using normalized root mean square error (NRMSE), Cronbach’s alpha test (α) and null hypothesis tests (t-test and F-test). The analysis challenged the quality of the acquired data and highlighted the significant errors in the acquired database. It is proven that the study area is plain, tectonically least affected and rich in oil and gas reserves. However, subsurface 3D modeling and contouring by using acquired database revealed high degrees of structural complexities and intense folding. The NRMSE had highest percentage of residuals between the estimated and predicted cases. The outcomes of hypothesis testing also proved the biasness and erraticness of the acquired database. Low estimated value of alpha (α) in Cronbach’s alpha test confirmed poor reliability of acquired database. A very low quality of acquired database needs excessive static correction or in some cases, reacquisition of data is also suggested which is most of the time not feasible on economic grounds. The outcomes of this study could be used to assess the quality of large databases and to further utilize as a guideline to establish database quality assessment models to make much more informed decisions in hydrocarbon exploration field.
In this paper, a method is provided for content-based image retrieval. Content-based image retrieval system searches query an image based on its visual content in an image database to retrieve similar images. In this paper, with the aim of simulating the human visual system sensitivity to image's edges and color features, the concept of color difference histogram (CDH) is used. CDH includes the perceptually color difference between two neighboring pixels with regard to colors and edge orientations. Since the HSV color space is close to the human visual system, the CDH is calculated in this color space. In addition, to improve the color features, the color histogram in HSV color space is also used as a feature. Among the extracted features, efficient features are selected using entropy and correlation criteria. The final features extract the content of images most efficiently. The proposed method has been evaluated on three standard databases Corel 5k, Corel 10k and UKBench. Experimental results show that the accuracy of the proposed image retrieval method is significantly improved compared to the recently developed methods.
Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.
Inhalation hazards are associated with potentially injurious exposure and increased risk for lung diseases, within the bauxite mining industry, especially for the smelter workers. Smoking is related to decreased lung function and leads to chronic lung diseases. This study had the objective to evaluate whether smoking is related to functional and radiographic respiratory changes in retired bauxite mining workers. Methods: This was a retrospective and cross-sectional study involving the analysis of database information of 140 retired bauxite mining workers from Poços de Caldas-MG evaluated at Worker’s Health Reference Center and at the Social Security Brazilian National Institute, from July 1st, 2015 until June 30th, 2016. The workers were divided into three groups: non-smokers (n = 47), ex-smokers (n = 46), and smokers (n = 47). The data included: age, gender, spirometry results, and the presence or not of pulmonary pleural and/or parenchymal changes in chest radiographs. Chi-Squared test was used (p < 0,05). Results: In the smokers’ group, 83% of spirometry tests and 64% of chest x-rays were altered. In the non-smokers’ group, 19% of spirometry tests and 13% of chest x-rays were altered. In the ex-smokers’ group, 35% of spirometry tests and 30% of chest x-rays were altered. Most of the results were statistically significant. Results demonstrated a significant difference between smokers’ and non-smokers’ groups in regard to spirometric and radiographic pulmonary alterations. Ex-smokers’ and non-smokers’ group demonstrated better results when compared to the smokers’ group in relation to altered spirometry and radiograph findings. These data may contribute to planning strategies to enhance smoking cessation programs within the bauxite mining industry.
In this work, a training algorithm for probabilistic neural networks (PNN) is presented. The algorithm addresses one of the major drawbacks of PNN, which is the size of the hidden layer in the network. By using a cross-validation training algorithm, the number of hidden neurons is shrunk to a smaller number consisting of the most representative samples of the training set. This is done without affecting the overall architecture of the network. Performance of the network is compared against performance of standard PNN for different databases from the UCI database repository. Results show an important gain in network size and performance.
With the availability of diverse printed, electronic literature and web sites on medical and health related information, it is impossible for the medical professional to get the information he seeks in the shortest possible time. For all these problems information literacy is the only solution. Thus, information literacy is recognized as an important aspect of medical education. In the present study, an attempt has been made to know the information literacy skills of the faculty and students at medical colleges of Haryana, Punjab and Chandigarh. The scope of the study was confined to the 12 selected medical colleges of three States (Haryana, Punjab, and Chandigarh). The findings of the study were based on the data collected through 1018 questionnaires filled by the respondents of the medical colleges. It was found that Online Medical Websites (such as WebMD, eMedicine and Mayo Clinic etc.) were frequently used by 63.43% of the respondents of Chandigarh which is slightly more than Haryana (61%) and Punjab (55.65%). As well, 30.86% of the respondents of Chandigarh, 27.41% of Haryana and 27.05% of Punjab were familiar with the controlled vocabulary tool; 25.14% respondents of Chandigarh, 23.80% of Punjab, 23.17% of Haryana were familiar with the Boolean operators; 33.05% of the respondents of Punjab, 28.19% of Haryana and 25.14% of Chandigarh were familiar with the use and importance of the keywords while searching an electronic database; and 51.43% of the respondents of Chandigarh, 44.52% of Punjab and 36.29% of Haryana were able to make effective use of the retrieved information. For accessing information in electronic format, 47.74% of the respondents rated their skills high, while the majority of respondents (76.13%) were unfamiliar with the basic search technique i.e. Boolean operator used for searching information in an online database. On the basis of the findings, it was suggested that a comprehensive training program based on medical professionals information needs should be organized frequently. Furthermore, it was also suggested that information literacy may be included as a subject in the health science curriculum so as to make the medical professionals information literate and independent lifelong learners.
A bibliometric analysis in the Web of Science database was used to identify overall scientific results of graphene inks to date (2008 to 2018). The objective of this study was to evaluate the evolutionary tendency of graphene inks research and to identify its aspects, aiming to provide data that can guide future work. The contributions of different researches, languages, thematic categories, periodicals, place of publication, institutes, funding agencies, articles cited and applications were analyzed. The results revealed a growing number of annual publications, of 258 papers found, 107 were included because they met the inclusion criteria. Three main applications were identified: synthesis and characterization, electronics and surfaces. The most relevant research on graphene inks has been summarized in this article, and graphene inks for electronic devices presented the most incident theme according to the research trends during the studied period. It is estimated that this theme will remain in evidence and will contribute to the direction of future research in this area.
Magnetic graphene has received widespread attention for their capability of water and wastewater treatment, which has been attracted many researchers in this field. A bibliometric analysis based on the Web of Science database was employed to analyze the global scientific outputs of magnetic graphene for water treatment until the present time (2012 to 2017), to improve the understanding of the research trends. The publication year, place of publication, institutes, funding agencies, journals, most cited articles, distribution outputs in thematic categories and applications were analyzed. Three major aspects analyzed including type of pollutant, treatment process and composite composition have further contributed to revealing the research trends. The most relevant research aspects of the main technologies using magnetic graphene for water treatment were summarized in this paper. The results showed that research on magnetic graphene for water treatment goes through a period of decline that might be related to a saturated field and a lack of bibliometric studies. Thus, the result of the present work will lead researchers to establish future directions in further studies using magnetic graphene for water treatment.
Later life loneliness is a social issue that is increasing alongside an upward global population trend. As a society, one way that we have responded to this social challenge is through developing non-pharmacological interventions such as befriending services, activity clubs, meet-ups, etc. Through a systematic literature review, this paper suggests that currently there is an underrepresentation of radical innovation, and underutilization of digital technologies in developing loneliness interventions for older adults. This paper examines intervention studies that were published in English language, within peer reviewed journals between January 2005 and December 2014 across 4 electronic databases. In addition to academic databases, interventions found in grey literature in the form of websites, blogs, and Twitter were also included in the overall review. This approach yielded 129 interventions that were included in the study. A systematic approach allowed the minimization of any bias dictating the selection of interventions to study. A coding strategy based on a pattern analysis approach was devised to be able to compare and contrast the loneliness interventions. Firstly, interventions were categorized on the basis of their objective to identify whether they were preventative, supportive, or remedial in nature. Secondly, depending on their scope, they were categorized as one-to-one, community-based, or group based. It was also ascertained whether interventions represented an improvement, an incremental innovation, a major advance or a radical departure, in comparison to the most basic form of a loneliness intervention. Finally, interventions were also assessed on the basis of the extent to which they utilized digital technologies. Individual visualizations representing the four levels of coding were created for each intervention, followed by an aggregated visual to facilitate analysis. To keep the inquiry within scope and to present a coherent view of the findings, the analysis was primarily concerned the level of innovation, and the use of digital technologies. This analysis highlights a weak but positive correlation between the level of innovation and the use of digital technologies in designing and deploying loneliness interventions, and also emphasizes how certain existing interventions could be tweaked to enable their migration from representing incremental innovation to radical innovation for example. This analysis also points out the value of including grey literature, especially from Twitter, in systematic literature reviews to get a contemporary view of latest work in the area under investigation.
It is commonly observed that aftershocks follow the mainshock. Aftershocks continue over a period of time with a decreasing frequency and typically there is not sufficient time for repair and retrofit between a mainshock–aftershock sequence. Usually, aftershocks are smaller in magnitude; however, aftershock ground motion characteristics such as the intensity and duration can be greater than the mainshock due to the changes in the earthquake mechanism and location with respect to the site. The seismic performance of slopes is typically evaluated based on the sliding displacement predicted to occur along a critical sliding surface. Various empirical models are available that predict sliding displacement as a function of seismic loading parameters, ground motion parameters, and site parameters but these models do not include the aftershocks. The seismic risks associated with the post-mainshock slopes ('damaged slopes') subjected to aftershocks is significant. This paper extends the empirical sliding displacement models for flexible slopes subjected to earthquake mainshock-aftershock sequences (a multi hazard approach). A dataset was developed using 144 pairs of as-recorded mainshock-aftershock sequences using the Pacific Earthquake Engineering Research Center (PEER) database. The results reveal that the combination of mainshock and aftershock increases the seismic demand on slopes relative to the mainshock alone; thus, seismic risks are underestimated if aftershocks are neglected.
The study examines the socioeconomic impact of development of an advanced industry in Israel. The research method is based on data collected from the Israel Central Bureau of Statistics and from the National Insurance Institute (NII) databases, which provided information that allows to examine the Economic and Social Changes during the 1990s. The study examined the socioeconomic effects of the development of advanced industry in Israel. The research findings indicate that as a result of globalization processes, the weight of traditional industry began to diminish as a result of factory closures and the laying off of workers. These circumstances led to growing unemployment among the weaker groups in Israeli society, detracting from their income and thus increasing inequality among different socioeconomic groups in Israel and enhancement of social disparities.
Installations of solar photovoltaic systems have increased considerably in the last decade. Therefore, it has been noticed that monitoring of meteorological data (solar irradiance, air temperature, wind velocity, etc.) is important to predict the potential of a given geographical area in solar energy production. In this sense, the present work compares two computational tools that are capable of estimating the energy generation of a photovoltaic system through correlation analyzes of solar radiation data: PVsyst software and an algorithm based on the PVlib package implemented in MATLAB. In order to achieve the objective, it was necessary to obtain solar radiation data (measured and from a solarimetric database), analyze the decomposition of global solar irradiance in direct normal and horizontal diffuse components, as well as analyze the modeling of the devices of a photovoltaic system (solar modules and inverters) for energy production calculations. Simulated results were compared with experimental data in order to evaluate the performance of the studied methods. Errors in estimation of energy production were less than 30% for the MATLAB algorithm and less than 20% for the PVsyst software.
The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).
Nowadays flight simulators offer tremendous possibilities for safe and cost-effective pilot training, by utilization of powerful, computational tools. Due to technology outpacing methodology, vast majority of training related work is done by human instructors. It makes assessment not efficient, and vulnerable to instructors’ subjectivity. The research presents an Objective Assessment Tool (gOAT) developed at the Warsaw University of Technology, and tested on SW-4 helicopter flight simulator. The tool uses database of the predefined manoeuvres, defined and integrated to the virtual environment. These were implemented, basing on Aeronautical Design Standard Performance Specification Handling Qualities Requirements for Military Rotorcraft (ADS-33), with predefined Mission-Task-Elements (MTEs). The core element of the gOAT enhanced algorithm that provides instructor a new set of information. In details, a set of objective flight parameters fused with report about psychophysical state of the pilot. While the pilot performs the task, the gOAT system automatically calculates performance using the embedded algorithms, data registered by the simulator software (position, orientation, velocity, etc.), as well as measurements of physiological changes of pilot’s psychophysiological state (temperature, sweating, heart rate). Complete set of measurements is presented on-line to instructor’s station and shown in dedicated graphical interface. The presented tool is based on open source solutions, and flexible for editing. Additional manoeuvres can be easily added using guide developed by authors, and MTEs can be changed by instructor even during an exercise. Algorithm and measurements used allow not only to implement basic stress level measurements, but also to reduce instructor’s workload significantly. Tool developed can be used for training purpose, as well as periodical checks of the aircrew. Flexibility and ease of modifications allow the further development to be wide ranged, and the tool to be customized. Depending on simulation purpose, gOAT can be adjusted to support simulator of aircraft, helicopter, or unmanned aerial vehicle (UAV).
Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.
With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.
This paper presents the design application and reinforcement detailing of 15 storied reinforced concrete shear wall-frame structure based on linear static analysis. Databases are generated for section sizes based on automated structural optimization method utilizing Active-set Algorithm in MATLAB platform. The design constraints of allowable section sizes, capacity criteria and seismic provisions for static loads, combination of gravity and lateral loads are checked and determined based on ASCE 7-10 documents and ACI 318-14 design provision. The result of this study illustrates the efficiency of proposed method, and is expected to provide a useful reference in designing of RC shear wall-frame structures.
In China, the changes that men experience in the perinatal period are not well researched. Men are also at risk of maladaptation to parenthood. The aim of this research is to review current studies regarding changes that Chinese men experience during transitioning to parenthood. 5 databases were employed to search relevant papers. The search found 128 articles. Based on the inclusion and exclusion criteria, 35 articles were included in this integrative review. Results showed the changes that Chinese fathers experienced during the transition to parenthood can be divided into two aspects: family relationships and mental problems. During transition to parenthood, fathers usually experienced an increase in their disappointment with marital conflict resolution and decreased sexual intimacy with their partner. Mental health declined, with fathers often feeling depressed and/or anxious during this time. Some men were diagnosed with clinical depression. The predictors of these changes included three domains: personal background (age and income), family background (gender of infant, relationship status and unplanned child) and cultural background (‘doing the month’, Confucianism, policy, social support).
This paper presents a context-sensitive media similarity search algorithm. One of the central problems regarding media search is the semantic gap between the low-level features computed automatically from media data and the human interpretation of them. This is because the notion of similarity is usually based on high-level abstraction but the low-level features do not sometimes reflect the human perception. Many media search algorithms have used the Minkowski metric to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information given by images in a collection. Our search algorithm tackles this problem by employing a similarity measure and a ranking strategy that reflect the nonlinearity of human perception and contextual information in a dataset. Similarity search in an image database based on this contextual information shows encouraging experimental results.
Introduction: Sodium-glucose co-transporter-2 (SGLT2) inhibitors are a new class of oral anti-diabetic drugs with a unique mechanism of action. They are used to improve glycaemic control in adults with type 2 diabetes by enhancing urinary glucose excretion. In the UAE, there has been certainly an increased use of these medications. As with any new medication, there are safety considerations related to their use in patients with type two diabetes. A retrospective study was conducted at the three main centres of the Imperial College London Diabetes Centre. Methodology: All patients in electronic database (Diamond) from October 2014 to October 2017 were included with a minimum of six months usage of sodium glucose co-transporter inhibitors that comprise canagliflozin, dapagliflozin and empagliflozin. There were 15 paired sample biochemical and clinical correlations. The analysis was done at the start of the study, three months and six months apart. SPSS version 24 was used for this study. Conclusion: This study of sodium glucose co-transporter-2 inhibitors used showed significant reductions in weight, glycated haemoglobin A1C, systolic and diastolic blood pressures. As the case with systematic reviews, there were similar changes in liver enzymes, raised total cholesterol, low density lipopoptein and high density lipoprotein. There was slight improvement in estimated glomerular filtration rate too. Our analysis also showed that they increased in the incidence of urinary tract symptoms and incidence of urinary tract infections.
SQL injection is one of the most common types of attacks and has a very critical impact on web servers. In the worst case, an attacker can perform post-exploitation after a successful SQL injection attack. In the case of forensics web servers, web server analysis is closely related to log file analysis. But sometimes large file sizes and different log types make it difficult for investigators to look for traces of attackers on the server. The purpose of this paper is to help investigator take appropriate steps to investigate when the web server gets attacked. We use attack scenarios using SQL injection attacks including PHP backdoor injection as post-exploitation. We perform post-mortem analysis of web server logs based on Hypertext Transfer Protocol (HTTP) POST and HTTP GET method approaches that are characteristic of SQL injection attacks. In addition, we also propose structured analysis method between the web server application log file, database application, and other additional logs that exist on the webserver. This method makes the investigator more structured to analyze the log file so as to produce evidence of attack with acceptable time. There is also the possibility that other attack techniques can be detected with this method. On the other side, it can help web administrators to prepare their systems for the forensic readiness.