Although digitization is a buzzword in almost every election campaign, the political parties leave voters largely in the dark about their specific positions on digital issues. In the run-up to the 2019 elections in Switzerland, the ‘Digitization Monitor’ project (DMP) was launched in order to change this situation. Within the framework of the DMP, all 4,736 candidates were surveyed about their digital policy positions and values. The DMP is designed as a digital policy supplement to the existing ‘smartvote’ voting advice application. This enabled a direct comparison of the digital policy attitudes according to the DMP with the topics of the ‘smartvote’ questionnaire which are comprehensive in content but mainly related to conventional policy areas. This paper’s main research goal is to analyze and visualize possible differences between conventional and digital policy areas in terms of response patterns between and within political parties. The analysis is based on dimensionality reduction methods (multidimensional scaling and principal component analysis) for the visualization of inter-party differences, and on standard deviation as a measure of variation for the evaluation of intra-party unity. The results reveal that digital issues show a lower degree of inter-party polarization compared to conventional policy areas. Thus, the parties have more common ground in issues on digitization than in conventional policy areas. In contrast, the study reveals a mixed picture regarding intra-party unity. Homogeneous parties show a lower degree of unity in digitization issues whereas parties with heterogeneous positions in conventional areas have more united positions in digital areas. All things considered, the findings are encouraging as less polarized conditions apply to the debate on digital development compared to conventional politics. For the future, it would be desirable if in further countries similar projects to the DMP could emerge to broaden the basis for conclusions.
Cotton, as a strategic crop, plays an important role in providing human food and clothing need, because of its oil, protein, and fiber. Iran has been one of the largest cotton producers in the world in the past, but unfortunately, for economic reasons, its production is reduced now. One of the ways to reduce the cost of cotton production is to expand the mechanization of cotton harvesting. Iranian farmers do not accept the function of cotton harvesters. One reason for this lack of acceptance of cotton harvesting machines is the number of field losses on these machines. So, the majority of cotton fields are harvested by hand. Although the correct setting of the harvesting machine is very important in the cotton losses, the morphological properties of the cotton plant also affect the performance of cotton harvesters. In this study, the effect of some cotton morphological properties such as the height of the cotton plant, number, and length of sympodial and monopodial branches, boll dimensions, boll weight, number of carpels and bracts angle were evaluated on the performance of cotton picker. In this research, the efficiency of John Deere 9920 spindle Cotton picker is investigated on five different Iranian cotton cultivars. The results indicate that there was a significant difference between the five cultivars in terms of machine harvest efficiency. Golestan cultivar showed the best cotton harvester performance with an average of 87.6% of total harvestable seed cotton and Khorshid cultivar had the least cotton harvester performance. The principal component analysis showed that, at 50.76% probability, the cotton picker efficiency is affected by the bracts angle positively and by boll dimensions, the number of carpels and the height of cotton plants negatively. The seed cotton remains (in the plant and on the ground) after harvester in PCA scatter plot were in the same zone with boll dimensions and several carpels.
Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.
Blood pressure helps the physicians greatly to have a deep insight into the cardiovascular system. The determination of individual blood pressure is a standard clinical procedure considered for cardiovascular system problems. The conventional techniques to measure blood pressure (e.g. cuff method) allows a limited number of readings for a certain period (e.g. every 5-10 minutes). Additionally, these systems cause turbulence to blood flow; impeding continuous blood pressure monitoring, especially in emergency cases or critically ill persons. In this paper, the most important statistical features in the photoplethysmogram (PPG) signals were extracted to estimate the blood pressure noninvasively. PPG signals from more than 40 subjects were measured and analyzed and 12 features were extracted. The features were fed to principal component analysis (PCA) to find the most important independent features that have the highest correlation with blood pressure. The results show that the stiffness index means and standard deviation for the beat-to-beat heart rate were the most important features. A model representing both features for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) was obtained using a statistical regression technique. Surface fitting is used to best fit the series of data and the results show that the error value in estimating the SBP is 4.95% and in estimating the DBP is 3.99%.
A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.
A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.
Based on multivariate statistical analysis theory, this paper uses the principal component analysis method, Mahalanobis distance analysis method and fitting method to establish the photovoltaic health model to evaluate the health of photovoltaic panels. First of all, according to weather conditions, the photovoltaic panel variable data are classified into five categories: sunny, cloudy, rainy, foggy, overcast. The health of photovoltaic panels in these five types of weather is studied. Secondly, a scatterplot of the relationship between the amount of electricity produced by each kind of weather and other variables was plotted. It was found that the amount of electricity generated by photovoltaic panels has a significant nonlinear relationship with time. The fitting method was used to fit the relationship between the amount of weather generated and the time, and the nonlinear equation was obtained. Then, using the principal component analysis method to analyze the independent variables under five kinds of weather conditions, according to the Kaiser-Meyer-Olkin test, it was found that three types of weather such as overcast, foggy, and sunny meet the conditions for factor analysis, while cloudy and rainy weather do not satisfy the conditions for factor analysis. Therefore, through the principal component analysis method, the main components of overcast weather are temperature, AQI, and pm2.5. The main component of foggy weather is temperature, and the main components of sunny weather are temperature, AQI, and pm2.5. Cloudy and rainy weather require analysis of all of their variables, namely temperature, AQI, pm2.5, solar radiation intensity and time. Finally, taking the variable values in sunny weather as observed values, taking the main components of cloudy, foggy, overcast and rainy weather as sample data, the Mahalanobis distances between observed value and these sample values are obtained. A comparative analysis was carried out to compare the degree of deviation of the Mahalanobis distance to determine the health of the photovoltaic panels under different weather conditions. It was found that the weather conditions in which the Mahalanobis distance fluctuations ranged from small to large were: foggy, cloudy, overcast and rainy.
Substantial research has indicated that socioeconomic and demographic characteristics’ of neighborhoods are strong determinants of food security. The aim of this study was to develop a Food Insecurity Neighborhood Index (FINI) based on the associated socioeconomic and demographic variables to identify the areas at potential risk of food insecurity in rural British Columbia (BC). Principle Component Analysis (PCA) technique was used to calculate the FINI for each rural Dissemination Area (DA) using the food security determinant variables from Canadian Census data. Using ArcGIS, the neighborhoods with the top quartile FINI values were classified as food insecure. The results of this study indicated that the most food insecure neighborhood with the highest FINI value of 99.1 was in the Bulkley-Nechako (central BC) area whereas the lowest FINI with the value of 2.97 was for a rural neighborhood in the Cowichan Valley area. In total, 98.049 (19%) of the rural population of British Columbians reside in high food insecure areas. Moreover, the distribution of food insecure neighborhoods was found to be strongly dependent on the degree of rurality in BC. In conclusion, the cluster of food insecure neighbourhoods was more pronounced in Central Coast, Mount Wadington, Peace River, Kootenay Boundary, and the Alberni-Clayoqout Regional Districts.
This study presents new gait representations for improving gait recognition accuracy on cross gait appearances, such as normal walking, wearing a coat and carrying a bag. Based on the Gait Energy Image (GEI), two ideas are implemented to generate new gait representations. One is to append lower knee regions to the original GEI, and the other is to apply convolutional operations to the GEI and its variants. A set of new gait representations are created and used for training multi-class Support Vector Machines (SVMs). Tests are conducted on the CASIA dataset B. Various combinations of the gait representations with different convolutional kernel size and different numbers of kernels used in the convolutional processes are examined. Both the entire images as features and reduced dimensional features by Principal Component Analysis (PCA) are tested in gait recognition. Interestingly, both new techniques, appending the lower knee regions to the original GEI and convolutional GEI, can significantly contribute to the performance improvement in the gait recognition. The experimental results have shown that the average recognition rate can be improved from 75.65% to 87.50%.
The evolution of groundwater chemistry and its quality is largely controlled by hydrogeochemical processes and their understanding is therefore important for groundwater quality assessments and protection of the water resources. A study was conducted in Bloemfontein town of South Africa to assess and compare the groundwater chemistry and quality characteristics in an alluvial aquifer and single-plane fractured-rock aquifers. 9 groundwater samples were collected from monitoring boreholes drilled into the two aquifer systems during a once-off sampling exercise. Samples were collected through low-flow purging technique and analysed for major ions and trace elements. In order to describe the hydrochemical facies and identify dominant hydrogeochemical processes, the groundwater chemistry data are interpreted using stiff diagrams and principal component analysis (PCA), as complimentary tools. The fitness of the groundwater quality for domestic and irrigation uses is also assessed. Results show that the alluvial aquifer is characterised by a Na-HCO3 hydrochemical facie while fractured-rock aquifer has a Ca-HCO3 facie. The groundwater in both aquifers originally evolved from the dissolution of calcite rocks that are common on land surface environments. However the groundwater in the alluvial aquifer further goes through another evolution as driven by cation exchange process in which Na in the sediments exchanges with Ca2+ in the Ca-HCO3 hydrochemical type to result in the Na-HCO3 hydrochemical type. Despite the difference in the hydrogeochemical processes between the alluvial aquifer and single-plane fractured-rock aquifer, this did not influence the groundwater quality. The groundwater in the two aquifers is very hard as influenced by the elevated magnesium and calcium ions that evolve from dissolution of carbonate minerals which typically occurs in surface environments. Based on total dissolved levels (600-900 mg/L), groundwater quality of the two aquifer systems is classified to be of fair quality. The negative potential impacts of the groundwater quality for domestic uses are highlighted.
The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.
Under the circumstance of environment deterioration, people are increasingly concerned about the quality of the environment, especially air quality. As a result, it is of great value to give accurate and timely forecast of AQI (air quality index). In order to simplify influencing factors of air quality in a city, and forecast the city’s AQI tomorrow, this study used MATLAB software and adopted the method of constructing a mathematic model of PCA-GABP to provide a solution. To be specific, this study firstly made principal component analysis (PCA) of influencing factors of AQI tomorrow including aspects of weather, industry waste gas and IAQI data today. Then, we used the back propagation neural network model (BP), which is optimized by genetic algorithm (GA), to give forecast of AQI tomorrow. In order to verify validity and accuracy of PCA-GABP model’s forecast capability. The study uses two statistical indices to evaluate AQI forecast results (normalized mean square error and fractional bias). Eventually, this study reduces mean square error by optimizing individual gene structure in genetic algorithm and adjusting the parameters of back propagation model. To conclude, the performance of the model to forecast AQI is comparatively convincing and the model is expected to take positive effect in AQI forecast in the future.
The correct estimation of reference evapotranspiration (ETₒ) is required for effective irrigation water resources planning and management. However, there are some variables that must be considered while estimating and modeling ETₒ. This study therefore determines the multivariate analysis of correlated variables involved in the estimation and modeling of ETₒ at Vaalharts irrigation scheme (VIS) in South Africa using Principal Component Analysis (PCA) technique. Weather and meteorological data between 1994 and 2014 were obtained both from South African Weather Service (SAWS) and Agricultural Research Council (ARC) in South Africa for this study. Average monthly data of minimum and maximum temperature (°C), rainfall (mm), relative humidity (%), and wind speed (m/s) were the inputs to the PCA-based model, while ETₒ is the output. PCA technique was adopted to extract the most important information from the dataset and also to analyze the relationship between the five variables and ETₒ. This is to determine the most significant variables affecting ETₒ estimation at VIS. From the model performances, two principal components with a variance of 82.7% were retained after the eigenvector extraction. The results of the two principal components were compared and the model output shows that minimum temperature, maximum temperature and windspeed are the most important variables in ETₒ estimation and modeling at VIS. In order words, ETₒ increases with temperature and windspeed. Other variables such as rainfall and relative humidity are less important and cannot be used to provide enough information about ETₒ estimation at VIS. The outcome of this study has helped to reduce input variable dimensionality from five to the three most significant variables in ETₒ modelling at VIS, South Africa.
This paper examines the influence of knowledge management factors on organizational commitment for employees in the oil and gas drilling industry of Iran. We determine what knowledge factors have the greatest impact on the personnel loyalty and commitment to the organization using collected data from a survey of over 300 full-time personnel working in three large companies active in oil and gas drilling industry of Iran. To specify the effect of knowledge factors in the organizational commitment of the personnel in the studied organizations, the Principal Component Analysis (PCA) is used. Findings of our study show that the factors such as knowledge and expertise, in-service training, the knowledge value and the application of individuals’ knowledge in the organization as the factor “learning and perception of personnel from the value of knowledge within the organization” has the greatest impact on the organizational commitment. After this factor, “existence of knowledge and knowledge sharing environment in the organization”; “existence of potential knowledge exchanging in the organization”; and “organizational knowledge level” factors have the most impact on the organizational commitment of personnel, respectively.
The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the supportvectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.
Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. Anemia is a lack of RBCs is characterized by its level compared to the normal hemoglobin level. In this study, a system based image processing methodology was developed to localize and extract RBCs from microscopic images. Also, the machine learning approach is adopted to classify the localized anemic RBCs images. Several textural and geometrical features are calculated for each extracted RBCs. The training set of features was analyzed using principal component analysis (PCA). With the proposed method, RBCs were isolated in 4.3secondsfrom an image containing 18 to 27 cells. The reasons behind using PCA are its low computation complexity and suitability to find the most discriminating features which can lead to accurate classification decisions. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network RBFNN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained within short time period, and the results became better when PCA was used.
Face recognition is a technique to automatically identify or verify individuals. It receives great attention in identification, authentication, security and many more applications. Diverse methods had been proposed for this purpose and also a lot of comparative studies were performed. However, researchers could not reach unified conclusion. In this paper, we are reporting an extensive quantitative accuracy analysis of four most widely used face recognition algorithms: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) using AT&T, Sheffield and Bangladeshi people face databases under diverse situations such as illumination, alignment and pose variations.
In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.
The aim of research was to define the relations between volatile compounds, some parameters (pH, titratable acidity (TA), total soluble solid (TSS), lactic acid bacteria count) and consumer preference of commercial fermented milks. These relations tend to be used for controlling and developing new fermented milk product. Three leading commercial brands of fermented milks in Thailand were evaluated by consumers (n=71) using hedonic scale for four attributes (sweetness, sourness, flavour, and overall liking), volatile compounds using headspace-solid phase microextraction (HS-SPME) GC-MS, pH, TA, TSS and LAB count. Then the relations were analyzed by principal component analysis (PCA). The PCA data showed that all of four attributes liking scores were related to each other. They were also related to TA, TSS and volatile compounds. The related volatile compounds were mainly on fermented produced compounds including acetic acid, furanmethanol, furfural, octanoic acid and the volatiles known as artificial fruit flavour (beta pinene, limonene, vanillin, and ethyl vanillin). These compounds were provided the information about flavour addition in commercial fermented milk in Thailand.
This paper presents a facial expression recognition system. It performs identification and classification of the seven basic expressions; happy, surprise, fear, disgust, sadness, anger, and neutral states. It consists of three main parts. The first one is the detection of a face and the corresponding facial features to extract the most expressive portion of the face, followed by a normalization of the region of interest. Then calculus of curvelet coefficients is performed with dimensionality reduction through principal component analysis. The resulting coefficients are combined with two ratios; mouth ratio and face edge ratio to constitute the whole feature vector. The third step is the classification of the emotional state using the SVM method in the feature space.
The aim of the present study was to develop a rapid method for electronic nose for online quality control of oat milk. Analysis by electronic nose and bacteriological measurements were performed to analyze spoilage kinetics of oat milk samples stored at room temperature and refrigerated conditions for up to 15 days. Principal component analysis (PCA), Discriminant Factorial Analysis (DFA) and Soft Independent Modelling by Class Analogy (SIMCA) classification techniques were used to differentiate the samples of oat milk at different days. The total plate count (bacteriological method) was selected as the reference method to consistently train the electronic nose system. The e-nose was able to differentiate between the oat milk samples of varying microbial load. The results obtained by the bacteria total viable countsshowed that the shelf-life of oat milk stored at room temperature and refrigerated conditions were 20hrs and 13 days, respectively. The models built classified oat milk samples based on the total microbial population into “unspoiled” and “spoiled”.
A duplicated image region may be subjected to a number of attacks such as noise addition, compression, reflection, rotation, and scaling with the intention of either merely mating it to its targeted neighborhood or preventing its detection. In this paper, we present an effective and robust method of detecting duplicated regions inclusive of those affected by the various attacks. In order to reduce the dimension of the image, the proposed algorithm firstly performs discrete wavelet transform, DWT, of a suspicious image. However, unlike most existing copy move image forgery (CMIF) detection algorithms operating in the DWT domain which extract only the low frequency subband of the DWT of the suspicious image thereby leaving valuable information in the other three subbands, the proposed algorithm simultaneously extracts features from all the four subbands. The extracted features are not only more accurate representation of image regions but also robust to additive noise, JPEG compression, and affine transformation. Furthermore, principal component analysis-eigenvalue decomposition, PCA-EVD, is applied to reduce the dimension of the features. The extracted features are then sorted using the more computationally efficient Radix Sort algorithm. Finally, same affine transformation selection, SATS, a duplication verification method, is applied to detect duplicated regions. The proposed algorithm is not only fast but also more robust to attacks compared to the related CMIF detection algorithms. The experimental results show high detection rates.