Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.
Accurate prediction of NOx emission is a continuous challenge in the field of diesel engine-out emission modeling. Performing experiments for each conditions and scenario cost significant amount of money and man hours, therefore model-based development strategy has been implemented in order to solve that issue. NOx formation is highly dependent on the burn gas temperature and the O2 concentration inside the cylinder. The current empirical models are developed by calibrating the parameters representing the engine operating conditions with respect to the measured NOx. This makes the prediction of purely empirical models limited to the region where it has been calibrated. An alternative solution to that is presented in this paper, which focus on the utilization of in-cylinder combustion parameters to form a predictive semi-empirical NOx model. The result of this work is shown by developing a fast and predictive NOx model by using the physical parameters and empirical correlation. The model is developed based on the steady state data collected at entire operating region of the engine and the predictive combustion model, which is developed in Gamma Technology (GT)-Power by using Direct Injected (DI)-Pulse combustion object. In this approach, temperature in both burned and unburnt zone is considered during the combustion period i.e. from Intake Valve Closing (IVC) to Exhaust Valve Opening (EVO). Also, the oxygen concentration consumed in burnt zone and trapped fuel mass is also considered while developing the reported model. Several statistical methods are used to construct the model, including individual machine learning methods and ensemble machine learning methods. A detailed validation of the model on multiple diesel engines is reported in this work. Substantial numbers of cases are tested for different engine configurations over a large span of speed and load points. Different sweeps of operating conditions such as Exhaust Gas Recirculation (EGR), injection timing and Variable Valve Timing (VVT) are also considered for the validation. Model shows a very good predictability and robustness at both sea level and altitude condition with different ambient conditions. The various advantages such as high accuracy and robustness at different operating conditions, low computational time and lower number of data points requires for the calibration establishes the platform where the model-based approach can be used for the engine calibration and development process. Moreover, the focus of this work is towards establishing a framework for the future model development for other various targets such as soot, Combustion Noise Level (CNL), NO2/NOx ratio etc.
During the last 10 years, in spite of the economic crisis, the number of tourists arriving in Greece has increased, particularly during the tourist season from April to October. In this paper, the number of annual tourist arrivals is studied to explore their preferences with regard to the month of travel, the selected destinations, as well the amount of money spent. The collected data are processed with statistical methods, yielding numerical and graphical results. From the computation of statistical parameters and the forecasting with exponential smoothing, useful conclusions are arrived at that can be used by the Greek tourism authorities, as well as by tourist organizations, for planning purposes for the coming years. The results of this paper and the computed forecast can also be used for decision making by private tourist enterprises that are investing in Greece. With regard to the statistical methods, the method of Simple Exponential Smoothing of time series of data is employed. The search for a best forecast for 2017 and 2018 provides the value of the smoothing coefficient. For all statistical computations and graphics Microsoft Excel is used.
The modern world faces huge challenges. Globalization changed the socio-economic conditions of many countries. The current processes in the global environment have a different impact on countries with different cultures. However, an alleviation of poverty and improvement of living conditions is still the basic challenge for the majority of countries, because much of the population still lives under the official threshold of poverty. It is very important to stimulate youth employment. In order to prepare young people for the labour market, it is essential to provide them with the appropriate professional skills and knowledge. It is necessary to plan efficient activities for decreasing an unemployment rate and for developing the perfect mechanisms for regulation of a labour market. Such planning requires thorough study and analysis of existing reality, as well as development of corresponding mechanisms. Statistical analysis of unemployment is one of the main platforms for regulation of the labour market key mechanisms. The corresponding statistical methods should be used in the study process. Such methods are observation, gathering, grouping, and calculation of the generalized indicators. Unemployment is one of the most severe socioeconomic problems in Georgia. According to the past as well as the current statistics, unemployment rates always have been the most problematic issue to resolve for policy makers. Analytical works towards to the above-mentioned problem will be the basis for the next sustainable steps to solve the main problem. The results of the study showed that the choice of young people is not often due to their inclinations, their interests and the labour market demand. That is why the wrong professional orientation of young people in most cases leads to their unemployment. At the same time, it was shown that there are a number of professions in the labour market with a high demand because of the deficit the appropriate specialties. To achieve healthy competitiveness in youth employment, it is necessary to formulate regional employment programs with taking into account the regional infrastructure specifications.
Collaborative research has become more prevalent and important across disciplines because it stimulates innovation and interaction between scholars. Seeing as existing studies relatively disregarded the institutional conditions triggering collaborative research, this work aims to analyze the changing trend in collaborative work patterns among Korean social scientists. The focus of this research is the performance of social scientists who received research grants through the government’s Social Science Korea (SSK) program. Using quantitative statistical methods, collaborative research patterns in a total of 2,354 papers published under the umbrella of the SSK program in peer-reviewed scholarly journals from 2013 to 2016 were examined to identify changing trends and triggering factors in collaborative research. A notable finding is that the share of collaborative research is overwhelmingly higher than that of individual research. In particular, levels of collaborative research surpassed 70%, increasing much quicker compared to other research done in the social sciences. Additionally, the most common composition of collaborative research was for two or three researchers to conduct joint research as coauthors, and this proportion has also increased steadily. Finally, a strong association between international journals and co-authorship patterns was found for the papers published by SSK program researchers from 2013 to 2016. The SSK program can be seen as the driving force behind collaboration between social scientists. Its emphasis on competition through a merit-based financial support system along with a rigorous evaluation process seems to have influenced researchers to cooperate with those who have similar research interests.
This paper aimed to help researchers and international companies to the differences and similarities between IFRS (International financial reporting standards) and UK GAAP or UK accounting principles, and to the accounting changes between standard setting of the International Accounting Standards Board and the Accounting Standards Board in United Kingdom. We will use in this study statistical methods to calculate similarities and difference frequencies between the UK standards and IFRS standards, according to the PricewaterhouseCoopers report in 2005. We will use the one simple test to confirm or refuse our hypothesis. In conclusion, we found that the gap between UK GAAP and IFRS is small.
Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.
Often a Grade 12 certificate is perceived as a passport to tertiary education and the minimum requirement to enter the world of work. In spite of its importance, many students do not make this milestone in South Africa. It is important to find out why so many students still fail in spite of transformation in the education system in the post-apartheid era. Given the complexity of education and its context, this study adopted a case study design to examine one historically underperforming high school in Bushbuckridge, Mpumalanga Province, South Africa in 2013. The aim was to gain a understanding of the inner and outer school contextual factors associated with the high failure rate among Grade 12 students. Government documents and reports were consulted to identify factors in the district and the village surrounding the school and a student survey was conducted to identify school, home and student factors. The randomly-sampled half of the population of Grade 12 students (53) participated in the survey and quantitative data are analyzed using descriptive statistical methods. The findings showed that a host of factors is at play. The school is located in a village within a municipality which has been one of the poorest three municipalities in South Africa and the lowest Grade 12 pass rate in the Mpumalanga province. Moreover, over half of the families of the students are single parents, 43% are unemployed and the majority has a low level of education. In addition, most families (83%) do not have basic study materials such as a dictionary, books, tables, and chairs. A significant number of students (70%) are over-aged (+19 years old); close to half of them (49%) are grade repeaters. The school itself lacks essential resources, namely computers, science laboratories, library, and enough furniture and textbooks. Moreover, teaching and learning are negatively affected by the teachers’ occasional absenteeism, inadequate lesson preparation, and poor communication skills. Overall, the continuous low performance of students in this school mirrors the vicious circle of multiple negative conditions present within and outside of the school. The complexity of factors associated with the underperformance of Grade 12 students in this school calls for a multi-dimensional intervention from government and stakeholders. One important intervention should be the placement of over-aged students and grade-repeaters in suitable educational institutions for the benefit of other students.
This work investigates a mathematical study for traffic flow and traffic density in Kigali city roads and the data collected from the national police of Rwanda in 2012. While working on this topic, some mathematical models were used in order to analyze and compare traffic variables. This work has been carried out on Kigali roads specifically at roundabouts from Kigali Business Center (KBC) to Prince House as our study sites. In this project, we used some mathematical tools to analyze the data collected and to understand the relationship between traffic variables. We applied the Poisson distribution method to analyze and to know the number of accidents occurred in this section of the road which is from KBC to Prince House. The results show that the accidents that occurred in 2012 were at very high rates due to the fact that this section has a very narrow single lane on each side which leads to high congestion of vehicles, and consequently, accidents occur very frequently. Using the data of speeds and densities collected from this section of road, we found that the increment of the density results in a decrement of the speed of the vehicle. At the point where the density is equal to the jam density the speed becomes zero. The approach is promising in capturing sudden changes on flow patterns and is open to be utilized in a series of intelligent management strategies and especially in noncurrent congestion effect detection and control.
Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.
Curcuma longa L. (Zingiberaceae), commonly known as turmeric, has a long history of traditional uses for culinary purposes as a spice and a food colorant. The present study aimed to document the ethnobotanical knowledge about Curcuma longa, and to assess the variation in the herbalists’ experience in Northeastern Algeria. Data were collected using semi-structured questionnaires and direct interviews with 30 herbalists. Ethnobotanical indices, including the fidelity level (FL%), the relative frequency citation (RFC), and use value (UV) were determined by quantitative methods. Diversity in the level of knowledge was analyzed using univariate, non-parametric, and multivariate statistical methods. Three main categories of uses were recorded for C. longa: for food, for medicine, and for cosmetic purposes. As a medicine, turmeric was used for the treatment of gastrointestinal, dermatological, and hepatic diseases. Medicinal and food uses were correlated with both forms of preparation (rhizome and powder). The age group did not influence the use. Multivariate analyses showed a significant variation in traditional knowledge, associated with the use value, origin, quality, and efficacy of the drug. The findings suggested that the geographical origin of C. longa affected the use in Algeria.
In this paper, we study the rainfall using a time series for weather stations in Nakhon Ratchasima province in Thailand by various statistical methods to enable us to analyse the behaviour of rainfall in the study areas. Time-series analysis is an important tool in modelling and forecasting rainfall. The ARIMA and Holt-Winter models were built on the basis of exponential smoothing. All the models proved to be adequate. Therefore it is possible to give information that can help decision makers establish strategies for the proper planning of agriculture, drainage systems and other water resource applications in Nakhon Ratchasima province. We obtained the best performance from forecasting with the ARIMA Model(1,0,1)(1,0,1)12.
In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.
The purpose of this study is to follow – up the graduated students of Bachelor of Science in Applied Statistics from Suan Sunandha Rajabhat University (SSRU) during the 1999 – 2012 academic years and to provide the fundamental guideline for developing the current curriculum according to Thai Qualifications Framework for Higher Education (TQF: HEd). The sample was collected from 75 graduates by interview and online questionnaire. The content covered 5 subjects were Ethics and Moral, Knowledge, Cognitive Skills, Interpersonal Skill and Responsibility, Numerical Analysis as well as Communication and Information Technology Skills. Data were analyzed by using statistical methods as percentiles, means, standard deviation, t- tests, and F- tests. The findings showed that samples were mostly female had less than 26 years old. The majority of graduates had income in the range of 10,001-20,000 Baht and experience range were 2-5 years. In addition, overall opinions from receiving knowledge to apply to work were at agree; mean score was 3.97 and standard deviation was 0.40. In terms of, the hypothesis testing’s result indicate gender only had different opinion at a significance level of 0.05.
Ice cover County has a significant impact on rivers as it affects with the ice melting capacity which results in flooding, restrict navigation, modify the ecosystem and microclimate. River ices are made up of different ice types with varying ice thickness, so surveillance of river ice plays an important role. River ice types are captured using infrared imaging camera which captures the images even during the night times. In this paper the river ice infrared texture images are analysed using first-order statistical methods and secondorder statistical methods. The second order statistical methods considered are spatial gray level dependence method, gray level run length method and gray level difference method. The performance of the feature extraction methods are evaluated by using Probabilistic Neural Network classifier and it is found that the first-order statistical method and second-order statistical method yields low accuracy. So the features extracted from the first-order statistical method and second-order statistical method are combined and it is observed that the result of these combined features (First order statistical method + gray level run length method) provides higher accuracy when compared with the features from the first-order statistical method and second-order statistical method alone.
This paper presents a comparative study of statistical methods for the multi-response surface optimization of a cryogenic freezing process. Taguchi design and analysis and steepest ascent methods based on the desirability function were conducted to ascertain the influential factors of a cryogenic freezing process and their optimal levels. The more preferable levels of the set point, exhaust fan speed, retention time and flow direction are set at -90oC, 20 Hz, 18 minutes and Counter Current, respectively. The overall desirability level is 0.7044.
The paper deals with quality labels used in the food products market, especially with labels of quality, labels of origin, and labels of organic farming. The aim of the paper is to identify perception of these labels by consumers in the Czech Republic. The first part refers to the definition and specification of food quality labels that are relevant in the Czech Republic. The second part includes the discussion of marketing research results. Data were collected with personal questioning method. Empirical findings on 150 respondents are related to consumer awareness and perception of national and European food quality labels used in the Czech Republic, attitudes to purchases of labelled products, and interest in information regarding the labels. Statistical methods, in the concrete Pearson´s chi-square test of independence, coefficient of contingency, and coefficient of association are used to determinate if significant differences do exist among selected demographic categories of Czech consumers.