Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.
This study was designed to find the best-fit probability distribution of annual rainfall based on 50 years sample (1966-2015) in the Karkheh river basin at Iran using six probability distributions: Normal, 2-Parameter Log Normal, 3-Parameter Log Normal, Pearson Type 3, Log Pearson Type 3 and Gumbel distribution. The best fit probability distribution was selected using Stormwater Management and Design Aid (SMADA) software and based on the Residual Sum of Squares (R.S.S) between observed and estimated values Based on the R.S.S values of fit tests, the Log Pearson Type 3 and then Pearson Type 3 distributions were found to be the best-fit probability distribution at the Jelogir Majin and Pole Zal rainfall gauging station. The annual values of expected rainfall were calculated using the best fit probability distributions and can be used by hydrologists and design engineers in future research at studied region and other region in the world.
The probability distributions are the best method for forecasting of extreme hydrologic phenomena such as rainfall and flood flows. In this research, in order to determine suitable probability distribution for estimating of annual extreme rainfall and flood flows (discharge) series with different return periods, precipitation with 40 and discharge with 58 years time period had been collected from Karkheh River at Iran. After homogeneity and adequacy tests, data have been analyzed by Stormwater Management and Design Aid (SMADA) software and residual sum of squares (R.S.S). The best probability distribution was Log Pearson Type III with R.S.S value (145.91) and value (13.67) for peak discharge and Log Pearson Type III with R.S.S values (141.08) and (8.95) for maximum discharge in Jelogir Majin and Pole Zal stations, respectively. The best distribution for maximum precipitation in Jelogir Majin and Pole Zal stations was Log Pearson Type III distribution with R.S.S values (1.74&1.90) and then Pearson Type III distribution with R.S.S values (1.53&1.69). Overall, the Log Pearson Type III distributions are acceptable distribution types for representing statistics of extreme hydrologic phenomena in Karkheh River at Iran with the Pearson Type III distribution as a potential alternative.
This work proposes a data-driven multiscale based quantitative measures to reveal the underlying complexity of electroencephalogram (EEG), applying to a rodent model of hypoxic-ischemic brain injury and recovery. Motivated by that real EEG recording is nonlinear and non-stationary over different frequencies or scales, there is a need of more suitable approach over the conventional single scale based tools for analyzing the EEG data. Here, we present a new framework of complexity measures considering changing dynamics over multiple oscillatory scales. The proposed multiscale complexity is obtained by calculating entropies of the probability distributions of the intrinsic mode functions extracted by the empirical mode decomposition (EMD) of EEG. To quantify EEG recording of a rat model of hypoxic-ischemic brain injury following cardiac arrest, the multiscale version of Tsallis entropy is examined. To validate the proposed complexity measure, actual EEG recordings from rats (n=9) experiencing 7 min cardiac arrest followed by resuscitation were analyzed. Experimental results demonstrate that the use of the multiscale Tsallis entropy leads to better discrimination of the injury levels and improved correlations with the neurological deficit evaluation after 72 hours after cardiac arrest, thus suggesting an effective metric as a prognostic tool.
The exact theoretical expression describing the probability distribution of nonlinear sea-surface elevations derived from the second-order narrowband model has a cumbersome form that requires numerical computations, not well-disposed to theoretical or practical applications. Here, the same narrowband model is reexamined to develop a simpler closed-form approximation suitable for theoretical and practical applications. The salient features of the approximate form are explored, and its relative validity is verified with comparisons to other readily available approximations, and oceanic data.
At-site flood frequency analysis is used to estimate flood quantiles when at-site record length is reasonably long. In Australia, FLIKE software has been introduced for at-site flood frequency analysis. The advantage of FLIKE is that, for a given application, the user can compare a number of most commonly adopted probability distributions and parameter estimation methods relatively quickly using a windows interface. The new version of FLIKE has been incorporated with the multiple Grubbs and Beck test which can identify multiple numbers of potentially influential low flows. This paper presents a case study considering six catchments in eastern Australia which compares two outlier identification tests (original Grubbs and Beck test and multiple Grubbs and Beck test) and two commonly applied probability distributions (Generalized Extreme Value (GEV) and Log Pearson type 3 (LP3)) using FLIKE software. It has been found that the multiple Grubbs and Beck test when used with LP3 distribution provides more accurate flood quantile estimates than when LP3 distribution is used with the original Grubbs and Beck test. Between these two methods, the differences in flood quantile estimates have been found to be up to 61% for the six study catchments. It has also been found that GEV distribution (with L moments) and LP3 distribution with the multiple Grubbs and Beck test provide quite similar results in most of the cases; however, a difference up to 38% has been noted for flood quantiles for annual exceedance probability (AEP) of 1 in 100 for one catchment. This finding needs to be confirmed with a greater number of stations across other Australian states.
Today, the need for water sources is swiftly increasing due to population growth. At the same time, it is known that some regions will face with shortage of water and drought because of the global warming and climate change. In this context, evaluation and analysis of hydrological data such as the observed trends, drought and flood prediction of short term flow has great deal of importance. The most accurate selection probability distribution is important to describe the low flow statistics for the studies related to drought analysis. As in many basins In Turkey, Gediz River basin will be affected enough by the drought and will decrease the amount of used water. The aim of this study is to derive appropriate probability distributions for frequency analysis of annual minimum flows at 6 gauging stations of the Gediz Basin. After applying 10 different probability distributions, six different parameter estimation methods and 3 fitness test, the Pearson 3 distribution and general extreme values distributions were found to give optimal results.
Entropy is a key measure in studies related to information theory and its many applications. Campbell for the first time recognized that the exponential of the Shannon’s entropy is just the size of the sample space, when distribution is uniform. Here is the idea to study exponentials of Shannon’s and those other entropy generalizations that involve logarithmic function for a probability distribution in general. In this paper, we introduce a measure of sample space, called ‘entropic measure of a sample space’, with respect to the underlying distribution. It is shown in both discrete and continuous cases that this new measure depends on the parameters of the distribution on the sample space - same sample space having different ‘entropic measures’ depending on the distributions defined on it. It was noted that Campbell’s idea applied for R`enyi’s parametric entropy of a given order also. Knowing that parameters play a role in providing suitable choices and extended applications, paper studies parametric entropic measures of sample spaces also. Exponential entropies related to Shannon’s and those generalizations that have logarithmic functions, i.e. are additive have been studies for wider understanding and applications. We propose and study exponential entropies corresponding to non additive entropies of type (α, β), which include Havard and Charvˆat entropy as a special case.
Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.
Despite the availability of natural disaster related time series data for last 110 years, there is no forecasting tool available to humanitarian relief organizations to determine forecasts for emergency logistics planning. This study develops a forecasting tool based on identifying probability distributions. The estimates of the parameters are used to calculate natural disaster forecasts. Further, the determination of aggregate forecasts leads to efficient pre-disaster planning. Based on the research findings, the relief agencies can optimize the various resources allocation in emergency logistics planning.
This paper deals with condition monitoring of electric switch machine for railway points. Point machine, as a complex electro-mechanical device, switch the track between two alternative routes. There has been an increasing interest in railway safety and the optimal management of railway equipments maintenance, e.g. point machine, in order to enhance railway service quality and reduce system failure. This paper explores the development of Kolmogorov- Smirnov (K-S) test to detect some point failures (external to the machine, slide chairs, fixing, stretchers, etc), while the point machine (inside the machine) is in its proper condition. Time-domain stator Current signatures of normal (healthy) and faulty points are taken by 3 Hall Effect sensors and are analyzed by K-S test. The test is simulated by creating three types of such failures, namely putting a hard stone and a soft stone between stock rail and switch blades as obstacles and also slide chairs- friction. The test has been applied for those three faults which the results show that K-S test can effectively be developed for the aim of other point failures detection, which their current signatures deviate parametrically from the healthy current signature. K-S test as an analysis technique, assuming that any defect has a specific probability distribution. Empirical cumulative distribution functions (ECDF) are used to differentiate these probability distributions. This test works based on the null hypothesis that ECDF of target distribution is statistically similar to ECDF of reference distribution. Therefore by comparing a given current signature (as target signal) from unknown switch state to a number of template signatures (as reference signal) from known switch states, it is possible to identify which is the most likely state of the point machine under analysis.
In view of their importance and usefulness in reliability theory and probability distributions, several generalizations of the inverse Gaussian distribution and the Krtzel function are investigated in recent years. This has motivated the authors to introduce and study a new generalization of the inverse Gaussian distribution and the Krtzel function associated with a product of a Bessel function of the third kind )(zKQ and a Z - Fox-Wright generalized hyper geometric function introduced in this paper. The introduced function turns out to be a unified gamma-type function. Its incomplete forms are also discussed. Several properties of this gamma-type function are obtained. By means of this generalized function, we introduce a generalization of inverse Gaussian distribution, which is useful in reliability analysis, diffusion processes, and radio techniques etc. The inverse Gaussian distribution thus introduced also provides a generalization of the Krtzel function. Some basic statistical functions associated with this probability density function, such as moments, the Mellin transform, the moment generating function, the hazard rate function, and the mean residue life function are also obtained.KeywordsFox-Wright function, Inverse Gaussian distribution, Krtzel function & Bessel function of the third kind.
In this paper we explore the application of a formal proof system to verification problems in cryptography. Cryptographic properties concerning correctness or security of some cryptographic algorithms are of great interest. Beside some basic lemmata, we explore an implementation of a complex function that is used in cryptography. More precisely, we describe formal properties of this implementation that we computer prove. We describe formalized probability distributions (o--algebras, probability spaces and condi¬tional probabilities). These are given in the formal language of the formal proof system Isabelle/HOL. Moreover, we computer prove Bayes' Formula. Besides we describe an application of the presented formalized probability distributions to cryptography. Furthermore, this paper shows that computer proofs of complex cryptographic functions are possible by presenting an implementation of the Miller- Rabin primality test that admits formal verification. Our achievements are a step towards computer verification of cryptographic primitives. They describe a basis for computer verification in cryptography. Computer verification can be applied to further problems in crypto-graphic research, if the corresponding basic mathematical knowledge is available in a database.