International Science Index

11
10010982
Speaker Recognition Using LIRA Neural Networks
Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Paper Detail
23
downloads
10
10000163
Smart Help at theWorkplace for Persons with Disabilities (SHW-PWD)
Abstract:

The Smart Help for persons with disability (PWD) is a part of the project SMARTDISABLE which aims to develop relevant solution for PWD that target to provide an adequate workplace environment for them. It would support PWD needs smartly through smart help to allow them access to relevant information and communicate with other effectively and flexibly, and smart editor that assist them in their daily work. It will assist PWD in knowledge processing and creation as well as being able to be productive at the work place. The technical work of the project involves design of a technological scenario for the Ambient Intelligence (AmI) - based assistive technologies at the workplace consisting of an integrated universal smart solution that suits many different impairment conditions and will be designed to empower the Physically disabled persons (PDP) with the capability to access and effectively utilize the ICTs in order to execute knowledge rich working tasks with minimum efforts and with sufficient comfort level. The proposed technology solution for PWD will support voice recognition along with normal keyboard and mouse to control the smart help and smart editor with dynamic auto display interface that satisfies the requirements for different PWD group. In addition, a smart help will provide intelligent intervention based on the behavior of PWD to guide them and warn them about possible misbehavior. PWD can communicate with others using Voice over IP controlled by voice recognition. Moreover, Auto Emergency Help Response would be supported to assist PWD in case of emergency. This proposed technology solution intended to make PWD very effective at the work environment and flexible using voice to conduct their tasks at the work environment. The proposed solution aims to provide favorable outcomes that assist PWD at the work place, with the opportunity to participate in PWD assistive technology innovation market which is still small and rapidly growing as well as upgrading their quality of life to become similar to the normal people at the workplace. Finally, the proposed smart help solution is applicable in all workplace setting, including offices, manufacturing, hospital, etc.

Paper Detail
2141
downloads
9
17415
Development of a Computer Vision System for the Blind and Visually Impaired Person
Abstract:

Eyes are an essential and conspicuous organ of the human body. Human eyes are outward and inward portals of the body that allows to see the outside world and provides glimpses into ones inner thoughts and feelings. Inevitable blindness and visual impairments may results from eye-related disease, trauma, or congenital or degenerative conditions that cannot be corrected by conventional means. The study emphasizes innovative tools that will serve as an aid to the blind and visually impaired (VI) individuals. The researchers fabricated a prototype that utilizes the Microsoft Kinect for Windows and Arduino microcontroller board. The prototype facilitates advanced gesture recognition, voice recognition, obstacle detection and indoor environment navigation. Open Computer Vision (OpenCV) performs image analysis, and gesture tracking to transform Kinect data to the desired output. A computer vision technology device provides greater accessibility for those with vision impairments.

Paper Detail
3084
downloads
8
16406
Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction
Abstract:

Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.

Paper Detail
1500
downloads
7
9982
Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language
Abstract:
Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.
Paper Detail
3719
downloads
6
5405
Accent Identification by Clustering and Scoring Formants
Abstract:
There have been significant improvements in automatic voice recognition technology. However, existing systems still face difficulties, particularly when used by non-native speakers with accents. In this paper we address a problem of identifying the English accented speech of speakers from different backgrounds. Once an accent is identified the speech recognition software can utilise training set from appropriate accent and therefore improve the efficiency and accuracy of the speech recognition system. We introduced the Q factor, which is defined by the sum of relationships between frequencies of the formants. Four different accents were considered and experimented for this research. A scoring method was introduced in order to effectively analyse accents. The proposed concept indicates that the accent could be identified by analysing their formants.
Paper Detail
1658
downloads
5
5268
Search Engine Module in Voice Recognition Browser to Facilitate the Visually Impaired in Virtual Learning (MGSYS VISI-VL)
Abstract:
Nowadays, web-based technologies influence in people-s daily life such as in education, business and others. Therefore, many web developers are too eager to develop their web applications with fully animation graphics and forgetting its accessibility to its users. Their purpose is to make their web applications look impressive. Thus, this paper would highlight on the usability and accessibility of a voice recognition browser as a tool to facilitate the visually impaired and blind learners in accessing virtual learning environment. More specifically, the objectives of the study are (i) to explore the challenges faced by the visually impaired learners in accessing virtual learning environment (ii) to determine the suitable guidelines for developing a voice recognition browser that is accessible to the visually impaired. Furthermore, this study was prepared based on an observation conducted with the Malaysian visually impaired learners. Finally, the result of this study would underline on the development of an accessible voice recognition browser for the visually impaired.
Paper Detail
1701
downloads
4
8564
Coupled Dynamics in Host-Guest Complex Systems Duplicates Emergent Behavior in the Brain
Abstract:

The ability of the brain to organize information and generate the functional structures we use to act, think and communicate, is a common and easily observable natural phenomenon. In object-oriented analysis, these structures are represented by objects. Objects have been extensively studied and documented, but the process that creates them is not understood. In this work, a new class of discrete, deterministic, dissipative, host-guest dynamical systems is introduced. The new systems have extraordinary self-organizing properties. They can host information representing other physical systems and generate the same functional structures as the brain does. A simple mathematical model is proposed. The new systems are easy to simulate by computer, and measurements needed to confirm the assumptions are abundant and readily available. Experimental results presented here confirm the findings. Applications are many, but among the most immediate are object-oriented engineering, image and voice recognition, search engines, and Neuroscience.

Paper Detail
1762
downloads
3
4967
Voice Command Recognition System Based on MFCC and VQ Algorithms
Abstract:
The goal of this project is to design a system to recognition voice commands. Most of voice recognition systems contain two main modules as follow “feature extraction" and “feature matching". In this project, MFCC algorithm is used to simulate feature extraction module. Using this algorithm, the cepstral coefficients are calculated on mel frequency scale. VQ (vector quantization) method will be used for reduction of amount of data to decrease computation time. In the feature matching stage Euclidean distance is applied as similarity criterion. Because of high accuracy of used algorithms, the accuracy of this voice command system is high. Using these algorithms, by at least 5 times repetition for each command, in a single training session, and then twice in each testing session zero error rate in recognition of commands is achieved.
Paper Detail
2638
downloads
2
8832
Through Biometric Card in Romania: Person Identification by Face, Fingerprint and Voice Recognition
Abstract:
In this paper three different approaches for person verification and identification, i.e. by means of fingerprints, face and voice recognition, are studied. Face recognition uses parts-based representation methods and a manifold learning approach. The assessment criterion is recognition accuracy. The techniques under investigation are: a) Local Non-negative Matrix Factorization (LNMF); b) Independent Components Analysis (ICA); c) NMF with sparse constraints (NMFsc); d) Locality Preserving Projections (Laplacianfaces). Fingerprint detection was approached by classical minutiae (small graphical patterns) matching through image segmentation by using a structural approach and a neural network as decision block. As to voice / speaker recognition, melodic cepstral and delta delta mel cepstral analysis were used as main methods, in order to construct a supervised speaker-dependent voice recognition system. The final decision (e.g. “accept-reject" for a verification task) is taken by using a majority voting technique applied to the three biometrics. The preliminary results, obtained for medium databases of fingerprints, faces and voice recordings, indicate the feasibility of our study and an overall recognition precision (about 92%) permitting the utilization of our system for a future complex biometric card.
Paper Detail
1605
downloads
1
4291
A Supervised Text-Independent Speaker Recognition Approach
Authors:
Abstract:

We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.

Paper Detail
1520
downloads