This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.
The Smart Help for persons with disability (PWD) is a part of the project SMARTDISABLE which aims to develop relevant solution for PWD that target to provide an adequate workplace environment for them. It would support PWD needs smartly through smart help to allow them access to relevant information and communicate with other effectively and flexibly, and smart editor that assist them in their daily work. It will assist PWD in knowledge processing and creation as well as being able to be productive at the work place. The technical work of the project involves design of a technological scenario for the Ambient Intelligence (AmI) - based assistive technologies at the workplace consisting of an integrated universal smart solution that suits many different impairment conditions and will be designed to empower the Physically disabled persons (PDP) with the capability to access and effectively utilize the ICTs in order to execute knowledge rich working tasks with minimum efforts and with sufficient comfort level. The proposed technology solution for PWD will support voice recognition along with normal keyboard and mouse to control the smart help and smart editor with dynamic auto display interface that satisfies the requirements for different PWD group. In addition, a smart help will provide intelligent intervention based on the behavior of PWD to guide them and warn them about possible misbehavior. PWD can communicate with others using Voice over IP controlled by voice recognition. Moreover, Auto Emergency Help Response would be supported to assist PWD in case of emergency. This proposed technology solution intended to make PWD very effective at the work environment and flexible using voice to conduct their tasks at the work environment. The proposed solution aims to provide favorable outcomes that assist PWD at the work place, with the opportunity to participate in PWD assistive technology innovation market which is still small and rapidly growing as well as upgrading their quality of life to become similar to the normal people at the workplace. Finally, the proposed smart help solution is applicable in all workplace setting, including offices, manufacturing, hospital, etc.
Eyes are an essential and conspicuous organ of the human body. Human eyes are outward and inward portals of the body that allows to see the outside world and provides glimpses into ones inner thoughts and feelings. Inevitable blindness and visual impairments may results from eye-related disease, trauma, or congenital or degenerative conditions that cannot be corrected by conventional means. The study emphasizes innovative tools that will serve as an aid to the blind and visually impaired (VI) individuals. The researchers fabricated a prototype that utilizes the Microsoft Kinect for Windows and Arduino microcontroller board. The prototype facilitates advanced gesture recognition, voice recognition, obstacle detection and indoor environment navigation. Open Computer Vision (OpenCV) performs image analysis, and gesture tracking to transform Kinect data to the desired output. A computer vision technology device provides greater accessibility for those with vision impairments.
Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.
The ability of the brain to organize information and generate the functional structures we use to act, think and communicate, is a common and easily observable natural phenomenon. In object-oriented analysis, these structures are represented by objects. Objects have been extensively studied and documented, but the process that creates them is not understood. In this work, a new class of discrete, deterministic, dissipative, host-guest dynamical systems is introduced. The new systems have extraordinary self-organizing properties. They can host information representing other physical systems and generate the same functional structures as the brain does. A simple mathematical model is proposed. The new systems are easy to simulate by computer, and measurements needed to confirm the assumptions are abundant and readily available. Experimental results presented here confirm the findings. Applications are many, but among the most immediate are object-oriented engineering, image and voice recognition, search engines, and Neuroscience.
We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.