Speech processing
 
/r/+α(/r/-/l/)
/r/
½(/r/+/l/)
/l/
/l/+α(/l/-/r/)

The objective of this work is to develop speech processing tools that can highlight spectral differences between similar phonemes. Such tools have application in second-language (L2) learning. Take the case of Japanese speakers, who have trouble discriminating /r/ from /l/ English phones (e.g., 'rock' vs. 'lock') because such contrast does not exist in their native language. Emphasizing spectral differences between these phonemes can help the L2 learner focus on those spectral cues that carry discriminatory information.

 
Face recognition
 

The objective of this research is to determine the extent to which caricatures help improve face recognition by humans . To achieve this objective, we are developing algorithms to generate 3D caricatures of an individual face from single frontal images. We also conduct perceptual studies to determine whether exaggeration of distinctive facial features helps memorize and recognize faces. A second objective of this research is to study the application of caricaturization algorithms to machine-learning problems, specifically in the area of automatic face recognition.
 
Wearable sensors
 
The objective of this work is to develop a wearable/wireless sensor system that allows us to monitor the physiology, activity, and context of a user on a 24-hour basis. Our hardware design is small and portable to allow users to carry out their normal daily activities. Data processing is based on a combination of nonlinear system identification and statistical pattern recognition techniques. Our long-term goal is to provide users with information and visualizations that allow them to gain a better understanding of their behavior and their habits. The figure shows the power spectral density of a user's respiration signal for a period of 50 minutes; in this example, increases in respiration rate correlate with physical exertion (the user drove, parked his car, walked to the office, and worked on the computer).
 
Machine Olfaction
 
We also work on several aspects of machine olfaction, ranging from instrumentation for metal-oxide chemoresistors to spatio-temporal coding in neuromorphic models of the olfactory pathway.
 

Infrared spectroscopy: The objective of this research is to develop a low-cost infrared absorption spectroscope based on linear variable filters. This instrument represents an alternative to electronic-nose devices based on cross-selective gas sensor arrays. Instead, the proposed instrument uses the concept of computational “pseudosensors,” where spectral lines in an analytical instrument are clustered into groups and used as independent variables. At the core of our system is an IR detector that combines an LVF and an array of 64 pyroelectric detectors (IR Microsystems). The LVF is a wedge-shaped interference filter, which provides a bank of transmission wavelengths from the thinnest end (short wavelengths) to the thickest (long wavelengths). The LVF sits atop a 64-pixel pyroelectric detector array, which produces a low-resolution spectrum of the transmitted spectra. Because of its particular bonds and molecular structures, each chemical species produces a unique IR spectrum, which can be used for analytical purposes.

 

 

Instrumentation/signal processing: We also investigate temperature modulation procedures to improve the sensitivity and selectivity of metal-oxide chemoresistors. The figure shows the chemical transient of a TGS sensor driven by a temperature ramp in the presence of allyl alcohol, tert butanol and benzene at different concentrations. As shown in the figure, the shape of the transient response and the location of the conductance maxima contain information about the identity of the analyte, whereas the transient amplitude can be used to estimate its concentration.

Sensory analysis: This represents the grand challenge for machine olfaction, how to correlate the response of instrumental data with the "gold standard": sensory analysis from a trained human panel. In colaboration with the University of Valladolid, we are developing pattern recognition methods to predict the organoleptic properties of Spanish red wines from gas, liquid and color sensor arrays. In collaboration with Duke and NC State University, University we have also used chemical sensors to evaluate biomaterials for odor abatement in swine facilities. The figure shows the correlation coefficients between human scores for irritation and pleasantness of biofiltered hog odors, and their leave-one-out predictions from sensor-array data.

Neuro-morphic models: What are the key signal processing mechanisms in the olfactory system, and how can they be used to process data from chemical sensor arrays? Our investigations to date have explored the process of spatial coding at the glomerular (GL) layer through chemotopic convergence of olfactory receptor neurons (ORN), and pattern completion through phase coding in the KIII neurodynamics system. The figure shows the spatial patterns of a 20x20 GL layer receiving projections from a population of 400,000 ORNs. The top ten images represent the GL pattern for ten different odors at a fixed concentration, whereas the lower ten images show the patterns for odor L1 and the mixture L1+L2 at five increasing concentrations. According to the model (and also to experimental results from neurobiology), odor quality is encoded by a unique spatial pattern across GL, whereas odor intensity is captured by the intensity and the spread of this pattern.
 
Mobile robotics
 
Heterogeneous mobots: With funding from NSF and a donation from Applied Materials, PRISM is sponsoring a number of senior design projects in the area of mobile robotics. These projects address issues related to sensing, such as sensor fusion for dead-reckoning, acoustic navigation, odor plume tracking and multi-robot sensor networks. The figure shows a small robot homing in on a light source. The vehicle is able to communicate its findings to other robots in the network through an RF link. More information about these projects can be obtained from the CPSC 483 class webpage.

Omnidirectional imaging: We have developed a computational range sensor based on ominidirectional vision. The device employs a structure-from-motion strategy, where depth information is extracted from optical flow. These range estimates are then used to build a probabilistic map by means of certainty grids.

 


Prior work: Global self-localization using sonar maps, Kohonen and multilayer perceptrons. Probabilistic models for sonar transducers using multilayer perceptrons. Probabilistic navigation using Bayesian inference and Partially Observable Markov Decision Processes. Perception and navigation using certainty grids. Low-level navigation and obstacle avoidance using potential fields. The figures show a global and a local map construted from a sonar ring with the certainty grid sensor fusion approach of Moravec and Elfes.
 
Speech-driven facial animation
 

In collaboration with Professors Oscar Garcia and Ardy Goshtasbi (Wright State University), Anna Esposito (Second Univesity of Naples, Italy) and Isaac Rudomin (Tec de Monterrey, Mexico) we have developed a "talking head" (a.k.a. voice puppet), a three-dimensional animation of a human head driven by a speech signal. The figure shows the basic building blocks of the system: audio processing, video tracking, audio-visual prediction and facial animation. For more information on this project, visit the SDFA webpage maintained by Praveen Kakumanu at Wright State.

 
 

PRISM | Computer Science | Dwight Look CoE | TAMU