Current Search: Vector Quantization (x)
View All Items
- Title
- SPEAKER IDENTIFICATION BASED ON DISCRIMINATIVE VECTOR QUANTIZATION AND DATA FUSION.
- Creator
-
Zhou, Guangyu, Mikhael, Wasfy, University of Central Florida
- Abstract / Description
-
Speaker Identification (SI) approaches based on discriminative Vector Quantization (VQ) and data fusion techniques are presented in this dissertation. The SI approaches based on Discriminative VQ (DVQ) proposed in this dissertation are the DVQ for SI (DVQSI), the DVQSI with Unique speech feature vector space segmentation for each speaker pair (DVQSI-U), and the Adaptive DVQSI (ADVQSI) methods. The difference of the probability distributions of the speech feature vector sets from various...
Show moreSpeaker Identification (SI) approaches based on discriminative Vector Quantization (VQ) and data fusion techniques are presented in this dissertation. The SI approaches based on Discriminative VQ (DVQ) proposed in this dissertation are the DVQ for SI (DVQSI), the DVQSI with Unique speech feature vector space segmentation for each speaker pair (DVQSI-U), and the Adaptive DVQSI (ADVQSI) methods. The difference of the probability distributions of the speech feature vector sets from various speakers (or speaker groups) is called the interspeaker variation between speakers (or speaker groups). The interspeaker variation is the measure of template differences between speakers (or speaker groups). All DVQ based techniques presented in this contribution take advantage of the interspeaker variation, which are not exploited in the previous proposed techniques by others that employ traditional VQ for SI (VQSI). All DVQ based techniques have two modes, the training mode and the testing mode. In the training mode, the speech feature vector space is first divided into a number of subspaces based on the interspeaker variations. Then, a discriminative weight is calculated for each subspace of each speaker or speaker pair in the SI group based on the interspeaker variation. The subspaces with higher interspeaker variations play more important roles in SI than the ones with lower interspeaker variations by assigning larger discriminative weights. In the testing mode, discriminative weighted average VQ distortions instead of equally weighted average VQ distortions are used to make the SI decision. The DVQ based techniques lead to higher SI accuracies than VQSI. DVQSI and DVQSI-U techniques consider the interspeaker variation for each speaker pair in the SI group. In DVQSI, speech feature vector space segmentations for all the speaker pairs are exactly the same. However, each speaker pair of DVQSI-U is treated individually in the speech feature vector space segmentation. In both DVQSI and DVQSI-U, the discriminative weights for each speaker pair are calculated by trial and error. The SI accuracies of DVQSI-U are higher than those of DVQSI at the price of much higher computational burden. ADVQSI explores the interspeaker variation between each speaker and all speakers in the SI group. In contrast with DVQSI and DVQSI-U, in ADVQSI, the feature vector space segmentation is for each speaker instead of each speaker pair based on the interspeaker variation between each speaker and all the speakers in the SI group. Also, adaptive techniques are used in the discriminative weights computation for each speaker in ADVQSI. The SI accuracies employing ADVQSI and DVQSI-U are comparable. However, the computational complexity of ADVQSI is much less than that of DVQSI-U. Also, a novel algorithm to convert the raw distortion outputs of template-based SI classifiers into compatible probability measures is proposed in this dissertation. After this conversion, data fusion techniques at the measurement level can be applied to SI. In the proposed technique, stochastic models of the distortion outputs are estimated. Then, the posteriori probabilities of the unknown utterance belonging to each speaker are calculated. Compatible probability measures are assigned based on the posteriori probabilities. The proposed technique leads to better SI performance at the measurement level than existing approaches.
Show less - Date Issued
- 2005
- Identifier
- CFE0000720, ucf:46621
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000720
- Title
- On Distributed Estimation for Resource Constrained Wireless Sensor Networks.
- Creator
-
Sani, Alireza, Vosoughi, Azadeh, Rahnavard, Nazanin, Wei, Lei, Atia, George, Chatterjee, Mainak, University of Central Florida
- Abstract / Description
-
We study Distributed Estimation (DES) problem, where several agents observe a noisy version of an underlying unknown physical phenomena (which is not directly observable), and transmit a compressed version of their observations to a Fusion Center (FC), where collective data is fused to reconstruct the unknown. One of the most important applications of Wireless Sensor Networks (WSNs) is performing DES in a field to estimate an unknown signal source. In a WSN battery powered geographically...
Show moreWe study Distributed Estimation (DES) problem, where several agents observe a noisy version of an underlying unknown physical phenomena (which is not directly observable), and transmit a compressed version of their observations to a Fusion Center (FC), where collective data is fused to reconstruct the unknown. One of the most important applications of Wireless Sensor Networks (WSNs) is performing DES in a field to estimate an unknown signal source. In a WSN battery powered geographically distributed tiny sensors are tasked with collecting data from the field. Each sensor locally processes its noisy observation (local processing can include compression,dimension reduction, quantization, etc) and transmits the processed observation over communication channels to the FC, where the received data is used to form a global estimate of the unknown source such that the Mean Square Error (MSE) of the DES is minimized. The accuracy of DES depends on many factors such as intensity of observation noises in sensors, quantization errors in sensors, available power and bandwidth of the network, quality of communication channels between sensors and the FC, and the choice of fusion rule in the FC. Taking into account all of these contributing factors and implementing a DES system which minimizes the MSE and satisfies all constraints is a challenging task. In order to probe into different aspects of this challenging task we identify and formulate the following three problems and address them accordingly:1- Consider an inhomogeneous WSN where the sensors' observations is modeled linear with additive Gaussian noise. The communication channels between sensors and FC are orthogonal power and bandwidth-constrained erroneous wireless fading channels. The unknown to be estimated is a Gaussian vector. Sensors employ uniform multi-bit quantizers and BPSK modulation. Given this setup, we ask: what is the best fusion rule in the FC? what is the best transmit power and quantization rate (measured in bits per sensor) allocation schemes that minimize the MSE? In order to answer these questions, we derive some upper bounds on global MSE and through minimizing those bounds, we propose various resource allocation schemes for the problem, through which we investigate the effect of contributing factors on the MSE.2- Consider an inhomogeneous WSN with an FC which is tasked with estimating a scalar Gaussian unknown. The sensors are equipped with uniform multi-bit quantizers and the communication channels are modeled as Binary Symmetric Channels (BSC). In contrast to former problem the sensors experience independent multiplicative noises (in addition to additive noise). The natural question in this scenario is: how does multiplicative noise affect the DES system performance? how does it affect the resource allocation for sensors, with respect to the case where there is no multiplicative noise? We propose a linear fusion rule in the FC and derive the associated MSE in closed-form. We propose several rate allocation schemes with different levels of complexity which minimize the MSE. Implementing the proposed schemes lets us study the effect of multiplicative noise on DES system performance and its dynamics. We also derive Bayesian Cramer-Rao Lower Bound (BCRLB) and compare the MSE performance of our porposed methods against the bound.As a dual problem we also answer the question: what is the minimum required bandwidth of thenetwork to satisfy a predetermined target MSE?3- Assuming the framework of Bayesian DES of a Gaussian unknown with additive and multiplicative Gaussian noises involved, we answer the following question: Can multiplicative noise improve the DES performance in any case/scenario? the answer is yes, and we call the phenomena as 'enhancement mode' of multiplicative noise. Through deriving different lower bounds, such as BCRLB,Weiss-Weinstein Bound (WWB), Hybrid CRLB (HCRLB), Nayak Bound (NB), Yatarcos Bound (YB) on MSE, we identify and characterize the scenarios that the enhancement happens. We investigate two situations where variance of multiplicative noise is known and unknown. Wealso compare the performance of well-known estimators with the derived bounds, to ensure practicability of the mentioned enhancement modes.
Show less - Date Issued
- 2017
- Identifier
- CFE0006913, ucf:51698
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006913