The conventional speaker change detection (SCD) method using Bayesian Information Criterion (BIC) has been widely used. However, its performance relies on the choice of penalty factor and suffers from mass calculation. The twostep SCD is less time consuming but generates more detection errors. The limitation of conventional method’s performance originates from the two adjacent data windows. We propose a strategy that inserts an interval between the two adjacent fixed-size data windows in each analysis window. The dissimilarity value between the data windows is regarded as the probability of a speaker identity change within the interval area. Then this analysis window is slid along the audio by a large step to locate the areas where speaker change points may appear. Afterwards we only focus on these areas and locate precisely where the change points are. Other areas where a speaker change point unlikely appears are abandoned. The proposed method is computationally efficient and more robust to noise and penalty factor compared with conventional method. Evaluated on the corpus of China Central Television (CCTV) news, the proposed method obtains 74.18% reduction in calculation time and 22.24% improvement in F1-measure compared with the conventional approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.