دانلود ترجمه مقاله شناسایی عبارات گفتاری در عبارت خارج از واژگان OOV

عنوان فارسی مقاله:	شناسایی عبارات گفتاری در عبارت خارج از واژگان oov
عنوان انگلیسی مقاله:	Query-by-Example Spoken Term Detection For OOV
دانلود مقاله انگلیسی:	جهت دانلود رایگان نسخه انگلیسی این مقاله اینجا کلیک نمایید

سال انتشار	2009
تعداد صفحات مقاله انگلیسی	6
تعداد صفحات ترجمه مقاله	15
مجله	شناخت سخنرانی اتوماتیک
دانشگاه	دانشگاه هاپکینز، کشور امریکا
کلمات کلیدی	–
نشریه	IEEE

بخشی از ترجمه:

هدف تکنولوژی تشخیص عبارت گفته شده(STD) این بوده که بتوانیم واژگانی را در مجموعه ی بزرگی از محتوای گفتاری ، مورد جستجو قرار دهیم. در این مقاله، ما مواردی را مطرح خواهیم کرد که در آن عبارات مد نظر جستجو(پرس و جوها)، مثال هایی صوتی هستند. این مورد، یا به وسیله ی تشخیص بخش مد نظر در جریان صوتی صورت خواهد گرفت و یا به وسیله ی گفتن عبارت پرس و جو. معمولاٌ پرس و جوها مرتبط با شاخص های نام گذاری شده و کلمات خارجی هستند، که عموماٌ پوشش ضعیفی در واژگان مربوط به سیستم های تشخیص گفتار پیوسته واژگان بزرگ(LVCSR) دارند. در تمامی این مقاله، ما بر روی جستجوی Query-By-Example برای چنین عباراتی که فاقد واژگان هستند(OOV) متمرکز خواهیم شد. ما یک مبدل وضعیت محدود(FST) را بر مبنای سیستم شاخص گذری و جستجو ایجاد خواهیم کرد[١] تا مسئله ی جستجوی Query-By-Example را برای عبارات OOV و به صورت ارائه ی هر دوی پرس و جو و شاخص به عنوان شبکه های آوایی از خروجی یک سیستم LVCSR حل شود. ما نتایجی را که متفاوت از مکانیسم های ارائه و تولید است هم برای پرس و جو و هم شاخص هایی که با یک واژه ایجاد شده است ، ارائه خواهیم داد. همچنین متدی دو مرحله ای را ارائه خواهیم داد که از جستجوی Query-by-Example و با استفاده از بهترین واژگان تشخیص داده شده استفاده کرده و اثبات خواهد کرد که این روش می توان کارائی قابل ملاحظه ای را داشته باشد که مقدار آن به وسیله ی مقدار وزن دهی شده ی عبارت واقعی(ATWV)، به میزان ٠.۴٧٩ در مقایسه با مقدار ٠.٣٢۵ که از تلفظ رفرنس برای OOV ها استفاده کرده است، اندازه گیری شده است. بهبودی های بیشتری را نیز می توان به وسیله ی روش دو گذره(دو مرحله ای) و فیلترینگ با استفاده از شمارش های مورد انتظار از سیستم فرهنگ لغت LVCSR بدست آورد.

بخشی از مقاله انگلیسی:

Abstract—The goal of Spoken Term Detection (STD) technology is to allow open vocabulary search over large collections of speech content. In this paper, we address cases where search term(s) of interest (queries) are acoustic examples. This is provided either by identifying a region of interest in a speech stream or by speaking the query term. Queries often relate to named-entities and foreign words, which typically have poor coverage in the vocabulary of Large Vocabulary Continuous Speech Recognition (LVCSR) systems. Throughout this paper, we focus on query-by-example search for such out-of-vocabulary (OOV) query terms. We build upon a finite state transducer (FST) based search and indexing system [1] to address the query by example search for OOV terms by representing both the query and the index as phonetic lattices from the output of an LVCSR system. We provide results comparing different representations and generation mechanisms for both queries and indexes built with word and combined word and subword units [2]. We also present a two-pass method which uses query-by-example search using the best hit identified in an initial pass to augment the STD search results.

The results demonstrate that query by-example search can yield a significantly better performance, measured using Actual Term-Weighted Value (ATWV), of 0.479 when compared to a baseline ATWV of 0.325 that uses reference pronunciations for OOVs. Further improvements can be obtained with the proposed two pass approach and filtering using the expected unigram counts from the LVCSR system’s lexicon.
I. INTRODUCTION
The fast-growing availability of recorded speech calls for efficient and scalable solutions to index and search this data. Spoken Term Detection (STD) is a key technology aimed at open-vocabulary search over large collections of spoken documents. A common approach to STD is to employ a large vocabulary continuous speech recognition (LVCSR) system to obtain word lattices and extend classical Information Retrieval techniques to word lattices. Such approaches have been shown to be very accurate for well-resourced tasks [3], [1].

عنوان فارسی مقاله:	شناسایی عبارات گفتاری در عبارت خارج از واژگان oov
عنوان انگلیسی مقاله:	Query-by-Example Spoken Term Detection For OOV

دیدگاهتان را بنویسید لغو پاسخ