美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇

COM6511代寫、Python語言編程代做

時間:2024-05-09  來源:  作者: 我要糾錯



COM4511/COM6511 Speech Technology - Practical Exercise -
Keyword Search
Anton Ragni
Note that for any module assignment full marks will only be obtained for outstanding performance that
goes well beyond the questions asked. The marks allocated for each assignment are 20%. The marks will be
assigned according to the following general criteria. For every assignment handed in:
1. Fulfilling the basic requirements (5%)
Full marks will be given to fulfilling the work as described, in source code and results given.
2. Submitting high quality documentation (5%)
Full marks will be given to a write-up that is at the highest standard of technical writing and illustration.
3. Showing good reasoning (5%) Full marks will be given if the experiments and the outcomes are explained to the best standard.
4. Going beyond what was asked (5%)
Full marks will be given for interesting ideas on how to extend work that are well motivated and
described.
1 Background
The aim of this task is to build and investigate the simplest form of a keyword search (KWS) system allowing to find information
in large volumes of spoken data. Figure below shows an example of a typical KWS system which consists of an index and
a search module. The index provides a compact representation of spoken data. Given a set of keywords, the search module
Search Results
Index
Key− words
queries the index to retrieve all possible occurrences ranked according to likelihood. The quality of a KWS is assessed based
on how accurately it can retrieve all true occurrences of keywords.
A number of index representations have been proposed and examined for KWS. Most popular representations are derived
from the output of an automatic speech recognition (ASR) system. Various forms of output have been examined. These differ
in terms of the amount of information retained regarding the content of spoken data. The simplest form is the most likely word
sequence or 1-best. Additional information such as start and end times, and recognition confidence may also be provided for
each word. Given a collection of 1-best sequences, the following index can be constructed
w1 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
w2 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
.
.
.
wN (fN,1, sN,1, eN,1) . . . (fN,nN , sN,nN , eN,nN )
(1)
1
where wi is a word, ni is the number of times word wi occurs, fi,j is a file where word wi occurs for the j-th time, si,j and ei,j
is the start and end time. Searching such index for single word keywords can be as simple as finding the correct row (e.g. k)
and returning all possible tuples (fk,1, sk,1, ek,1), . . ., (fk,nk , sk,nk , ek,nk ).
The search module is expected to retrieve all possible keyword occurrences. If ASR makes no mistakes such module
can be created rather trivially. To account for possible retrieval errors, the search module provides each potential occurrence
with a relevance score. Relevance scores reflect confidence in a given occurrence being relevant. Occurrences with extremely
low relevance scores may be eliminated. If these scores are accurate each eliminated occurrence will decrease the number of
false alarms. If not then the number of misses will increase. What exactly an extremely low score is may not be very easy
to determine. Multiple factors may affect a relevance score: confidence score, duration, word confusability, word context,
keyword length. Therefore, simple relevance scores, such as those based on confidence scores, may have a wide dynamic range
and may be incomparable across different keywords. In order to ensure that relevance scores are comparable among different
keywords they need to be calibrated. A simple calibration scheme is called sum-to-one (STO) normalisation
rˆi,j = r
γ
 
i,j
ni
k=1 r
γ
i,k
(2)
where ri,j is an original relevance score for the j-th occurrence of the i-th keyword, γ is a scale enabling to either sharpen or
flatten the distribution of relevance scores. More complex schemes have also been examined. Given a set of occurrences with
associated relevance scores, there are several options available for eliminating spurious occurrences. One popular approach
is thresholding. Given a global or keyword specific threshold any occurrence falling under is eliminated. Simple calibration
schemes such as STO require thresholds to be estimated on a development set and adjusted to different collection sizes. More
complex approaches such as Keyword Specific Thresholding (KST) yield a fixed threshold across different keywords and
collection sizes.
Accuracy of KWS systems can be assessed in multiple ways. Standard approaches include precision (proportion of relevant retrieved occurrences among all retrieved occurrences) and recall (proportion of relevant retrieved occurrences among all
relevant occurrences), mean average precision and term weighted value. A collection of precision and recall values computed
for different thresholds yields a precision-recall (PR) curve. The area under PR curve (AUC) provides a threshold independent summative statistics for comparing different retrieval approaches. The mean average precision (mAP) is another popular,
threshold-independent, precision based metric. Consider a KWS system returning 3 correct and 4 incorrect occurrences arranged according to relevance score as follows: ✓ , ✗ , ✗ , ✓ , ✓ , ✗ , ✗ , where ✓ stands for correct occurrence and ✗ stands
for incorrect occurrence. The average precision at each rank (from 1 to 7) is 1
1 , 0
2 , 0
3 , 2
4 , 3
5 , 0
6 , 0
7 . If the number of true correct
occurrences is 3, the mean average precision for this keyword 0.7. A collection-level mAP can be computed by averaging
keyword specific mAPs. Once a KWS system operates at a reasonable AUC or mAP level it is possible to use term weighted
value (TWV) to assess accuracy of thresholding. The TWV is defined by
TWV(K, θ) = 1 −
 
1
|K|
 
k∈K
Pmiss(k, θ) + βPfa(k, θ)
 
(3)
where k ∈ K is a keyword, Pmiss and Pfa are probabilities of miss and false alarm, β is a penalty assigned to false alarms.
These probabilities can be computed by
Pmiss(k, θ) = Nmiss(k, θ)
Ncorrect(k) (4)
Pfa(k, θ) = Nfa(k, θ)
Ntrial(k) (5)
where N<event> is a number of events. The number of trials is given by
Ntrial(k) = T − Ncorrect(k) (6)
where T is the duration of speech in seconds.
2 Objective
Given a collection of 1-bests, write a code that retrieves all possible occurrences of keyword list provided. Describe the search
process including index format, handling of multi-word keywords, criterion for matching, relevance score calibration and
threshold setting methodology. Write a code to assess retrieval performance using reference transcriptions according to AUC,
mAP and TWV criteria using β = 20. Comment on the difference between these criteria including the impact of parameter β.
Start and end times of hypothesised occurrences must be within 0.5 seconds of true occurrences to be considered for matching.
2
3 Marking scheme
Two critical elements are assessed: retrieval (65%) and assessment (35%). Note: Even if you cannot complete this task as a
whole you can certainly provide a description of what you were planning to accomplish.
1. Retrieval
1.1 Index Write a code that can take provided CTM files (and any other file you deem relevant) and create indices in
your own format. For example, if Python language is used then the execution of your code may look like
python index.py dev.ctm dev.index
where dev.ctm is an CTM file and dev.index is an index.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.2 Search Write a code that can take the provided keyword file and index file (and any other file you deem relevant)
and produce a list of occurrences for each provided keyword. For example, if Python language is used then the
execution of your code may look like
python search.py dev.index keywords dev.occ
where dev.index is an index, keywords is a list of keywords, dev.occ is a list of occurrences for each
keyword.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.3 Description Provide a technical description of the following elements
• Index file format
• Handling multi-word keywords
• Criterion for matching keywords to possible occurrences
• Search process
• Score calibration
• Threshold setting
2. Assessment Write a code that can take the provided keyword file, the list of found keyword occurrences and the corresponding reference transcript file in STM format and compute the metrics described in the Background section. For
instance, if Python language is used then the execution of your code may look like
python <metric>.py keywords dev.occ dev.stm
where <metric> is one of precision-recall, mAP and TWV, keywords is the provided keyword file, dev.occ is the
list of found keyword occurrences and dev.stm is the reference transcript file.
Hint: In order to simplify assessment consider converting reference transcript from STM file format to CTM file format.
Using indexing and search code above obtain a list of true occurrences. The list of found keyword occurrences then can
be assessed more easily by comparing it with the list of true occurrences rather than the reference transcript file in STM
file format.
2.1 Implementation
• AUC Integrate an existing implementation of AUC computation into your code. For example, for Python
language such implementation is available in sklearn package.
• mAP Write your own implementation or integrate any freely available.
3
• TWV Write your own implementation or integrate any freely available.
2.2 Description
• AUC Plot precision-recall curve. Report AUC value . Discuss performance in the high precision and low
recall area. Discuss performance in the high recall and low precision area. Suggest which keyword search
applications might be interested in a good performance specifically in those two areas (either high precision
and low recall, or high recall and low precision).
• mAP Report mAP value. Report mAP value for each keyword length (1-word, 2-words, etc.). Compare and
discuss differences in mAP values.
• TWV Report TWV value. Report TWV value for each keyword length (1-word, 2-word, etc.). Compare and
discuss differences in TWV values. Plot TWV values for a range of threshold values. Report maximum TWV
value or MTWV. Report actual TWV value or ATWV obtained with a method used for threshold selection.
• Comparison Describe the use of AUC, mAP and TWV in the development of your KWS approach. Compare
these metrics and discuss their advantages and disadvantages.
4 Hand-in procedure
All outcomes, however complete, are to be submitted jointly in a form of a package file (zip/tar/gzip) that includes
directories for each task which contain the associated required files. Submission will be performed via MOLE.
5 Resources
Three resources are provided for this task:
• 1-best transcripts in NIST CTM file format (dev.ctm,eval.ctm). The CTM file format consists of multiple records
of the following form
<F> <H> <T> <D> <W> <C>
where <F> is an audio file name, <H> is a channel, <T> is a start time in seconds, <D> is a duration in seconds, <W> is a
word, <C> is a confidence score. Each record corresponds to one recognised word. Any blank lines or lines starting with
;; are ignored. An excerpt from a CTM file is shown below
7654 A 11.34 0.2 YES 0.5
7654 A 12.00 0.34 YOU 0.7
7654 A 13.30 0.5 CAN 0.1
• Reference transcript in NIST STM file format (dev.stm, eval.stm). The STM file format consists of multiple records
of the following form
<F> <H> <S> <T> <E> <L> <W>...<W>
where <S> is a speaker, <E> is an end time, <L> topic, <W>...<W> is a word sequence. Each record corresponds to
one manually transcribed segment of audio file. An excerpt from a STM file is shown below
2345 A 2345-a 0.10 2.03 <soap> uh huh yes i thought
2345 A 2345-b 2.10 3.04 <soap> dog walking is a very
2345 A 2345-a 3.50 4.59 <soap> yes but it’s worth it
Note that exact start and end times for each word are not available. Use uniform segmentation as an approximation. The
duration of speech in dev.stm and eval.stm is estimated to be 57474.2 and 25694.3 seconds.
• Keyword list keywords. Each keyword contains one or more words as shown below
請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp




















 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:EBU6304代寫、Java編程設計代做
  • 下一篇:COM4511代做、代寫Python設計編程
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇
    亚洲精品国产熟女久久久| 天堂资源在线视频| 少妇高潮在线观看| av网站有哪些| 欧美黄色aaa| 国产精品夜夜夜爽阿娇| 在线免费观看日韩av| 免费黄色av网址| 朝桐光av在线| 艳妇荡乳欲伦69影片| 小向美奈子av| 天天鲁一鲁摸一摸爽一爽| 精品国产国产综合精品| 国产jizz18女人高潮| 国产亚洲精品精品精品| 搜索黄色一级片| 人妻 丝袜美腿 中文字幕| 91精品啪在线观看国产| 亚洲啪av永久无码精品放毛片| 日韩精品xxx| 亚洲av无码成人精品国产| 中文字幕丰满乱子伦无码专区| 完美搭档在线观看| 国产美女永久免费无遮挡| 手机免费看av| 亚洲午夜福利在线观看| 公肉吊粗大爽色翁浪妇视频| 国产性猛交xx乱| 中文字幕在线观看的网站| 偷拍女澡堂一区二区三区| 黄色正能量网站| 69xxx免费| 超碰caoprom| 精品一区二区三区蜜桃在线| 日本成人午夜影院| 99久久久免费精品| 人妻在线日韩免费视频| 1024在线看片| 中文字幕 日本| 麻豆精品国产免费| 女同毛片一区二区三区| 黄色正能量网站| 国产wwwwxxxx| 免费看91视频| 污污污www精品国产网站| 久久偷拍免费视频| 日韩一级片在线免费观看| 国产极品美女在线| 国产精品一区二区入口九绯色| 中文字幕在线观看二区| 国产十八熟妇av成人一区| 久久国产柳州莫菁门| 国产a级黄色片| 国产精品精品软件男同| 精品无码人妻一区二区免费蜜桃| 26uuu成人网| 亚洲高潮女人毛茸茸| 免费成人蒂法网站| 少妇aaaaa| 黄色av免费播放| www青青草原| 无码任你躁久久久久久老妇| 成人午夜免费影院| 无码人妻aⅴ一区二区三区| 波多野结衣电影免费观看| 国产又黄又粗又猛又爽的| 色哟哟精品观看| 人妻精品久久久久中文字幕| 中文字幕18页| 中国特级黄色片| 四虎国产精品免费| 国产一区二区三区在线视频观看| 国产午夜精品理论片在线| 国产精品理论在线| 国产精品久久久久久成人| x88av在线| 亚洲熟女少妇一区二区| 美女网站视频色| 熟女av一区二区| 中文字幕影音先锋| av在线天堂网| 在线免费观看黄色小视频| 久久午夜夜伦鲁鲁片| 一级做a爰片毛片| 四虎永久免费在线观看| 亚洲精品国产精品国自产网站| 91麻豆制片厂| 久久精品无码一区二区三区毛片| 下面一进一出好爽视频| 在线成人精品视频| 免费看污黄网站在线观看| 天海翼在线视频| 成人免费看片载| 亚洲AV无码成人精品区明星换面| 永久免费看mv网站入口| 野花视频免费在线观看| 国产免费一区二区三区网站免费| 非洲一级黄色片| 高清中文字幕mv的电影| 久久精品视频18| 绯色av蜜臀vs少妇| 三上悠亚ssⅰn939无码播放| 公肉吊粗大爽色翁浪妇视频| 国产激情在线免费观看| 大地资源二中文在线影视观看| 亚洲av综合一区二区| 少妇人妻丰满做爰xxx| 亚洲最大免费视频| 日本猛少妇色xxxxx免费网站| 好吊色视频在线观看| 精品熟妇无码av免费久久| 一级黄色免费视频| caoporn91| 俄罗斯毛片基地| 素人fc2av清纯18岁| 欧美日韩一区二区区别是什么| 美女100%露胸无遮挡| yy1111111| 91精品人妻一区二区三区蜜桃2| 成人免费无遮挡无码黄漫视频| wwwww在线观看| 农村妇女精品一区二区| 成都免费高清电影| 喷水视频在线观看| 三级影片在线看| 永久免费看mv网站入口| 国产3级在线观看| 欧美a在线播放| 国产又粗又猛又爽又黄的视频四季 | 制服丝袜第一页在线观看| 午夜精品一区二区三区视频| 中文字幕在线1| 男人天堂资源网| 日韩国产第一页| 在线观看网站黄| 中文字幕 欧美 日韩| 免费观看污网站| 亚洲AV无码国产精品| 欧美特级黄色录像| 国产大片免费看| 一色道久久88加勒比一| 国产精品免费无码| 日本一级特级毛片视频| 国产精品 欧美激情| 成年人性生活视频| 岛国精品资源网站| 亚洲色成人网站www永久四虎| 国产特级黄色录像| 久久久精品少妇| 伊人久久久久久久久| 久久久久久久久久久国产精品| 亚洲图片另类小说| 18深夜在线观看免费视频| 久久丫精品国产亚洲av不卡| 国产精品酒店视频| 精品中文字幕在线播放| 亚洲欧美卡通动漫| 国产成人精品无码片区在线| 日本激情视频一区二区三区| 久久国产免费视频| 欧日韩不卡视频| 亚洲做受高潮无遮挡| 国产三级国产精品| 潘金莲一级淫片aaaaaaa| 一级性生活免费视频| 黑人巨大猛交丰满少妇| 国产呦小j女精品视频| 性欧美videos| 色呦呦一区二区| 老熟妻内射精品一区| 日韩片在线观看| 日韩久久久久久久久久久| 妺妺窝人体色WWW精品| 精品一区二区三区四区五区六区| 亚洲一二三四视频| 一色道久久88加勒比一| 成人性生活免费看| 国产a级片视频| 好吊色视频在线观看| 美女av免费看| 亚洲区自拍偷拍| 在线观看福利片| 国产美女喷水视频| 超碰97av在线| 精品人妻少妇嫩草av无码| 在线播放av网址| 免费成人深夜天涯网站| 青青操在线视频观看| 欧美午夜精品一区二区| 手机看片国产日韩| 亚洲AV无码国产成人久久| 亚洲无人区码一码二码三码| 老妇女50岁三级| 中国毛片直接看| 五月天av网站| 成年人av电影| 亚洲自拍偷拍精品| 国产熟妇搡bbbb搡bbbb| 中文字幕免费视频| 亚洲怡红院在线观看|