Log on: Remember me
Powered by Elgg
  • Publish Comment:

  • David Schlangen's Pages:

    Pages
  • David Schlangen

  • Owned communities

David Schlangen : Home Page > minutes140408

- InPro, meeting, minutes, 14/04/08
  - present: M, T, G, D
  - Gabriel demo'ed current state of Higgins. Displays duration of
    vocal action (both recognised and own) on timeline, uses simple
    boundary tone classification (up, down) to base decisions on
    thresholding on. (This is mostly a test of the architecture at the
    moment, the strategies are very simple.)
  - dysfluencies: what to do with aborted words? Most likely, sphinx
    will recognise rubbish. Would be too unrestrictive to include
    aborted versions of all words; adding other methods (e.g., using
    prosodic info) would require too much changes at low level of
    ASR. (Hm. But at some point we'll have frame-level /
    syllable-level prosodic info anyway. Shouldn't be too hard to let
    classifier judge whether word was perhaps misrecognised because it
    was a different, aborted one.)
  - Timo and Gabriel will work together on getting better classifier
    for boundary tone detection to work. Does it need to do speaker
    adaptation?
  - first step on syntax side: toy grammar for Pento domain.
    (``Nimm das {Kreuz | Teil | lange Ding} aus der Mitte links
    oben'') in Higgins parser.
  - using a grammar as linguistic model in sphinx apparently doesn't
    work incrementally (doesn't return results before top category has
    been found), but using statistical LM does work. (Although there
    are still technical problems, but it looks promising.)
  - even if we can't use a grammar, we can still bootstrap an n-gram
    LM with utterances generated from a domain grammar.



das, 04/14/08 11:40 (GMT)

Keyword: grammar, Higgins, InPro, meetings, minutes, sphinx

Add a new page under this one