Log on: Remember me
Powered by Elgg
  • Publish Comment:

  • David Schlangen's Pages:

    Pages
  • David Schlangen

  • Owned communities

David Schlangen : Home Page > minutes221007

  - present: Michaela, Timo, David
  - re. end of turn project / paper:
    - prosody: add online speaker adaptivity? (Learning average f0,
      intensity, etc.)
    - syntax: how do you evaluate incremental parser?
    - infrastructure:
      - `incrementalizer': textual input field that sends out new word
        each time space bar is hit. Perhaps also faking dysfluencies
  when backspace is hit. E.g., if input is "I saw Pe^B^B^John",
        output is "I saw Pe- erm John".
This module should be able to take the place of the ASR in the
architecture, sending to the parser exactly what it expects.

  - discussed domain for InPro system:
    - VM, pro:
      - corpus
      - symmetric task, both talk same amount, both have the same
      roles
    - VM, contra:
      - pretty complex domain, perhaps not terribly easy to restrict
        so that it becomes realistic to model.
[ Then again, perhaps not. Could always be modelled as
          slot-filling ("at what day do you want to meet?", "at what
          time?", "sorry, that doesn't work. I can offer ___. Does
          this work, yes or no?"). Lots of system initiative. But then
          the same holds for almost any domain, certainly also
          pento. And, if it is modelled with more system-initiative,
          it becomes less symmetric, of course, and one looses the
          advantage that the system can model either partner (or both
          can be modelled by system). ]

    - Pentomino / DEAWU setting:

      Computer is Instruction Follower (IF), moves pieces on
      board. Human is Instruction Giver (IG), directs IF to place
      pieces. Most likely, situation should be one where IG sees board
      & outline, & sees what IF is doing.

      - several variations possible, including DEAWU setting where IG
      has numbered solution.

      - pro:
      - corpus
- existing modules (reference resolution for pieces on board)
- perhaps more like SDS? (But see above: VM domain can be made
          SDS-like as well.)
      - contra:
      - asymmetric task -- perhaps with little talk by system?
      - brings in new issues like reference resolution and
          clarification. [ But these issues would surface in any kind
  of practical system. ]
        - more specific new issues:
  - action; i.e., moving mouse pointer to target location; and
          allowing user to barge in on *action*, "no, not there!"
    [ but avoidable, if feature of acting incrementally (=
              before end of turn) is not added ]
- the question: WHY?  Need for a voice-interface to a game
        that is more easily and readily controlled by direct
        manipulation (aka: dragging and clicking) is difficult to
        see..

     - same here: system-initiative can be imposed here as well, even
       if it is not necessarily natural. E.g. "which piece do you want
       to place?".. "how shall I rotate it?" etc. etc.


     - would bring in additional way of showing use of incremental
       processing (see above), namely beginning to act
       (non-linguistically) as soon as partial information from the
       utterance allows is. ("Now take the green [moves to direction
       of green piece] piece")

     - turn-taking wise, the challenge here would probably be not so
       much detecting turn endings and act fast, but rather detecting
       hesitations and *not* act. I.e., avoiding wrong time-outs.
       --> this could be done on our existing corpus / corpora.

   - nice uses of DEAWU setting (IG has solution): solution could be
     known to system as well.
     - only do part of identifying piece on board; system places it
       automatically. I.e., only interface Alex Siebert's thing with
       ASR (& some more GUI).
     - complete fake: system only detects turn ends, then plays
       hesitations, asks the occasional (fake) clarification question,
       and then does what it knows is correct anyway..
       Occasionally places pieces wrongly, etc. etc..

       Actually, any kind of intermediate step is possible: use
       keyword spotting to detect which task is being done at the
       moment (identifying piece, orientation, placement); etc. etc.

     - This could be seen as another point in favour of this domain:
       modules (reference resolution parts, logic) could be faked
       and system / application could still be interesting. Not sure
       how something like this could be done in VM domain.

  - general point: turn-taking module (detect end, play hesitation if
    necessary) should be general enough so that it can be wrapped
    around standard SDS, e.g. one built with CSLU toolkit. The minimal
    lag of FSA system (built with time-out) is known (it's the time
    out setting), so our wrapper could produce "erms" of at least this
    length.
    One could then test whether having these "erms" improves
    perception of FSA-system that is otherwise kept constant.



das, 10/22/07 04:39 (GMT)

Keyword: inpro, meetings, minutes

Add a new page under this one