Next: IMPLEMENTATION Up: AN ASYNCHRONOUS VIRTUAL MEETING Previous: ASYNCHRONOUS VIRTUAL MEETING

VISUALIZING VOICE MESSAGES

Another important feature of the system is the visualization of the voice messages. We designed an user interface, which displays the voice messages as if they are the written words. In the on-line bulletin board system (BBS), usually the written messages are threaded and viewed as trees. When we write messages on BBS or in the E-mail, some conventions are used to show the quoted messages, such as `' prefix.

We designed a user-agent software of our voice messaging system with such a look-and-feel. Figure 4 is the screen image of the software. The window consists of two panes. The left pane is tree-view, which the user can select the messages. When the user selects a node in the tree, voice messages are merged into a file, which includes the selected message itself and the parent or ancestors in the tree. The right pane is text-view, on which the user can read the messages as the written words. Each paragraph corresponds to the segment generated by the server (Figure 3). While the sound is playing, each line in the text-view is highlighted with synchronizing to the voice.

Figure: The user-agent software (VOYAGER). Left pane is the tree-view and right pane is the text-view.

Figure 5 shows the system organization and the usage of the proposed system. The whole system consists of the user-agent and the message management server. Each server manages some meeting rooms and all the original utterances are stored in the server.

In AVM, the spoken messages may be transcribed in two ways. One of the possibilities is that the user-agent has the capability of speech recognition. When the utterance is recorded, the voice is transcribed and transmitted to the server. Another possibility is that the server has the speech recognizer. When the server received a speech data from the user-agent, it is transcribed and stored to the database.

The important point is that the both ways could be supported in our system. If the user-agent software has very limited speech recognition ability, the server can compensate it. If the recognition performance of user-agent is improved, it can contribute to the better usability of the user-agent software, and the server can use the information received from the user-agent. So we can put it to practical use now, and improve the speech processing for the future while the users make conversations using the present system.

Figure: The usage of proposed system.

Next: IMPLEMENTATION Up: AN ASYNCHRONOUS VIRTUAL MEETING Previous: ASYNCHRONOUS VIRTUAL MEETING

Nishimoto Takuya
1999年08月30日 17時55分09秒