Information Science and Technology

Exploring the Process of Knowledge Discovery based on Multimedia Technologies

Miki Haseyama , Professor

Graduate School of Information Science and Technology (Course of Media and Network Technologies, Department of Electronics and Information Engineering, School of Engineering)

High school : Hokkaido Sapporo Minami High School

Academic background : Doctorate from Hokkaido University

Research areas: image/video processing, audio processing, music processing, knowledge discovery
Research keywords: image recognition, image restoration, semantic video understanding, image retrieval, video retrieval, knowledge discovery
Website: http://www-lmd.ist.hokudai.ac.jp/

What is your goal?

My goal is to achieve a next-generation multimedia system that understands images and video in the same way as humans. With the expansion of high-speed Internet circuits and diffusion of large-capacity storage media, we are now surrounded by an overflow of digital data, such as images and video. As this data is expected to further increase globally, it is reported that 35 zettabytes of data (a zettabyte is 10²¹ bytes) will be generated in 2020. To give you an idea of how exorbitant this amount of data is, assuming that all the data is stored in DVD discs and stacked one upon another, the stack would reach as far as half the distance from the Earth to Mars. In fact, the total amount of data generated by humans on Earth has already exceeded the total amount of available storage (storage devices for recording data, such as hard disk drives and USB memory sticks) in the world. In other words, even if some important information is contained in a large amount of data, it is discarded without being saved. To save our important information, we need technologies that automatically find our desired information. I am pursuing my goal of “achieving a next-generation multimedia system that understands images and video in the same way as humans” in order to develop such technologies.

What kinds of technologies have been achieved?

To implement “a system that understands images and video in the same way as humans,” we need to develop various methods, including image recognition methods based on the visual and auditory characteristics of humans, methods to analyze music and audio signals included in video, image restoration methods, and next-generation image coding methods (Figs. 1 and 2 show implementation examples). In addition, with the development of these methods, we need to clarify how humans derive knowledge from all signals coming from outside, such as images, video, and audio and music signals. Accordingly, I have clarified the process of semantic video understanding from the standpoint of multimedia signal processing research, as well as the process of knowledge discovery by humans.

Fig. 3 Associative image search system (Image Vortex)

Fig. 4 A large-scale image search system The image search system Image Vortex has been commercialized and is used as Image Cruiser.(http://spir.ist.hokudai.ac.jp/shiga_photo.html)

For example, our image search system (Figs. 3 and 4) takes into account how humans obtain desired images. While general image search systems need query keywords to find users’ desired information, our system does not need such keywords. With each image in the system automatically assigned numerical tags that represent their characteristics, the images move and are converged to the optimal positions as if they are communicating with each other once the user starts the image search. Based on the numerical tags assigned to images, similar images are arranged closer while dissimilar ones farther apart. By visualizing a large amount of images, our inner knowledge or memory is activated, then having been made aware of an unconscious association of images, to obtain the image that we really want. I have also developed an associative video search engine that allows us to intuitively search not only images but also video without using keywords (Fig. 5), and a heterogeneous cross-media search engine that enables us to freely obtain music, images, and video according to our preferences (Fig. 6). You can search music based on images, conversely search images based on music, and even automatically receive recommendations across networks from users with similar preferences or other users with totally different preferences, all of which was not possible before. New discoveries inspire and delight us. I think that in the near future, a search system will be realized that enriches our lives by allowing us to make unexpected discoveries based on all kinds of information around the world, such as images, video, and music.

hat gives you joy as a researcher?

The research on the process of knowledge discovery that I have been pursuing can be paraphrased as an investigation into how we can provide “awareness” and “triggers” to humans. I believe that by assisting many people with the insights, it can help enhance social productivity. Naturally, researchers pursue intellectual investigation and contribute to the evolution of sciences. However, what gives us great motivation as researchers is seeing our investigations utilized in life and industries, thereby allowing us to contribute to society. It is our joy as researchers to have our research results prove useful in enriching lives.