Historically, computers have been quite adept at storing and retrieving alpha-numeric data consisting of numbers and characters. However, recent hardware and software advances have created a wide variety of new data types, generally referred to as multimedia (audio, video, pictures, graphs, maps, etc.), that have almost no similarities to alpha-numeric data, and are not currently supported like alpha-numeric data. Specifically, we want to be able to search, modify, and manipulate multimedia data much like current applications are able to search, modify and manipulate alpha-numeric data. For example, if we have a database of images, we would like to say "Give me a picture of a pig." Unfortunately, multimedia data is not directly searchable. In its raw form multimedia data is relatively meaningless from the computer's perspective. To search multimedia, it is first necessary to identify the semantic content of the data and represent it in a way that is meaningful and can be manipulated by computers. For instance, someone must first identify the pig in the image before you can search for pictures of pigs.
Because of the enormous increase in the volume of multimedia, any content-based retrieval system must support automatic identification of semantic content. That is, the system cannot afford to rely on human interaction to identify and extract semantic content. For example, if you have ten thousand images, you can't display each one to the user and ask "What's in this one?" and have the user enter a content description. Fortunately, the system can use the techniques from image, signal, and vision processing. These fields of research deal with identification of features within images, sounds, graphs, etc. Consequently, a content-based retrieval system should incorporate these processing routines so that semantic content can be automatically identified.
Unfortunately, the amount of information you can potentially get from even a single image is almost limitless ("a picture is worth a thousand words"), and the problem is even worse with video. Consequently, the system must delay identification of some content until it is absolutely required. On the other hand, some content may be so basic that it should be identified automatically. This means that the operations must be organized so that some operations can be delayed, then applied automatically when users request information.
Processing operations can't do everything though. There is some content that users are interested in that processing operations alone cannot identify. For example, "U.S. Presidents". Processing operations can identify individual people in an image, but the fact the Bill Clinton is the President is outside the realm of image processing. This means that a content-based retrieval system must provide some way of capturing domain-dependent information, like Bill Clinton is the President, and must use that information is resolving user queries.
Finally, all this processing and domain-dependent information must be incorporated with a database. The database is used to store multimedia data and to resolve user queries. The database must also be capable of using the processing operations and domain-dependent knowledge-base to identify content that users ask for.
So what is MOODS? MOODS is a tool for developing content-based retrieval applications that incorporate processing operations, a knowledge-base of domain-dependent information, and a database. The tool consists of a database, a library of processing routines, and a knowledge-base. Applications are developed by simply writing any routines that the library doesn't contain, organizing the processing operations to identify semantic content, and developing rules for the knoweldge base. Once the application is developed, multimedia data entered into the application is automatically processed for content. The data, along with identified content is then automatically entered in the database and becomes available for searching. When users make queries, the system first consults the database to find matching data. If nothing is found, the system tries to process some data further to see if it matches the user's request. If the system has no appropriate processing capabilities, the knowledge base is consulted to see if it can help locate matching data. By developing this tool, content-based retrieval programs can be easily developed that can identify content in any domain, and for any multimedia format.