Need and relevance
Ever since the idea of convergence was floated in the early 1990s, the media industry has been talking about cross-platform exploitation as a way of producing more exciting content more cost-effectively. But while technology has helped to produce better quality sounds and images, the costs continue to rise. Quality digital media production remains very labour intensive, making it a very high-risk, high-cost industry. One of the reasons is that images are crafted at very low levels – quite often at the ‘polygon’ or ‘pixel’ level. In many applications, the existence of more sophisticated digital tools has actually pushed up costs, as more time is spent on complex off-line processes in the quest for quality. It is virtually impossible to re-use items from previous productions (regardless of issues of copyright) in different contexts, as the majority of sounds and images only work in the context and media type for which they were originally made. One ‘real world’ example of this is the making of the game Bratz. Although the game company was provided with all the original assets of the animation series on which the game is based, not one byte of this data was used. The principle reason was that the animation was produced using Softimage ‘curved surface’ (NURBS) technology. This produces good animation quality, but cannot be converted into polygons for game engines. A second example was when Raven Software wanted to make the game of Star Trek Voyager. They had access to the 3-D models of the space ship exteriors and interiors as used in the TV series, but found that these had far too high a polygon count, too much texture, and were not created with games in mind. These assets were used as guides to prepare completely new objects for the game. Even to re-use complex 3-D scenes in a sequel, by the same production company, is horribly difficult, since the ‘interconnections’ in the model are not obvious to anyone other than the author. Even finding the right assets in the data archive may be too difficult and too expensive.
The ambition is to develop new approaches to reverse the trend toward ever-increasing cost, building on and extending research in media technologies, web semantics, and context based audiovisual object retrieval. We do not wish to invent new media genres, for which there may not be a market, but facilitate the production of media types for which there is an established and proven market.
The European media industry is characterised by fragmentation and the presence of many small, high technology creative companies providing advanced services. European media companies are very popular internationally for the quality and creativity of their work, their flexibility and cost-effectiveness. The downside is that the scale of the American companies allows them to dominate the market for large-scale, high-cost and high-profit work. Asian companies are simultaneously monopolising the market for low-end animation by the use of very cheap low skilled labour.
SALERO’s overall ‘Vision’ is to define and develop ‘intelligent content’ for media production , consisting of multimedia objects with context-aware behaviours for self-adaptive use and delivery across different platforms. ‘Intelligent Content’ should enable the creation and re-use of complex, compelling media by artists who need to know little of the technical aspects of how the tools that they use actually work. Over the next ten years, we expect the application of new tools to transform the way in which media are produced. Imagine a cross media ‘TV series / game’, which mixes characters and scenes created as 3D CGI with others from live action digital image capture. Some of the characters have already been produced and used in a ‘city street’ context, but the new production has scenes on horseback. Today, it would be almost impossible to use the 3-D character models on horseback, as they would have been built for ‘city’ action – and quite impossible to put the 2D pixel-based characters onto horses. However, if we can treat the characters and the horses as self-aware objects, a new ‘horse’ object should be able to ‘train’ the character to ride it. The ‘horse’ will also interact with the terrain, galloping on the flat, but trekking or scrambling up hills. It should also interact with the background to generate the right acoustic effects from a sound library, breathing differently when galloping or when trekking. Lighting and visual effects will also have a degree of intelligence. We can imagine a situation where, in the ‘TV’ version, there are two characters in a smoky interior, lit with tracking spots, but the game-play requires the addition of a third or fourth character. In the future, the extra characters should also ‘know’ they need tracking spots, and the ‘smoke effect’ should interact with the extra light sources. The characters may even interact with the ‘plot’ or ‘storyline’ object(s) and act out the work to be produced. The artistic operator should interact with characters as a film director directs a movie rather than by editing pixels, NURBS surfaces or other mathematical concepts.
Imagine that the original language of the production is Spanish, but we also need an English version. Dialogue should be translated but sound effects, and music, should stay the same. For this we would ideally like to have spoken English generated from a script by a voice synthesis engine. Some ‘visual’ objects may also need translation, such as the sign above a shop, but not if this is the name of the shop owner. Finally, the assembled production should be able to adjust itself for transport to the prevailing network configuration and to render out for display at the optimum resolution and format for the delivery platform.
In addition to content-based information retrieval (‘given an image of a horse, find similar images’ etc), semantically enabled ‘content’ should open avenues for context-sensitive retrieval (‘given the sound of a galloping horse, find an image of a galloping horse’). Since the objects or content in question have intelligence, the associated contextual information should be clearer and more useful, helping the production company to find examples of ‘horse’ and ‘horse with rider’ from other productions with similar actions. From the context of the scenario, it should be possible to infer the underlying need and look for ‘rider objects’ created before, which could be adapted for re-use. We might use any one of a number of search mechanisms, drawing an outline object, querying by example, or using metadata or ontology based searches.
Complete realisation of SALERO’s vision is a long-term goal. SALERO aims to advance the current state of the art in digital media to the point where it becomes possible to create audiovisual for cross-platform delivery using intelligent content tools, with greater quality at lower cost, to provide audiences with more engaging entertainment and information at home or on the move. This aim gives rise to three overarching R&D objectives, which are as follows:
A better understanding of the relations between media types, genres, workflows and styles as a pre-requisite to the adaptation and transfer of content elements across productions and platforms is to be achieved by basic research leading to the creation of guidelines for researchers, industry developers and media producers; and the specification of industry requirements and for different media contexts.
Metadata, Media Semantics and Ontologies will be analysed, researched and developed. They will define the parameters necessary for the creation and manipulation of semantically aware content objects of various types.
RTDI for practical methods of context-based information retrieval will be carried out in SALERO. Results will simplify the location and retrieval of characters, sounds, images, movements or behaviours from very large datasets and media storage systems.
The activities will research and develop procedures for manipulating the appearance, sound, movement and behaviour of semantically aware characters and other objects for delivery on different platforms in a range of contexts.
Improved methods and tools for language processing and speech synthesis as a means of supporting the generation of multilingual media content will be researched and developed.
Within SALERO the activities will be devoted to developing and validating software toolkits, software systems, plug-ins and interfaces that allow the control of appearances, sounds, semantic behaviour and properties of intelligent content objects for media production and post-production, that can be used in conjunction with existing industry programs.
SALERO will guide, validate and evaluate the technological R&D through the means of a series of experimental productions across a range of important media genre, based on scenarios defined by artists and creative media professionals.
Goal is to promote the understanding and acceptance of intelligent content technologies within the professional community (including the relevant standardisation bodies).
A special work package will reinforce the European skills base by developing training structures for professionals and researchers.
Market launch of technologies created by SALERO will be prepared through a series of Demonstration Testbeds for media industry professionals.
Project Main Results
- Definitions of media types, genres, workflows and styles for intelligent content to be utilized across productions and platforms. The results will be presented in form of a report and peer reviewed scientific publications.
- Specification of industry requirements for different media; guidelines for researchers, industry developers and media producers for cross platform intelligent content production and usage.
- An ontology language, media ontologies and metadata, describing semantics and context of intelligent content, with appropriate documentation.
- New methods of context-based retrieval of characters, sounds, images, movements or behaviours from very large datasets and media storage systems, verified by peer-reviewed publications and a system for the demonstration of the viability/scalability of the methods proposed.An ontology language, media ontologies and metadata, describing semantics and context of intelligent content, with appropriate documentation.
- Applications for manipulating the appearance, sound, movement and behaviour of semantically aware characters and other objects for delivery on different platforms in a range of contexts.
- Software tools for media ontology creation, manipulation and versioning, with an application for semantic media annotation.
- Tools for language processing and speech synthesis as a means of supporting the generation of multilingual media content. These tools will take the form of SW-applications capable of generating emotional speech and speech in different languages.
- Software toolkits, software systems, and interfaces, compliant with current industry practices, which allow the control of appearances, sounds, semantic behaviour and properties of intelligent content objects for media production and post-production. Software Toolkits and Systems will be made available in form of APIs and the interface definitions will be made public.
- Three experimental productions, covering the application areas “information and entertainment programming”, “pre-school”, and “interactive games”. The experimental productions will be used to evaluate the SALERO tools, systems and applications and will provide the basis for demonstrations and test beds. Parts of the experimental productions content will be reused in establishing the end of project showcase.
- Masters Syllabus development for universities and professional training programme for industry. The experience generated at a Research level in the IP will be used to advise and re-shape syllabuses at Masters Degree level in Digital Media, with the aim of creating a joint European programme as well as to draft professional training syllabuses.
- Scientific dissemination through publications and active participation at conferences, workshops and reviewed journals; contribution to standardisation.
- Presentations to industry professionals, at major trade fairs and industry events.
- End of project showcase in form of a DVD containing public documents from the project and examples from the experimental production scenarios.