Building Scalable and Smart Multimedia Applications on the Semantic Web - Michael Hausenblas - E-Book

Building Scalable and Smart Multimedia Applications on the Semantic Web E-Book

Michael Hausenblas

0,0
39,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Doctoral Thesis / Dissertation from the year 2009 in the subject Business economics - Information Management, grade: 1,0, University of Graz, language: English, abstract: The Semantic Web has become reality over the past couple of years. While certain practical topics—such as interoperability, etc.—have at least partially been addressed, scalability and expressivity issues regarding the utilisation of multimedia metadata on the Semantic Web are still widely neglected. However, existing Web (2.0) applications handling millions of multimedia assets are starting to take advantage of Semantic Web technologies. This work contributes to design decisions regarding scalable and smart multimedia applications on the Semantic Web. Based on an analysis of practical issues—stemming from diverse projects and activities the author has participated in over the past four years—three areas have been identified, namely (i) performance and scalability issues on the data access level, (ii) the effective and efficient representation of multimedia content descriptions, and (iii) the deployment of multimedia metadata on the Semantic Web. The three research areas have as its common base the trade-off between expressivity and scalability. We present our findings regarding scalable, yet expressive Semantic Web multimedia applications in a number of practical settings and discuss future directions, such as interlinking multimedia.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Veröffentlichungsjahr: 2009

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Content
Chapter
1.1 Motivation
1.2 Problem Definition
1.2.1 Performance and Scalability Issues in Distributed Metadata Sources
1.2.2 Efficient and Effective Representation of Multimedia Metadata
1.2.3 Scaleable Multimedia Metadata Deployment on the Semantic Web
1.3 Reader’s Guide
1.4 What this work is NOT about
2.1 Semantic Web Applications
2.2 Multimedia Applications
2.2.1 Smart Multimedia Content
2.2.2 Multimedia Metadata Deployment
2.3 Scalability and Expressivity
2.3.1 Infrastructure Level
2.3.2 Application Level
3.1 Multimedia Container Formats
3.1.1 eXtensible HyperText Markup Language-(X)HTML
3.1.2 Scalable Vector Graphics-SVG
3.1.3 Synchronized Multimedia Integration Language-SMIL
3.2 Aspects of Multimedia Metadata
5.1 Information Flow and Media Semantic Web Stack
5.2 Extraction vs. Annotation
5.2.1 Extraction
5.2.2 Annotation
5.3 How To Deal with the Semantic Gap
5.3.1 Low-level Feature Based Approach
5.3.2 Model-based Approach
5.3.3 Semantic Web Approach
5.3.4 Hybrid Approach
5.4 Multimedia Ontology Engineering
5.4.1 Methodologies
5.4.2 Ontology Engineering Tools
5.4.3 Review of Existing Multimedia Ontologies
6.1 Introduction
6.2 Motivation and Scenarios
6.3 Requirements for the Description of Multimedia Assets
6.4 Environment Analysis: The Semantic Web
6.5 Multimedia Assets on the Semantic Web
6.6 Formal Descriptions of Multimedia Assets
6.6.1 Ontology Languages
6.6.2 Rules
6.6.3 Comparing Formal Descriptions Regarding the Requirements
6.7 Conclusions
7.1 The Semantic Web Stack regarding SWMA
7.2 Design Principles and Common Concepts
7.2.1 Occam’s Razor
7.2.2 Follow-your-nose
7.3 Expressivity on the Semantic Web
7.4 Scalability on the Semantic Web
7.5 Conclusion

Page 1

Page 3

Page 5

Abstract

The Semantic Web has become reality over the past couple of years. While certain practical topics—such as interoperability, etc.—have at least partially been addressed, scalability and expressivity issues regarding the utilisation of multimedia metadata on the Semantic Web are still widely neglected. However, existing Web (2.0) applications handling millions of multimedia assets are starting to take advantage of Semantic Web technologies. This work contributes to design decisions regarding scalable and smart multimedia applications on the Semantic Web. Based on an analysis of practical issues—stemming from diverse projects and activities the author has participated in over the past four years—three areas have been identified, namely (i) performance and scalability issues on the data access level, (ii) the effective and efficient representation of multimedia content descriptions, and (iii) the deployment of multimedia metadata on the Semantic Web. The three research areas have as its common base the trade-off between expressivity and scalability. We present our findings regarding scalable, yet expressive Semantic Web multimedia applications in a number of practical settings and discuss future directions, such as “inter- linking multimedia”.

Page 7

Kurzfassung

Im Laufe der letzten Jahre wurde das Semantic Web Realit¨ at. Obgleich einige praktische Fragen, wie beispielsweise Interoperabilit¨ at, schon teilweise behandelt wurden, sind die Themen Skalierbarkeit und Expressivit¨ at in Bezug auf die Ausn ¨ utzung von Multimedia-Metadaten im Semantic Web bislang vernachl¨ assigt worden. Bestehende Web (2.0) Anwendungen, die Millionen von multimedialen Inhalten handhaben, beginnen von Semantic Web Technologien zu profitieren. Die vorliegende Arbeit unterst ¨ utzt Designentscheidungen beim Bau von multimedialen

Semantic Web Applikationen. Dabei wurden, ausgehend von einer umfassenden Analyse praxisnaher Probleme (basierend auf Projekten bei denen der Verfasser dieser Arbeit beteiligt war) drei Bereiche identifiziert: Erstens, Performanz und Skalierbarkeitsfragen auf der Datenzugriffsebene, zweitens, effiziente und effektive Repr¨ asentation von Beschreibungen multimedialer Inhalte, und schließlich der Gebrauch von Multimedia-Metadaten am Semantic Web. Gemein ist den oben genannten Forschungsbereichen die Kompromissfindung in Bezug auf Expressivit¨ at vs. Skalierbarkeit. In der Arbeit werden die Erkenntnisse bez ¨ uglich skalierbarer und dennoch ausdrucks-

starker Semantic Web Applikationen im Multimediabereich im Rahmen einer Reihe realit¨ atsnaher Aufgabenstellungen dargestellt. Schließlich werden zuk ¨ unftige Entwicklungen (wie “interlinking multimedia”) diskutiert.

Page 9

I hereby certify that the work presented in this thesis is my own and that work performed by others is appropriately cited.

Michael Hausenblas, June 2008.

Page 11

Acknowledgements

My uttermost thanks and thoughts to my family: Saphira, Ranya, Iannis and Anneliese. Without your support I could not possibly have done this work. Especially in the darker moments you have motivated me and pointed me towards the light a the end of the tunnel. Thank you.

I especially wish to thank my supervisor, Wolfgang Slany, for his support and for letting me benefit from his rich experience. It was most enjoyable to publish with you, and I learned a lot from you. Both regarding scientific work and the practical side as well; I am deeply thankful.

To my colleagues at JOANNEUM RESEARCH I would like to say thanks for your time and permanent will to discuss—without any special order these colleagues are: Werner Bailer, Werner Haas, Harald Mayer, Wolfgang Halb, Herwig Rehatschek, Martin Umgeher, Rudi Schlatte, Rene Kaiser, Martin H ¨ offernig, Robert Sorschag, Lena Lauber, Hannes

Bauer, Georg Mittendorfer, Herwig Zeiner, Selver Softic, Georg Thallinger, Gert Kienast, and Wolfgang Weiss. I further wish so express my gratitudes to my colleagues from the research projects NM2, K-Space, and SALERO: Doug Williams, Ian Kegel and Tim Stevens (all British Telecom), Marian Ursu (Goldsmith College), Maureen Thomas (Cambridge University), further Lynda Hardman, Jacco van Ossenbruggen, Rapha¨ el Troncy, Frank Nack, Zeljko Obrenovic, and Lloyd Rutledge (CWI) as well as Tobias B ¨ urger (STI Innsbruck) and

Yves Raimond (Centre for Digital Music, Queen Mary, University of London). From my work within W3C I learned a lot regarding standardisation processes, and administrative as well as community issues. I like to express my deep thankfulness to my colleagues at the Multimedia Semantics Incubator Group (especially Jeff Z. Pan, Rapha¨ el Troncy, Vassilis Tzouvaras, Susanne Boll, Tobias B ¨ urger, and Oscar Celma) and the RDFa

Task Force of the Semantic Web Deployment Working Group (Ben Adida, Ralph Swick, Mark Birbeck, Steven Pemberton, Shane McCarron, and Manu Sporny) as well as other people within W3C who have strongly influenced my view: Ivan Herman, Dan Connolly, Guus Schreiber, and Fabien Gandon. Thanks for your patience. Finally, kudos to the genuine Linking Open Data chaps, Chris Bizer, Tom Heath, Richard Cyganiak, Kingsley Idehen, Yves Raimond—you shaped my understanding regarding the practical side of the Web of Data quite a lot. Thanks for your brilliant and visionary ideas and for sharing them with me. Regarding the practitioner’s point of view I wish to thank Danny Ayers and Keith Alexander (both Talis), Benjamin Nowack (for ARC2), Dan Brickley (FOAF, uF, etc.), Lee Feigenbaum (regarding scovo/SPARQL) and David Peterson (our ISWC07 discussions); working with you certainly helped me to better understand and attack real-world issues. Last but not least I want to thank my parents Gerhard and Gertrude, as well as my sister Monika and her family (Herbert, Larissa, and Elena) for their patience and support over the past couple of years.

Page 12

Page 13

Credits

This thesis is based on Keith Andrews’ wonderful LAT E X template. An up-to-date version can be obtained by visiting the FTP site of the Institute for Information Processing and Computer Supported New Media (http://ftp.iicm.tugraz.at/pub/keith/thesis/).

Thefancyhdrpackage was used for the chapter headings. Kudos for this nice work to Piet van Oostrum. Note that the package is available via the CTAN network (http://www.ctan.org/tex-archive/macros/latex/contrib/fancyhdr/).

Some of the content and metadata stems from the FP6 EU project “New Media for a New Millennium” (NM2) (http://www.ist-nm2.org/). The author expresses deepest gratitude to the creative people in the production teams. The X3D example used in Fig.3.3on page36and the Fig.3.4on page50has been created within the NM2 project.

By courtesy ofhttp://www.x-smiles.orgthe SMIL example in Fig.3.2on page34has been used.

The Fig.4.1on page57originates fromThe Description Logic Handbook[13].

The W3C Semantic Web stack used in Section4.3originally stems from W3C’s Semantic Web activity homepage (http://www.w3.org/2001/sw/); especially credits go to the W3C Semantic Web activity lead, Ivan Herman.

Page 14

Page 18

8 A Performance and Scalability Metric for Virtual RDF Graphs133

8.3.1 Types Of Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1378.3.2 Characteristics Of Sources. . . . . . . . . . . . . . . . . . . . . . . . . . 139

9 Media Semantics Mapping145

9.3.1 Data and Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1509.3.2 Media Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519.3.3 Spaces of Abstraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519.3.4 Built-in rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1539.3.5 User-defined rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1559.3.6 The MSM Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . 156

9.5.1 The NM2 Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1579.5.2 Authoring Of Non-linear Stories. . . . . . . . . . . . . . . . . . . . . . 1579.5.3 Example NM2 Productions. . . . . . . . . . . . . . . . . . . . . . . . . 1599.5.4 Lessons Learned. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1599.5.5 The NM2 Workflow in Terms of Canonical Processes. . . . . . . . . . 1619.6 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

10 Efficient Multimedia Metadata Deployment165

10.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16610.1.1 Last Mile of Multimedia Metadata Deployment. . . . . . . . . . . . . 16610.1.2 Related Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16610.1.3 Design Principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16710.2 Use Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16710.2.1 Use Case: Annotate and Share Photos Online. . . . . . . . . . . . . . . 16810.2.2 Use Case: Purchasing Music Online. . . . . . . . . . . . . . . . . . . . 16910.2.3 Use Case: Describing the Structure of a Video. . . . . . . . . . . . . . 16910.2.4 Use Case: Publishing Professional Content with Metadata. . . . . . . 17010.2.5 Use Case: Expressing and Using Complex Rights Information. . . . . 17010.2.6 Use Case: Detailed Description of Large Media Assets. . . . . . . . . 17010.2.7 Use Case: Cultural Heritage. . . . . . . . . . . . . . . . . . . . . . . . . 171

Page 19

10.2.8 Derived Requirements from the Use Cases. . . . . . . . . . . . . . . . 17310.3 RDFa-deployed Multimedia Metadata. . . . . . . . . . . . . . . . . . . . . . . 17510.3.1 ramm.x Vocabulary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17510.3.2 ramm.x extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17710.3.3 Processing ramm.x Descriptions. . . . . . . . . . . . . . . . . . . . . . 17810.4 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17910.4.1 Deploying a Still Image along with Exif Metadata. . . . . . . . . . . . 17910.4.2 An Example from Cultural Heritage. . . . . . . . . . . . . . . . . . . . 18010.5 Conclusion and Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

IV Conclusion and Outlook185

11 Conclusions187

12 Outlook189

12.1 Semantic Web multimedia applications now and in 10 years time. . . . . . . 18912.1.1 Emerging Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18912.1.2 Advanced Annotation Techniques. . . . . . . . . . . . . . . . . . . . . 19112.1.3 Interactive Media. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19212.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19312.2.1 Meshups and More. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19312.2.2 Multimedia and the Web of Data. . . . . . . . . . . . . . . . . . . . . . 194

V Addendum197

A Sources199

A.1 RDF Source Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199A.1.1 Minimalistic Media Ontology Example. . . . . . . . . . . . . . . . . . 199A.2 Program Source Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200A.2.1 Performance and Scalability Metric Showcase. . . . . . . . . . . . . . 200A.3 Diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210A.3.1 Media Semantics Mapping. . . . . . . . . . . . . . . . . . . . . . . . . 210

B Author’s Contribution213

B.1 Publications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213B.2 Projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214B.2.1 Media Production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214B.2.2 Media Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214B.2.3 Other Activities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215B.3 Academic Activities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215B.4 Activities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216B.4.1 W3C participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216B.4.2 ramm.x initiative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217B.4.3 Related to MPEG-7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Page 20

C Reference Material219

C.1 Multimedia Ontologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219C.1.1 aceMedia Visual Descriptor Ontology. . . . . . . . . . . . . . . . . . . 219C.1.2 Mindswap Image Region Ontology. . . . . . . . . . . . . . . . . . . . 219C.1.3 Music Ontology Specification. . . . . . . . . . . . . . . . . . . . . . . . 219C.1.4 Kanzaki Audio Ontology. . . . . . . . . . . . . . . . . . . . . . . . . . 220C.1.5 Core Ontology for Multimedia (COMM). . . . . . . . . . . . . . . . . 220C.2 Multimedia Annotation Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Bibliography225

Glossary247

Index249

Page 22

Page 23

List of Tables

2.1 Scalability in selected domains.. . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1 Scope of Metadata regarding their Functional Type.. . . . . . . . . . . . . . . 40

4.1 Description Logics Axioms.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.2 Terminology for LP clause types.. . . . . . . . . . . . . . . . . . . . . . . . . . 634.3 Dublin Core Elements.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.1 Comparison of Formal Descriptions for Media Assets.. . . . . . . . . . . . . . 118

8.1 Query Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

9.1 Overview on the Media Semantics Mapping built-in rules.. . . . . . . . . . . 153

C.1 An overview of Multimedia Annotation Tools.. . . . . . . . . . . . . . . . . . 221

Page 24

Page 25

List of Listings

3.1 A sample SVG markup.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 A sample SMIL markup.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.3 A sample X3D markup.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Excerpt of an exemplary MPEG-7 document.. . . . . . . . . . . . . . . . . . . 493.5 An exemplary MPEG-21 license.. . . . . . . . . . . . . . . . . . . . . . . . . . 524.1 A sample DL knowledge base.. . . . . . . . . . . . . . . . . . . . . . . . . . . 584.2 Example RDF statements.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.3 Example SPARQL query.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.4 A sample SKOS document.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.5 A sample FOAF document.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.6 A sample DOAP document.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 844.7 An excerpt of a sample JSON document.. . . . . . . . . . . . . . . . . . . . . . 869.1 The F/C-Space mapping rule.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1549.2 Examplary transition rule.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1559.3 An exemplary user-defined rule.. . . . . . . . . . . . . . . . . . . . . . . . . . 15510.1 XHTML source code excerpt of the deployed media asset.. . . . . . . . . . . 18010.2 Extracted RDF from an historical newspaper page.. . . . . . . . . . . . . . . . 18110.3 Querying the embedded RDF metadata of the newspaper scan.. . . . . . . . 18212.1 Resulting triples from the hAudio example.. . . . . . . . . . . . . . . . . . . . 190A.1 RDF source code of the minimal media ontology (T-Box).. . . . . . . . . . . . 199A.2 RDF source code of the minimal media ontology (A-Box).. . . . . . . . . . . . 200A.3 Java source code of the PSIMeter application.. . . . . . . . . . . . . . . . . . . 201

Page 26

Page 27

Page 3

Introduction

When “a message from Chad and Steve”1reached the YouTube community in early October 2006, people would ask: Why is Google going to put 1.65 billion dollar2on the counter? Without being in the board of Google it is hard to tell, though the core of the story is obvious:it is the multimedia, stupid!

Two fundamental types of resources are at odds on the Web: textual resources, and multimedia resources—or more specific, audio-visual content—such as a PNG still image, a MP3 music clip, or an AVI video clip. While for textual resources an array of research [85;285]and tools are available3, multimedia issues w.r.t. the Semantic Web have not yet been widely addressed. In this work we focus on multimedia resources, or—to be a bit more precisely— their description, and the respective usage of the descriptions.

1.1 Motivation

The demand for real-world applications on the Semantic Web is steadily increasing. Simultaneously, existing Web applications handling millions of multimedia assets are starting to take advantage of Semantic Web technologies [237]. Although in the past five to ten years an increase of research activities in the media semantics area can be noticed, several core problems are still not satisfactory solved. Effectively and efficiently accessing distributed data sources, dealing with theSemantic Gapin multimedia content descriptions, and deploying media asset descriptions on a Web-scale; these and other related issues stemming from real-world requirements may be one of the reasons for the—still widely academic minted— reputation of Semantic Web (multimedia) applications.

Different parameters may influence the performance, and the functionality of a Semantic Web multimedia application (SWMA). Attempting to build such scaleable and smart

1http://www.youtube.com/watch?v=QCVxQ_3Ejkg

2http://www.google.com/press/pressrel/google_youtube.html

3http://www.txtkit.sw.ofcd.com/

Page 4

applications, one has to research manifold aspects of multimedia metadata generation, representation, and consumption. A multidimensional analysis is necessary to identify the requirements for a successful utilisation of media assets on the Semantic Web. Regarding accessing distributed data sources, it can be noted that theRDFisingprocess has not yet been widely researched. Some practical work has been reported, such as [264]. However, performance and scalability issues were neglected by and larger so far. Furthermore, automated understanding of multimedia content is an issue in Semantic Web multimedia applications; often referred to as the “Semantic Gap”, which is, following Smeulders et.al. [271]

... the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation.

Although substantial research efforts have been undertaken, a generic, domain-independent solution to the problem is not at hand. Understanding from a set of low-level features, such as colour, shape, etc. that these actually stand for (that is “mean”) a certain entity in a domain—for example “tree”—is a non-trivial task.

Most of the activities or projects addressing the Semantic Gap are seldom more than research prototypes, using toy data sets. While the focus is often put on the expressivity of the description, aspects as performance and scalability, extensibility, and interoperability still have not been widely addressed. Studer et. al. [284] recently claimed:

Another challenge is to manage the expressivity-scalability trade-off of reasoning over declarative knowledge, enabling reasoning over large-scale distributed knowledge bases for suitably expressive knowledge representations. Automated knowledge acquisition will typically yield knowledge that’s uncertain—for example, fuzzy or probabilistic. Such knowledge must be represented and reasoned with in an adequate and scalable way. As knowledge from distributed knowledge bases is aggregated, a deeper semantics can emerge, letting intelligent agents discover patterns across people, roles, and tasks.

This work aims at addressing theexpressivity-scalability tradeoffin the realm of multimedia applications operating on the Semantic Web. The following example illustrates, how easy one may run into troubles, when dealing with a detailed description of audio-visual content.

Example 1.1 (Low-level feature description of a media asset with RDF).A video clip with a duration ofone houris described with MPEG-7. Several visual low-level features (F) as colour, shape, texture, etc. are extracted for a number of spatial segments (S) per key frame (K). A multimedia ontology is then used to represent the MPEG-7 descriptors formally (on basis of RDF); an average number of RDF triples is assumed for each descriptor (TD). An estimation of the resulting RDF graph size then isF·K·S·TD. Let us assume that we want to capture 10 features, some 1000 key frames may exist, 10 spatial segments are marked up, and finally 10 triples are required per descriptor. This yields a total RDF graph size of1 million triples—justfor describing the low-level features of an hour of video footage.f

Finally, even if the above mentioned issue were resolved, another open issue exists: The deployment of multimedia metadata along with the content in the context of the Semantic Web. To the best of our knowledge no proposal exists that addresses performance and scalability, as well as enabling the formal descriptions of the multimedia resources.

Page 5

1.2. PROBLEM DEFINITION5

1.2 Problem Definition

Several issues arise when building Semantic Web multimedia applications; based on a thorough analysis (cf. Chapter6)we identify three issues to be most significant regarding scalability and expressivity:

•performance and scalability issue in distributed multimedia metadata sources,

•efficient and effective representation of multimedia vocabularies and instances, and

•scaleable multimedia metadata deployment on the Semantic Web.

The following sections describe each of the above listed areas of research in greater detail, and formulate according research questions. The reader is invited to note that although the three selected research areas are not strongly interrelated, they have a recurrent theme: they all focus on both effectiveness and efficiency, hence the name of this thesis—Building scaleable and smart multimedia applications on the Semantic Web. While the three areas may be seen as orthogonal, they address different aspects in the design and implementation of a Semantic Web multimedia applications.

1.2.1 Performance and Scalability Issues in Distributed Metadata Sources

A Semantic Web multimedia application needs to process RDF-based metadata stemming from a range of sources. When accessing and processing distributed metadata sources on the RDF-level, the application has to deal with real-world limitations as bandwidth, downtimes, etc.

While from the point of view of a Semantic Web agent it might not be of interest where the triples come from, it may—for the human user who has instructed the agent to carry out a task—well be of interest how long a certain operation takes.Research Questions.What are the characteristics of (multimedia) data sources available on the (Semantic) Web. How can these efficiently be RDFised? Which are practical performance and scalability indicators?

Scope.For this problem, we assume that we deal with global descriptions of multimedia assets. For the performance and scalability indicator a static, a simple setup is assumed; it should be evaluate it in an multimedia Web application.

1.2.2 Efficient and Effective Representation of Multimedia Metadata

When building Semantic Web multimedia applications, the content being dealt with has to be described appropriately. In order to describe the content appropriately, a language has to meet a range of requirements. It has to be expressive to represent objects, events, and relations. There must be ways to assign descriptions to temporal and spatial segments. The granularity of the content description has to be adjustable. The language has to deal with concrete data types in all its forms (scalars, vectors, and matrices). To find a tradeoff between expressivity and scalability, several aspects should be taken into account4:

4For a detailed discussion on these aspects, the reader is invited to refer to Chapter6.

Page 6

•Thegranularity of the descriptionusually has an impact on the size of the resulting description. The discriminator here is that of scope: A audio clip might be described global in terms of genre (thisMP3 file is a Jazz clip)or there might be a detailed description of the wave shape, energy, etc. for a certain time period (fromtime code X to time code Y the following parameters have been extracted:vector of signal parameters).

•The requiredinferential capabilitiesof the system influence the choice of the representation. If no or only simple queries are expected (returnall documents that are mono and less than 1min playtime),a simple metadata format (such as ID3 for music) might be sufficient. When advanced, and even domain specific retrieval operations are on target (findme contemplative scenes with at least two people in it),usually formal-grounded languages as Description Logics or rule-based languages are a good choice.

•Theusage of the contentthat may further be differentiated into:

-number of users (limited group vs. Web-scale)

-content delivery (streaming, interactive, off-line)

-metadata deployment (embedded vs. referenced)

-access mode (broadcast vs. point-to-point)

-read-only vs. read/write

-personalisation of content

The reader is invited to note that no single language currently covers all the above mentioned aspects. Where, e.g., MPEG-7 is a good choice for representing low-level features, it fails supporting the engineer in the modelling of high-level semantics. On the other hand, for example OWL is quite expressive but lacks built-ins for complex concrete data types (as matrices), temporal descriptions, and support for multimedia description in general5.

Research Questions.How can (formal) descriptions of multimedia content be represented effectively and efficiently? Is there a trade-off between scalability and expressivity and if yes, where?

Scope.A closed-world scenario is assumed; we focus on spatio-temporal descriptions of audio-visual material. Common multimedia metadata formats such as MPEG-7 should be taken into account.

1.2.3 Scaleable Multimedia Metadata Deployment on the Semantic Web

Many multimedia metadata formats, such as Exif or MPEG-7 are available to describe what a multimedia asset is about, who has produced it, etc. With the advent of User Generated Content—be it blogs, Wikis, etc.—a need for deploying these M3 formats in (X)HTML pages can be identified. Another motivation stems from the professional content realm. There, detailed descriptions of cross-media content is on target, along with rights-management. Again, in the context of building Semantic Web multimedia applications, one key question regarding the deployment of the metadata is how to enable existing multimedia metadata formats to enter the Semantic Web in order to make them accessible to Semantic Web agents capable of handling RDF-based metadata.

5We note that the ongoing work regarding OWL 2 have not been taken into consideration; see alsohttp:

Page 7

1.3. READER’S GUIDE7

Research Questions.How can existing multimedia metadata formats be deployed effectively and efficiently on the Semantic Web? What are the use cases?Scope.It is assumed that reusability of exiting material should be maximised. We assume a prototypical implementation as sufficient as a proof of concept. The deployment description should be available as an vocabulary.

1.3 Reader’s Guide

The thesis at hand is roughly structured into five parts as shown in Fig.1.1.

•Part I introduces the foundations and lists existing and related work;

•Part II discusses methods and requirements regarding scaleable, yet expressive multimedia content descriptions;

•Part III addresses the three core issues of engineering Semantic Web multimedia applications as of the problem definition;

•Part IV contains conclusions and contemplates about future directions regardingw.r.t. Semantic Web multimedia applications;

•Part V (Appendix) gathers sources and the author’s contributions.

Readers familiar with both multimedia metadata and equipped with knowledge of the Semantic Web (technologies) may choose to skip Part I and directly start with Part II. The core of the thesis is in Part III, as it addresses the research question given earlier in this chapter (cf. section1.2).

Note that a detailed explanation of the research this thesis is built upon is given later in sectionB.1(AppendixB).This work was accompanied by the author’s activities within

Page 8

W3C. The author was active in the first Multimedia Semantics Incubator Group (MMSEM-XG) in 2006/2007. Further, the author has been active in the Semantic Web Deployment Working Group (ongoing) focusing on the RDFa specification6, more specially on the use cases, test cases and the implementation report. Finally, the author has been active in the LinkingOpenData7project, realising the formalisation and interlinking of statistical data [125] and proposing a new interlinking method [144;143].In the following a detailed reader’s guide—on the chapter level—is given:

Part I-Scope and FoundationsIntroduces the foundations and sums up existing work. The goal is to make the reader familiar with the problem domain.

IntroductionGives a motivation and defines the research questions.

Related and Existing WorkRelated work is discussed and critically reviewed.

Multimedia MetadataFoundations of multimedial metadata (M3) are explained.

Semantic WebSemantic Web basics are explained.

Part II-Methods and RequirementsConstitutes theoretical elaborations on scaleable yet expressive multimedia content descriptions.

Creating Smart Content DescriptionsElaborates on how multimedia content descriptions are created (from extraction to ontology engineering).

Scaleable yet Expressive Content DescriptionsIntroduces requirements for scaleable yet expressive multimedia content descriptions.

Part III-SWMA EngineeringAddresses three core issues of engineering Semantic Web Multimedia applications.

Rational & Common ConceptsLists basic design principles and defines common concepts.

A Performance and Scalability Metric for Virtual RDF GraphsAddresses issues w.r.t. the access of distributed metadata sources.

Media Semantics MappingAddresses issues regarding the Semantic Gap in media descriptions.

Efficient Multimedia Metadata DeploymentAddresses multimedia metadata deployment issues.

Part IV-Conclusion and OutlookDiscusses lessons learned and future directions.

Concluding RemarksThe work is reviewed and discussed.

OutlookA number of possible developments regarding SWMA is presented.

Part V-AppendixGathers sources and author’s contributions.

SourcesLists sources of RDF graphs and applications in the context of this work.

Author’s ContributionLists the author’s contributions in the realm of this thesis.

Reference MaterialOffers a collection of good practice material for SWMA.

GlossaryGives a short explanation of terms used in this work.

Page 9

1.4. WHAT THIS WORK IS NOT ABOUT9

1.4 What this work is NOT about

This thesis doesnot attempt to define a solution for a semantic description of multimedia content.Defining such a formal specification, i.e., an ontology, would contradict with the genuine idea behindontologies:to be based on an agreement of domain experts. The reader is invited to note the plural form; it is rather due to the fact that ontologies are based on asharedunderstanding of a domain than to the circumstance that the author is not able to or willing to perform such a task.

Not in the scope of this thesis are multimediacontentissues, such as compression, codecs, etc. Further, issues as data access and delivery (caching, streaming, broadband, etc.) or access control-issues, such as ACLs, etc., are not in the primary scope of the thesis. However, we refer to such issues if they have a significant impact on the issues discussed earlier, i.e., issues thatarein the scope of the work.

Page 10

Page 11

Related and Existing Work

The title of this work—Building Scaleable and Smart Multimedia Applications on the Semantic Web—contains terms, which have to be clarified, and put into context before one is able to go into greater detail. For quite a lot of these terms, no general definition is available. Were appropriate, such a definition in the context of the thesis is given. Hence, this chapter discusses the following terms, along with their interpretation in the context of the work at hand:

•“Semantic Web Applications”, cf. section2.1

•“Multimedia Applications”, cf. section2.2

•“Scalability and Expressivity”, cf. section2.3

For each of the phrases an explanation is given, relevant existing and related work is discussed. Where applicable, research projects are listed exemplary—some of them the author has participated in.

As an aside, it is worth noting that each technology undergoes certain phases ranging from foundational academic research to practical exploitation. Semantic Web technologies are—as time of writing of this thesis—according to Gartner’s Hype Cycle1in the so called “Technology Trigger” phase. This first phase of a Hype Cycle is the breakthrough, product launch or other event that generates significant press and interest. While from the infrastructural point of view a lot work has already been done (annotations, languages, services, etc.), practical aspects as for example scalability of metadata have not been widely addressed. However, a range of activities can be noticed in this field, be it grass-root-like or educational and outreach activities.

1A Hype Cycle is a graphic representation of the maturity, adoption and business application of specific technologies.,http://www.gartner.com

Page 12

2.1 Semantic Web Applications

In 2007, the Semantic Web Challenge2(SWC) is being held the fifth time in a row. The SWC—an event for demonstrating practical progress towards achieving the vision of the Semantic Web—is organized in conjunction with theInternational Semantic Web Conference(ISWC). Several purposes are served, namely (i) the SWC enables to illustrate to society what the Semantic Web can provide, (ii) gives researchers an opportunity to showcase their work and compare it to others, and (iii) stimulates current research to a higher final goal by showing the state-of-the-art every year.

To ensure a certain level of comparability, the SWC has listed a number of minimal requirements, a Semantic Web applications must meet in order to be able to participate in the challenge. These criteria are outlined and discussed in the following.

1. Themeaning of datahas to play a central role.

•Meaning must be represented using formal descriptions,

•Data must be manipulated/processed in interesting ways to derive useful information, and

•This semantic information processing has to play a central role in achieving things that alternative technologies cannot do as well, or at all.

2. Theinformation sources...

•Should have diverse ownerships (i.e. there is no control of evolution),

•Should be heterogeneous (syntactically, structurally, and semantically), and

•Should contain real world data, i.e. are more than toy examples.

3. It is required that all applicationsassume an open world,i.e. assume that the information is never complete.

context of the Semantic Web the languages of choice are somehow limited to being RDFbased, such as OWL and the like.

Regarding the “information sources”: Firstly, the requirement that the sources need to have diverse ownerships is obviously needed to be able to demonstrate theWebcharacteristic; cf. Definition2.2.Secondly, asking for real world data rather than for constructed, limited toy examples supports the very issue of this thesis. Regarding the “open world assumption”: Due to the Web-scale reasoning process, this is a non-trivial issue; recently Fensel and van Harmelen [98] elaborated on that issue. We subscribe to the above stated view on the requirements for Semantic Web applications, and additionally point out that a Semantic Web application is aWeb application,after all. The lessons learned in this area should be taken into account, as well. Well-known infrastructure, processes, and methodologies3for handling content and metadata should be utilised. Consequently, before we give a definition of what is to be understood by a Semantic Web application, we define Web application as follows4.

Page 13

2.1. SEMANTIC WEB APPLICATIONS13

Definition 2.1 (Web Application).

A Web Application is a software program that meets following minimal requirements:

•It is based on the HyperText Transfer ProtocolHTTP[169] and Uniform Resource IdentifiersURI[33;60]5;

•For human agents, the primary presentation format is the Hypertext Markup Language (X)HTML6[330];

•For software agents, the primary interface is REST-compliant [100] or may be based on Web services7—cf.SOAP[276],WSDL[55], andUDDI[301];

•The application operates on the Internet;q

•The number of (concurrent) users is undetermined.

Note that whereprimaryis used in Definition2.1,it is possible and likely that other rendering formats (such as PDF8) or protocols (for example XMPP9) may as well be offered by a Web application in addition to the ones mentioned. Note as well that the last characteristic both effects the scalability and performance of a Web application. In the next step we give—based on Definition2.1and motivated by the requirements of the Semantic Web Challenge—a definition of a Semantic Web application.

Definition 2.2 (Semantic Web Application).

A Semantic Web Application is a Web application that additionally to the requirements listed in Definition2.1,meets the following minimal requirements:

•The metadata (metadata sets) used in the Web application must be machine readable and machine interpretable10, i.e, it is based on the Resource Description FrameworkRDF[203]11;

•A set of formal vocabularies—potentially based onOWL[239]—is used to capture the domain of discourse12; at least one of the utilised vocabularies and/or metadata sets has to be provennotto be under (full) control of the Semantic Web application maintainer;

•SPARQL [253] should be used for querying, and RIF[260] may be utilised for exchang-qing rules.

The restriction that a (Semantic) Web application is expected to operate on the Internet is to ensure that Intranet—or for the sake of correctness: Intraweb—applicationsutilising(Semantic) Web technologies arenotunderstood as (Semantic) Web applications in the narrower sense per se. The reader is invited to note that this requirement is a matter of the control over the data and the schemas rather than a question of the sheer size of the deployed application.

5See section4.3.1for details

6http://www.w3.org/html/wg/

7http://www.w3.org/2002/ws/

8http://www.adobe.com/devnet/pdf/pdf_reference.html

9http://www.xmpp.org/rfcs/

10For a discussion on this issue the reader is invited to refer to [312, Section 1.1]

11Section4.3.3

12See section4.3.4for details