[bxmlt2005] Nicola Henze: Personalization on the Semantic Web

This is today’s invited talk. Nicola Henze is a professor at Uni Hannover, which is Germany’s second-best Semantic Web research location, according to Robert Tolksdorf. She talks about the “people” in TBL’s famous definition of the Semantic Web: “… an extension of the current Web … better enabling computers and people to work in cooperation.”

(Slides, PDF)

She reviews personalization approaches on the old web and on mobile devices. Research has shown that users can access services faster on personalized mobile devices, are more satisfied with the service, and ultimately use it more.

There’s a hierarchy of personalization. Services can be unpersonalised, can identify users anonymously, can be aware of the user’s context (his or her location, time, device), or can have complex models of individual users (assessing their goals, interests, requirements).

Enter the semantic web. It can be used to improve the existing approaches: Metadata about web pages (subjects, it’s the homepage of X, etc.) can be used to improve navigation and to select relevant bits and pieces.

It’s bad when we can’t comprehend why a personalized system does certain things, e.g. present me with a certain piece of information. Semantic web enables proof of these decisions: The system can explain why it show me this. sounds a bit like [TriQL.P]

Case study: a system for information about scientific publications. [how original!] The problem to be solved here is duplication of information. Information about publications originates on the author’s home pages. But we want different views on this information.

They use a product called LIXTO suite to extract the data from web pages. The data is then mixed with additional ontologies, citeseer data and personalization rules. RDQL queries are used to get stuff out of the data and show it in a custom interface.

She doesn’t give much details. What happens in the backend? There’s “Ontology” and “Reasoning” on the architecture slide, what do these components do? How does the rendering of RDF to HTML work? There’s more in the slides.

Basically it’s a portal that aggregates

She reads from a novel — I missed the title — where someone uses a computer to evaluate candidate’s claims in an election campaign. Giving a few orders to the computer — remove all obvious lies, superfuous stuff etc. — reduces their programmes to a few words. Very timely, there are elections in Germany next weekend.

The extraction from HTML is done with regular expressions. You need a new extractor for each new datasource. If extraction fails because the HTML structure has changed, a warning is raised. This works reasonably well because publication data always looks quite similar.

This entry was posted in General, Semantic Web. Bookmark the permalink.