Why now?
"As interface stands on the shoulders of infrastructure,
tomorrow's user experience will rest on the foundation of today's
Semantic Web technologies."
Peter Morville, Ambient Findability, 2005, p. 170
Why now? - an analogy...
- The energy infrastructure was initially stand-alone... we foraged, made, or bought locally
- Local electric power generation, particularly co-located with industry sites
- Power lines and the 'grid'... standards, and a visible sign of the infrastructure's existence
- Buried lines... out of sight, but relied upon
- Eventually...
- Nearly invisible, relatively standardized resource
- Additional uses... like audio speakers riding the home wiring
- Different innovations/sources
- ... and local initiatives that give back to the grid
... analogy thanks to Nathalie Barthe
The imperative for Usability and Interaction Design involvement
"Well, you said that for people to be able to handle data they need a lot of skill...
We do not yet have Semantic Web technology available which is that easily usable by grandparents and children. That is true."
Tim Berners-Lee, "The future of the Web as seen by its creator"
in an interview with Peter Moon, IDG Now, July 9, 2007
"After 10+ years of work into various aspects of the Semantic Web...
I am now fully convinced (read: no longer in denial) that most of the remaining challenges to realize the Semantic Web vision have nothing to do with the underlying technologies...
Instead, it all comes down to user interfaces and usability."
Ora Lassila, "Semantic Web Soul Searching"
Wilbur-and-O blog, March 19, 2007
Evolving understandings
- The "Semantic Web"... a grand vision
- The "semantic web"... heterogeneous, interconnected, self-describing
- Web 2.5... or Web 3.0...
- ... or "Linked Open Data"... the "Web of data":
- can we be less 'purist' in markup?
- can we be less 'purist' in data?
- inclusive, more organic process
- is it 'bootstrapping' or 'digression'? ...it is up to our users to shape
It's about the lines, not the points
focus on describing the relationships between things
It's about design
focus on relevance to tasks
expand possibilities
What problems are we trying to solve?
- Relevance
- About-ness
- Directness
- Reliability and provenance
- Scalability
- Usability:
- context
- relationships
- integration
- meaning
Some things to think about
- User-centered, rather than publisher-centered
Some things to think about
- User-centered, rather than publisher-centered
- Vision of rich context: formal, social, personal, situational
- A seamless user experience... user goals and tasks are facilitated easily, no matter what technologies and applications involved
- Based on frictionless data... data is free to move between applications and uses as needed
Some things to think about
- User-centered, rather than publisher-centered
- Vision of rich context: formal, social, personal, situational
- Relevant and pro-active information
Is Metadata back in fashion?
Metadata
- Inward-looking: data about the content or application itself
- Outward looking: the situations, tasks, context, and real world subjects
- Cross-referring: the relationships that an item of content participates in
- Embedded: the internal structure of content and its markup
Metadata is an enabler
Themes
- Navigating information landscapes
- Search and natural language interaction
- Creating and sharing information
- Vocabulary and relationship management
- Personal and social issues: provenance, transparency, shared data and privacy
Themes derived from Lisa Battle's 2006 analysis of Semantic Web research and projects:
Preliminary Inventory of Users and Tasks for the Semantic Web
Navigating Information Landscapes
What is useful? What challenges exist?
Navigating information landscapes
- Increasingly common to see structured data:
- Faceted filtering and browsing
- Related topics, tagging and cross-linking
- Mashups
- Getting easier to implement
- Consistency? Usefulness? Interaction challenges?
Navigating information landscapes
Exploring facets
- Categories that you control and select
- It's becoming common for eCommerce and large site navigation
- Increased relevance for users
- There are two types of interaction: filtering and browsing
- Filtering is about starting with large lists and narrowing... a relevance funnel
- Browsing is about exploring the structure of the data to find useful areas of interest
- The line is blurring between search filtering and browsing...
- The capabilities are being built into CMS's and other web tools now
Navigating information landscapes
Exhibit: Simple faceted presentation
Created by the SIMILE group, MIT
- It's how the Web grew: "see what I did, copy it, try it"
- Re-usabe/sharable controls can lead to a more consistent experience
- Lightweight data structure... emphasis on structure
Navigating information landscapes
mSpace and mSpace Mobile: facet browsing
Created by IAM Group, ECS, Southampton, UK
- Semantic Challenge winner, 2004
- "Throw it over" any structured data - even mashups
- Start with any facet, move in any direction
- Considerations?
- How easy to add new facets?
- Will people easily grasp that they control sequence of facets?
- Locational context - a primary "filter" for mobile
- Start with "what do I want to know here?"
- Add social recommendation - based on my friends
- Focus+Context for small screens
Navigating information landscapes
Multimedia eCulture: Rijksmuseum, Netherlands
Created by CWI, Netherlands
- Semantic Challenge winner, 2006
- Rich metadata, with ability to link to other international data sets
- Faceted search results, browsing rather than filtering
- Challenges:
- Labeling
- What metadata to show, what to hide?
- Is the interaction sufficiently... or too... "serendipitous"?
Navigating information landscapes
IRS TaxMap
Created by InfoLoom
- Starts with highly structured information
- Call centers - high volume, public language
- Adds:
- Index-like topic access
- Alternative terms more in line with the public
- Contextual relationships to other topics
Navigating information landscapes
Using context as a framework to present content
Navigating information landscapes
Using context as a framework to link content
Navigating information landscapes
Tag-based navigation
- Tags are now a common sight, and they solve real needs
- ...but, tags are having some trouble morphing into "folksonomies" and they don't scale well
- They have another challenge - two different motivations:
- Unique identity, for personal findability
- Generic descriptor, for social visability
- Trend: grouping tags
- Trend: Suggestions based on structured data sets, better semantic integration, e.g.:
- Defined terminology database
- Wikipedia subjects
- Semantic relationships for tags, links, and people
Navigating information landscapes
Visualization overload? The "Big Fat Graphs"
- There are many really good uses for visualization of data. There are also at least as many poor uses in the semantic web community...
- Just because the underlying data model is a graph may not mean that's the best way to display the data for users
- See the considerations outlined by m.c. schraefel and David Karger in The Pathetic Fallacy of RDF (from the 3rd International Workshop on the Semantic Web and User Interaction, Nov 2006, Athens, Georgia
Navigating information landscapes
Data browsing
- Follow the related paths or "walk" the logic in the ontology
- Useful to see sources of data, and connections between elements
- Tabulator, a project started by Tim Berners-Lee, is a FireFox plug-in
- Tools that an ontology/data specialist or developer is most likely to use?
Navigating information landscapes
SADIe: a unique use of SW technologies for accessibility
Created by Universities of Manchester and Aberdeen, UK
Search and the possible role for natural language interaction
Is there something other than "keywords"?
Search and natural language interaction
- "Semantic search"... it's still early days
- Full sentence and structured sentence searching - how does it work for users?
- Computer algorithms parsing text in web pages/documents and assigning meaning and relationships
- OK, it's pretty cool (not Cuil?)... but how do we test the underlying algorithms, assumptions, and data sets?
Searching information landscapes
Keywords, meaning, and natural language
- Search is still dominated by keywords and flat results lists
- More focus now on refinement, with richer interaction in results page
- Increasing visualization, use of metadata, clusters/facets
- Many search algorithms are geared toward huge data sets (i.e. the entire Web) and little metadata availability
- Is natural language too "heavy" or too fragile?
- How useful is an intermediate step, such as "structured natural language" for more controlled environments?
Searching information landscapes
Structured natural language search
Dynamic & Distributed information Systems Group, University of Zurich
- User research on different levels of structure, language, interaction
How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users?
(pdf) by Esther Kaufmann and Abraham Bernstein, ISWC 2007
- "The results of the study reveal a clear preference for full sentences as query language and confirm that NLIs are useful for querying Semantic Web data."
- "...a highly significant preference for full-sentence queries independent of the retrieval performance."
- "One of the most prominent qualitative results was that several users, who rated Querix as best interface, explicitly stated that they appreciated the 'freedom of the query language.'"
- With full-sentence questions, users can communicate their information need in a familiar and natural way without having to think of appropriate keywords
- People can express more semantics when they use full sentences and not just keywords.
Searching information landscapes
Complex search, where semantic data relationships play a role
- Synthesis and exploration... not just targeted or 'good enough'
- High context with implicit as well as explicit relationships
- For example:
- What Greek restaurants are open after 10pm within three blocks of a movie theatre where I can see the latest DiCaprio film?
...and can I get there in 45 minutes, with current traffic?
- How do I process this claim for a back treatment while the patient was on vacation in Maryland? Is it subject to the new legal requirements now in effect in her home state of Texas?
- A new crop of search tools arriving...
- Parsing search phrase/sentence structure to try and support context identification
- Using concept extraction and ontology integration in back-ends to improve context identification
- Like so much in search, could use more usability work
I think it's fair to say that next generation tools are on the way, once you attach names like Microsoft and Yahoo to the trends...
Creating and sharing information...
rapid acceleration
Data entry and annotation
Sites that share information
Creating and sharing information
- Semantic syntax is becoming part of the "fabric" of applications
- RDFa (10.2008) is a catalyst
- Strong encouragement: Yahoo (SearchMonkey), Thompson/Reuters (Calais), Drupal... and others
- Is metadata really back?
- We can enrich the user experience !
- We can grow the semantic web organically, from the bottom up !
- Of course, our stuff will show up in unexpected places, and be used in unexpected ways
Creating and sharing information
If you don't have the data, what do you have?
- We all know: having users enter metadata is crazy
- ... or is it? Maybe it's just hard?
- Increasing focus on capturing, extracting, sharing, and organizing structured data
- Lightweight data entry interfaces
- Significant public datasets available
Creating and sharing information
Drupal
- Dries Buytaert: RDFa support into Core (v8), reaching for a fully semantic-enabled CMS
- An active community building modules for RDF, and structured/linked data capabilities
- Some examples of what is possible:
- Calais and OpenPublish integration (Thompson/Reuters and Phase2)
- RDF data mapping and management from CCK (DERI and the Semantic Group community)
- Solr and Lucene integration (Acquia)
For example, Phase2's architecture stack concept around OpenPublish (presented at Drupalcon 3/2009, Washington, D.C.):
Creating and sharing information
LepTree, Splickr, and Spotter: biology community support
Created by Univ. Maryland Baltimore Campus (part of AToL project)
- LepTree community site: Drupal-based content management system with semantic triple-store
- AJAX techniques allow structured data entry "in place" - using an ontology for selection fields
- Splickr: annotating content with metadata, at the same time providing that metadata to the archive for other uses
- Spotter: map-based locating and accessing blog entries created with Splickr
Creating and sharing information
Semantic Wiki: semantic extensions to the popular MediaWiki
Created by Ontoprise and Instititue AIFB, Universitat Kalrsruhe, Germany
- You are reading a page about "Card Sorting" and see a link to "Information Architecture"
- In a regular wiki, IA would be typed in as [[Information Architecture]]
- In semantic wiki, IA would be typed in as [[method used for::Information Architecture]], which reads (to a semantic parser): "Card Sorting is a method that is used when doing Information Architecture [link]"
- An expected amount of session time might be stated in the article on "Card Sorting"
- In a regular wiki, you can read the anticipated time, but the computer can't
- In semantic wiki, type in [[expected duration:=1 hour]], i.e. "A card sort session has an expected duration of 1 hour"
- Builds on the success of MediaWiki (which drives Wikipedia)
- Syntax for relationships between topics as triples
- Not easy for the "average user" -- so they've been working hard on that...
Creating and sharing information
Semantic Wiki: semantic extensions to the popular MediaWiki
Created by Ontoprise and Instititue AIFB, Universitat Kalrsruhe, Germany
- New interfaces for three tasks:
- Adding relationship, category and property information to terms in the wiki page
- Browsing the ontology in the wiki
- Querying the ontology using structured query syntax to find particular information
Creating and sharing information
Personal Information Scraps
Simile Group, MIT and AIM Group, U. Southampton
- What are scraps?
- If "apps" are rich interaction with structured data, then...
- Scraps: lightweight capture of unstructured data with high contextual relevance
- Easy, lightweight, and flexible for different styles
- Jourknow, Inky, and AtomsMasher: exploratory and very interesting!
- Ethnographic process: study of scientists' PostIt notes, desks, notebooks, and computer filing:
Creating and sharing information
Personal Information Scraps
Simile Group, MIT and AIM Group, U. Southampton
- The group is working on a range of small interfaces to facilitate easy and useful scrap management. Here are a few examples (but keep your eye on this group, because they refine and improve the work regularly):
- Simple data capture
- Exploring context behind a particular scrap
- Relating scraps to other things
Creating and sharing information
Personal Information Scraps: Pidgin
Simile Group, MIT and AIM Group, U. Southampton
In order to facilitate people creating and managing scraps, they are looking at different levels of informal-to-formal syntax that might be used.
| |
sloppy pidgin |
|
jane 3pm diesel cafe |
|
"Sloppy parsed" to allow out-of-order matching and recursive nesting of typed templates. |
| |
tame pidgin |
|
Meet with Jane phone 617-555-1212 tomorrow at diesel cafe about SWUI submission |
|
Hand-written grammars for common domains, with semi-open SW-KB defined lexicon, and support for nested expressions. Not user-extensible or re-orderable. |
| |
clay pidgin |
|
meet 3pm with jane smith about swui |
|
User-defined N3 macro language using "means" templates written by the user. Support for nesting. No re-ordering clauses. Template: "meet when with whom about what" means [ a :Meeting; vcal:start "when"; xcal:attendees "whom"; xcal:description "what"]. |
| |
n3+res pidgin |
|
swui mtg a Meeting; starts at: 3pm tomorrow; with jane; location Diesel Cafe |
|
N3 with entity and property and value resolution. Uses a colon or dash to delimit multi-word properties from their values, and semicolons to delimit clauses. |
Creating and sharing information
Annotation
- Annotating specific points within web pages - for research and review
- Key issues:
- When are they findable and useful?
- When found, how do annotations carry the context in which they were created... what were you thinking about at the time?
- How do we design with browser-embedded plug-ins in mind?
- Behind the scenes in Annotea, some of the structural concepts:
Creating and sharing information
PhotoStuff: archiving and annotating your photos
Created by MindLab, Univ. of Maryland
- Most photo sites now have annotations of portions of images... but it wasn't always this way
- PhotoShow (from 2002-2003) is annotation where data capture is tied to ontologies
- Load ontologies to get the initial classification options available
- For each classification item there is a set of "fields" for specific instance data
- Adding a new required data field can be done by adding an item in an ontology... no new code!
- Annotations are shareable
- They are also portable, if you decide to move from one photo site to another
Creating and sharing information
Yahoo Pipes: a graphical mashup builder
- Building mashups visually in the browser
- Host the mash-up on their site, or use the API to embed it in yours
Vocabulary and Relationship Management
Vocabulary and relationships
Working with vocabulary
- There is an increasing amount of publicly available, structured terminology to use
- There are also economical ways to extract structure from unstructured content
- We still need good tools, to:
- Create our own domain/organization-specific vocabulary
- Review and make decisions about external or extracted vocabulary
- Integrate vocabularies from different sources, to suit our particular purpose
- Manage vocabulary and tagging over time
Vocabulary and relationships
Existing terminology sets
- There is much to learn from smart people and their published efforts
- Just because the vocabulary sets are available doesn't mean it's effort-free
- Many different formats, structures, purposes -- and licenses
- There is a lot of new data flooding the Internet, for example:
Vocabulary and relationships
Example of a widely-used ontology: FOAF - Friend of a Friend
Collaborative project started by Libby Miller and Dan Brickley, 2000
- Common, sharable information about people... and their relationships to other people
- Increasingly used as the format for describing people
- Challenge: a person can have more than one... work-arounds are able to cross-refer, but they're not really easy
- Creating a FOAF profile could be much easier to do... and so could maintaining them
- Where can they be found?
- People's individual sites
- FOAF Bulletin Board (part of FOAF wiki)
- Semantically-rich commercial sites
- Social community sites like Tribe and LiveJournal
Vocabulary and relationships
Concept extraction and term identifiers
- Concept extraction: TermExtractor (Sapienza, Univ. Roma) and TerMine (National Centre for Text Mining, Univ. Manchester and Tokyo)
- Not many of these out in the "free web" space
- Simple approach to uploading one or more files for rapid analysis
- Quite a lot of control over analysis... if you can understand what the controls offer!
- Gnosis: FireFox plug-in for on-the-fly concept identification (based on Calais)
- Calais service, and extensions for WordPress and Drupal for extracting concept metadata to be used as tags
- Zemanta background searches based on the content you are creating in popular content management systems and blogs: plug-ins for WordPress, Moveable Type, Drupal
Vocabulary and relationships
Ontology editing and viewing
Protege editor, U.Stanford; Crop Circles by MindLab, U.Maryland / U.Manchester; TopQuadrant
What are some of the issues?
- Useful for bridging different sources with different data representations
- Allows construction of complex data relationships
- Requires knowledge representation awareness and domain experience
- Collaborative editing is coming, but still early days
- Interactions and visualizations don't tend to scale well
- Maybe we need to find other ways? An open discussion...
Personal and Social Issues:
Shared data, provenance, transparency and privacy
Hot topics for longer-term consideration
Personal and social issues
- Shared data
- Building on the strength of Web 2.0
- Stronger relationships... between each other... between our data "selves"
- Provenance
- Trust will raise its head, as it does with every technology shift
- Tracing data history... more possible than ever... and more important than ever
- Transparency
- We've been hearing a lot about that lately!
- What does it really mean in a widely inter-linked and structured data world?
- So the data is transparent... what about the manipulations?
- Privacy and informed consent
- Who's in charge?
- How to we know what we've given permission for?
- In future, we're not just giving permission for data... we could empower agents with responsiblity for action on our behalf
Personal and social issues
Challenges
- It's about the data !
- The Facebook, Flickr, Google, et al dramas: whose data is it?
- Mine, and it remains behind the wall until I delete it
- Mine, or someones, and it's sharable and linkable
- Mine, and I can take it with me when I go
- Mine, and the site owner's...? How deep do I have to read in the License?
- The lurking "freshness" problem...
- Am I comparing apples and oranges? Lemons?
- Does everything I'm interacting with carry similar meanings and purposes?
- How do I know?
- ...and will I be overwhelmed if I do know?
- It's about the data !
Personal and social issues
CS AktiveSpace: mashup of academic data, with provenance
Created by IAM, University of Southampton
- Can source information and provenance have >1 step back?
- Is this a generic browser function, rather than an application-by-application design?
Personal and social issues
Inference Web: explanation interfaces
Created by Stanford University
- The logic in ontologies means "indirect" connections between question and answer
- What are queries and agents doing?
- Transparency: "how did you arrive at that answer?"
- Can anything more than the simplest, non-real-world logic be expressed in a way we can understand? What are user expectations?
Personal and social issues
Informed consent: what's unique about the Semantic Web?
From presentation/paper by Paul Shabajee, SWUI2006

Personal and social issues
Recommender systems and understanding "trust"
Jennifer Golbeck, MindLab, University of Maryland
- Do we perceive recommendations differently when we know who provided them?
- How can we better inter-connect our trusted relationships with other people?
- Since many of our relationships are context-specific (e.g. work, or family, or hobbies), how does a recommender system put things in the context of: 1) who we know, and 2) the situations where we interact with them?
- Jen's recent (2008) paper offers a range of study data on controlling factors and implications: Trust and Nuanced Profile Similarity in Online Social Networks (2008, ACM Transactions on the Web, to appear)
For your consideration...
How do we make sure the Semantic Web is:
- Better than the experience we have today
- So easy, anyone can describe themselves/their information semantically
- A trust-worthy and provable representation of our interests
- Forgiving of differences in language and meaning, being clear and respectful of semantic "shades of gray"
- Able to clearly show what a "good" experience is (complete, understandable, transparent, semantically rich, trustable, not overwhelming), when much of the activity is happening in the background using semantic applications and agents
- Able to grow organically (and with few dependencies), while also moving from the "web of content" to the "web of data"... and even toward the "web of meaning"