source: SMC/trunk/SMC/src/web/docs/userdocs.html @ 6846

Last change on this file since 6846 was 6846, checked in by mateusz.zoltak@oeaw.ac.at, 9 years ago

Bunch of fixes making it easy to deploy a working instance:

  • JS libraries names in the repo now agree with URLs in src/web/index.html
  • sample data added - now data list and locations agree with the list set up in src/web/scripts/config.js
  • src/web/get.php generates allow-origin headers so you can query it with AJAX
  • static pages (examples, userdoc) added (copied from the clarin instance)
  • few fixes in smc.graph.js (most notably it can again read old graph jsons in which nodes in links are denoted by index and not by id)

(Probably) the last SVN commit, we are switching to git

File size: 14.6 KB
Line 
1<?xml version="1.0" encoding="utf-8" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4<head>
5<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6<meta name="generator" content="Docutils 0.11: http://docutils.sourceforge.net/" />
7<title>SMC Browser</title>
8<link rel="stylesheet" href="../scripts/style/cmds-ui.css" type="text/css" />
9</head>
10<body>
11<div class="document" id="smc-browser">
12<h1 class="title">SMC Browser</h1>
13
14<p>Explore the <cite>Component Metadata Framework</cite></p>
15<p>In <em>CMD</em>, metadata schemas are defined by profiles that are constructed out of reusable components  - collections
16of metadata fields. The components can contain other components, and they can be reused in multiple profiles.
17Furthermore, every CMD element (metadata field) refers via a PID to a data category to indicate unambiguously how the content of the field in a metadata description should
18be interpreted (Broeder et al., 2010).</p>
19<p>Thus, every profile can be expressed as a tree, with the profile component as the root node, the used components as intermediate nodes
20and elements or data categories as leaf nodes, parent-child relationship being defined by the inclusion (<tt class="docutils literal">componentA <span class="pre">-includes-&gt;</span> componentB</tt>) or referencing (<tt class="docutils literal">elementA <span class="pre">-refersTo-&gt;</span> datcat1</tt>).The reuse of components in multiple profiles and especially also the referencing of the same data categories in multiple CMD elements leads to a blending of the individual profile trees into a graph (acyclic directed, but not necessarily connected).</p>
21<p>SMC Browser visualizes this graph structure in an interactive fashion. You can have a look at the <a class="reference external" href="examples.html">examples</a> for inspiration.</p>
22<p>It is implemented on top of wonderful js-library <a class="reference external" href="https://github.com/mbostock/d3">d3</a>, the code checked in <a class="reference external" href="https://svn.clarin.eu/SMC/trunk/SMC">clarin-svn</a> (and needs refactoring). There is also some preliminary <a class="reference external" href="devdocs.html">technical documentation</a></p>
23<div class="section" id="data">
24<h1>Data</h1>
25<p>The graph is constructed from all profiles defined in the <a class="reference external" href="http://catalog.clarin.eu/ds/ComponentRegistry/#">Component Registry</a>.
26To resolve name and description of data categories referenced in the CMD elements
27definitions of all (public) data categories from <a class="reference external" href="http://dublincore.org">DublinCore</a> and <a class="reference external" href="http://www.isocat.org">ISOcat</a> (from the <a class="reference external" href="http://www.isocat.org/rest/profile/5.rdf">Metadata Profile</a> [RDF] - retrieving takes some time!) are fetched. However only data categories used in CMD will get part of the graph. Here is a <a class="reference external" href="smc_stats.html">quantitative summary</a> of the dataset.</p>
28<p>When inspecting the numbers, it is important to be aware of the occurrence expansion resulting from the reusability of the components.
29So in an example, a component C has 2 subcomponents and is reused within one profile by two other components A and B, the resulting profile
30will consist of (at least) 8 components (<tt class="docutils literal">[A, B, A/C, B/C, A/C/C1, A/C/C2, B/C/C1, B/C/C2]</tt>), although only 5 distinct components are used.
31The same goes for elements in reused components. In most cases it is indicated in the label, if the number reflect distinct items, or all (expanded) occurrences.</p>
32<p>(Some of the) numbers in the statistics lead to a list of corresponding terms.
33E.g. in the summary for a profile, clicking on the components-number lists all the components of given profile alphabetically.
34Currently there are such lists for:</p>
35<blockquote>
36<ul class="simple">
37<li><tt class="docutils literal">profile <span class="pre">-&gt;</span> components</tt></li>
38<li><tt class="docutils literal">profile <span class="pre">-&gt;</span> elements</tt></li>
39<li><tt class="docutils literal">profile <span class="pre">-&gt;</span> data categories</tt></li>
40<li><tt class="docutils literal">data category <span class="pre">-&gt;</span> profiles</tt></li>
41</ul>
42</blockquote>
43</div>
44<div class="section" id="user-interface">
45<h1>User Interface</h1>
46<p>The user interface is divided into 4 main parts:</p>
47<dl class="docutils">
48<dt>Index</dt>
49<dd>Lists all available Profiles, Components, Elements and used Data Categories
50The lists can be filtered (enter search pattern in the input box at the top of the index-pane).
51By clicking on individual items, they are added to the <cite>selected nodes</cite> and get rendered in the graph pane.</dd>
52<dt>Main (Graph)</dt>
53<dd>Pane for rendering the graph.</dd>
54<dt>Navigation</dt>
55<dd>This is the control panel governing the rendering of the graph. See below for available <a class="reference internal" href="#options">Options</a>.</dd>
56<dt>Detail</dt>
57<dd>In this pane, overall summary of the data is displayed by default,
58but mainly the detail information about the selected nodes is listed here.</dd>
59</dl>
60<div class="section" id="interaction">
61<h2>Interaction</h2>
62<p>Following data sets are distinguished with respect to the user interaction:</p>
63<dl class="docutils">
64<dt>all data</dt>
65<dd>the full graph with all profiles, components, elements and data categories and links between them.
66Currently this amounts to roughly 4.600 nodes and 7.500 links.</dd>
67<dt>selected nodes</dt>
68<dd>nodes explicitely selected by the user (see below how to <a class="reference internal" href="#select-nodes">select nodes</a>).</dd>
69<dt>data to show</dt>
70<dd><p class="first">the subset of data that shall be displayed.</p>
71<p class="last">Starting from the selected nodes, connected nodes (and connecting edges)
72are determined  based on the options (<tt class="docutils literal"><span class="pre">depth-before</span></tt>, <tt class="docutils literal"><span class="pre">depth-after</span></tt>).</p>
73</dd>
74</dl>
75<p>The nodes are colour-coded by type:</p>
76<object data="graph_legend.svg" style="height: 100px;" type="image/svg+xml">
77the legend to the graph</object>
78<p id="select-nodes">There are multiple ways to select/unselect nodes:</p>
79<dl class="docutils">
80<dt>select from index</dt>
81<dd><p class="first">by clicking individual items in the index list, the item will be <strong>added</strong> to the selected nodes</p>
82<p class="last">clicking on an already selected item unselects it</p>
83</dd>
84<dt>select in graph</dt>
85<dd><p class="first">by clicking on a visible node in the graph, the node will be <strong>added</strong> to the selected nodes</p>
86<p class="last">clicking on an already selected node unselects it</p>
87</dd>
88<dt>select area in graph</dt>
89<dd>by dragging (hold mouse button down and pull) a rectangle in the graph pane, all nodes within that rectangle get selected
90all other nodes will be unselected</dd>
91<dt>unselect in detail pane</dt>
92<dd>clicking on an item in the detail pane unselects it</dd>
93<dt>select in statistics</dt>
94<dd>as mentioned in <a class="reference internal" href="#data">Data</a> (some) numbers in the statistics reveal a list of corresponding terms.
95Clicking on these terms in the statistics page leads to the browser, with given term as selected node (and default settings)</dd>
96<dt>select in statistics in the detail pane</dt>
97<dd>the numbers from statistics page are shown also in the detail pane for selected nodes.
98Here, clicking on a term from these lists adds it to the graph, as a selected node.</dd>
99<dt>mouseover</dt>
100<dd>on mouse over a node, all connected nodes to given node (and connecting links) within the visible sub-graph are highlighted
101and all other nodes and links are faded</dd>
102<dt>drag a node</dt>
103<dd>click and hold on a node, one can move the node around, however usually the layout is stronger
104and puts the node back to its original position. Not so with the freeze-layout, that freezes all the nodes and lets you move them around freely</dd>
105</dl>
106</div>
107<div class="section" id="options">
108<h2>Options</h2>
109<p>The navigation pane provides the following options to control the rendering of the graph:</p>
110<dl class="docutils">
111<dt>depth-before</dt>
112<dd>how many levels of connected ancestor nodes shall be displayed</dd>
113<dt>depth-after</dt>
114<dd>how many levels of connected descendant nodes shall be displayed</dd>
115<dt>link-distance</dt>
116<dd>approximate distance between individual nodes
117(not exact, because it is just one of multiple factor for the layouting of the graph)</dd>
118<dt>charge</dt>
119<dd>the higher the charge, the more the nodes tend to drift apart</dd>
120<dt>friction</dt>
121<dd>factor for &quot;cooling down&quot; the layout, lower numbers (50-70) stabilize the graph more quickly,
122but it may be too early, with higher numbers (95-100) the layout has more time/freedom to arrange,
123but may get jittery</dd>
124<dt>node-size</dt>
125<dd><p class="first">N = all nodes have given diameter N;</p>
126<p class="last">usage = node is scaled based on how often the node appears in the complete dataset
127i.e. often reused elements (like description or language) will be bigger</p>
128</dd>
129<dt>labels</dt>
130<dd>show/hide all labels
131hiding the labels accelerates the rendering significantly, which may be an issue if more nodes are displayed.
132irrespective of this option, on mouseover labels for all and only the highlighted nodes are displayed</dd>
133<dt>curve</dt>
134<dd>straight or arc (better visibility)</dd>
135<dt>layout</dt>
136<dd><p class="first">There are a few layouting algorithms provided. They are all not optimal in any way, but most of the time, they deliver quite good results.
137For different data displayed other algorithm may be more appropriate:</p>
138<dl class="last docutils">
139<dt>force</dt>
140<dd>undirected layout, trying to spread the nodes in the pane optimally, equally in all directions
141This is the underlying <a class="reference external" href="https://github.com/mbostock/d3/wiki/Force-Layout">layouting algorithm</a>. All the other layouts build on top of it, by just adding further constraints.</dd>
142<dt>vertical-tree</dt>
143<dd>top-down layout respect the direction of the edges, children are always below the parents</dd>
144<dt>horizontal-tree</dt>
145<dd>left-right layout respect the direction of the edges, children are always right to the parents
146(at least they should be, currently, in certain configurations, the layout does not get the orientation for some links right)</dd>
147<dt>weak-tree</dt>
148<dd>a layout that &quot;tends&quot; towards left to right arrangement, but not strictly so (experimental)</dd>
149<dt>dot</dt>
150<dd>strict left to right reusing the x-positioning as determined by <a class="reference external" href="http://www.graphviz.org/">dot</a>
151Arranges the nodes in strict ranks (typical for dot layout)
152This is done in a separate preprocessing step for the whole graph, so the positioning may be suboptimal
153for a given subgraph. The y-coordinate is approximated on the fly by the base algorithm.</dd>
154<dt>freeze</dt>
155<dd>this is actually a &quot;no-layout&quot; - the nodes just stay fixed in their last position,
156However, individual nodes still can be dragged around, so this can be used to adjust a few nodes for better legibility (or aesthetics),
157but only when you start moving around inividual nodes, you will learn to appreciate the great (and tedious) work of the layouting algorithms,
158so generally you want to try to play around with the other settings to achieve a satisfying result.</dd>
159</dl>
160</dd>
161</dl>
162</div>
163<div class="section" id="linking-export">
164<h2>Linking, Export</h2>
165<p>The navigation pane exposes a <strong>link</strong>, that captures the exact current state of the interface
166(just the options and the selection, not the positioning of the elements),
167so that it can be bookmarked, emailed etc.</p>
168<p>Furthermore, there is the <strong>download</strong>, that allows to export the current graph as SVG.
169This is accomplished without a round trip to the server, with a <a class="reference external" href="https://groups.google.com/forum/?fromgroups=#!topic/d3-js/aQSWnEDFxIc">javascript trick</a>
170serializing the svg as base64-data into the url (so you don't want to save (or see) the exported url).
171But you can both, right click the link and [Save link as...], or click on the link, which opens the SVG in a new tab
172where you can view, resize, print and save it.
173Employing this simple method also means, that there is no possibility to export the graph in PNG, PDF or any other format,
174because this would require <a class="reference external" href="http://d3export.cancan.cshl.edu/">server-side processing</a>. (However this is a planned future enhancement.)</p>
175</div>
176</div>
177<div class="section" id="issues">
178<h1>Issues</h1>
179<dl class="docutils">
180<dt>Performance</dt>
181<dd>Chrome is by far the fastest, followed by IE(9).
182A serious performance degradation was observed for graphs above 200 nodes on Firefox.
183Showing labels also significantly affects performance.</dd>
184<dt>Bounds</dt>
185<dd>When the graph gets to big, it does not fit in the viewing pane.
186This will be tackled soon (either scrollbars or applying boundaries). Meanwhile,
187you can reduce the link-distance and charge parameters or change the layout.</dd>
188</dl>
189</div>
190<div class="section" id="plans-and-todos">
191<h1>Plans and ToDos</h1>
192<p>Substantial issues:</p>
193<ul class="simple">
194<li>Add information from <strong>RelationRegistry</strong> (relations between DatCats)</li>
195<li>Blend in instance data from <strong>MDRepository</strong> (allow search on MDRepository)</li>
196<li>graph operations (intersect, difference of subrgraphs)</li>
197</ul>
198<p>Smaller enhancements of the user interface:</p>
199<ul class="simple">
200<li>select nodes by querying the names (e.g. show me all nodes with &quot;Access&quot; in their name)</li>
201<li>option to show only selected types of nodes (e.g. only profiles and datcats)</li>
202<li>detail-info on hover</li>
203<li>full HTML-rendering of a node (Profile, Component)</li>
204<li>backlinking from detail (e.g. view all the profiles a data category is used in by clicking on the number ('used in profiles')</li>
205<li>store/export SVG/PDF/PNG-renderings of the graphs</li>
206<li>add edge-weight: scale based on usage, i.e. how often appears the relation in the complete dataset
207i.e. often reused combinations of components/elements will be nearer</li>
208<li>allow to blend in further (private) CMD-profiles dynamically</li>
209</ul>
210</div>
211</div>
212</body>
213</html>
Note: See TracBrowser for help on using the repository browser.