source: SMC/trunk/SMC/docs/userdocs.rst @ 2480

Last change on this file since 2480 was 2480, checked in by vronk, 11 years ago

adding user documentation

File size: 10.3 KB
Line 
1***********
2SMC Browser
3***********
4
5Explore the `Component Metadata Framework`
6
7.. _Component Metadata Framework: http://clarin.eu/cmdi
8
9In *CMD*, metadata schemas are defined by profiles, that are constructed out of reusable components  - collections
10of metadata fields. The components can contain other components, and they can be reused in multiple profiles.
11Furthermore, every CMD element (metadata field) refers via a PID to a data category to indicate unambiguously how the content of the field in a metadata description should
12be interpreted (Broeder et al., 2010).
13
14Thus, every profile can be expressed as a tree, with the profile component as the root node, the used components as intermediate nodes
15and elements or data categories as leaf nodes, parent-child relationship being defined by the inclusion (``componentA -includes-> componentB``) or referencing (``elementA -refersTo-> datcat1``).The reuse of components in multiple profiles and especially also the referencing of the same data categories in multiple CMD elements leads to a blending of the individual profile trees into a graph (acyclic directed, but not necessarily connected).
16
17SMC Browser visualizes this graph structure in an interactive fashion.
18
19It is implemented on top of wonderful js-library d3_, more technical documentation follows soon.
20
21.. _d3: https://github.com/mbostock/d3
22
23
24Data
25====
26The graph is constructed from all profiles defined in the `Component Registry`_.
27To resolve name and description of data categories referenced in the CMD elements
28definitions of all (public) data categories from `DublinCore`_ and `ISOcat`_ (from the `Metadata Profile`_ [RDF] - retrieving takes some time!) are fetched. However only data categories used in CMD will get part of the graph. Here is a `quantitative summary`_ of the dataset.
29
30When inspecting the numbers, it is important to be aware of the occurrence expansion resulting from the reusability of the components.
31So in an example, a component C has 2 subcomponents and is reused within one profile by two other components A and B, the resulting profile
32will consist of (at least) 8 components (``[A, B, A/C, B/C, A/C/C1, A/C/C2, B/C/C1, B/C/C2]``), although only 5 distinct components are used.
33The same goes for elements in reused components. In most cases it is indicated in the label, if the number reflect distinct items, or all (expanded) occurrences.
34
35.. _Component Registry: http://catalog.clarin.eu/ds/ComponentRegistry/#
36.. _ISOcat: http://www.isocat.org
37.. _Metadata Profile: http://www.isocat.org/rest/profile/5.rdf
38.. _DublinCore: http://dublincore.org
39.. _quantitative summary: smc_stats.html
40
41User Interface
42==============
43
44The User interface is divided into 4 main parts:
45
46Index
47   Lists all available Profiles, Components, Elements and used Data Categories
48   The lists can be filtered (enter search pattern in the input box at the top of the index-pane)
49   By clicking on individual items, they are added to the `selected nodes` and get rendered in the graph pane
50   
51Main (Graph)
52   Pane for rendering the graph.
53   
54Navigation
55   This is the control panel governing the rendering of the graph. See below for available `Options`_.
56   
57Detail
58   In this pane, overall summary of the data is displayed by default,
59   but mainly the detail information about the selected nodes is listed here.
60   
61   
62Interaction
63-----------
64
65Following data sets are distinguished wrt user interaction:
66
67all data
68   the full graph with all profiles, components, elements and data categories and links between them.
69   
70   Currently this amounts to roughly 2.000 nodes and 3.000 links
71
72selected nodes
73   nodes explicitely selected by the user (see below how to `select nodes`_)
74
75data to show
76   the subset of data that shall be displayed.
77   
78   Starting from the selected nodes, connected nodes (and connecting edges)
79   are determined  based on the options (``depth-before``, ``depth-after``).
80
81The nodes are colour-coded by type:
82
83.. image:: graph_legend.svg
84         :alt: the legend to the graph
85         :height: 80px
86
87.. _select nodes:
88
89There are multiple ways to select/unselect nodes:
90
91select from index
92        by clicking individual items in the index list, the item will be **added** to the selected nodes
93       
94        clicking on an already selected item unselects it
95
96select in graph
97  by clicking on a visible node in the graph, the node will be **added** to the selected nodes
98 
99  clicking on an already selected node unselects it
100 
101select area in graph
102  by dragging (hold mouse button down and pull) a rectangle in the graph pane, all nodes within that rectangle get selected
103  all other nodes will be unselected
104
105unselect in detail pane
106  clicking on an item in the detail pane unselects it
107
108mouseover
109  on mouse over a node, all connected nodes to given node (and connecting links) within the visible sub-graph are highlighted
110  and all other nodes and links are faded
111
112drag a node
113  click and hold on a node, one can move the node around, however usually the layout is stronger
114  and puts the node back to its original position
115
116Options
117-------
118The navigation pane provides following option to control the rendering of the graph:
119
120
121depth-before
122  how many levels of connected ancestor nodes shall be displayed 
123depth-after
124        how many levels of connected descendant nodes shall be displayed 
125
126link-distance
127        approximate distance between individual nodes
128        (not exact, because it is just one of multiple factor for the layouting of the graph)
129       
130charge
131        the higher the charge, the more the nodes tend to drift apart
132       
133node-size
134  N = all nodes have given diameter N;
135 
136  usage = node is scaled based on how often the node appears in the complete dataset
137  i.e. often reused elements (like description or language) will be bigger
138 
139labels
140  show/hide all labels
141  hiding the labels accelerates the rendering significantly, which may be an issue if more nodes are displayed.
142  irrespective of this option, on mouseover labels for all and only the highlighted nodes are displayed
143
144curve
145  straight or arc (better visibility)
146 
147layout
148  There are a few layouting algorithms provided,
149  for different data displayed other algorithm may be more appropriate:
150 
151  force
152    undirected layout, trying to spread the nodes in the pane optimally, equally in all directions
153    This is the underlying `layouting algorithm`_. All the other layouts build on top of it, by just adding further constraints.
154  vertical-tree
155    top-down layout respect the direction of the edges, children are always below the parents
156  horizontal-tree
157    left-right layout respect the direction of the edges, children are always right to the parents
158  weak-tree
159    a layout that "tends" towards left to right arrangement, but not strictly so (experimental)           
160  dot
161    strict left to right reusing the x-positioning as determined by dot_
162    Arranges the nodes in strict ranks (typical for dot layout)
163    This is done in a separate preprocessing step for the whole graph, so the positioning may be suboptimal
164    for a given subgraph. The y-coordinate is approximated on the fly by the base algorithm.
165  freeze
166    this is actually a "no-layout" - the nodes just stay fixed in their last position,
167    However, individual nodes still can be dragged around, so this can be used to adjust a few nodes for better legibility (or aesthetics),
168    but only when you start moving around inividual nodes, you will learn to appreciate the great (and tedious) work of the layouting algorithms,
169    so generally you want to try to play around with the other settings to achieve a satisfying result.
170
171.. _layouting algorithm: https://github.com/mbostock/d3/wiki/Force-Layout
172.. _dot: http://www.graphviz.org/
173 
174
175
176Linking, Export
177---------------
178 
179The navigation pane exposes a **link**, that captures the exact current state of the interface
180(just the options and the selection, not the positioning of the elements),
181so that it can be bookmarked, emailed etc.
182
183Furthermore, there is the **download**, that allows to export the current graph as SVG.
184This is accomplished without a round trip to the server, with a `javascript trick`_ 
185serializing the svg as base64-data into the url (so you don't want to save (or see) the exported url).
186But you can both, right click the link and [Save link as...], or click on the link, which opens the SVG in a new tab
187where you can view, resize, print and save it.
188Employing this simple method also means, that there is no possibility to export the graph in PNG, PDF or any other format,
189because this would require `server-side processing`_. (However this is a planned future enhancement.)
190
191.. _javascript trick: https://groups.google.com/forum/?fromgroups=#!topic/d3-js/aQSWnEDFxIc
192.. _server-side processing: http://d3export.cancan.cshl.edu/
193 
194 
195Examples
196========
197
198`DCMI terms`_
199
200.. _DCMI terms: ?link-distance=24&charge=107&layout=force&selected=clarin_eucr1p_1288172614023,clarin_eucr1p_1288172614026&
201 
202Issues
203======
204
205Performance
206        Chrome is by far the fastest, followed by IE(9).
207        A serious performance degradation was observed for graphs above 200 nodes on Firefox.
208        Showing labels also significantly affects performance.
209
210Bounds
211  When the graph gets to big, it does not fit in the viewing pane.
212  This will be tackled soon (either scrollbars or applying boundaries). Meanwhile,
213  you can reduce the link-distance and charge parameters or change the layout.
214
215Plans and ToDos
216===============
217
218Substantial issues:
219
220* Add information from **RelationRegistry** (relations between DatCats)
221* Blend in instance data from **MDRepository** (allow search on MDRepository)
222
223Smaller enhancements of the user interface:
224
225* select nodes by querying the names (e.g. show me all nodes with "Access" in their name)
226* option to show only selected types of nodes (e.g. only profiles and datcats)
227* detail-info on hover
228* full HTML-rendering of a node (Profile, Component)
229* backlinking from detail (e.g. view all the profiles a data category is used in by clicking on the number ('used in profiles')
230* store/export SVG/PDF/PNG-renderings of the graphs
231* add layout ``freeze`` static layout, were individual nodes can be moved around freely
232* add edge-weight: scale based on usage, i.e. how often appears the relation in the complete dataset
233  i.e. often reused combinations of components/elements will be nearer
234       
Note: See TracBrowser for help on using the repository browser.