1 | *********** |
---|
2 | SMC Browser |
---|
3 | *********** |
---|
4 | |
---|
5 | Explore the `Component Metadata Framework` |
---|
6 | |
---|
7 | .. _Component Metadata Framework: http://clarin.eu/cmdi |
---|
8 | |
---|
9 | In *CMD*, metadata schemas are defined by profiles, that are constructed out of reusable components - collections |
---|
10 | of metadata fields. The components can contain other components, and they can be reused in multiple profiles. |
---|
11 | Furthermore, every CMD element (metadata field) refers via a PID to a data category to indicate unambiguously how the content of the field in a metadata description should |
---|
12 | be interpreted (Broeder et al., 2010). |
---|
13 | |
---|
14 | Thus, every profile can be expressed as a tree, with the profile component as the root node, the used components as intermediate nodes |
---|
15 | and elements or data categories as leaf nodes, parent-child relationship being defined by the inclusion (``componentA -includes-> componentB``) or referencing (``elementA -refersTo-> datcat1``).The reuse of components in multiple profiles and especially also the referencing of the same data categories in multiple CMD elements leads to a blending of the individual profile trees into a graph (acyclic directed, but not necessarily connected). |
---|
16 | |
---|
17 | SMC Browser visualizes this graph structure in an interactive fashion. |
---|
18 | |
---|
19 | It is implemented on top of wonderful js-library d3_, more technical documentation follows soon. |
---|
20 | |
---|
21 | .. _d3: https://github.com/mbostock/d3 |
---|
22 | |
---|
23 | |
---|
24 | Data |
---|
25 | ==== |
---|
26 | The graph is constructed from all profiles defined in the `Component Registry`_. |
---|
27 | To resolve name and description of data categories referenced in the CMD elements |
---|
28 | definitions of all (public) data categories from `DublinCore`_ and `ISOcat`_ (from the `Metadata Profile`_ [RDF] - retrieving takes some time!) are fetched. However only data categories used in CMD will get part of the graph. Here is a `quantitative summary`_ of the dataset. |
---|
29 | |
---|
30 | When inspecting the numbers, it is important to be aware of the occurrence expansion resulting from the reusability of the components. |
---|
31 | So in an example, a component C has 2 subcomponents and is reused within one profile by two other components A and B, the resulting profile |
---|
32 | will consist of (at least) 8 components (``[A, B, A/C, B/C, A/C/C1, A/C/C2, B/C/C1, B/C/C2]``), although only 5 distinct components are used. |
---|
33 | The same goes for elements in reused components. In most cases it is indicated in the label, if the number reflect distinct items, or all (expanded) occurrences. |
---|
34 | |
---|
35 | .. _Component Registry: http://catalog.clarin.eu/ds/ComponentRegistry/# |
---|
36 | .. _ISOcat: http://www.isocat.org |
---|
37 | .. _Metadata Profile: http://www.isocat.org/rest/profile/5.rdf |
---|
38 | .. _DublinCore: http://dublincore.org |
---|
39 | .. _quantitative summary: smc_stats.html |
---|
40 | |
---|
41 | User Interface |
---|
42 | ============== |
---|
43 | |
---|
44 | The User interface is divided into 4 main parts: |
---|
45 | |
---|
46 | Index |
---|
47 | Lists all available Profiles, Components, Elements and used Data Categories |
---|
48 | The lists can be filtered (enter search pattern in the input box at the top of the index-pane) |
---|
49 | By clicking on individual items, they are added to the `selected nodes` and get rendered in the graph pane |
---|
50 | |
---|
51 | Main (Graph) |
---|
52 | Pane for rendering the graph. |
---|
53 | |
---|
54 | Navigation |
---|
55 | This is the control panel governing the rendering of the graph. See below for available `Options`_. |
---|
56 | |
---|
57 | Detail |
---|
58 | In this pane, overall summary of the data is displayed by default, |
---|
59 | but mainly the detail information about the selected nodes is listed here. |
---|
60 | |
---|
61 | |
---|
62 | Interaction |
---|
63 | ----------- |
---|
64 | |
---|
65 | Following data sets are distinguished wrt user interaction: |
---|
66 | |
---|
67 | all data |
---|
68 | the full graph with all profiles, components, elements and data categories and links between them. |
---|
69 | |
---|
70 | Currently this amounts to roughly 2.000 nodes and 3.000 links |
---|
71 | |
---|
72 | selected nodes |
---|
73 | nodes explicitely selected by the user (see below how to `select nodes`_) |
---|
74 | |
---|
75 | data to show |
---|
76 | the subset of data that shall be displayed. |
---|
77 | |
---|
78 | Starting from the selected nodes, connected nodes (and connecting edges) |
---|
79 | are determined based on the options (``depth-before``, ``depth-after``). |
---|
80 | |
---|
81 | The nodes are colour-coded by type: |
---|
82 | |
---|
83 | .. image:: graph_legend.svg |
---|
84 | :alt: the legend to the graph |
---|
85 | :height: 80px |
---|
86 | |
---|
87 | .. _select nodes: |
---|
88 | |
---|
89 | There are multiple ways to select/unselect nodes: |
---|
90 | |
---|
91 | select from index |
---|
92 | by clicking individual items in the index list, the item will be **added** to the selected nodes |
---|
93 | |
---|
94 | clicking on an already selected item unselects it |
---|
95 | |
---|
96 | select in graph |
---|
97 | by clicking on a visible node in the graph, the node will be **added** to the selected nodes |
---|
98 | |
---|
99 | clicking on an already selected node unselects it |
---|
100 | |
---|
101 | select area in graph |
---|
102 | by dragging (hold mouse button down and pull) a rectangle in the graph pane, all nodes within that rectangle get selected |
---|
103 | all other nodes will be unselected |
---|
104 | |
---|
105 | unselect in detail pane |
---|
106 | clicking on an item in the detail pane unselects it |
---|
107 | |
---|
108 | mouseover |
---|
109 | on mouse over a node, all connected nodes to given node (and connecting links) within the visible sub-graph are highlighted |
---|
110 | and all other nodes and links are faded |
---|
111 | |
---|
112 | drag a node |
---|
113 | click and hold on a node, one can move the node around, however usually the layout is stronger |
---|
114 | and puts the node back to its original position |
---|
115 | |
---|
116 | Options |
---|
117 | ------- |
---|
118 | The navigation pane provides following option to control the rendering of the graph: |
---|
119 | |
---|
120 | |
---|
121 | depth-before |
---|
122 | how many levels of connected ancestor nodes shall be displayed |
---|
123 | depth-after |
---|
124 | how many levels of connected descendant nodes shall be displayed |
---|
125 | |
---|
126 | link-distance |
---|
127 | approximate distance between individual nodes |
---|
128 | (not exact, because it is just one of multiple factor for the layouting of the graph) |
---|
129 | |
---|
130 | charge |
---|
131 | the higher the charge, the more the nodes tend to drift apart |
---|
132 | |
---|
133 | node-size |
---|
134 | N = all nodes have given diameter N; |
---|
135 | |
---|
136 | usage = node is scaled based on how often the node appears in the complete dataset |
---|
137 | i.e. often reused elements (like description or language) will be bigger |
---|
138 | |
---|
139 | labels |
---|
140 | show/hide all labels |
---|
141 | hiding the labels accelerates the rendering significantly, which may be an issue if more nodes are displayed. |
---|
142 | irrespective of this option, on mouseover labels for all and only the highlighted nodes are displayed |
---|
143 | |
---|
144 | curve |
---|
145 | straight or arc (better visibility) |
---|
146 | |
---|
147 | layout |
---|
148 | There are a few layouting algorithms provided, |
---|
149 | for different data displayed other algorithm may be more appropriate: |
---|
150 | |
---|
151 | force |
---|
152 | undirected layout, trying to spread the nodes in the pane optimally, equally in all directions |
---|
153 | This is the underlying `layouting algorithm`_. All the other layouts build on top of it, by just adding further constraints. |
---|
154 | vertical-tree |
---|
155 | top-down layout respect the direction of the edges, children are always below the parents |
---|
156 | horizontal-tree |
---|
157 | left-right layout respect the direction of the edges, children are always right to the parents |
---|
158 | weak-tree |
---|
159 | a layout that "tends" towards left to right arrangement, but not strictly so (experimental) |
---|
160 | dot |
---|
161 | strict left to right reusing the x-positioning as determined by dot_ |
---|
162 | Arranges the nodes in strict ranks (typical for dot layout) |
---|
163 | This is done in a separate preprocessing step for the whole graph, so the positioning may be suboptimal |
---|
164 | for a given subgraph. The y-coordinate is approximated on the fly by the base algorithm. |
---|
165 | freeze |
---|
166 | this is actually a "no-layout" - the nodes just stay fixed in their last position, |
---|
167 | However, individual nodes still can be dragged around, so this can be used to adjust a few nodes for better legibility (or aesthetics), |
---|
168 | but only when you start moving around inividual nodes, you will learn to appreciate the great (and tedious) work of the layouting algorithms, |
---|
169 | so generally you want to try to play around with the other settings to achieve a satisfying result. |
---|
170 | |
---|
171 | .. _layouting algorithm: https://github.com/mbostock/d3/wiki/Force-Layout |
---|
172 | .. _dot: http://www.graphviz.org/ |
---|
173 | |
---|
174 | |
---|
175 | |
---|
176 | Linking, Export |
---|
177 | --------------- |
---|
178 | |
---|
179 | The navigation pane exposes a **link**, that captures the exact current state of the interface |
---|
180 | (just the options and the selection, not the positioning of the elements), |
---|
181 | so that it can be bookmarked, emailed etc. |
---|
182 | |
---|
183 | Furthermore, there is the **download**, that allows to export the current graph as SVG. |
---|
184 | This is accomplished without a round trip to the server, with a `javascript trick`_ |
---|
185 | serializing the svg as base64-data into the url (so you don't want to save (or see) the exported url). |
---|
186 | But you can both, right click the link and [Save link as...], or click on the link, which opens the SVG in a new tab |
---|
187 | where you can view, resize, print and save it. |
---|
188 | Employing this simple method also means, that there is no possibility to export the graph in PNG, PDF or any other format, |
---|
189 | because this would require `server-side processing`_. (However this is a planned future enhancement.) |
---|
190 | |
---|
191 | .. _javascript trick: https://groups.google.com/forum/?fromgroups=#!topic/d3-js/aQSWnEDFxIc |
---|
192 | .. _server-side processing: http://d3export.cancan.cshl.edu/ |
---|
193 | |
---|
194 | |
---|
195 | Examples |
---|
196 | ======== |
---|
197 | |
---|
198 | `DCMI terms`_ |
---|
199 | |
---|
200 | .. _DCMI terms: ?link-distance=24&charge=107&layout=force&selected=clarin_eucr1p_1288172614023,clarin_eucr1p_1288172614026& |
---|
201 | |
---|
202 | Issues |
---|
203 | ====== |
---|
204 | |
---|
205 | Performance |
---|
206 | Chrome is by far the fastest, followed by IE(9). |
---|
207 | A serious performance degradation was observed for graphs above 200 nodes on Firefox. |
---|
208 | Showing labels also significantly affects performance. |
---|
209 | |
---|
210 | Bounds |
---|
211 | When the graph gets to big, it does not fit in the viewing pane. |
---|
212 | This will be tackled soon (either scrollbars or applying boundaries). Meanwhile, |
---|
213 | you can reduce the link-distance and charge parameters or change the layout. |
---|
214 | |
---|
215 | Plans and ToDos |
---|
216 | =============== |
---|
217 | |
---|
218 | Substantial issues: |
---|
219 | |
---|
220 | * Add information from **RelationRegistry** (relations between DatCats) |
---|
221 | * Blend in instance data from **MDRepository** (allow search on MDRepository) |
---|
222 | |
---|
223 | Smaller enhancements of the user interface: |
---|
224 | |
---|
225 | * select nodes by querying the names (e.g. show me all nodes with "Access" in their name) |
---|
226 | * option to show only selected types of nodes (e.g. only profiles and datcats) |
---|
227 | * detail-info on hover |
---|
228 | * full HTML-rendering of a node (Profile, Component) |
---|
229 | * backlinking from detail (e.g. view all the profiles a data category is used in by clicking on the number ('used in profiles') |
---|
230 | * store/export SVG/PDF/PNG-renderings of the graphs |
---|
231 | * add layout ``freeze`` static layout, were individual nodes can be moved around freely |
---|
232 | * add edge-weight: scale based on usage, i.e. how often appears the relation in the complete dataset |
---|
233 | i.e. often reused combinations of components/elements will be nearer |
---|
234 | |
---|