Feedback on LaBRI demos

These contents were recompiled from the document Feedback on demos #1 on 5.4.2017. From now on, feedback notes will be collected for each partner separately.

Both LaBRI's and EISI's demos made it clear to me that we also need start thinking about a query building system which bridges the gap between user-defined research interests and the need to get users to set parameters for algorithms (for now DOI). For the query-building I wonder how we can get an effective interplay between user input (as in known-item searches) and user selection from graph-based recommendations. More on this in the last segment. This is particularly relevant for tasks 1 and 2 in the project vision.

Suggested priorities for the next demo

  1. Pau and Bordeaux(?): Revised layout for nodes which is easier to read and less complex. Could parallel coordinates be part of the solution?
  2. Bordeaux: Allow more complex queries and allow for quick assessment of which nodes and which node types have been selected. Either list view or csv export (list view preferred).
  3. Marten: Are co-occurrences in our case best computed on sentence, paragraph or n-word-window level? These could be transformed into multilayer data as well (increasing likelihood of a meaningful relationship).
  4. Bordeaux and Pau: Create a bridge between data visualization and content: Can we add links to the documents to be displayed e.g. in a browser for further assessment?
  5. Marten: Find out about best practices in evaluation of such applications and algorithms.

Notes on demo 1 LaBRI

  1. On DOI: DOI is best suited for the creation of ego networks from scratch and driven by user interest. For these first demos and with regard to the project vision, Task 1 Diversity and continuous coverage is best suited for the DOI approach: We strive to get a general overview of the presence of a node (person, institution etc.) in the corpus.
  2. DOI and Project vision Task 2: The particular strength of the DOI-approach, free-flowing definitions of user interest, may also become relevant for task 2, “Search by tag” at a later stage. User interest can be either user-driven (“I already know which nodes I would like to include”) and/or assisted by recommendations (“I select nodes from a list of recommendations”).
  3. Demo and documentation: No problem to run the Python scripts in Tulip, the instructions by Antoine are clear. The only bit I struggle with are the last two slides on the current setup of the algorithm.
  4. More precise queries needed: While testing, I realized that for an assessment of “meaningfulness” of the DOI, I need to be able to run more specific queries. I need to apply additional conditions to further narrow down time periods and have “co-occurs with” AND and OR conditions to create more specific queries. If this is hard to implement in Tulip at this stage, we can discuss more complex demo queries and ask you to extract subgraphs directly.
  5. Hard to spot (multimodal) nodes: Same as in Pau-Demo: issues with distinguishing node types. I need to get a quick sense of which nodes and node-types are present in a subgraph. Reading labels of documents is hard because they are too long but abbreviating them does not make sense.
  6. Enrich with captions and links to CVCE docs: Easy access to the captions and ideally a link to the original document. As a temporary solution, could we either create a link to histograph (preferred) or (already based on example of a CVCE URL using CVCE-DOIs)
  7. Spreadsheet: To remedy this, I tried the spreadsheet view to get an overview of which nodes are present. But it only shows a limited number of characters, I can't read captions and longer titles.
  8. Spreadsheet: How can I download csv files of the spreadsheet view to show it to my colleagues? If I just copy them from the spreadsheet view I seem to get different data.
  9. IDs: I used the slug (“konrad-adenauer”) as unique identifier to find specific nodes, but this does not work for ePubs or documents of course
  10. Need to filter node types: ePublication nodes are dominant (center node in the screenshot below with node “konrad-adenauer” in blue).

Tulip bugs?

This happens when I add a panel to Tulip on Mac. Is this a bug? Is there a better way to arrange a side-by-side view of graph and spreadsheet?

This happened when I tried to re-add a graph panel