Thursday, March 6, 2014

The Mind Map project: a new concept of Enterprise Knowledge Management

Abstract
A project to build an innovative Information Management tool to extract, correlate and expose unstructured information.
A screen shot of one of the mindmap generated by the tool.
Different colors have been used to depict sub-concepts. 


The demo
Time ago I published in this blog some posts where I presented some algorithms I designed to extract relevant information from documents and more in general unstructured content (as tweets, blogs post, web pages).
I don't want spend too much words, I guess a demo better describes the idea I have in mind.
It's still a prototype, and a lot of work is still required, but in this video I hope you appreciate the idea.
In the video I tested the prototype of the application using a wikipedia page.

PS
To optimize the video, watch it on youtube with the option "quality HD".

...Looking forward to receive feedback!


Stay Tuned
cristian

2 comments:

  1. Some questions about it:
    - what is your definition of phrase?
    - how does your algorithm selects the most relevant ones? Rank wrt frequency?
    - Does concept equal word? How do you define a sub-concept?
    - What kind of information retrieval algorithm have you in mind? (Input, output, logic)
    - What is the meaning of the arrows' direction? Syntactic ( x->y ::= word x precedes word y)?

    The graphs seems reasonable, but I was wondering what would be the MindGraph if we apply the technique on the entire English Wikipedia. What do you expect?

    I see some clear overlaps with at least two pieces of research I am currently working on. If you like, we can have a chat about it. :)

    Thanks,
    michele.

    ReplyDelete
  2. Ciao Michele,
    ->The definition of the phrase is determined by the algo: it chooses autonomously how chunk the text in phrases (no punctuation rules are used).
    --> The ranking is not based on frequency approach. It works with graph theory methodologies (...it's part of the research I'm working on).
    --> the sub concepts are defined by graph clustering technique (for the time being I'm using something standard).
    ->The arrow: x->y ::= word x precedes word y.
    It t doesn't make sense having the graph for the entire wikipedia!: it's much more helpful to aggregate homogeneous info and find relationships.
    The idea is to have a set of mindgraph (I like your definition!) for each document.
    Sure we can set up a virtual coffee whenever you want!
    cheers
    c.

    ReplyDelete