Text & Data Mining by practical means: The Mind Map project: a new concept of Enterprise Knowledge Management

Thursday, March 6, 2014

The Mind Map project: a new concept of Enterprise Knowledge Management

Abstract
A project to build an innovative Information Management tool to extract, correlate and expose unstructured information.

A screen shot of one of the mindmap generated by the tool.
Different colors have been used to depict sub-concepts.

The demo
Time ago I published in this blog some posts where I presented some algorithms I designed to extract relevant information from documents and more in general unstructured content (as tweets, blogs post, web pages).
I don't want spend too much words, I guess a demo better describes the idea I have in mind.
It's still a prototype, and a lot of work is still required, but in this video I hope you appreciate the idea.
In the video I tested the prototype of the application using a wikipedia page.

PS
To optimize the video, watch it on youtube with the option "quality HD".

...Looking forward to receive feedback!

Stay Tuned

cristian

5 comments:

Michele FilanninoMarch 9, 2014 at 3:55 AM
Some questions about it:
- what is your definition of phrase?
- how does your algorithm selects the most relevant ones? Rank wrt frequency?
- Does concept equal word? How do you define a sub-concept?
- What kind of information retrieval algorithm have you in mind? (Input, output, logic)
- What is the meaning of the arrows' direction? Syntactic ( x->y ::= word x precedes word y)?

The graphs seems reasonable, but I was wondering what would be the MindGraph if we apply the technique on the entire English Wikipedia. What do you expect?

I see some clear overlaps with at least two pieces of research I am currently working on. If you like, we can have a chat about it. :)

Thanks,
michele.
ReplyDelete
Replies
Cristian MesianoMarch 9, 2014 at 4:37 AM
Ciao Michele,
->The definition of the phrase is determined by the algo: it chooses autonomously how chunk the text in phrases (no punctuation rules are used).
--> The ranking is not based on frequency approach. It works with graph theory methodologies (...it's part of the research I'm working on).
--> the sub concepts are defined by graph clustering technique (for the time being I'm using something standard).
->The arrow: x->y ::= word x precedes word y.
It t doesn't make sense having the graph for the entire wikipedia!: it's much more helpful to aggregate homogeneous info and find relationships.
The idea is to have a set of mindgraph (I like your definition!) for each document.
Sure we can set up a virtual coffee whenever you want!
cheers
c.
ReplyDelete
Replies
leoJune 27, 2019 at 1:08 AM
Amazing content.
Data Mining Process
ReplyDelete
Replies
sriApril 11, 2022 at 5:05 AM

Thanks for sharing this.,
csm training
Scrum master Training
ReplyDelete
Replies
ZiyaJune 26, 2023 at 5:50 AM
I always like to read well-written articles, like this one I found in your post. Everyone will thank you for sharing this knowledge because it is really useful. fantastic work best inventory management software
ReplyDelete
Replies

Add comment

Pages

Thursday, March 6, 2014

The Mind Map project: a new concept of Enterprise Knowledge Management

5 comments: