Tech Support > Dr. Bizaramor Strikes Back
Data Mining and Data Visualization Tools
Megan:
--- Quote from: name on May 01, 2012, 03:39:39 PM ---...I made the Fucilla Graph by simply scraping the data about persons and organizations by hand from the first page of the thread and saving it as CSV, then doing the same for the edges, followed by some adjustment by hand. I am looking forward to see what you are doing...
--- End quote ---
I am able to write out a GEXF file now and read it into Gephi. I need to shape the data differently, however, and I need to learn how to use Gephi.
The simple node lists I am creating now are too simple. They contain "parallel edges" that Gephi can't process. I can solve that by adding hierarchy -- instead of having a flat list of message nodes the graph can contain each "term" as a child of each message (forum post) in which it appears. The edges would then connect terms instead of messages. While I am at it, I should include the board & topic hierarchy above the messages.
Would that be a useful graph? Now that I have a basic set of components for building graphs, I can construct them any way that would work for visualization, but what would work well?
Presently, the only edges in the graph are from a "repeats" relationship, where an edge represents a word in a message that is a repeat of that same word in some earlier message. I can add another set of edges for a "quotes" relationship, where an edge connects an earlier message with a later one that quotes from it. I also have data for each message about who posted it (name and forum profile link) and who started the topic (name and forum profile link) that contains it.
parallel:
Most of the tech jargon and acronyms here is over my head, but find it an interesting topic and aim to learn more about it (recently did my first mysql).
--- Quote from: Megan on May 04, 2012, 06:37:41 AM ---While I am at it, I should include the board & topic hierarchy above the messages.
Would that be a useful graph? Now that I have a basic set of components for building graphs, I can construct them any way that would work for visualization, but what would work well?
--- End quote ---
Is it possible to do a layered graph, so that one could choose what level of complexity to view or combine. Like board/topic data on one layer, names in network on another, additional related data on a third layer and so forth ? That way one could make presentations match the presentation context by adding/subtracting layers.
name:
--- Quote from: Megan ---I am able to write out a GEXF file now and read it into Gephi.
--- End quote ---
Cool! Congratulaitons for your advances and thanks for your interest in this and the great work!
--- Quote from: Megan ---... Would that be a useful graph? Now that I have a basic set of components for building graphs, ...
--- End quote ---
I don't know. I suppose that it depends on what one wants to understand with such a graph.
My original idea was to use graphs to help understand the relationship of people, organizations... mentioned in threads such as the Fucilla one, or the longer Rense thread (for example), where people discuss a subject and apport information about people, entities, events ... over time. I am currently trying to make knime get the entities (see http://en.wikipedia.org/wiki/Named_entity_recognition) from a text, and in next step I'll look if can make it find relations between these entities and if there is a way to tag them in a sensible way. What I want is to eventually get a graph which more-or-less resembles something I'd also do by hand.
The thing with the "parallel edges" to which you allude is the weight of the relation between those particular two nodes. You must use the "weight" field of the edge so that gephi can process it and count up each time you find another relation between these two nodes - see the GEXF draft primer at "2.3.3 Declaring an Edge" on page 7.
You may also want to look at http://gexf.net if you haven't discovered it yet.
Megan:
--- Quote from: parallel on May 04, 2012, 11:25:03 PM ---Most of the tech jargon and acronyms here is over my head, but find it an interesting topic and aim to learn more about it (recently did my first mysql).
--- End quote ---
It wants to be over my head too, but I keep batting it back down. When I started in this field, a long time ago, it was actually possible to understand what you were doing, and I understood what was going on right down to the circuit level. Now you learn things at a certain high level and don't worry about the rest unless it quits working -- there are too many levels. How far down you go depends on how fascinated you are with with it. I am not fascinated with it much at all; I just want to be able to do something with it.
Part of the challenge is that "lazy" system 2. It doesn't want to deal with so much complexity, and you have to keep telling it that it isn't that bad and to hang in there. I learned long ago to divide problems into "chunks" of limited size, each of which (hopefully) is not overwhelming. The chunks might be nested one within another, or they might be parts of a system flow where one chunk does its work and hands off the results to the next chunk.
--- Quote ---Is it possible to do a layered graph, so that one could choose what level of complexity to view or combine. Like board/topic data on one layer, names in network on another, additional related data on a third layer and so forth ? That way one could make presentations match the presentation context by adding/subtracting layers.
--- End quote ---
I think that is what making the node list hierarchical is going to do. I don't know Gephi well enough yet to predict how changes to the input file are going to manifest in the visualizations. Once I have a good set of input data I will gain experience with Gephi.
Megan:
--- Quote from: name on May 05, 2012, 12:51:35 AM ---
--- Quote from: Megan ---I am able to write out a GEXF file now and read it into Gephi.
--- End quote ---
Cool! Congratulaitons for your advances and thanks for your interest in this and the great work!
--- Quote from: Megan ---... Would that be a useful graph? Now that I have a basic set of components for building graphs, ...
--- End quote ---
I don't know. I suppose that it depends on what one wants to understand with such a graph.
My original idea was to use graphs to help understand the relationship of people, organizations... mentioned in threads such as the Fucilla one, or the longer Rense thread (for example), where people discuss a subject and apport information about people, entities, events ... over time. I am currently trying to make knime get the entities (see http://en.wikipedia.org/wiki/Named_entity_recognition) from a text, and in next step I'll look if can make it find relations between these entities and if there is a way to tag them in a sensible way. What I want is to eventually get a graph which more-or-less resembles something I'd also do by hand.
--- End quote ---
Thanks. I'll digest that when I have some free time. :)
--- Quote ---The thing with the "parallel edges" to which you allude is the weight of the relation between those particular two nodes. You must use the "weight" field of the edge so that gephi can process it and count up each time you find another relation between these two nodes - see the GEXF draft primer at "2.3.3 Declaring an Edge" on page 7.
You may also want to look at http://gexf.net if you haven't discovered it yet.
--- End quote ---
Gephi just flat ignored the parallel edges; said it wasn't implemented yet. The problem should go away once I implement a board/topic/message/term hierarchy in the node list.
Apparently I can also model changes in the graph over time. That sounds like loads of fun!
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version