Tech Support > Dr. Bizaramor Strikes Back
Data Mining and Data Visualization Tools
dant:
@Vulcan59:
I am not sure how best to go about setting up
projects using the forum. Perhaps some suggestions
could come from the SOTT developers/IT since they
understand how set up and manage projects but using
a forum will present some challenges or maybe there is a
better way?
Consider this structure - how can this work in a forum?
Maybe this is overcomplicating and could be simpler?
Sott Projects
+===============+ ...
| |
Data Mining & Visualization Project #2 ...
|
+=====+===+===+
| | | |
CoZion Dutroux FoS Fucilla
|
+=======+======== ...
| ...
Bush
The only reason I suggested the merge is because we may
not want this to be public, may not want to clutter the original
post, and maybe for other reasons.
seek10:
some thing to keep int mind. when setting the expectation. no mean to discourage. This tool has LOT of potential based on their forum questions. people are using it even for brain mapping etc.
_http://forum.gephi.org/viewtopic.php?t=1816
--- Quote ---Re: Which upgrade for best gephi performance?
Postby jonswords » 23 Apr 2012 21:01
The graphs i'm currently working with are c.15000 nodes and 10-12000 edges. This will likely increase in size in the near future.
jonswords
Posts: 3
Joined: 23 Oct 2011 14:27
Top
Re: Which upgrade for best gephi performance?
Postby seinecle » 24 Apr 2012 09:38
I would go for 8Gb RAM minimum and as good a graphic card as you can.
I've read somewhere that the roadmap for Gephi includes the development of GPU (graphic cards)-based computations, because this provides a huge boost in processing power (not just for the graphic rendering, but also for many sorts of computation). Even if these developments are surely not for the next 6 months, you could future-proof your laptop by having a very good GPU on it.
Best,
Clement
--- End quote ---
I wondered why it is slow on my laptop, when I imported example of 5000 nodes example.
Megan:
I am finally starting to make sense out of this thread, and I am looking at Gephi and Neo4j. I don't yet understand how the network was built that produced the graph above, but I think it will be clearer once I have these things running. I had already been thinking about ways to build graphs from forum messages, and data collected that way might feed into these tools.
Megan:
Neo4j is becoming my introduction to NoSQL databases. As an SQL/MDX developer I haven't had any reason to look at them, but it makes sense for this kind of data. Reading about Neo4j doesn't, however, answer my questions about software for building graphs from forum messages or other textual sources.
I am thinking about collecting data from the forum using either RSS or the SMF ".xml" feature. This would also make it possible to collect from other sources such as blogs using more or less the same code.
What I am trying to figure out now is what to collect. Possibilities I see so far include
* Board ID
* Topic ID
* Message ID
* Poster ID
* Subject
* Links to other posts
* Links to other websites
* Body text
* Nouns & noun phrasesMost of these fields are specific to the forum (and maybe others like it), but the last three are more general. One of the things I would like to do is to analyze text for nouns and noun phrases, and build a relational database of those that are shared. I am not really sure how to go about it; all I can do is try it and refine as I go.
Some of these items represent nodes or node attributes in a network (IDs, subject, body) while others represent relationships (links, nouns). Noun and noun phrase relationships could be very useful for connecting things, but they don't seem to have an obvious "direction" and I am not sure how this might affect their use in Gephi. Or maybe they do at least sometimes have an implied direction deriving from the way the text in which they appear may be linked to other posts via quoting, in the case of forum posts. I could give them a direction based on time stamps, I suppose.
Megan:
I managed to build something that downloads forum messages into a relational database automatically, a first step toward doing other things. The term extraction tool I am using is less flexible than I thought, and I am going to have to figure out a better way of doing that, or else come up with a different one. In the mean time at least I am collecting data.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version