GUESS

The Graph Exploration System

 

 

 

 

Version 1.0.1 (beta)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Eytan Adar


 

1. GUESS features

 

This tool is/includes:

 

This tool isn’t:


1. GUESS features. 2

2.  Getting Started. 4

2.1 Installation. 4

2.2 Running. 4

3. Tutorial 5

4. Getting Your Data In. 8

4. Getting Your Data In. 9

4.1 The GUESS .gdf format 9

4.2 GraphML. 14

5. Manipulating and Querying Nodes and Edges. 15

5.1 Queries and Sets. 16

5.2 The Information Window.. 18

6. Laying out Graphs. 20

7. Analysis Commands. 23

7.1 Clustering. 23

7.2 Visualizing Fields. 23

7.3 Field, Graph, Node, Edge Statistics. 24

7.4 Random Graph Generation. 25

8. Modifying graphs. 26

9. Output Commands. 28

10. Subgraphs. 29

11. States and Animations. 30

11.1 State Sensitive Queries. 31

11.2 State Alternative: Ranges. 31

12. Legends. 33

13. Interface to R. 34

14. Convex Hulls. 36

15. Modifying the Interface/Expanding GUESS. 37

15.1  Example 1: A Simple Button. 37

15.2  Example 2: A Threshold Slider 38

15.3  Example 3: A Network Monitor 40

15.4  Example 4: Remote control of GUESS. 43

15.5  Responding to clicks and other code bits. 44

16. Applets and Applications. 48

16.1 Signing GUESS. 48

16.2 Compiling Your Code. 48

16.3 Advanced Applet Features. 49

16.4 Building your own Application. 49

17.  Front-end Alternatives. 50

18.  Command Line Options. 51

19. Additional Information. 52

Appendix A. Colors. 53

Appendix B. Changes from regular Jython/Python. 54


 

2.  Getting Started

2.1 Installation

You’re going to need 3 things:

 

If you’re running on Windows: We’ve included a sample guess.bat script which will launch GUESS.  You’ll either want to set GUESS_HOME as a global environment variable or just in the script.

If you’re running on UNIX: Same as above just with the guess.sh script

If you’re running on a Mac: Andrea Wiggins wrote a very helpful manual on getting started with GUESS on the Mac.  It’s available here:  http://graphexploration.cond.org/MacGUESSinstall.pdf

2.2 Running

We’ve greatly simplified running GUESS.  You no longer have to build the database as a separate step (as in Zoomgraph).  Just double click on guess.bat and you’ll be in the system or if you don’t want to type commands into the console window you can run guessallgui.bat (also need to change GUESS_HOME) which will give you an interpreter window inside the main UI window.


 

3. Tutorial

Let’s start with a simple example.  There is a sample database (sample.gdf) in the zip file.  It includes about 400 nodes and 700 edges.  Take a look at it to get a sense of what goes into a data definition file.  But don’t get intimidated, almost none of it is required.

 

Run guess.bat and we’ll be up and running.  The first thing you’ll see is a dialog asking if you want to open an existing database or load a new file.  We’re going to start with a new file so click the middle button.  When the file chooser dialog comes up pick the sample.gdf file.  You’ll then be asked if you want the new database to be persistent or in memory.  Just make it in memory for now.  This means that if you make changes GUESS will forget about it when you quit, but that’s fine for now.  You should see something that looks like Figure 1.

 

The graph that popped up represents a corporate communication network.  Each node represents an employee (with a department property), and each edge represents communication between two employees (with a frequency property on the edge indicating the number of undirected communications).

 

First, some basics: Try moving around in this space.  If you hover over a node or edge you can see some details pop up.  If you click on the node it will center in the display.  Clicking on an edge will bring both end points into view dynamically.  Left clicking and dragging on the background will allow you to move the display around.  Right clicking and moving the mouse will zoom you in and out of the display.  If you hold down the shift key while left dragging on the background you’ll be able to draw a rectangle to zoom to.

A new feature of GUESS is the ability to more easily move, delete, and edit nodes and edges.   Notice the 5 buttons at the bottom of the screen.  The first button on the left is the browsing mode which you start out in.  The next two allow you to select nodes and edges respectively.  The fourth is for manipulating convex hulls and the last allows you to annotate the document.  If you click on the node tool you will be able to click on node will select it.  You can then move the node around by dragging it around the screen or pull on the handles to change its size.  By holding down shift as you click on the background you will be able to select multiple nodes at the same time and move them all at once (currently there is no way to resize all the nodes at the same time).

 

Ok, now back in the command prompt where you started GUESS you should see a prompt that looks like this “>.”  Unlike Zoomgraph, GUESS uses a modified Jython interpreter (which in turn is based on Python).  You can now write full programs in the GUESS language.  If you type “2+2” it will evaluate to 4.  If you type “test = 4” the newly defined test variable will be set to 4.  The interpreter also understands if you want to enter longer routines or function definitions.  For example, let’s say we want to define a factorial function.  Start by typing “def fact(a):” and hit enter.  The cursor will now change for a “>” to a “.” indicating that you want to write more before the command gets evaluated/executed.  Now you can start typing in the rest of your code (don’t forget that in Python white space defines code blocks).  When you’re done simply hit enter on a blank line.

 

> def fact(a):

.       if (a == 1):

.               return(1)

.       else:

.               return(a*fact(a-1))

.

> fact(5)

120

 

Through the interpreter you can also control what you see on the screen.  For example type “center” and hit enter.  The display will automatically center to include all the nodes (assuming you moved around in the initial layout).  (note: type quit at any time to exit or just close the display window… don’t ctrl-c as you may corrupt your database).

 

Nodes can either be selected by name or through a query on their properties.  For example, try typing: “(node5,node6).color = red”  This will make nodes 5 and 6 red.  Our sample database has other properties on nodes.  Specifically, nodes here have a department.  So for example, “(dept == ‘dept5’).color = black” will set all the people in department 5 to a black color. 

 

Edges are accessed in a slightly different way.  Edges have names that are the start and end nodes.  For example, “(node67-node76).color = red” changes the edge between person 67 and 76 to red. You can also access edges by query.  As mentioned earlier, edge in this case have an attribute called freq (frequency).  So if we wanted to hide edges where the communication frequency was under 100 we would type: “(freq < 100).visible = 0”  The ‘-‘ also implies directionality.  If the database indicated directions (which this one doesn’t) you could talk about: node67->node76, node67<-node76, or node67<->node76, node67?node76[1].

 

The last mechanism for accessing edges is by defining node sets.  Let’s say we only care about communications between dept 4 and 9. 

  1. Let’s hide everything: “g.nodes.visible = 0”
  2. Then show only the nodes in departments 4 or 9: “((dept == ‘dept4’) | (dept == ‘dept9’)).visible = 1”
  3. Finally, we can change the color for inter-departmental edges by typing: “((dept == ‘dept4’)-(dept ==’dept9’)).color = red”  This command tells the GUESS to find all nodes in dept 4 and all nodes in department 9 and then will find all edges between them (in this case only one). 
  4. We can also do “((dept == ‘dept9’)-(dept == ‘dept9’)).color = blue” to just get intra-departmental links blue.  You should see something like Figure 2.

 

Because we are using a real language like Python in the background we could have made things much simpler by declaring some intermediate variables.  For example, the following commands would have led to the same results:

 

g.nodes.visible = 0

dept4 = (dept == ‘dept4’)

dept9 = (dept == ‘dept9’)

(dept4,dept9).visible = 1

(dept4-dept9).color = red

(dept9-dept9).color = blue

 

 

The GUESS system also contains a number of analysis modules to simply basic tasks (calculating graph metrics, etc.)  These are described in much more detail elsewhere, but just to give you a flavor try this… First, reset the graph to its starting state.  Type: ‘g.nodes.visible = 1’.  Then type: ‘g.edges.color = green’ and finally ‘g.nodes.color = blue’ (you should see the same thing as what we started with).  Type “density” This should calculate the density of the graph (.00827…).

Other analysis modules do more interesting things.  For example, colorize will color nodes and edges by different features.  Try typing “colorize(dept)”  Each node will now be a different (random) color.  The colorize function will also generate a bunch of subgraphs.  Then try “colorize(freq,blue,red)” which will assign a color over a linear range (from blue to red) based on the frequency of communications.

 

Ok, let’s try something a little more interesting:

·         Lets reset everything…

g.nodes.visible = 1

g.nodes.color = blue

g.edges.color = green

·         For every department we can assign a random color

colorize(dept)

·         Lets say we want to create a legend so we can tell which color goes with each department.  First thing is to get GUESS to group nodes by department for us

deptg = groupBy(dept)

·         The variable deptg is a set of sets where each internal set is a department name.  So lets create a legend for ourselves

xy = Legend()

·         What you’ll now see is a blank legend screen.  We’re going to populate the legend with the first element in each of deptg’s subsets.

for d in deptg: xy.add(d[0],d[0].dept)

 

The last line translates to: iterate over all subsets in deptg, setting each one in turn to the variable d and then running the add(..) command on the legend.  The add command takes a “prototype” node as input and a text string to put next to the prototype.  So we’ll take the first node from each group (i.e. a sample node from the department) and add it to the legend along with that node’s department name.  You’ll hopefully see something like Figure 3 at this point.  This is fairly standard Python syntax but you can get some great refresher materials on the web (see Additional Information section).


4. Getting Your Data In

 

In GUESS we are actively working to disentangle the frontend visualization from the backend data.  However, for now we are still using the HSQLDB database to persistently store and access to this data.  The database data can be persistent or in memory (for use in applets or if you just want to do some quick experiments).

 

There are three ways to add data into the database.  The first is to apply node and edge creation commands inside the GUESS interface.  These nodes will automatically get pushed into the database.  The second is a limited, but fairly functional graphml loader (see notes in section 3.2).  The final, and perhaps best way, to get your data in is to create a guess data file.

 

When you ran GUESS in the tutorial you may have noticed the set of questions in the beginng that walked you through loading the file/databases.  The first menu lets you load up an existing (persistent) database or pick a file to load.  This file can either by in GUESS format or GraphML (GraphML files must have the file type .xml or .graphml).  If you select the “Load New” you will select the file that contains your data.  You will then be asked if you want this new file to be persistent or in memory.  Selecting “in memory” will load the data into a memory resident database which will be vanish once you quit GUESS.  If you select “persistent” you will be instructed to pick a directory for your database files and then a name for the database (there end up being a few files that make the database persistent so you may want to create a “database” directory if you don’t like your directories to be littered with files).

4.1 The GUESS .gdf format

The file structure for the .gdf files is very simple.  We will basically define the nodes with their properties followed by the edges with theirs.

 

The node definition section starts with the line: “nodedef> name”

 

The nodedef line will tell GUESS what the format is of the following lines that actually describe nodes.  In the simple case we are just going to have one column on each line, the node name.  Nodes are required to have unique names (identifiers).  You will want to avoid using anything that is not a valid Python variable name here if you want to access the nodes by this name.  When GUESS starts up, it will automatically create variables for you for each node.  So if you have a node called foobarbaz, you’ll be able to talk about foobarbaz.color.

 

The simplest file looks something like this:

 

nodedef> name

foobar

 

which tells GUESS that we want a node called foobar.  All other aspects of the node (color, visibility, style) will be extracted from defaults.  After name (the only required column), you may use pre-defined columns and new columns to set and control extra node properties.  Pre-defined columns are:

·         x – a double representing the node’s x location (default: random)

·         y – a double representing the node’s y location (default: random)

·         visible – a boolean indicating if the node should be displayed (default: true)

·         color – a string, the default color of the node (default: “blue”).  We have a long list of color names that we know about, but if you didn’t want to use one of those you could quote an rgb triplet (e.g. “124,234,222”)

·         fixed – boolean, can the node be moved? (default: false)

·         style – an int indicating which style of node to use (default: 1).  Currently GUESS maps: rectangle = 1, ellipse = 2, rounded rectangle = 3, text inside a rectangle = 4, text inside an ellipse = 5, text inside a rounded rectangle = 6, and an image = 7

·         width – double, node width (default: 4)

·         height – double, node height (default: 4)

·         label – string, a label for the node in the visualization (default is the name)

·         labelvisible – boolean, should we show the label? (default: false)

·         image – string, a filename of the image to use if the node style = 7

 

These properties can also be controlled and accessed once GUESS is actually running.  You can type “foobar.x” to get the x coordinate for foobar and “foobar.height = 20” to set foobar’s height. 

 

These pre-defined attributes can be overridden by simply adding them to the list in the nodedef line.  For example:

 

nodedef> name,x,y,color

foo,0,0,blue

bar,100,100,red

 

This will tell GUESS that you want two nodes: a blue one called foo at (0,0) and a red one called bar at (100,100).  Notice that you don’t have to quote things explicitly (strings versus numbers).  The system should figure that out for you (unless your string has a comma in which case you’ll want to put it in quotes). 

 

This pre-defined list is simply our choice on node properties that have a specific meaning to the visualization.  We may add more in the future (font sizes, colors, complex shape definitions, etc.), but this is the set for now.  Usually you will want to add extra attributes to the node definitions.  For example, you may want to have a department property or maybe a salary.  Unlike the pre-defined nodes you will need to tell GUESS what kind of property this is (string, integer, etc.).  We use standard SQL to define these aspects.  For example:

 

nodedef> name,style,dept VARCHAR(32),salary INT default 40000

foo,1,dept1,50000

bar,2,dept2,52000

 

This file tells GUESS that you want to have two user defined columns, dept (the department) and salary.  Notice that we can define a default salary so that any new nodes added after the load will take on the default value.  After running GUESS on this .gdf file you will have two nodes and be able to access these properties in the same way as the pre-defined ones.  For example, typing “foo.style” will return 1 and “foo.salary” will return 50000.

 

Edges are defined in a very similar way, the only required columns for edges are “node1” and “node2” which are the names of the two nodes you are connecting.  A simple example is something like:

 

nodedef> name

a

b

c

d

edgedef> node1,node2

a,b

a,c

a,d

 

Which defines a star network centered on node a.  It will look something like Figure 5. 

 

Edges, like nodes, can contain pre-defined and user-defined attributes in the definition lines.  Valid pre-defined edge properties are:

·         visible – a boolean indicating if the edge should be displayed (default: true)

·         color – a string, the default color of the node (default: “green”). 

·         weight – a double indicating the edge weight (default: 1, but not currently used for calculations)

·         width – double, node width (default: .3)

·         directed – boolean, indicating edge directionality (default: false, undirected/bidirected). If true, this will assume node1 is the source and node2 is the destination.

·         label – string, a label for the node in the visualization (default is the edge weight)

·         labelvisible – boolean, should we show the label? (default: false) 

 

 

One critical thing to note is that duplicated edges are not supported.  That is you can not create more than one edge of the same direction between two nodes.  At most you can have 3 edges between two nodes (a->b, b->a, and a-b).  Recall that a-b and a<->b are considered to be the same thing.  GUESS will try to remove duplicate edges (e.g. a->b and b<-a) for you (when loading the file), but sometimes this will fail and you will get an exception.  You can simulate this behavior by adding extra fields.

 

Again, just as in the case of nodes, any user-defined edge attributes can be added by putting them on the edgedef line.  Extending our previous example:

 

nodedef> name,style,dept VARCHAR(32),salary INT default 40000

bob,1,dept1,50000

john,1,dept1,49000

alice,2,dept2,52000

edgedef> node1,node2,directed,relationship VARCHAR(32)

bob,alice,true,reports to

john,alice,true,reports to

bob,john,false,colleague of

 

Using something like this you will be able to say “(bob-john).relationship” in GUESS and get back “colleague of.”

4.1.1 Node Styles

 

As we described above there are a number of predefined nodes styles.  The following image shows these styles:

 

 

 

 

Node style 7 has its “image” field set to hplogo.jpg (a local file). It is also possible to “push” an image (a java object) to an image style node.  This allows you to create new images dynamically.  For example, CustomNodes.py in the scripts directory will generate an “aqua” style button for each node when convertAllToAqua() is called.  The result would look something like:

 

Finally, it is now possible to define your own polygon shapes for nodes.  By generating a style id (an integer > 100) and associating it with a Shape object in the shapeDB you will be able to create your own shapes.  For example, we can create diamond, triangle, or star shapes using the following code (also in shapetest.py):

 

from java.awt.geom import GeneralPath

from java.awt import Polygon

import jarray

 

xpoints = jarray.array((10,5,0,5),'i')

ypoints = jarray.array((5,10,5,0),'i')

diamond = Polygon(xpoints,ypoints,4);

shapeDB.addShape(104,diamond)

 

xpoints = jarray.array((55, 67, 109, 73, 83, 55, 27, 37, 1, 43),'i')

ypoints = jarray.array((0, 36, 36, 54, 96, 72, 96, 54, 36, 36),'i')

star = Polygon(xpoints,ypoints,10)

shapeDB.addShape(105,star)

 

triangle = GeneralPath()

triangle.moveTo(5,0)

triangle.lineTo(10,5)

triangle.lineTo(0,5)

triangle.lineTo(5,0)

shapeDB.addShape(106,triangle)

Running this script and applying the commands v0.style = 104, v1.style = 105, and v2.style = 106 results in the following picture:

 

 

4.1.2 Exporting GDF Files

 

To export the current graph as a GDF file you can simply type exportGDF(“filename.gdf”) which will output the current database as a GDF file named filename.gdf.

 

4.2 GraphML

 

GraphML (http://graphml.graphdrawing.org/) is an XML file format for representing graphs.  We support a limited set of this format (no subgraphs or hyperedges).  The main constraint is that nodes need to be defined before edges (this will be fixed later on).   Attribute names in keys that have the same names as those pre-defined node and edge properties above can be used to control the visual aspects of the graph.  Keys with other attribute names will be used to construct additional properties for nodes and edges.  Take a look at the file test.xml as a sample.

5. Manipulating and Querying Nodes and Edges

 

From the discussion on databases it should be evident that whatever attributes you define (or are pre-defined) on nodes and edges are accessible through the interpreter.  You would access these as you would any attribute of a Python object.  When the graph is loaded into memory GUESS will create node and edge objects for you in the top level name space of the interpreter.  Nodes will have the same name as their name property.

 

Going back to our simple office example:

 

nodedef> name,style,dept VARCHAR(32),salary INT default 40000

bob,1,dept1,50000

john,1,dept1,49000

alice,2,dept2,52000

edgedef> node1,node2,directed,relationship VARCHAR(32)

bob,alice,true,reports to

john,alice,true,reports to

bob,john,false,colleague of

 

We will have 3 nodes defined in the top level namespace, bob, john, and alice.  We can get a property by using the construct: <nodename>.<propertyname> and set a property by doing <nodename>.<propertyname> = <value>.  If the property represents a visual aspect of the node or edge the set operation will immediately modify the visualization.  Any changes will also be committed to the database.  Some examples are:

 

Nodes also have a few “special” attributes which are basically shortcuts.  For example, the size attribute will set both the width and the height of a node at the same time.  So bob.size = 20 is the same as bob.width = 20 and bob.height = 20.

 

Edge work in a very similar way but they are not set in the top level namespace.  Instead you would access the edge between two nodes by placing a “-“ between them (or “<-“, “->”, or “<->” for directed nodes).  If you want to modify or access a node property you need to remember to place the edge definition in parentheses.  For example:

 

For an edge you can always ask for the attributes node1 and node2 to get the end points (e.g. (bob-john).node1).  For directed edges you can additionally ask for the source and destination attributes (e.g. (bob->alice).source returns bob).

At certain times you may want to group nodes and/or edges together to modify or access their attributes at the same time.  This is simply done by putting objects in parentheses (standard Python).  What is unique the Python enhancement used by GUESS is that you can now modify an attribute of every node in that set at the same time.  For example (bob,john).color = red will set both the bob and john nodes to red.  This will also work for (bob,john,alice-bob).color = blue because nodes and edges have the same attribute (color).  The graph attribute nodes (e.g. g.nodes) and edges (g.edges) return a group of all nodes and edges in the graph respectively.  So the command g.nodes.color = black will set all nodes in the graph to black.

 

One trick of this syntax is finding edges between two sets of nodes.  A simple example is the command: (bob,john)-alice which will return two edges bob-alice and john-alice (we could have also typed (bob,john)-(alice) but since the second set was only composed of one node the extra parentheses were not necessary).  If you are looking for additional ways to group nodes take a look at the subgraph section later on in this manual.

 

A final note on edge and node visibility:

5.1 Queries and Sets

In addition to defining nodes for you in the top level namespace, GUESS will also define special objects called fields which will have the same name as node and edge properties.  For example, the field “height” exists as a variable named height.  These variables are used in the construction of queries that select specific nodes.  For example, for our previous example GUESS generated additional fields called dept, salary, and relationship.  Using these and the operators ==, <=, >=, !=, <, >, and like we can begin to select out nodes and edges that match certain criteria.  Some examples in this instance are: 

 

 

 

 

You can also make more complicated queries by combining sub-expressions with the & and | operators (and/or respectively).  For example:

 

One important note is that for fields that belong to nodes and edges (e.g. color, visible, width) the field object in the top level namespace will default to node attributes.  The expression “color == blue” will return nodes that are blue.  In order to specify that you want an edge or node attribute, prepend the attribute name with “Node.” or “Edge.”  This means that to get all blue edges we would use the expression: “Edge.color = blue.”

 

In order to find all nodes or edges not in a given set, you may use the complement method.  The method will take either a set or individual node/edge as an argument and return a set of all nodes/edges not in the set.  If the argument set contains only nodes the result will contain only nodes.  Similarly if the set contains only edges the result will contain only edges.  If both nodes and edges are in the input set nodes and edges will be returned in the output set.  Example usage includes:

 

complement(v44) – all nodes other than v44

complement(v44?v55)   - all edges not between node 44 and 55

complement((v44,v44?v55))   - all nodes AND edges not in the set

complement(dept == ‘dept1’) – all nodes not in department 1

 

If a set contains a mixture of nodes and edges you can apply the findEdges() and findNode() methods to extract just the nodes or edges.

 

In addition to the operations listed above there are a number of constructs/operators that are specific to state based operations.  Take a look at the States and Animations chapter to get a sense of those.

 

 

 

 

 

 

 

 

 

 

 

 

5.2 The Information Window

In order to see and modify node and edge properties quickly GUESS contains an information window that can be opened by typing: infowindow(). As you mouse over nodes and edges you will see their fields and values displayed in this window (see Figure n).  You will also be able to modify these fields by clicking on the value you want to change and simply typing the new value and hitting enter (immutable values such as name will not change).

 

You can force the information window to display details for a specific node or edge by typing InfoWindow.details(name) where name is the name of the node or edge you would like to see.

 


 

g.randomLayout()

g.frLayout()

 

g.gemLayout()

g.circleLayout()

 

g.isomLayout()

g.mdsLayout()

(after weight set to freq)


 

g.physicsLayout()

g.radialLayout(v5)

 

g.springLayout()

 

6. Laying out Graphs

One of the most critical aspects of visualizing graphs is to get nice looking layouts.  The figures above represent a sampling of the layouts provided by GUESS which you can apply to your graphs.  A number of the guess layouts are iterative.  That is they are always trying to “improve” and thus may never converge.  For these layouts you may predefine how many loops they should run for.  If you choose not to define a limit the layout algorithms will run until convergence or will show a dialog every 30 seconds asking if you would like to continue.

Currently GUESS provides the following layouts

 

                                                           

 

By and large layouts are executed in their own loops so that they do not take over the UI rendering pipeline.  Layouts will also center the display to fit all the nodes.  This is also an asynchronous process.  At times you may want to add layout operations into your scripts.  You may wish to make use of the commands:

 

[to add: discussion on more programmatic/finer control of layouts]

7. Analysis Commands

7.1 Clustering

Because we are making use of the JUNG system in GUESS we can take advantage 3of the many clustering algorithms already implemented there.  These commands will generally generate a set of sets that can then be used in any way you want.  Current commands include:

 

A simple use of these commands is to color each cluster differently.  For example:

 

clusts = weakComponentClusters()

for z in clusts:

          z.color = randomColor()

 

GUESS will also generate groupings (and sorts based on any field).  This is done by the groupBy(field) and groupAndSortBy(field) methods.  Using these we could color each edge in the sample database by frequency.

 

clusts = groupAndSortBy(freq)

clustcol = generateColors(blue,red,len(clusts))

for z in range(0,len(clusts)):

          clusts[z].color = clustcol[z]

 

Because resizing and coloring nodes and edges is a fairly straightforward operation we have created a number of shortcuts described below.

 

You may also make use of the groupBy/sortBy/groupAndSortBy methods when dealing with sets.  For example, say we pull out only a subset of nodes (e.g. all those in department 1) and would like to see them ordered by salary (note that we don’t actually have a salary field defined in the sample data set):

 

dept1 = (dept == ‘dept1’)

dept1.sortBy(salary)

 

or if we wanted to group them by job function we could do:

 

dept1.groupBy(jobfunc)

7.2 Visualizing Fields

As shortcuts to the groupAndSortBy followed by color and size changes we a number of shortcuts:

 

 

7.3 Field, Graph, Node, Edge Statistics

There are a number of special properties on (numerical) fields that allow you to get a quick sense of the average, minimum, maximum, and summed values.  This can be done by appending the field variable with .avg, .min, .max, and .sum respectively (e.g. freq.min will return the minimum frequency value).

A number of node and edge fields are calculated when they are first accessed.  These include: betweenness, pagerank, degrank, hits, and rwbetweeness which correspond to the Betweenness, PageRank, Degree Distribution Rank, HITS rank, and Random-Walk Betweenness ranks.  Also available are indegree, outdegree, and totaldegree.  Because many of these take a long time to compute, the first time you access the property the value is calculated and cached.  Changes to the graph will require an update to these ([need to describe]).

 

For example, we can calculate and color based on betweenness by doing:

 

v1.betweenness

g.colorize(Node.betweenness,red,blue)

 

You can also ask a node for the shortest path to other nodes by either applying the unweightedShortestPath(target) or dijkstarShortestPath(target) methods.  A list of edges representing the shortest path will be returned to you.  In figure n we have found the shortest path between v291 and v376 and changed the color to blue through the command:

 

(v291.unweightedShortestPath(v376)).color = blue

 

7.4 Random Graph Generation

 

If you would like to have GUESS generate a random graph for you there are a number of existing options available in JUNG and exposed through GUESS.  To use these you may want to start with an empty database and use one of the following:

 

 

[need to add descriptions]

8. Modifying graphs

 

Adding nodes and edges is a fairly straightforward process.  To add a new node you simply invoke the addNode(name) command which will create a new node with the default characteristics and the name “name.”  Adding an edge is also simple but there are two commands depending on if you want a directed or undirected/bidirected edge.

&n