Andreas Hess // Projects // GridWeka2
GridWeka2 is a modified version of the well-known Weka machine learning and data-mining software in Java, written by Eibe Frank and other people from the University of Waikato in New Zealand.
GridWeka2 is able to run cross-validations in the Weka Explorer and Experimenter in parallel and distributed over several machines while being easily configurable. GridWeka2 is currrently based on Weka 3.4.3.
I wrote GridWeka2 for two reasons:
Note that GridWeka2 works in the Explorer and the Experimenter, but not in the CLI. GridWeka2 is a completely new project and is not based on Weka Parallel or GridWeka.
As Weka itself, GridWeka2 is licensed under the terms of the GPL.
GridWeka2 is very easy to configure and usage is transparent. You can run Weka as normal by typing:
java -cp GridWeka2.jar weka.gui.GUIChooser
However, to make use of the parallelisation, you need two more things. First, you will have to start at least one Weka server, either on the local machine, on some other machine that is reachable over the net, or both! The following command starts a server on port 6714 that allows up to 3 concurrent requests:
java -cp GridWeka2.jar weka.ucd.WekaServer 6714 3
The last thing you have to do is to tell you client about the Weka servers that you want to use. Do do that, you have to create a file servers.csv and place it in the directory from where you start your Weka client (i.e. the GUI Chooser, the Explorer or the Experimenter). A simple servers.csv file looks like this:
localhost,6714,-,-
myothermachine,6714,-,-
This tells Weka that there is a server running on localhost and another server running on myothermachine, and both listen on port 6714. The two minus signs are reserved entries: In a future version they will be used for user name and password based authentication.
After you have created the servers.csv file, just start the GUIChooser as shown above. If there is no servers.csv file, computation will be done on the client as normal.
There are a lot of functions that would be desirable in GridWeka2, but are not implemented. Some of these functions are missing because GridWeka2 is a research prototype and not finished software. Therefore there is also no warranty of any kind. This is a list of known limitations:
There are a few other changes in GridWeka2 when comparing it with the standard Weka 3.4.3: