PADRES User Guide

Contents

Introduction

PADRES is a distributed content-based publish/subscribe system. It consists of a set of clients connected to brokers organized in an overlay network. A client can be a publisher and/or a subscriber. Publishers produce information that subscribers are possibly interested in (publications,) and subscribers consume them. Initially, publishers broadcast their publication schemas by issuing advertisements. Subscribers register their interest via issuing subscriptions, and these subscriptions are routed towards candidate publishers (whose advertisements match the subscriptions.) When a new publication is published, it is matched against all the registered subscriptions and routed to the subscribers who sent matching subscriptions.

A topic-based pub/sub system makes the routing decisions based on the "topic" attribute that classifies publications into pre-defined classes. In contrast, the content-based routing used in PADRES can use any attribute in the publications/subscriptions to make the matching/routing decision, providing a highly flexible routing mechanism. PADRES is also different from many other content-based pub/sub systems, due to its fully decentralized architecture: routing decisions are taken in a distributed broker overlay. This provides a scalable solution that can span across a large enterprise or even throughout the Internet.

PADRES Installation

Follow the instructions from the PADRES download page.

Running PADRES

A PADRES overlay is deployed by instantiating a set of brokers and clients (see Section Introduction). First, you have to decide on the topology of the broker overlay and client-broker connections. You can additionally instantiate a monitor in order to monitor the status of the brokers and clients. For this user guide, let us consider the PADRES system topology shown in Figure 1.

padres_nw01.jpg
Figure 1: A simple PADRES network.

PADRES system components are typically deployed on different physical machines. However, it may also be advantageous to learn to run the whole system in a single physical machine for verification and debugging purposes. We first describe how to instantiate the system shown in Figure 1, on a single physical machine, and then in a distributed setting.

Running a PADRES System in a Single Physical Machine

The parameters of the PADRES system shown in Figure 1 running on a single machine is given in the Table 1. The port numbers are chosen arbitrarily. However, it is better to choose high port numbers to avoid conflicts with the ports used by other services.

Table 1: PADRES system parameters.

Broker A ID: BrokerA   Broker B ID: BrokerB   Broker C ID: BrokerC
  Location: localhost     Location: localhost     Location: localhost
  Port A: 1100     Port B: 1101     Port C: 1102
  Neighbors: Broker B     Neighbors: Broker A, Broker C     Neighbors: Broker B
Client X Location: localhost   Client Y Location: localhost
  Broker: Broker A     Broker: Broker C

When launching a PADRES system, the broker overlay has to be created before instantiating the clients. The broker overlay can be created using the following commands:

$ startbroker -i BrokerA -p 1100 -n localhost:1101
$ startbroker -i BrokerB -p 1101 -n localhost:1100,localhost:1102
$ startbroker -i BrokerC -p 1102 -n localhost:1101
Notes:
  • Option -i is used to specify the broker ID, option -p to specify port number, and option -n to specify the neighbors.
  • Broker IDs should be unique and should not contain the '-' (hyphen) character.
  • Only one broker can listen at a particular port.
  • The neighbors are given in a comma separated (with no space) <location>:<port#> list. location can be DNS names or IP addressses.
  • Even though the above example shows all the brokers specifying their neighbors, it is enough to specify that only at one side: Broker A and Broker B are neighbors; if that neighborhood relation is specified when starting Broker A, it need not be specified when starting Broker B.
  • There is no stipulated order in which the brokers have to be started. Brokers can detect whether the specified neighbors are alive or not using heartbeat messages.
  • For detailed description of the options used with the startbroker command, run it with -help option (link.)

Now, start the clients:

$ startclient 1100      # this is client X, connecting to Broker A via port 1100
$ startclient 1102      # this is client Y, connecting to Broker C via port 1102
The startclient command will bring up a GUI for each client which can be used to interact with the client (startclient help.) The layout of the client GUI is shown in Figure 2.

client_gui.gif
Figure 2: PADRES RMI Client GUI.

The client GUI provides a way to perform the basic operations in a PADRES network: advertise, subscribe, and publish. A client can publish only after it has sent out matching advertisements. For example, if the Client X is going to publish temperature data from Toronto area, it must first advertise by issuing an advertisement as below (enter this command into the user interface area of Client X):

a [class,eq,'temp'],[area,eq,'tor'],[value,<,100]
Here, the client advertises that it is going to publish temperature data for Toronto area, with values less than 100 degree Celsius (note that the interface is space sensitive.) If the Client Y wants to be informed once the temperature in city of Toronto drops below zero, it can register a subscription via its GUI as below (enter this command into the user interface area of Client Y):
s [class,eq,'temp'],[area,eq,'tor'],[value,<,0]
Now, Client X can publish new temperature data via its GUI (enter this command into the user interface area of Client X):
p [class,'temp'],[area,'tor'],[value,-10]
This data will be routed through the broker overlay and the Client Y will be informed about it via the output area of its GUI (this message should have appeared in the client output area of Client Y upon sending of the previous publication):
Got Publication: [class,temp],[area,"tor"],[value,-10];Thu Jun 19 17:14:58 GMT 2008

Advertisements and subscriptions have (attribute, operator, value) tuple format, whereas publications have (attribute, value) tuple format. More information on adv/sub/pub patterns and operators can be found here.

Note: currently the client GUI does not support "unadvertise" and "unsubscribe".

The PADRES system can be graphically viewed using the PADRES monitor. To start the monitor, use the following command:

$ startmonitor

It will bring up a GUI with a blank canvas initially. Use Federation -> Connect to Federation... (Main -> Connect to Federation... in some versions) to bring up a connection dialog. Enter the hostname and port number of a valid active broker in the system and press 'OK'. For example, if you want to connect to Broker A, use 'localhost' and '1100'. After some time, the monitor will show the current system layout as in Figure 3. You can also use Federation -> Refresh (Main -> Refresh Federation) or press F1 (F5) to refresh the view.

padres_monitor.jpg
Figure 3: PADRES Monitor GUI.

Instructions on the basic usage of the monitor can be found here.

You can stop the clients and the monitor just by closing the GUIs. The brokers can be stopped using the stopbroker command:

$ stopbroker BrokerA
$ stopbroker BrokerB
$ stopbroker BrokerC

Running a Distributed PADRES System

Using the same topology as in Figure 1 but running each process on a separate node, we have a slightly different setup compared to the previous configuration, as shown in Table 2 below:

Table 2: Distributed PADRES system parameters.

Broker A ID: BrokerA   Broker B ID: BrokerB   Broker C ID: BrokerC
  Location: 10.0.1.1     Location: 10.0.1.2     Location: 10.0.1.3
  Port A: 10000     Port B: 10001     Port C: 10002
  Neighbors: Broker B     Neighbors: Broker A,Broker C     Neighbors: Broker B
Client X Location: 10.0.1.4   Client Y Location: 10.0.1.5
  Broker: Broker A     Broker: Broker C

You may start brokers and clients individually using the same technique above by logging into each node separately, or better you can use the provided tool called PANDA to simplify and speed up your deployment.

PANDA

Padres Automated Node Deployer and Administrator (PANDA) allows you to deploy and manage a network of brokers and clients. In addition to starting and terminating remote processes, PANDA also supports installing/uninstalling of rpms and tarballs and fetching and cleaning of log files at remote nodes. The user has complete freedom of the topology to deploy as all deployment details are captured in a user defined Topology File. Internally, PANDA consists of a Java program with many helper shell scripts that interact with the remote nodes. With PANDA, you no longer need to manually log into every single node to do anything, as everything is now automated.

The current version of PANDA is only available on the Linux platform, requires OpenSSH 3.9 or later (with options ConnectTimeout and StrictHostKeyChecking), requires all remote machines to be accessible via ssh (remote machines must host ssh servers), and have the screen application installed.

Running a Distributed PADRES System with PANDA

Before deploying any Padres processes, you must make sure that the machines on which you wish to run padres has Java 6 and screen installed, and both Padres binaries and libraries. Installation of Java 6 and screen on PlanetLab can be done via the install command in PANDA. By default, PANDA assumes that Java is installed in the home directory under "java/". Additionally, the script named javahome located in the distribution's etc/panda/setup must be present in the home directory of the remote nodes. Uploading of the Padres binaries and libraries can be done by using the upload command to upload and extract tarball containing the required files. Please complete PANDA's configuration before proceeding. More help regarding PANDA commands can be found by typing "help" in the PANDA console.

Using the PADRES system configuration in Table 2, to deploy Broker A using PANDA, first start panda by running the startpanda command. Once you get the PANDA console, type the following command into PANDA's console:

$ startpanda
Type 'help' or '?' for help.
> 0.0 ADD BrokerA 10.0.1.1 startbroker -Xms 64 -Xmx 128 -hostname 10.0.1.1 -p 10000 -i BrokerA

The 0.0 value at the beginning of the line marks the time when the broker should be started (0.0 implies an immediate action, also see below.) All node addresses must strictly be IP addresses. To stop the deployed broker, issue the command below. Note that the ID of the process and IP address of the node must match with the previous ADD command.

> 0.0 REMOVE BrokerA 10.0.1.1

Instead of typing ADD/REMOVE commands separately for each broker/client process, it is possible to group all the console commands into a file (called a PANDA topology file) to be imported into PANDA. Below is the PANDA topology file that captures the setup illustrated in Table 2. This topology file utilizes PANDA's 2-phase deployment where PANDA ensures all brokers marked with time 0.0 are fully up and connected in phase-I before deploying clients with time > 0.0 in phase-II.

# Phase 1, deploy the 3 brokers
0.0 ADD BrokerA 10.0.1.1 bin/startbroker -Xms 64 -Xmx 128 -hostname 10.0.1.1 -p 10000 -i BrokerA
0.0 ADD BrokerB 10.0.1.2 bin/startbroker -Xms 64 -Xmx 128 -hostname 10.0.1.2 -n 10.0.1.1:10000 -p 10001 -i BrokerB
0.0 ADD BrokerC 10.0.1.3 bin/startbroker -Xms 64 -Xmx 128 -hostname 10.0.1.3 -n 10.0.1.2:10001 -p 10002 -i BrokerC

# Phase 2, deploy Client X and Client Y.
# Client X is a publisher deployed at 1s after successful broker deployment that publishes stock quote publications of symbol 
# ANTP at 60 msgs/min to BrokerA with 0 delay before initial publication.  demo/stockquote/startSQpublisher.sh is the script that starts this 
# automated stock quote publisher
1.01 ADD ClientX 10.0.1.4 demo/bin/stockquote/startSQpublisher.sh -hostname 10.0.1.4 -i ClientX -s ANTP -r 60 -d 0 -b 10.0.1.1:10000

# ClientY is a subscriber deployed at 10s after successful broker deployment that subscribes to [class,eq,'STOCK'],[volume,>,0] at BrokerC.  
# demo/stockquote/startSQsubscriber.sh is the script that starts this automated stock quote subscriber
10 ADD ClientY 142.150.237.136 demo/bin/stockquote/startSQsubscriber.sh -hostname 10.0.1.5 -i ClientY -s "[class,eq,'STOCK'],[volume,>,0]" -b 10.0.1.3:10002

To deploy this topology using PANDA, run panda with the topology file, assume it is named topology.txt. Note: You must configure PANDA before using it. See below section on Configuring PANDA.

$ startpanda topology.txt

Alternatively, you may load the topology file after running panda without the command line parameter by using the load command in panda's console:

$ startpanda
> load topology.txt

Loading a topology file does not automatically start the broker/client processes. Once the topology file is successfully loaded and panda has verified that all nodes referenced by the file is reachable, issue the deploy command to deploy the processes:

$ startpanda topology.txt
Checking reachability of referenced nodes in topology file ...
10.0.1.1        OK
10.0.1.2        OK
10.0.1.3        OK
10.0.1.4        OK
10.0.1.5        OK
topology.txt successfully loaded

Type 'help' or '?' for help.
> deploy

After issuing the deploy command, PANDA will ask if you want to skip PANDA's automated 2-phase deployment process. By skipping the 2-phase deployment process, you will be given a prompt to decide whether or not to proceed with phase 2 deployment. Alternatively, if you choose not to skip the 2-phase deployment process, PANDA will automatically deploy phase 2 once it sees that all brokers and links are established in phase 1 using an internal monitoring client.

To stop the deployment at any time, use the stop command. Ignore any innocuous error messages.

The full list of PANDA commands, syntaxes, and descriptions can be found here.

Configuring PANDA

By default, all of PANDA's configuration is contained in $PADRES_HOME/etc/panda/panda.properties. You may use -c config_file_path command line argument on startpanda.sh to load your own configuration file for PANDA.

  • Configure remote login name
    This is the login name used to log into all of the remote nodes.
    scripts.env.SLICE=<login name>
    
  • Configure remote =PADRES_HOME=
    This is a relative path to the remote machines home's directory.
    remote.padres.path=<path to padres home directory>
    
  • Configure SSH keys
    Panda uses public/private ssh keys to log into remote nodes. See here or google yourself on how to generate public/private ssh keys. Note that PANDA requires you to use an empty paraphrase. Put the public key in $HOME/.ssh directory at all remote nodes. Then modify the line illustrated below in panda.properties to reflect the path to your private key. Note, the private key must only have read and write permission only to the user.
    scripts.env.IDENTITY=<path to private ssh key>
    
  • Configure tarball package
    Essentially, the tarball is the complete PADRES package with 3rd party library jar files. PANDA will download the tarball onto the remote nodes and extract the tarball in the $HOME directory (not $PADRES_HOME.) Therefore, it is recommended that a directory containing the contents of PADRES be automatically created upon extraction. Note that the remote.padres.path property value mentioned above should match this. To enable uploading of the PADRES tarball onto the remote nodes, you should configure the line below in panda.properties to point to the URL of your tarball. PANDA uses wget for this operation, and, therefore, the URL should be a valid HTTP address that is prefixed with http://

    scripts.env.TARBALL=<url to tarball>
    

PADRES Configuration

The operations of the PADRES components can be configured using configuration files. These files have the format of a standard Java property file. The default PADRES configuration files can be found in $PADRES_HOME/etc/. However, PADRES components generally provide a command line option with which you can specify your own configuration file. In addition, PADRES components provide other command line options which can be used to configure individual parameters. Refer the documentations on each PADRES components below for further details.

Note: Command line options overwrite configurations from user-specified configuration file; configurations from user-specified configuration file overwrite the configurations from default configuration file; configurations from default configuration file overwrites the default parameters hard-coded within the code.

Configuring PADRES Broker

By default, $PADRES_HOME/etc/broker.properties is used to configure PADRES brokers. This can be overwritten by specifying a different configuration file using -c command line option.

Few available configuration parameters and their descriptions are as follows. For the complete configuration options, please check the property file in $PADRES_HOME/etc/.

# Sample configuration/properties file for the PADRES broker.

# REQUIRED.  This key specifies the identifier of the broker.  This ID 
# must be unique across all brokers in the same federation, otherwise,
# there will be duplicate message IDs, and erroneous routing will occur.
# Needless to say more, it will result in a catastrophy.
padres.brokerID=Broker1

# OPTIONAL (default=1099).  This tells the broker which RMI port it 
# should bind its transport handler to receive messages from clients and 
# neighboring brokers.  Note: An RMI registry should already be running 
# at this port before you start the broker, otherwise you will get an
# RMI exception.
padres.port=1099

# OPTIONAL (default="").  You can specify here one or more neighbors for 
# this broker to connect to upon joining the federation.  An address of 
# a broker typically consists of an IP address and an RMI port.  Multiple 
# broker addresses must be separated by a comma, as shown in the example 
# below.  It is OK to leave this field blank, especially when you are 
# starting the first broker that has no available brokers to which to 
# connect.  There is no default value for this parameter.
#padres.remoteBrokers=128.100.241.50:1100,128.100.241.51:1099

For the command line options to overwrite these configurations, refer the help document on the startbroker command.

Configuring PADRES Client

PADRES client uses $PADRES_HOME/etc/client.properties as its default configuration file. Currently there is no command line option to specify a user-specific configuration file.

Configuring PADRES Logs

PADRES uses log4j.properties and log4j.properties files to configure the logging of messages from broker and client respectively. At present, there is no command line option available to specify a user-defined configuration file for PADRES logging engine.

By default, the logs will be created in the ~/.padres/logs/ directory. You can change the location of the log directory using -ll options while launching the commands.

Additional Information

Topic attachments
I Attachment Action Size Date Who Comment
txttxt topology.txt manage 1.2 K 2009-03-23 - 17:31 AlexCheung sample Panda topology file
Topic revision: r44 - 2009-05-25 - 19:34:44 - AltonChiu
 
Copyright © Middleware Systems Research Group. Send feedback