Spring 2005
Programming Assignment 2: A Distributed Online Banking System
Due: April 15th
(for off-campus students - 14 days after viewing Lecture 15 )
- You may work in groups of two for this lab assignment.
- A link to a FAQ for this project.
The link will be updated to answer common questions about the assignment.
- Here are some useful
references for this project.
1 The Problem
In this programming assignment you will implement a Distributed Online Banking System.
The assignment uses concepts of Replication, Consistency, Distributed Locking
and Load Balancing .
An pictorial representation of the system is as shown in the figure below.
Figure 1: Distributed Online Banking System and Components
The system has two important components:
- Replicated Database Servers
The bank database of accounts (and corresponding information of each) is split
across three servers. Additionally, each account is replicated on two of the servers.
E.g.: An account with id 100 could be stored on Servers 1 and 2 and another
account with id 120 maybe stored on Server 1 and 3 etc.
- Load Balancer
The load balancer acts as the interface to the distributed banking database.
Each client sends its request to the Load Balancer which in turn forwards
the request to one of the servers to perform the desired action.
Additionally, the load balancer also fetches any response from the server
and sends it back to the client. The load balancer can use different
forwarding policies as discussed below.
2 Functionalities of the System:
- Replicated Database Servers
- Replication:
All accounts of the bank are split across the servers, with each account stored
on two of the three servers. For the assignment assume that all accounts are already
created and the client operations are for deposit, withdraw and balance check
- Distributed Locking:
As an account is stored on more than one machines,
during the update of an account information, the system uses a distributed locking mechanism
to protect the critical section. The servers use the Ricart and Agarwala distributed locking mechanism
to obtain and release locks. Detailed explanation of the algorithm is at AST Chapter 5, Pages 266-269.
Note that the original algorithm depends on Lamport's clocks or other ordering techniques, but
your implementation can use physical clocks on the machines for timestamping and ordering.
- Consistency:
The distributed locking mechanism provides mutual exclusion,
the servers also maintain strict consistency, using the release consistency, i.e., by synchronizing
at the boundaries. The servers as a result forward their updates to replicate before releasing the
lock for the account and maintain a strict consistent view of account updates across all replicas.
- Load Balancer
Each request sent by the client is forwarded to one of the servers. The load balancer uses two different
policies to forward requests to servers:
- Per-account Round Robin:
For each account the load balancer issues request to the (two) servers in a round robin manner.
E.g.: If account with id 100 is on servers 1 and 2, three consecutive requests to can be
sent to the servers in the following order: 1, 2 and 1.
- Request Load-based:
In this scheme, the load balancer maintains an estimated of the load at each server.
E.g.: Requests/min being processed at each server or current number of requests being processed
at the server.
The load balancer maintains a running estimate of this metric and uses it to forward
a request to the least loaded server where the account information is stored (note: only 2 server
out of 3 servers store information for each account).
Some things to keep in mind:
- Assume each server reads in an input file to create its database of account records.
For this, you can create three input files, one for each server, and distribute
the entire set of accounts, such that each account is stored at two servers.
- To keep the replication simple, assume clients do not create new accounts
and only query or update existing accounts.
- Assume physical clocks for timestamping and ordering in the Ricart and Agarwala's
distributed locking mechanism. This is may effects like, a slow machine always
winning a tie. You may report such cases if you find them in your implementation.
3 Evaluation and Measurement
Correctness
Demonstrate that your system works correctly according to requirements stated in the
description and functionality of the system. In particular:
- Show that the bank database is distributed and replicated.
- Demonstrate that distributed locking mechanism works correctly. i.e: show in action
(via output logs maybe) that simultaneous updates to an account at different servers
is handled correctly by the locking mechanism.
- Demonstrate that consistency is maintained across servers during updates., i.e:
update an account at one server and query or update same account at different server
and check if record is consistent with previous update.
- Demonstrate the proper functioning of the load balancer using the two different
techniques to forward incoming requests.
Scalability
Additionally, experiment with your system to measure its performance with differnt
workloads. A few scenarios described below.
- Measure the average time for the three different types of requests,
balance, deposit and withdraw with a lightly loaded system or a system with only
request being processed.
(note that balance does not need locking and must complete faster).
This will tell you how much time each operation takes with no competing requests.
- Keeping request rate of each client constant, vary the number of clients and
measure the latency of each request.
- Keep number of clients constant, but vary request rate to measure latency
of each request.
- Perform above experiments with both the load balancer forwarding techniques
and compare.
- Measure the average load on each server with the different load balancing mechanisms,
i.e: you can report the average estimate or requests/min at each server
with the different schemes.
It is important that you describe the results of your experiment and not just describe
what the experiment did. Please state what the experiment demonstrates or what you expected and what was
seen etc.
These are guidelines only, so be creative in what can be evaluated and measured as
part of your experiments to test the system.
4 What you will submit
When you have finished implementing the complete assignment as
described above, put all the code in a separate directory in your edlab account
(/cs677/project2).
You are required to submit your solution in the form of
printouts (please only attach relevant outputs that demonstrate your points and demonstrate functionality,
DO NOT printout entire output logs and source code).
Each program must work correctly and be documented. You
should hand in:
- Outputs generated by running your program. (in EdLab account)
- Outputs to demonstrate correct working of the system. (in EdLab account and Printout)
This is important, as it will show that your system works according to the requirements.
- A separate (typed) document of approximately two pages describing
the overall program design, a description of "how it works", and design
tradeoffs considered and made.
Describe clearly how each system is designed and implemented.
Also describe possible improvements and
extensions to your program (and sketch how they might be made). (Edlab and Printout)
- Prepare a list of design considerations you made while designing your system and
describe each briefly. This is similar to the design considerations discussed in class of the Email
system on the last slide of Lecture 2.
(in Edlab and Printout)
- A program listing containing in-line documentation. (in Edlab account)
- Instructions to compile and run the code from 677/project2. (in Edlab account)
- A separate description of the tests you ran on your program to convince
yourself that it is indeed correct. Also describe any cases for which your
program is known not to work correctly. (in Edlab account and Printout)
- Performance results to test scalability and performance parameters. (in Edlab abd Printout)
Let us not waste a lot of trees. So, if any of the above turn out
to be large, just save the relevant information in a file, leave it on your
EDLAB account and submit the name of the file.
5 Grading policy for all programming assignments
Grading:
- Program Listing
- works correctly ------------- 50%
- in-line documentation -------- 15%
- Design Document
- quality of design ------------ 15%
- understandability of doc ------- 10%
- Thoroughness evaluation ---------- 10%
Grades for late programs will be lowered 12 points per day
late.