Must-have Component A. 2
Description:
In
this project we implement the simplest possible scheduler of transactions. The
scheduler will accept two transactions (just two) and produce a conflict-serializable schedule that maximize the concurrency
For example, assume there are two transactions
with 4 actions each, say
T1:
a1,a2,a3,a4 and T2: b1,b2,b3,b4
The
serial schedule could look like this:
T1
|
T2
|
a1
|
|
a2
|
|
a3
|
|
a4
|
|
|
b1
|
|
b2
|
|
b3
|
|
b4
|
Now, if there is no conflict between a4 and b1
we could move action b1 up, and get
T1
|
T2
|
a1
|
|
a2
|
|
a3
|
|
|
b1
|
|
|
a4
|
|
|
b2
|
|
b3
|
|
b4
|
And if there is no conflict between b1 and a3,
we could move b1 up again, getting
T1
|
T2
|
a1
|
|
a2
|
|
|
b1
|
a3
|
|
|
|
a4
|
|
|
b2
|
|
b3
|
If
we could move b1 and b2 higher than they are, we should try (why?) Clearly, I
presented you with an algorithm.
Testing
and the methodology for selection of the solution
- Terminology:
When S is a schedule, |S| denotes its length
- floor(x)
is the function assigning integer part to a fraction (e.g. floor(2.7) = 2)
- When
you compute schedule it may be equivalent to either T1T2, or T2T1. You
should return a better of the two possible schedule.
So we need a criterion for what it means that one schedule is better then the other.
- While
many proposals for quality of such schedules are possible here is one. Let
us assume we computed schedule S which is equivalent to T1T2 (i.e. T1
-> T2) in the graph). We measure its quality as follows (very crude
measure):
r(S) = (|T'| +
|T''|)/(|T1| + |T2|)
where T' is the part of T1 that executed (in S) before T2
starts, and T'' is that part of T2 that executes after T1 ends.
- You
will return the schedule where the value r(S) is minimal
- The
issue of data input. As you will see, my test cases are pretty long (not
counting COMMIT they have up to 6 actions). Showing them in linear execution
vertically may be inconvenient. You may try to show them horizontally, but
the decision is yours. Below I will enter my transactions horizontally.
- Here
are my two test cases. You will have at least two other test cases. Your
test cases must interleave transactions of length at least 5.
- T1
= R(A); W(A): R(C); R(D); W(C); W(D); and T2 = R(E); W(E); R(A); W(A);
R(C); R(B); W(C)
- T3
= R(A); R(B); R(C); R(D); W(A); W(B); W(C); and T4 = R(A); R(C); R(E);
R(B); R(D);
- I
disregarded COMMIT actions; you may disregard them, too
- The
issue of data input. You can either read the data of transactions from two
files, or provide the user with the capability of input the data directly.
This is not an issue in this project. But in either case you need to check
the validity of the input. For instance the sequence:
The only actions in your transactions are R(X) and W(W) (where X can be
any nonempty ascii string without blanks).
Implementation
requirements
- The
inputs are two schedules T1, T2
- The
output is a schedule, conflict equivalent to T1T2, with interleaving (if
possible), but with the first action of T2 not earlier than half of the
schedule T1
- The
first action of T2 must be put as far as it can go, subject to the
limitations above