This is an example of the type of help Nimcha Consulting can provide. Programming
mentoring and helping do job optimization requires code review. And doing code review
has different approaches. On one hand we can recommend changes that purely effect
speed and efficiency; and on the other hand we can recommend changes to make code
easier to turn over to less experienced programmers. Just because you can code some
tricky algorithm doesn't mean you should.
The following is a repgen that a CU wrote and had trouble with. They were trying
to pull a transaction count by share and loan types for branch 01 and they were
not getting any results.
Click here to view source
1: TARGET=ACCOUNT
2:
3: DEFINE
4: #INCLUDE "RD.GETDATA.DEF"
5: TRUE=1
6: FALSE=0
7: STARTDATE=DATE
8: ENDDATE=DATE
9: INDEXS=NUMBER
10: INDEXL=NUMBER
11: SHTRANCOUNT=NUMBER ARRAY (999)
12: LNTRANCOUNT=NUMBER ARRAY (999)
13: SHAREDESC=CHARACTER ARRAY (999)
14: LOANDESC=CHARACTER ARRAY (999)
15: END
16:
17: SETUP
18: STARTDATE=DATEREAD("Enter start date of transactions")
19: ENDDATE=DATEREAD("Enter end date of transactions")
20: END
21:
22: SELECT
23: ACCOUNT:CLOSEDATE='--/--/--' AND ACCOUNT:ACTIVITYDATE>=STARTDATE
24: END
25:
26: PRINT TITLE="Transactions By Share and Loan Types"
27:
28: INDEXS=SHARE:TYPE
29: INDEXL=LOAN:TYPE
30:
31: FOR INDEXL=0 TO 99
32: DO
33: LOANDESC(INDEXL)=GETDATACHARACTER(GETDEFAULTLOAN,INDEXL,77)
34: END
35:
36: FOR INDEXS=0 TO 99
37: DO
38: SHAREDESC(INDEXS)=GETDATACHARACTER(GETDEFAULTSHARE,INDEXS,55)
39: END
40:
41: FOR EACH LOAN WITH LOAN:CLOSEDATE='--/--/--'
42: DO
43: FOR EACH LOAN TRANSACTION WITH (LOAN TRANSACTION:POSTDATE>=STARTDATE AND
44: LOAN TRANSACTION:POSTDATE<=ENDDATE AND
45: LOAN TRANSACTION:VOIDCODE=FALSE AND
46: LOAN TRANSACTION:COMMENTCODE=FALSE AND
47: LOAN TRANSACTION:BRANCH=0001)
48: DO
49: LNTRANCOUNT(INDEXL)=LNTRANCOUNT(INDEXL)+1
50: END
51: UNTIL LOAN TRANSACTION:POSTDATE<STARTDATE
52: END
53:
54: FOR EACH SHARE WITH SHARE:CLOSEDATE='--/--/--'
55: DO
56: FOR EACH SHARE TRANSACTION WITH (SHARE TRANSACTION:POSTDATE>=STARTDATE AND
57: SHARE TRANSACTION:POSTDATE<=ENDDATE AND
58: SHARE TRANSACTION:VOIDCODE=FALSE AND
59: SHARE TRANSACTION:COMMENTCODE=FALSE AND
60: SHARE TRANSACTION:BRANCH=0001)
61: DO
62: SHTRANCOUNT(INDEXS)=SHTRANCOUNT(INDEXS)+1
63: END
64: UNTIL SHARE TRANSACTION:POSTDATE<STARTDATE
65: END
66: END
67:
68: TOTAL
69: FOR INDEXS = 0 TO 99
70: DO
71: IF SHTRANCOUNT(INDEXS)>0 THEN
72: DO
73: COL=4 INDEXS
74: COL=30 SHAREDESC(INDEXS)
75: COL=60 SHTRANCOUNT(INDEXS)
76: NEWLINE
77: NEWLINE
78: END
79: END
80: NEWLINE
81: NEWLINE
82: FOR INDEXL = 0 TO 99
83: DO
84: IF LNTRANCOUNT(INDEXL)>0 THEN
85: DO
86: COL=4 INDEXL
87: COL=30 LOANDESC(INDEXL)
88: COL=60 LNTRANCOUNT(INDEXL)
89: NEWLINE
90: NEWLINE
91: END
92: END
93: END
Issues and other discussion points:
1) The CU stated they were not getting any output. Taking a quick look through the
code shows no PRINT or COL statements anywhere but the TOTAL division. The printing
in the TOTAL division was conditional on the Share/Loan Tran count being greater
than 0. Since they were not getting output the count is never greater than 0. Looking
back at the PRINT division shows that the count is incremented based on INDEXS/INDEXL.
Tracking INDEXS/INDEXL you can see the problem was in lines 31-39. Because INDEXS/INDEXL
was used in the FOR loops, upon exiting the loop the values would be 100. That places
it outside the range of the array they were storing the information. Total time
to find (but not fix)the bug was less than 60 seconds!
2) One fix could be to initialize INDEXS/INDEXL after lines 31-39. But it is better
to evaluate these lines and then change what they do. The purpose of these lines
is to load the Share/Loan description into an Array so that it can be used in the
TOTAL division. Doing it as written in the PRINT division under a TARGET of ACCOUNT
causes the Repgen to iterate through Parameters and load the array with the same
information for every single ACCOUNT that passes the SELECT division. Needless to
say this is a waste of CPU. Since the data is used only once it should be moved
to either the SETUP or TOTAL division.
To fix this then we need to weigh the options. Do we go for speed optimization or
for code readability and flexibility? There are basically two choices in fixing
this. Preload the array and reference it later. Or just reference the parameters
on the fly. Preloading is very quick but does take up a bit of memory and does involve
code someone needs to "maintain". Referencing the parameters on the fly uses no
memory and the code should be very easy to understand. However, referencing on the
fly does mean the data is "moved" for each item. So if this is just a small number
then the time to do it is small. But if the number of items could be large then
you might not want to do this.
In this particular repgen the number is small enough that there are no speed concerns.
So really the choices just come down to aesthetics. Do you need to ever reference
the data before the TOTAL division? Will this code ever be maintained by inexperienced
programmers?
3) Lines 31-39 are also inefficient as well as redundant. In addition to moving
them to the SETUP division they should be rewritten. There are two separate loops
being used to pull the information from Parameters. This can be rewritten to use
only one loop. Since using INDEXS/INDEXL is completely arbitrary and indeed is the
source of the original "bug", it would be better to use some other variable to loop
on. Additionally, even though there is no code advantage for doing this it is good
practice to not use variable names for multiple things. One should be able to read
a variable name and know exactly what it is used for. So for example in this loop
I would use something more generic:
31: FOR XXX = 0 TO 99
32: DO
33: SHAREDESC(XXX)=GETDATACHARACTER(GETDEFAULTSHARE,XXX,55)
34: LOANDESC(XXX)=GETDATACHARACTER(GETDEFAULTLOAN,XXX,77)
35: END
The time savings of making one loop instead of two is not much but the point is
that it could be written more efficiently. This is a useful change for future programmers.
Also, even with the small CPU usage decrease it has always been my opinion that
a hundred such changes will possibly be noticeable in huge time consuming operations
like Month End.
4) Lines 28-29 set the INDEXS and INDEXL to the respective Share/Loan types. But
they are not actually on a Share or Loan at the moment so the data is invalid. These
lines need to be moved into the actual SHARE/LOAN loops in order to be set correctly.
5) Lines 22-24 The SELECT division is problematic because it will exclude results
the CU is actually asking for. The stated purpose of the Repgen is to calculate
the number of transactions at a Branch during a particular time frame. The SELECT
division excludes closed Accounts so any Account that is closed at run time but
that had Transactions at the Branch during the period being counted will be missed.
It should be changed to include Accounts opened during the period regardless of
current status.
6) Lines 41 and 54 Similar to number 5, the exclusion of closed Shares/Loans will
cause Transactions to be missed. It should include Shares/Loans opened during the
period.
7) Lines 11-14 The declared Array size is way to big and is using memory needlessly.
I think they had to set it to 999 instead of 99 because of the original bug. Since
INDEXS/INDEXL was accidentally incremented to 100 they probably needed to increment
the array from 99 to something else in order to not get an "Array out of Bounds"
error. They should have been able to use 100 but they chose 999 instead. Essentially
they are now using something like 270K of memory just to store just a tiny bit of
information.
|