Home Page

Code review and programming practices

This is an example of the type of help Nimcha Consulting can provide. Programming mentoring and helping do job optimization requires code review. And doing code review has different approaches. On one hand we can recommend changes that purely effect speed and efficiency; and on the other hand we can recommend changes to make code easier to turn over to less experienced programmers. Just because you can code some tricky algorithm doesn't mean you should.

The following is a repgen that a CU wrote and had trouble with. They were trying to pull a transaction count by share and loan types for branch 01 and they were not getting any results.

Click here to view source

  1: TARGET=ACCOUNT
  2:
  3: DEFINE
  4:  #INCLUDE "RD.GETDATA.DEF"
  5:  TRUE=1
  6:  FALSE=0
  7:  STARTDATE=DATE
  8:  ENDDATE=DATE
  9:  INDEXS=NUMBER
 10:  INDEXL=NUMBER
 11:  SHTRANCOUNT=NUMBER ARRAY (999)
 12:  LNTRANCOUNT=NUMBER ARRAY (999)
 13:  SHAREDESC=CHARACTER ARRAY (999)
 14:  LOANDESC=CHARACTER ARRAY (999)
 15: END
 16:
 17: SETUP
 18:  STARTDATE=DATEREAD("Enter start date of transactions")
 19:  ENDDATE=DATEREAD("Enter end date of transactions")
 20: END
 21: 
 22: SELECT
 23:  ACCOUNT:CLOSEDATE='--/--/--' AND ACCOUNT:ACTIVITYDATE>=STARTDATE
 24: END
 25:
 26: PRINT TITLE="Transactions By Share and Loan Types"
 27:
 28:  INDEXS=SHARE:TYPE
 29:  INDEXL=LOAN:TYPE
 30:
 31:  FOR INDEXL=0 TO 99
 32:   DO
 33:    LOANDESC(INDEXL)=GETDATACHARACTER(GETDEFAULTLOAN,INDEXL,77)
 34:   END
 35:
 36:  FOR INDEXS=0 TO 99
 37:   DO
 38:    SHAREDESC(INDEXS)=GETDATACHARACTER(GETDEFAULTSHARE,INDEXS,55)
 39:   END
 40:  
 41:  FOR EACH LOAN WITH LOAN:CLOSEDATE='--/--/--'
 42:   DO
 43:    FOR EACH LOAN TRANSACTION WITH (LOAN TRANSACTION:POSTDATE>=STARTDATE AND
 44:                                    LOAN TRANSACTION:POSTDATE<=ENDDATE AND
 45:                                    LOAN TRANSACTION:VOIDCODE=FALSE AND
 46:                                    LOAN TRANSACTION:COMMENTCODE=FALSE AND
 47:                                    LOAN TRANSACTION:BRANCH=0001)
 48:     DO
 49:      LNTRANCOUNT(INDEXL)=LNTRANCOUNT(INDEXL)+1
 50:     END
 51:    UNTIL LOAN TRANSACTION:POSTDATE<STARTDATE
 52:   END
 53:
 54:  FOR EACH SHARE WITH SHARE:CLOSEDATE='--/--/--'
 55:   DO
 56:    FOR EACH SHARE TRANSACTION WITH (SHARE TRANSACTION:POSTDATE>=STARTDATE AND
 57:                                     SHARE TRANSACTION:POSTDATE<=ENDDATE AND
 58:                                     SHARE TRANSACTION:VOIDCODE=FALSE AND
 59:                                     SHARE TRANSACTION:COMMENTCODE=FALSE AND
 60:                                     SHARE TRANSACTION:BRANCH=0001)
 61:     DO
 62:      SHTRANCOUNT(INDEXS)=SHTRANCOUNT(INDEXS)+1
 63:     END
 64:    UNTIL SHARE TRANSACTION:POSTDATE<STARTDATE
 65:   END
 66: END
 67:
 68: TOTAL
 69:  FOR INDEXS = 0 TO 99
 70:   DO
 71:    IF SHTRANCOUNT(INDEXS)>0 THEN
 72:     DO
 73:      COL=4 INDEXS
 74:      COL=30 SHAREDESC(INDEXS)
 75:      COL=60 SHTRANCOUNT(INDEXS)
 76:      NEWLINE
 77:      NEWLINE
 78:     END
 79:   END
 80:  NEWLINE
 81:  NEWLINE
 82:  FOR INDEXL = 0 TO 99
 83:   DO
 84:    IF LNTRANCOUNT(INDEXL)>0 THEN
 85:     DO
 86:      COL=4 INDEXL
 87:      COL=30 LOANDESC(INDEXL)
 88:      COL=60 LNTRANCOUNT(INDEXL)
 89:      NEWLINE
 90:      NEWLINE
 91:     END
 92:   END
 93: END

Issues and other discussion points:

1) The CU stated they were not getting any output. Taking a quick look through the code shows no PRINT or COL statements anywhere but the TOTAL division. The printing in the TOTAL division was conditional on the Share/Loan Tran count being greater than 0. Since they were not getting output the count is never greater than 0. Looking back at the PRINT division shows that the count is incremented based on INDEXS/INDEXL.

Tracking INDEXS/INDEXL you can see the problem was in lines 31-39. Because INDEXS/INDEXL was used in the FOR loops, upon exiting the loop the values would be 100. That places it outside the range of the array they were storing the information. Total time to find (but not fix)the bug was less than 60 seconds!

2) One fix could be to initialize INDEXS/INDEXL after lines 31-39. But it is better to evaluate these lines and then change what they do. The purpose of these lines is to load the Share/Loan description into an Array so that it can be used in the TOTAL division. Doing it as written in the PRINT division under a TARGET of ACCOUNT causes the Repgen to iterate through Parameters and load the array with the same information for every single ACCOUNT that passes the SELECT division. Needless to say this is a waste of CPU. Since the data is used only once it should be moved to either the SETUP or TOTAL division.

To fix this then we need to weigh the options. Do we go for speed optimization or for code readability and flexibility? There are basically two choices in fixing this. Preload the array and reference it later. Or just reference the parameters on the fly. Preloading is very quick but does take up a bit of memory and does involve code someone needs to "maintain". Referencing the parameters on the fly uses no memory and the code should be very easy to understand. However, referencing on the fly does mean the data is "moved" for each item. So if this is just a small number then the time to do it is small. But if the number of items could be large then you might not want to do this.

In this particular repgen the number is small enough that there are no speed concerns. So really the choices just come down to aesthetics. Do you need to ever reference the data before the TOTAL division? Will this code ever be maintained by inexperienced programmers?

3) Lines 31-39 are also inefficient as well as redundant. In addition to moving them to the SETUP division they should be rewritten. There are two separate loops being used to pull the information from Parameters. This can be rewritten to use only one loop. Since using INDEXS/INDEXL is completely arbitrary and indeed is the source of the original "bug", it would be better to use some other variable to loop on. Additionally, even though there is no code advantage for doing this it is good practice to not use variable names for multiple things. One should be able to read a variable name and know exactly what it is used for. So for example in this loop I would use something more generic:

31:  FOR XXX = 0 TO 99
32:   DO
33:    SHAREDESC(XXX)=GETDATACHARACTER(GETDEFAULTSHARE,XXX,55)
34:    LOANDESC(XXX)=GETDATACHARACTER(GETDEFAULTLOAN,XXX,77)
35:   END

The time savings of making one loop instead of two is not much but the point is that it could be written more efficiently. This is a useful change for future programmers. Also, even with the small CPU usage decrease it has always been my opinion that a hundred such changes will possibly be noticeable in huge time consuming operations like Month End.

4) Lines 28-29 set the INDEXS and INDEXL to the respective Share/Loan types. But they are not actually on a Share or Loan at the moment so the data is invalid. These lines need to be moved into the actual SHARE/LOAN loops in order to be set correctly.

5) Lines 22-24 The SELECT division is problematic because it will exclude results the CU is actually asking for. The stated purpose of the Repgen is to calculate the number of transactions at a Branch during a particular time frame. The SELECT division excludes closed Accounts so any Account that is closed at run time but that had Transactions at the Branch during the period being counted will be missed. It should be changed to include Accounts opened during the period regardless of current status.

6) Lines 41 and 54 Similar to number 5, the exclusion of closed Shares/Loans will cause Transactions to be missed. It should include Shares/Loans opened during the period.

7) Lines 11-14 The declared Array size is way to big and is using memory needlessly. I think they had to set it to 999 instead of 99 because of the original bug. Since INDEXS/INDEXL was accidentally incremented to 100 they probably needed to increment the array from 99 to something else in order to not get an "Array out of Bounds" error. They should have been able to use 100 but they chose 999 instead. Essentially they are now using something like 270K of memory just to store just a tiny bit of information.