Welcome!

.NET Authors: Jim Kaskade, Adine Deford, Matt Hester, Yung Chou, Elizabeth White

Related Topics: PowerBuilder, .NET

PowerBuilder: Article

Elegant Programming: Code Writing Style | Part 2

Code writing style

Local Variables Declaration
Declare local variables in the beginning (on the top) of the function, before the first executable line. This will:

  • Make it easier to detect all the variables used in the function and to follow them
  • Keep as little stuff as possible in executable code fragments, where programmers should concentrate on business logic.

The declaration of a local variable is not an executable command in PowerBuilder. The memory is allocated on the stack exactly at the moment when the function is called together with the parameters. It doesn't make sense to declare a variable inside an "if" construction hoping to improve performance - the declaration will occur anyway.

Over-Indenting Is Enemy Number One
Indent code fragments with as few tabs as possible.

Don't be carried away by extra indenting. In most cases, an indentation of three tabs says that the code could be written more elegantly, and an indentation of more than three tabs is a "red light" and signals a totally incorrect organization of code structure. If your code has long "staircases" of END IF/END CHOOSE/NEXT/LOOPs, then it's a good idea to take your habits critically. Try not to exceed two tabs - obeying this rule will keep your subprograms simple.

Loops with CONTINUE
Use continue in loops instead of indenting the subsequent code with one more tab. This method is a strong weapon in the war against over-indenting:

*** BAD code: ***

do while [condition]
if this.uf_data_ok_1() then
if this.uf_data_ok_2() then
if this.uf_data_ok_3() then
[code fragment with own indenting]
end if
end if
end if
loop

*** GOOD code: ***

do while [condition]
if not his.uf_data_ok_1() then continue
if not this.uf_data_ok_2() then continue
if not this.uf_data_ok_3() then continue
[code fragment with its own indenting levels]
loop

Indentation in CHOOSE CASE
Avoid any unnecessary indenting inside of "choose case" construction. Don't put an extra tab between "choose case..." and "end choose", write:

choose case ...
case ...
...
case ...
...
end choose

instead of

choose case ...
case ...
...
case ...
...
end choose

logically; expressions in each "case" have the same nesting level as if it were in an "if" expression, so why use more tabs to express the same level of logical nesting? This style looks pretty shocking in the beginning, but try it, and you'll really enjoy scripts that are easier to understand, especially when "choose case"-s are nested (or mixed with "if"-s and loops).

No Loop Condition Duplicating
Don't write the stop condition of a loop twice (before and inside the loop). You can ask why if in most programming books we see the condition duplicated in loops whose body may not be executed even once? My "unofficial" answer is simple: why write something twice that can be written only once to achieve the same goal, making the code longer? And the "official" answer is, probably, known to you - for the same reasons we avoid any sort of code duplication: if a change must be done in the condition, both copies of the code must be updated (but one of them can be mistakenly skipped).

*** BAD code: ***

long ll_row = 0

ll_row = dw_cust.GetNextModified(ll_row, Primary!)
do while ll_row > 0
[do something with the modified row]
ll_row = dw_cust.GetNextModified(ll_row, Primary!)
loop

An eternal loop with "break" is a very elegant solution in this situation:

*** GOOD code: ***

long ll_row = 0

do while true
ll_row = dw_cust.GetNextModified(ll_row, Primary!)
if ll_row < 1 then exit
[do something with the modified row]
loop

One Executable Statement per Line
Don't write more than one executable statement in one line of code. It not only can be messy and inelegant but it also makes debugging more difficult. If you obey this rule, you will always see which statement the debugger's cursor is standing near. For example, if your IF statement is written on one line, you can't clearly see if the program flow has stepped into the IF (and cannot put a breakpoint in it).

*** BAD code: ***

if IsNull(li_age) then li_age = uf_get_age(ll_emp_id)

*** GOOD code: ***

if IsNull(li_age) then
li_age = uf_get_age(ll_emp_id)
end if

This idea is especially important in the fight against expressions complexity. Good developers break one long function into a number of short subfunctions. The same philosophy works for complex, super-nested expression; we should break them down into standalone, simple lines:

*** BAD code: ***

ls_country_name = &

uf_get_country_name_by_city_id &
(uf_get_country_id_by_city_id &
(uf_get_city_id(ll_row_num)))

*** GOOD code: ***

ls_city_id = uf_get_city_id(ll_row_num)
ls_country_id = uf_get_country_id_by_city_id(ls_city_id)
ls_country_name = uf_get_country_name_by_country_id(ls_country_id)

The last example demonstrates how this approach simplifies debugging (in addition to better readability): if the variable ls_country_name has not been populated as expected, you can see in the debugger (without STEP IN) exactly which step fails. In the nested version, if you want to STEP IN uf_get_country_name_by_country_id to debug it, you are first forced to STEP IN (and STEP OUT from) each one of the inner functions, beginning with the most nested ls_city_id.

Store Values, Returned by Subroutines and Expressions, in Variables
Don't directly use values, returned by subroutines and expressions, as an input of other subroutines and expressions; instead, store them in interim variables and use those variables in the rest of the function.

There are at least two reasons to do that:

  1. It makes debugging easier - you can see the returned value immediately before it's "buried" in any kind of complicated collection or data set, or even disappears at all after a comparison (like "if li_curr_year = uf_get_operation_year() then...").
  2. It helps to break one complicated line of code into two to three simpler and clearer lines if the called function is nested in an expression.

These two reasons dictate that you always store the return value of a function in a variable, even if the function is called only once (if more than once then there are no questions at all). The only exception from this rule - a well-named Boolean function called only once in the script and used directly in an "if" statement. In this case we can understand the returned value - true or false - looking in the debugger if the program flow goes into the "if" or skips it (or goes into the "else" section). Ensure the called function doesn't return null, which is treated as false by the program flow.

Deterministic function is one returning the same output for the same input in the given context. If you need to call a deterministic function more than once in order to re-use its ret. value in different fragments, there are additional advantages to storing the ret. value in a variable:

  1. Better performance (the function is run only once), especially if the function is not extremely efficient.
  2. Signals that the called function is deterministic (otherwise developers can think it returns different values on each call).

Call the function only once and use the returned value in the rest of the script as many times as you need. This advice is especially important when the result of a function is used as a stop condition in a loop. For example:

for ll_row = 1 to dw_cust.RowCount()
...
Next

Imagine that the function returns 10000 - it means, it's called 10000 times. The solution is obvious:

long ll_row_count
ll_row_count = dw_cust.RowCount()
for ll_row = 1 to ll_row_count
...
Next

But be careful - if the loop can change the number of rows in the collection, the function is not deterministic.

Process Impossible Options (Yes, Yes, It's Possible!)
Process in "choose case" constructions all existing options. Signal error if an unexpected option is supplied to the construction at runtime. If you are an advocate of the defensive programming, you'll appreciate this idea.

When your code processes a number of pre-defined options (such as different statuses or modes - usually using the "choose case" construction), don't forget that the list of possible options can grow in the future, and you (or other developers) can forget to process the newborn option(s). What can we do to prevent that? We can force the code to complain: "Developer, open me and think how to process the new option!" How to force? By doing two simple things:

  1. In the "choose case" construction, add a new case for each option that exists in the system, but is not currently processed. These cases will not perform any real action, so write a comment like "do nothing," "irrelevant" or "not applicable" in their executed parts to let developers know you left them empty intentionally, not mistakenly.
  2. Add a "case else" section that will signal an unexpected option (throw an exception, display an error message or whatever).

Shortly, see the two following fragments. In the example I will use three customer statuses: ACTIVE, PENDING and INACTIVE, but only active and pending customers are processed currently by the business logic:

*** BAD code ***
(inactive status is not even mentioned...):

choose case ls_cust_status
case n_cust_status.ACTIVE
[code fragment processing active customers]
case n_cust_status.PENDING
[code fragment processing pending customers]
end choose

*** GOOD code ***
(inactive status is listed explicitly in a specially created "case"):

choose case ls_cust_status
case n_cust_status.ACTIVE
[code fragment processing active customers]
case n_cust_status.PENDING
[code fragment processing pending customers]
case n_cust_status.INACTIVE
// do nothing
case else
f_thow(PopulateError(0, "No case defined for customer status " + ls_cust_status)
end choose

If a new status (for example, DECEASED) is added to the business after many years, the code fragment will force the developers to update the logic. If the variable ls_cust_status contains the value of n_cust_status.DESEASED, the program flow will go to the "case else" section and the exception will be thrown. That will usually happen in the early stages of development or the unit test. Even if the exception will be thrown in production, it's better than living for a long time with a hidden bug (which can potentially be very expensive). Maybe, the new case should be treated in a special way, and you have to decide (or discuss with business analysts) how to process it. If the new status doesn't require any special treatment, simply add it to the "do nothing" case.

Never write business code in the "case else" section. If a few options must be treated in the same way, physically list them all in one "case". There are three reasons for that:

  1. The "case else" section can now be used for reporting an error as described earlier.
  2. Global Text Search will find all places in the application where the searched option is processed. If the option is processed in a "case else" section, it won't be found, so you are in trouble and must waste time investigating. There's a good chance you won't find everything you want.
  3. Looking at a "choose case" construction, developers see the whole picture (not only its fragment), so they don't have to guess which options "fall" into "case else". As people say, a half-truth is often a falsehood...

We discussed "choose case"-es, but the concept works also for "if"-s. Usually you're forced to replace the "if" construction with "choose case" to prevent a terrible heap of "else if"-s:

*** BAD code: ***

if is_curr_entity = n_entity.CAR then
this.Retrieve(il_car_id)
else
this.Retrieve(il_bike_id)
end if

*** GOOD code: ***

choose case is_curr_entity
case n_entity.CAR
this.Retrieve(il_car_id)
case n_entity.BIKE
this.Retrieve(il_bike_id)
case else
f_throw(PopulateError(0, "No case defined for entity " + is_curr_entity")
end choose

Don't use Boolean flags if there are more than two options. Let's say you have one class for both cars and bikes (because these entities are processed is a very similar way). Sometimes you want to know in which of the two modes the object (the instance) is created: in the car mode or in the bike mode. You could think about a Boolean flag ib_is_car_mode that will be initialized to true or false depending on the mode and used this way:

*** BAD code: ***

if ib_is_car_mode then
this.Retrieve(il_car_id)
else
this.Retrieve(il_bike_id)
end if

The best solution in this situation is to create two constants, n_entity.CAR and n_entity.BIKE, notwithstanding that there are only two possible options, see the previous "GOOD code." If one day you want to use the discussed class for boats as well, simply create n_entity.BOAT constant, initialize is_curr_entity with it, run the application and... follow your own instructions, written months or years ago.

Sometimes it makes sense not to follow this advice - for example, if there really are many options but you process one or two of them.

Mnemonic Strings as Application Codes
Make system codes' values self-explanatory, so developers don't need to look in the catalog tables to understand their meaning.

We can avoid the problem, described earlier, if our application codes (like statuses, modes, etc.) are textual and not meaningless numerics (like 1, 2, 3...). To have self-explanatory codes, it's a great idea as it makes debugging, reading code and exploring data in DB tables much easier. It's better to see ‘CLOSED' (instead of mystery number 3) as a variable value in the debugger or a value in a DB table when your brain is busy looking for a bug, isn't it? These mnemonic strings don't need to be long - VARCHAR(10) is enough; we can always shorten words if needed (still keeping them understandable).

*** BAD approach ***

Developers need to look in the codes catalog table to understand the numeric code:

*** GOOD approach ***

Codes are self-understandable; the codes catalog is kept only for referential integrity and to display codes in GUI:

Anyway we have to use constants in our code and not the values directly:

*** BAD code: ***

if ls_new_status = ‘CLOSED' then...

*** GOOD code: ***

if ls_new_status = n_order_status.CLOSED then...

The first approach is quite legible, but bug-prone; we can mistype the status and no compilation error will be received.

Avoid Loops on DataWindows
Don't process DataWindows and DataStores row-by-row (in loop) if the task can be accomplished using other, more efficient means.

For example:

  • When you are looking for a value (or values combination), use the Find function instead of a row-by-row comparison. If you want a row (for example, the current row) not to be searched (it can happen when you want to check if there are more rows with the same value(s) as in the current row), call Find twice - from the first row to the before-skipped row, and then from the just-after-skipped to the last row.
  • The function Describe() can be useful; for example, the following code obtains the maximum Effective Date of all rows:

ld_eff_date = Date(ids_XXX.Describe("Evaluate(‘Max(eff_date)',1)"))

  • Use a computed field instead of making the calculation in a code and assigning the results to columns in each row.
  • If you need to assign the same value to a column in all the rows (like a coefficient another column should be multiplied or divided by), make that column a computed field with a very simple expression, "1", and change that expression programmatically. Suppose, a variable ll_coef_to_divide contains the results of a calculation in PB code. Here is the bad (not efficient) solution (assuming that the field is not a computed but exists in the DW's data source):

for ll_row = 1 to ll_row_count
dw_XXX.object.coef_to_divide[ll_row] = ll_coef_to_divide
next

To make the assignment in one stroke, the field should be a computed one. The value, returned by it, is assigned this way:

dw_XXX.object.c_coef_to_divide.Expression = String(ll_coef_to_divide)
dw_XXX.GroupCalc()
/* Comments */

Comment your code if it's not completely straightforward. It's necessary not only for other developers, who will try to understand your code, but for yourself too. Now you remember what the code does, but will you after a week, after a month, after a year?

Don't leave comments if they are not really necessary (like ‘Declare variables:' just before a variables declaration section and ‘Call function XXX:' just before calling function XXX).

Comments can help you not only understand existing functions, but also write new ones. After the creation of a new function, write a comment before each future code fragment, performing a different logical task (as if you would comment an existing code), and after that begin to write executed code. The following is an example of an initial skeleton (built of comments) for a function that calculates a new balance for a customer (also pay attention so we can avoid articles "the" and "a" in comments as it shortens them, which is always good):

public function decimal uf_calc_new_balance (integer al_cust_id, decimal adec_amt_to_add)
dec ldc_new_balance
// Validate parameters:
// Retrieve existing balance of customer:
// Retrieve their allowed overdraft:
// Calculate new balance:
// If it exceeds allowed overdraft then return original balance:
// If this code reached then new amount is ok - return it:
return ldc_new_balance

In the very early stage of function writing it's easier to concentrate on what should be done by the function (I would call that "upper-level business logic"), and only after the comments skeleton has been written can we begin writing executed code (each fragment just after its corresponding comment) with a much lower chance of having to rewrite code.

Write RETURN at the end of the functions that return "(none)".

Always close functions, returning "(none)", with "return" instruction.

The compilers don't force us to write "return" as the last statement of subroutines that return nothing, but it can be pretty useful in debugging if the last executable construction of the function enables program flow to go inside it or to jump over it (for example, if the script's last construction is an "if", "choose case" or a "loop"). If that construction is skipped by the program flow (let's say, doesn't go inside an "if"), the debugger's cursor will stop on the return being discussed, so you have a chance to do something before the debugged function has been exited. Don't analyze if the final "return" can be useful or not - simply write it at the end of each, function, returning "(none)" - that won't do any harm.

Pass Inter-Objects Parameters by Name
Passing parameters between objects, use one of the methods where parameters are accessed by a mnemonic name, not by a meaningless index of an array. For example, utilize structures (but you will have to create a lot of them for each specific case), or a universal class (one for the whole application) like the one described here: http://www.zuskin.com/dwspy/parm.htm.

More Stories By Michael Zuskin

Michael Zuskin is a certified software professional with sophisticated programming skills and experience in Enterprise Software Development.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.