|By Michael Zuskin||
|February 2, 2013 02:00 PM EST||
Functions Must Be Short
Create a separate function for each logical sub-task, i.e., divide one long program into a number of short subprograms. The idea is named "Separation of concerns." Do that not only if the code will be re-used (i.e., called from more than one place) but even if it will only be called once. It's not a problem to have a lot of functions belonging to one task or business flow, even tens - a developer can always bring into focus only one of them. On the other hand, it's very difficult to understand how one intricate toilet-paper-long script works. Adherence to this rule will produce simple code even if the whole system is extremely complex, like software for a space ship or for brain surgy. The following tips will help you write code in a simple manner:
- Ideally a function should be no longer than one screen (not including the header comments). Two screens are still acceptable, but three screens already bring up the issue of incorrect functions organization unless the function performs a long "black work" that cannot (or should not) be broken into pieces, for example, processing a big number of fields gotten from an external service, when each field is processed in a few lines.
- The next acceptable advice was found by me in a programming book: "functions should contain up to approximately 100 lines of code not including comments."
Pay attention: the problem of functions that are too long usually goes together with the already discussed problem of extra indentation in Part 2.
Let's read what Jorn Olmheim wrote in the book 97 Things Every Programmer Should Know:
There is one quote, from Plato, that I think is particularly good for all software developers to know and keep close to their hearts:
"Beauty of style and harmony and grace and good rhythm depends on simplicity."
In one sentence, this sums up the values that we as software developers should aspire to. There are a number of things we strive for in our code:
- Speed of development
- The elusive quality of beauty
Plato is telling us that the enabling factor for all of these qualities is simplicity.
I have found that code that resonates with me, and that I consider beautiful, has a number of properties in common. Chief among these is simplicity. I find that no matter how complex the total application or system is, the individual parts have to be kept simple: simple objects with a single responsibility containing similarly simple, focused methods with descriptive names.
The bottom line is that beautiful code is simple code. Each individual part is kept simple with simple responsibilities and simple relationships with the other parts of the system. This is the way we can keep our systems maintainable over time, with clean, simple, testable code, ensuring a high speed of development throughout the lifetime of the system.
Beauty is born of and found in simplicity.
Steve McConnell writes in Code Complete:
A large percentage of routines in object-oriented programs will be accessor routines, which will be very short. From time to time, a complex algorithm will lead to a longer routine, and in those circumstances, the routine should be allowed to grow organically up to 100-200 lines.
Let issues such as the routine's cohesion, depth of nesting, number of variables, number of decision points, number of comments needed to explain the routine, and other complexity-related considerations dictate the length of the routine rather than imposing a length restriction per se.
That said, if you want to write routines longer than about 200 lines, be careful. None of the studies that reported decreased cost, decreased error rates, or both with larger routines distinguished among sizes larger than 200 lines, and you're bound to run into an upper limit of understandability as you pass 200 lines of code.
And now ideas from different developers, found on the Internet:
"When reading code for a single function, you should be able to remember (mostly) what it is doing from beginning to the end. If you get partway through a function and start thinking "what was this function supposed to be doing again?" then that's a sure sign that it's too long..."
"Usually if it can't fit on my screen, it's a candidate for refactoring. But, screen size does vary, so I usually look for under 25-30 lines."
"IMO you should worry about keeping your methods short and having them do one "thing" equally. I have seen a lot of cases where a method does "one" thing that requires extremely verbose code - generating an XML document, for example, and it's not an excuse for letting the method grow extremely long."
"...you should make functions as small as you can make them, as long as they remain discrete sensible operations in your domain. If you break a function ab() up into a() and b() and it would NEVER make sense to call a() without immediately calling b() afterwards, you haven't gained much. Perhaps it's easier to test or refactor, you might argue, but if it really never makes sense to call a() without b(), then the more valuable test is a() followed by b(). Make them as simple and short as they can be, but no simpler!"
"As a rule of thumb, I'd say that any method that does not fit on your screen is in dire need of refactoring (you should be able to grasp what a method is doing without having to scroll. Remember that you spend much more time reading code than writing it). ~20 lines is a reasonable maximum, though. Aside from method length, you should watch out for cyclomatic complexity i.e. the number of different paths that can be followed inside the method. Reducing cyclomatic complexity is as important as reducing method length (because a high CC reveals a method that is difficult to test, debug and understand)."
"During my first years of study, we had to write methods/functions with no more than 25 lines (and 80 columns max for each line). Nowadays I'm free to code the way I want but I think being forced to code that way was a good thing ... By writing small methods/functions, you more easily divide your code into reusable components, it's easier to test, and it's easier to maintain."
"I often end up writing methods with 10 - 30 lines. Sometimes I find longer methods suitable, when it's easier to read/test/maintain."
"My problem with big functions is that I want to hold everything that's happening in the code, in my head all at once. That's really the key. It also means that we're talking about a moving target. Because the goal is usability, the one screen rule really does make sense even though you can point to seeming flaws like varying screen resolutions. If you can see it all at once without paging around the editor, you are very much more likely to handle it all as a block.
What if you're working on a team? I suppose the best thing for the team would be to determine the lowest common denominator and target that size. If you have someone with a short attention-span or an IDE set up displaying around 20 lines, that's probably a good target. Another team might be good with 50.
And yeah, whatever your target is, sometimes you'll go over. 40 lines instead of 25 when the function can't really be broken up reasonably is fine here and there. You and your partners will deal with it. But the 3K line single-function VB6 modules that exist in some of my employer's legacy suites are insane!"
"I prefer to try to keep everything in a function on the same level of abstraction and as short as possibly. Programming is all about managing complexity and short one purpose functions make it easier to do that."
Keep Functions Simple, Part 2
The problem of mixed levels of abstraction
Don't mix different levels of abstraction in one function. The main function should call well-named sub-functions that call sub-sub-functions (and so on, and so on...), so a developer can easily "travel" up and down between different levels of abstraction (each time concentrating only on the current one) through hierarchies of any depth.
Kent Beck wrote in his book Smalltalk Best Practice Patterns: "Divide your program into methods that perform one identifiable task. Keep all of the operations in a method at the same level of abstraction. This will naturally result in programs with many small methods, each a few lines long."
Code Blocks Must Be Short
Don't allow code blocks to overgrow. A code block is a fragment, placed between opening and closing operators. These operators are:
- Code branching operators (like IF ... ELSE ... END IF).
- Looping operators (FOR ... NEXT, LOOP ... END LOOP, DO ... WHILE etc.)
The great idea is to keep the opening and closing operators on one screen. If you see that it's impossible, then think about extracting the block into a new function. It can also decrease the indenting; for example, the fragment:
if [condition] then
[very long code fragment with its own indents]
will be looking in the new function as
if not [condition] then return
[very long code fragment with its own indents]
But that is the subject of the next paragraph:
Code After Validations
If a large code fragment is executed after a few validations (and is placed inside a few if-s), take them all (the fragment and the validations) into a new function and exit that function just after one of the validations has failed. It will not only save your code from extra indenting but also convey the following information: the whole algorithm (not its part) is executed after all the preliminary validations have been passed successfully. For example, if the first line in a function is "if not <condition> then return" and the code is longer than one screen, the function readers immediately know that all the executed stuff is done only if the condition is satisfied, while if there is an "if" block like "if <condition> then <many-screens-code-fragment> end if" then the function readers are forced to scroll down to see if there is any code after the "end if" (executed always). See how you can convert code from monstrous to elegant:
*** BAD code: ***
if this.uf_data_ok_1() then
if this.uf_data_ok_2() then
if this.uf_data_ok_3() then
[code fragment with its own indents]
*** GOOD code (taken into a new function): ***
if not this.uf_data_ok_1() then return
if not this.uf_data_ok_2() then return
if not this.uf_data_ok_3() then return
Here you can ask: and what about the "single point of exit" rule? I don't want to discuss it here, but this idea produces more problems than it solves. I agree with Dijkstra who was strongly opposed to the concept of a single point of exit (it can simplify debugging in particular circumstances, but why do I have to suffer from working everyday with more complicated code only for the sake of simplifying possible debugging which may never happen?).
If a code fragment is not very long after many validations and you don't want to extract it into a special function, use the exceptions mechanism: put the whole fragment between try and catch, throwing an exception on failure of any of the sub-validations and process it locally (without propagating outwards). If you don't want to use exceptions for any reason, then use one of the following tricks: the "flag method" or the "fake loop method" but not the "multi-indents method".
*** GOOD code ("flag method"): ***
lb_ok = this.uf_data_ok_1()
if lb_ok then
lb_ok = this.uf_data_ok_2()
if lb_ok then
lb_ok = this.uf_data_ok_3()
if lb_ok then
[code fragment with its own indenting levels]
*** Another GOOD code ("fake loop method"): ***
do while true
lb_ok = false
if not this.uf_data_ok_1() then break
if not this.uf_data_ok_() then break
if not this.uf_data_ok_3() then break
lb_ok = true
if lb_ok then
[code fragment with its own indenting levels]
As you can see, the fake loop is an eternal loop with an unconditional break at the end of the first iteration, so the second iteration will never happen. This solution is looking strange (a loop construction that never loops), but it works.
Functions must return values to the calling script only when it makes sense. A function is allowed to return a value using a return statement only if at least one of the following conditions is true:
- The main purpose of the function is to obtain the value, and there is only one value to return.
- The main purpose of the function is to perform some action, but the returned value is important for the calling script (not for error processing - there are exceptions for that), and there is only one value to return, for example, the main purpose of uf_retrieve is to retrieve data from the database, but, in addition, it returns the number of retrieved rows so the calling script is more efficient because it doesn't need to call RowCount().
A function must not return a value using a return statement (i.e., "(none)" must be selected as the returned type in the function's signature) if at least one of the following conditions is true:
- The function's purpose is to perform some action, and there is nothing useful to return to the calling script.
- The function (of any purpose) must return more than one value - they all (!!!) are returned using "ref" arguments. It's considered a very bad programming style if both the mechanisms are used: return statement and by-reference arguments!
You can ask: what is the problem with having a meaningless, but harmless return 1 at the bottom of the script? Nothing catastrophic, but the return value is a part of the function's contract with the outer world, and each detail of that contract is important and must make sense. Looking at the function's interface, developers will make conclusions about its functionality, so if a value is returned, that has a reason and the returned value should somehow be processed in the calling script... You know, it's like adding to a function of one extra argument of type int, which is not used inside, and always passing 1 through it. That argument will also be harmless and not catastrophic, but unnecessary and foolish in the same way as the discussed "return 1".
Use REF Keyword
When you pass actual arguments to a function by reference, always add "ref". In fact, this short keyword is playing the role of a comment:
dw_main.GetChild("status", ref ldwc_status)
It really helps to understand scripts, especially when calling functions have multiple arguments in both the logical directions, "in" and "out". It was a bad solution for PB creators to make ref an optional keyword; let's make it required of our own free will.
No Global Variables and Functions
Never create global variables and functions! They are an atavism that has survived from the early days of programming. Modern technologies (like Java and .NET) don't have these obsolete features at all. PB has them only for backward compatibility, so don't create new ones (there is only one exception - global functions, used in DW expressions if other solutions are more problematic).
All developers, using the object-oriented approach, know about encapsulation, so usually there are no questions about global variables - they are an "obvious evil." But what's so bad about global functions? If you have a small, simple function, making it a public function of an NVO (instead of a global function) seems to provide no advantage at first glance, but...
- If you will want to extract a part of the function's code into a new function (according to the principle of creation a subprogram for each logical task or simply because the script has become too long), it will be impossible without creating another global function. And if you need 20 such functions in the future? You have two bad choices: to create an additional global function or not to create it (in the last case the script will be left as a long, unreadable and hardy managed buggers muddle). But if you have created an NVO as a container for your function (which is declared "public"), then you can add to that NVO any number of additional "service" functions ("private" if they are not intended to be called from outside).
- If you need to create a number of different functions, related to a same task/flow, putting them in one NVO will not only decrease the quantity of objects in the PBL, but will also signal that they are somehow related to each other. It's definitely better than a PBL overloaded with a crazy mix of tens or even hundreds of global functions belonging to different logical units.
The programmed process may require you to store data (for example, between calls to the function, or to cache data, retrieved once, for multi-times use by different consumers over the application). If a global function is used, your bad choices are global vars and using another NVO (in the last case you will have related stuff in different locations). But if you have created the function in an NVO, then there are no problems - declare instance variables (as well as constants for safer and more elegant code).
Refactor Identical Code
Merge functions with duplicating functionality into one generic function. If such a function appears in classes inherited from the same ancestor, create the generic function in the ancestor and call it from the descendants. If the classes are not inherited from one ancestor, then create the generic function in a third class (even if you have to create that class for only one function). If you find yourself thinking whether to duplicate code using copy-paste (10 minutes of work) or take it to a third place (two hours including testing) - stop thinking immediately. Never think about that, even in last days of your contract - simply take the code to a third place and call it from where needed. If you are still in doubt about spending your time (which really belongs to the company), ask your manager, but it's better to do the work well and after that to explain to the manager why it has taken longer time. If the manager understands what quality programming is then your effort will be appreciated.
Refactor Similar Code
Merge functions with similar functionality into one generic function. Different parts of the application must supply specific (uncommon) data to a generic (universal) algorithm implemented only once.
If the functions, being taken by you into a third place to prevent duplication, are very similar but not exactly identical, you need to exploit your brain a little bit more. Do the following:
- Merge the original scripts as described in "Refactor identical code" removing code duplication in the maximum way you can.
- Supply the different stuff (unique for each original function) from the application areas the original functions appeared in before.
For example, we have two original functions in different classes that are like these (fragments 1 and 2 of the second object are exactly as fragments 1 and 2 accordingly of the first class, but the DataObjects are different):
*** BAD code: ***
uf_some_function() of the first class:
is_entity = "Car"
uf_some_function() of the second class:
[fragment 1] // exactly like [fragment 1] in the first class
is_entity = "Bike" // oops, it's different from the same place in the first class...
[fragment 2] // exactly like [fragment 2] in the first class
*** GOOD code: ***
uf_some_function() moved into the ancestor class:
is_entity = this.uf_get_entity()
We use the function uf_get_entity to overcome the problem of difference between two the discussed functions. uf_get_entity is created in the ancestor class (as a placeholder, returning NULL or empty string) and implemented in the descendants to supply specific entities descriptions: in the first descendant the function should be coded as "return "Car"", in the second one - as "return "Bike"".
If the function is taken into a third class (that doesn't belong to the inheritance hierarchy) then the specific (different) data can be supplied as argument(s) of the new merged function, so the fragment "is_entity = this.uf_get_entity()" will become "is_entity = as_entity".
Finally, there is one more method to achieving the goal - we can populate is_entity while initializing the instance (for example, in its constructor), but this approach is not always applicable.
Of course, it's better to spend some time before development and think about how to organize classes instead of thoughtless straightforward coding that forces the Ctrl+C and Ctrl+V keys to work hard.
Forget the Keyword "DYNAMIC"
Don't call functions and events dynamically. Instead, cast to the needed data type (which has the function/event) and call it statically. Instead of:
lw_transport = uf_get_transport_window()
li_wheels_qty = lw_transport.dynamic wf_get_wheels_qty()
lw_transport = uf_get_transport_window()
ls_win_name = lw_transport.ClassName()
choose case ls_win_name
lw_car = lw_transport
li_wheels_qty = lw_car.wf_get_wheels_qty()
lw_bike = lw_transport
li_wheels_qty = lw_bike.wf_get_wheels_qty()
lw_plane = lw_transport
li_wheels_qty = lw_plane.wf_get_wheels_qty()
f_throw(PopulateError(0, "Unexpected window " + ls_win_name)
That approach requires more lines of code, but it has the following advantages:
- Clarity to code readers. Developers immediately see the whole picture (all the possible situations) and oppositely, dynamic calls hide the picture and require guessing or an annoying investigation (if that information is needed).
- Type-safety. If one day uf_get_transport_window() will return w_boat, there will be a readable message generated (saying what is the problem and where it occurred) instead of an application failure. Possibly, the developer will decide to extend the "choose case" construction with case "w_boat" (which will not call uf_get_transport_window()).
I wanted to gather all of my Internet of Things (IOT) blogs into a single blog (that I could later use with my University of San Francisco (USF) Big Data “MBA” course). However as I started to pull these blogs together, I realized that my IOT discussion lacked a vision; it lacked an end point towards which an organization could drive their IOT envisioning, proof of value, app dev, data engineering and data science efforts. And I think that the IOT end point is really quite simple…
Jul. 23, 2016 06:15 PM EDT Reads: 754
With 15% of enterprises adopting a hybrid IT strategy, you need to set a plan to integrate hybrid cloud throughout your infrastructure. In his session at 18th Cloud Expo, Steven Dreher, Director of Solutions Architecture at Green House Data, discussed how to plan for shifting resource requirements, overcome challenges, and implement hybrid IT alongside your existing data center assets. Highlights included anticipating workload, cost and resource calculations, integrating services on both sides...
Jul. 23, 2016 06:00 PM EDT Reads: 1,853
"We are a well-established player in the application life cycle management market and we also have a very strong version control product," stated Flint Brenton, CEO of CollabNet,, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 05:30 PM EDT Reads: 1,703
The IoT has the potential to create a renaissance of manufacturing in the US and elsewhere. In his session at 18th Cloud Expo, Florent Solt, CTO and chief architect of Netvibes, discussed how the expected exponential increase in the amount of data that will be processed, transported, stored, and accessed means there will be a huge demand for smart technologies to deliver it. Florent Solt is the CTO and chief architect of Netvibes. Prior to joining Netvibes in 2007, he co-founded Rift Technologi...
Jul. 23, 2016 05:00 PM EDT Reads: 930
Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.
Jul. 23, 2016 04:00 PM EDT Reads: 2,294
We're entering the post-smartphone era, where wearable gadgets from watches and fitness bands to glasses and health aids will power the next technological revolution. With mass adoption of wearable devices comes a new data ecosystem that must be protected. Wearables open new pathways that facilitate the tracking, sharing and storing of consumers’ personal health, location and daily activity data. Consumers have some idea of the data these devices capture, but most don’t realize how revealing and...
Jul. 23, 2016 04:00 PM EDT Reads: 1,964
What are the successful IoT innovations from emerging markets? What are the unique challenges and opportunities from these markets? How did the constraints in connectivity among others lead to groundbreaking insights? In her session at @ThingsExpo, Carmen Feliciano, a Principal at AMDG, will answer all these questions and share how you can apply IoT best practices and frameworks from the emerging markets to your own business.
Jul. 23, 2016 03:45 PM EDT Reads: 1,493
Basho Technologies has announced the latest release of Basho Riak TS, version 1.3. Riak TS is an enterprise-grade NoSQL database optimized for Internet of Things (IoT). The open source version enables developers to download the software for free and use it in production as well as make contributions to the code and develop applications around Riak TS. Enhancements to Riak TS make it quick, easy and cost-effective to spin up an instance to test new ideas and build IoT applications. In addition to...
Jul. 23, 2016 03:30 PM EDT Reads: 1,831
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Jul. 23, 2016 03:15 PM EDT Reads: 794
Extracting business value from Internet of Things (IoT) data doesn’t happen overnight. There are several requirements that must be satisfied, including IoT device enablement, data analysis, real-time detection of complex events and automated orchestration of actions. Unfortunately, too many companies fall short in achieving their business goals by implementing incomplete solutions or not focusing on tangible use cases. In his general session at @ThingsExpo, Dave McCarthy, Director of Products...
Jul. 23, 2016 03:00 PM EDT Reads: 1,647
Ask someone to architect an Internet of Things (IoT) solution and you are guaranteed to see a reference to the cloud. This would lead you to believe that IoT requires the cloud to exist. However, there are many IoT use cases where the cloud is not feasible or desirable. In his session at @ThingsExpo, Dave McCarthy, Director of Products at Bsquare Corporation, will discuss the strategies that exist to extend intelligence directly to IoT devices and sensors, freeing them from the constraints of ...
Jul. 23, 2016 02:30 PM EDT Reads: 1,699
WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at @ThingsExpo, Leo Papadopoulos, CTO of Cloud9, discussed the importance of WebRTC and how it enables companies to focus...
Jul. 23, 2016 02:15 PM EDT Reads: 623
The best-practices for building IoT applications with Go Code that attendees can use to build their own IoT applications. In his session at @ThingsExpo, Indraneel Mitra, Senior Solutions Architect & Technology Evangelist at Cognizant, provided valuable information and resources for both novice and experienced developers on how to get started with IoT and Golang in a day. He also provided information on how to use Intel Arduino Kit, Go Robotics API and AWS IoT stack to build an application tha...
Jul. 23, 2016 01:00 PM EDT Reads: 818
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, discussed the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filterin...
Jul. 23, 2016 12:00 PM EDT Reads: 1,127
Is your aging software platform suffering from technical debt while the market changes and demands new solutions at a faster clip? It’s a bold move, but you might consider walking away from your core platform and starting fresh. ReadyTalk did exactly that. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, will discuss why and how ReadyTalk diverted from healthy revenue and over a decade of audio conferencing product development to start an innovati...
Jul. 23, 2016 12:00 PM EDT Reads: 836
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, provided tips on how to be successful in large scale machine learning...
Jul. 23, 2016 11:00 AM EDT Reads: 1,144
Early adopters of IoT viewed it mainly as a different term for machine-to-machine connectivity or M2M. This is understandable since a prerequisite for any IoT solution is the ability to collect and aggregate device data, which is most often presented in a dashboard. The problem is that viewing data in a dashboard requires a human to interpret the results and take manual action, which doesn’t scale to the needs of IoT.
Jul. 23, 2016 11:00 AM EDT Reads: 1,819
What does it look like when you have access to cloud infrastructure and platform under the same roof? Let’s talk about the different layers of Technology as a Service: who cares, what runs where, and how does it all fit together. In his session at 18th Cloud Expo, Phil Jackson, Lead Technology Evangelist at SoftLayer, an IBM company, spoke about the picture being painted by IBM Cloud and how the tools being crafted can help fill the gaps in your IT infrastructure.
Jul. 23, 2016 10:45 AM EDT Reads: 1,968
"C2M is our digital transformation and IoT platform. We've had C2M on the market for almost three years now and it has a comprehensive set of functionalities that it brings to the market," explained Mahesh Ramu, Vice President, IoT Strategy and Operations at Plasma, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 10:00 AM EDT Reads: 1,043
"delaPlex is a software development company. We do team-based outsourcing development," explained Mark Rivers, COO and Co-founder of delaPlex Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 10:00 AM EDT Reads: 1,919