"I have achieved Zero Defects",
A dream of every software engineer.
Is it possible to achieve zero defects? Nothing is impossible. We have even reached the moon!!! But … seriously, do you think zero defects is possible? Fred Brooks, the author of The Mythical Man Month says "For each bug found during the testing phase, one hidden bug goes to the customer". These words are spoken from experience.
Now ask the same question to one of the managers who have not written a single line of code in the last 5 years. Probably, the reply would be "Defects are induced because of the carelessness of the engineers". I am not exaggerating. This happens in our everyday life, in the world around us. After all, the world is not that fair.
Ask the same question to the so called careless engineer. The reply could be "Its not possible to achieve zero defects, even though we can strive to reduce the defect count. We have even created tools to track down the bug count ...".
I am an engineer. I represent the careless engineer that you can see and feel in every software firm. When the blame for the defects are put on me, I become emotional sometimes. I scream. I shout. "Hey Manager, we engineers are careless. Because we too are human. And don't you know that to err is humane?"
Can we blame the manager for all those bad deeds? After all, the managers have to deal with the end customer who, sometimes, doesn't even know what a 'program' is. I know a customer who was not ready to accept even a single defect. He said "I am paying 10 times for your damn software than I pay for the hardware. You are so expensive. If your product is not defect free, then why should I invest such a huge amount for your service?"
That’s a reasonable question. If customer pays, can't he expect a decent product. And when he pays a huge amount, can't he expect a defect free product. After all, we engineers also live in the financial world. We also know the importance of money. In fact, we too save money to buy products.
Customers, managers and engineers. Who is the actual culprit? Who can we put the blame to? We forgot one entity… the point of our discussion. The software system which carries the burden of the defects. But one difference. Software systems are not part of our real world. They live in the virtual world. The rules and laws in the virtual world are different.
The hardware for which the customer pays live its life in the real world. You can touch it and feel it. In the real world, systems and events are more predictable. Because, the laws of physics applies here. If your mobile phone slips from your hand, your genius mind can track down the moment when it is going to hit the ground. You just need to know the weight of your mobile phone and height to the ground. Considering the dust in the atmosphere is negligible, you just need to apply the formula you learned in your school days to track when your mobile phone hits the ground. This is possible because the falling mobile phone is a continuous system. The engineers in ISRO and NASA send rockets to the outer space using the same logic. You can predict the output of continuous systems.
Take the values of all the variables in a software system along with the method stacks available in different threads, at any moment. This collection is the state of the software system at that moment. The virtual world is digital. In the digital world all systems are discrete. Means, the software system is nothing but a system which has 'n' number of states. You can test these n number of states by n number of test cases. This will ensure that the software system is defect free. Then you can make the big announcement. "I have achieved zero defects".
Consider that you have a simple java program which uses just a primitive 'int' variable. The int has a size of 4 bytes and can take values in the range of -2,147,483,648 to 2,147,483,647. These are the different states this single variable can hold. Definitely I cannot have test cases to test all these states. I will test three generic conditions. I.e., what happens to this small system, when the int takes a negative value, zero and positive value. Now I have reduced the states which I need to test from 4294967295 cases to 3 cases. Now there is a chance for a hidden bug in the rest of the 4294967292 states which I have not tested.
This example is a little exaggeration. But, …. If this is the case for a simple program with a single variable, what will be the case of a huge software system with thousands of variables and hundreds of threads? You can ensure zero defects by testing all those different states. If you are a manager, learn a bit about permutations and combinations so that you can calculate the number of test cases your test engineers should handle.
Coming back to reality, you cannot test all those states in the software systems. You need unacceptable duration of time and patience to do a complete exhaustive testing. By the time you finish your testing, years would've passed, even the need for your software system to exist will not be there. Your test engineers would've gone crazy, you would've been completely lost, depressed and wouldn't have seen day light for years. Even your family would've forgotten you. Man….
When the facts speak, the real world where we live has got strict deadlines and market pressures. The life time of products' time-to-market are getting shorter and shorter. We have to build systems in a matter of months. We need to cater to the change of requirements coming here and then. We have millions of states in the software which are not tested, but hide bugs from our reach.
Hence, I have the feeling that what others have said before me….. "Zero defects are not possible". But you can reduce the number of defects.
How?
- Follow Manifesto of Software Craftsmanship.
- Recruit the right resources. Engineers who have got the passion to the craft and the attitude to improve his/her skills. They will contribute to the success of your organization without further supervision.
- Give the best possible infrastructure to your engineers - the latest machines, uncensored internet, library, gaming facilities for his/her mind to relax. Productivity is highest when the mind is relaxed.
- Provide flexible working hours. After all human mind has its own mood swings. Software cannot be written by just copy-pastes. You need to harness the creativity of the engineers. Creativity springs out of the mind when the time is right. You cannot ask someone to come out with a creative idea, say, in the next 1 hour.
- A quiet place to work. The highest productivity of an engineer is achieved when he/she write programs in a meditative state, where the mind finds the best solutions to the domain problems.
- Working from home facility.
- If defects are found, do not blame the engineer who wrote the program or the engineer who tested the app. Face the reality.
- Create a culture of knowledge sharing. Remember, people don't share knowledge in the absence of trust. Create a culture of trust.
- Do not micro manage. Micro management will lead to mediocrity. Creativity is highest when the engineer is left to himself. Let the creativity of the engineer flow freely to create the best quality software systems. Remember the old management quote - "You can't manage What you can't measure".
- Boost the motivation level of your engineers.
madhu
http://eclipsebible.com/
http://eclipsebible.com/
References - The Mythical Man Month by Fred Brooks, Object Oriented Analysis and Design with Applications by Grady Booch and Others.
I'm sorry, but I totally disagree with you. Real world is simpler than the software system? How's that, when the system is a model to the reality?
ReplyDeleteContinuous systems are simpler than discrete ones? How's that, when discrete states are simplification (probing) of a continuous space?
I don't really think that the hardware being produced is less complicated than the software. The key to their relative high quality is intensive testing of small independent units. We should tend to do same in software industry, e.g. keeping testable units (e.g. methods) small and simple (with as small number of parameters as possible).
I have never said that real world is simpler than software systems. But unlike the real world, the software systems are created by people.
ReplyDeleteThe whole idea of the blog is to convey that it is not possible to test all the 'states' in a discrete system.
One way to handle this complexity is 'separation of concerns'. As you said, write small independent units with clear boundaries. This helps us to make each unit more testable.
But still when you integrate the independent testable unit into a complete software system, the 'state space' of the software system sky rockets...