Are You Certified?
There’s about 200 certificates today in field of software. These certificates are offered from organizations such as the IEEE or giant software companies. These certificates could be for individuals, such as the Microsoft certified System Engineer certificate or to organizations such as the CMMI certificate
Giant companies offer certificate in their specific technologies. One of the most famous examples is Microsoft, with its administration and development certificates. Other examples include Oracle, CISCO, etc.
I think there are two problems facing the software certificates and prevent them from having a high value.
First problem is that it is easy to pass the exam while you are not really qualified because you know the answers of the questions. I am a Microsoft Certified Application Developer (MCAD). The course is very excellent in my opinion,and it can give you a very good overview and basic knowledge of the course fields. Although I studied the course with an intention to understand and really earn a good knowledge of the course filed’s, a lot of others just try to get previous exams or sample exams from test king or similar organizations and they just study the questions and memorize the answers – and they pass. What helps them to do so is that the questions doesn’t have major changes with time and the exam is multiple choice exam. I think the exams could have more serious changes with time and the questions could get more difficult.
I have met a some people who didn’t write a line of code in their entire life and they enter the Microsoft exams just to get a certificate that may allow a better opportunity for a good job.
I think that one of the most reliable certificates in the field of software development would be Microsoft Certified Architect certificate, because it is required for someone to get that certificate to be nominated by a previous Microsoft Certified Architect and to present his previous work to a review board. The idea of really showing your work to some people who can review it and discuss it with you makes a better result than just passing a multiple choice exam.
A similar thing applies to certificates offered to organizations such as the CMMI certificate. for a company to get to a CMMI certificate of level x they have to represent some things such as specific analysis documents for their projects. The problem is that companies order their software developers to just fill some papers even if they are not important for the project in order to just get a higher CMMI level. I know a friend project manager who has moved recently to a CMMI level 4 company, and he was astonished to find that the project managers there have the culture of just filling specific documents with specific fields (he found the project managers memorizing the fields in the papers that they don’t refer to template documents).They don’t make a serious analysis or design, they are just filling place holders to get to a higher CMMI level.
The criteria of CMMI certification are stupid. They rely mainly on documents which is not an indicator for any thing. Any one can fill a bunch of documents without serious software engineering practices. Also some life cycles such as SCRUM don’t produce a lot of documents by nature. Regardless of your opinion of SCRUM, some big companies such as Microsoft uses it in some projects. so using the stupid CMMI certification criteria, Microsoft may not be able to get a high CMMI level.
I remember that I have read an article stating that companies with very high CMMI level have higher bug ratio in their code than the normal for the software industry for the same project types and sizes, because stuff focus mainly on just filling some documents.
The second point that is decreasing the value of software certificates is the nature of the course itself. Generally any software teaching course – even in collages- focuses mainly on code and technology without focusing on the practice itself. A lot of developers give great care to code and technology and less or even no care to software engineering. With the great increase in the size of the IT industry, this has created what Steve McConnell calls “Software No man’s Land”. The field contains a small number of real software engineers. Teaching code can create a good coder but not a great developer. I think this is one of the main reasons for the high rate of failing software projects.
There is a number of courses that focus on the study of software engineering, such as the IEEE’s Certified Software Development Professional certificate, but the problem that these certificates are less known than those offered form big companies such as Microsoft.
Even courses that teach software engineering tend to be more theoretical than practical in many cases.
I think it would be a great addition to the software certification if big companies such as SUN and Microsoft offered certificates that focus on software engineering practices, concepts and skills, with more reliable tests.
Decreasing Communication Burden Between Team Members
In his famous book the mythical man month, Freed Brooks stated one of the most famous (and unapplied) principles of software engineering, which states that adding developers to an off schedule project’s team increases the cost AND increases the time, it doesn’t save the schedule.
There are two reasons for this:
- The new team members need time to understand the project, the design and the coding standards for the project.
- The communication burden increases. If the team contains x developer then we have x(x-1) communications taking place. If the team contained 5 developers then we have 20 communication operations happening. If we add 2 new developers to the team, then we have 42 communications taking place. The increase in the number of communication operations is not linear. On the other hand, the increase in the cost is linear.
The only solution for a late schedule situation in Brooks opinion is to decrease the features, not to add more developers. Read more
Should we always stick to backward compatibility?
Backward compatibility is always a major concern when writing a new application or adding a new feature to a technology standard.
The problem is that standards are in many cases wrong or at least not complete. Many cases can easily be found, such as C++, the web standards (HTML for instance) and SQL.
One of the things that prevent us from destroying those wrong standards and replacing them with totally new standards is backward compatibility. But does it really worth it?
I think backward compatibility is important when building a new application. You want your users to be able to open their old files although you have made a new file format. This is something that worth working for, but for a technology or a standard? I think here we should stop to thing about some time.
Take the web as an example. How many problems do web developers face and have to hack because of the different ways browsers deal with html and asynchronous java script calls? One of the main causes for this is that the possible ways you can deal with web standards are endless. Why don’t we ignore all the backward compatibility completely in such a situation?
Lets say we are about to put new web standards, browsers, and protocols. Those who are satisfied with the old standards or their application contain old code can continue to use old standards and web development tools; those who are creating new applications from scratch can use the new standards. Developers of totally new applications don’t have to face all those problems because “we cannot change the standards completely for backward compatibility reasons”.
Take a look at C++ and the amount of problems and difficulties and pitfalls that are found in some parts of the language and known to every C++ developer, say for example templates.
When putting new C++ standards, why don’t we ignore the past completely? There is a good number of compilers for the old C++ that can be used for editing old code and be used with old libraries. We can start a new C++ era with a new language that totally ignores the old problems and make new compilers for it. Of course the industry will not move in a second from old language to the new one, at least because of the big amount of C++ libraries, but those libraries will be replaced sooner or later by big and small companies. See for example the .NET base class library and how it was developed from scratch.
The end result that we would get in a few years if we ignore backward compatibility when putting new technology standards is a new software development era with less development cost, time and bugs, happier developers and users.
Microsoft JET Engine
The Microsoft JET engine is one of the oldest database engines that is used till today to handle simple operations on simple databases such as Microsoft Access.
I know it is a simple database engine, but I didn’t think that it is this bad.
I am working on a small C# business application for a company. The application needs simple database operations, so I decided to use Microsoft SQL server express as my database management system, but after designing the database I found that the computer on which the application is to be deployed has low memory resources, and SQL server express takes about 80 MB of memory. Another important reason was the backup operation. The people who were to work on this application have very low technical level, so backing up and restoring the database from SQL server management studio would be hard for them, while for Access database you only have to copy and replace the mdb file. For these reasons I switched to Access database. Every thing was going for a while, but after some coding and adding some more functionality to the program, I got three great surprises.
1- There’s no transactions in the JET engine, it only do database locks when you ask for a transaction. When database locks are done, you get exceptions when you try to read and write the same record during the same transaction whatever isolation level you are using!
2-No multiple connections is allowed. if two objects (on the same thread) open a connection to the database at the same time, they are considered two users, and when each one starts a query on a table, the table is locked for this user. The other user (object) cannot read or write in the same table.
3-Manual processing and filtering of data inside C# code is faster than using the database engine.
The data that I store in the database are a copy of the file – folder hierarchy of a CD ROM, so I have folders, each folder can have child folders, and each folder can have files too.
When reading data from the database to load this folder – file hierarchy in a TreeView using multiple queries, the application becomes very slow. This was very clear when I added about 2000 folders and 5000 files to the database and then tried to read them recursively.
I got a mad Idea. I got all records from the database in two queries, and I have done the filtering of the data manually in the C# code. It took less than 1/8 of the time of the old technique.
I suppose this is a database engine that should perform queries faster than manual processing of data, but I was wrong!
LINQ
For a long time, SQL has been regarded by some technology expert as a disaster, or a barrier that prevents us from getting a lot of benefits from the relational algebra in software applications.
Problems such redundant queries, nullable fields, not supporting objects, were widely discussed with no proposed solution for them. Three manifestos were written to describe what we need in the new Query language that can solve the problems of SQL, but they stopped at the stage of giving heuristics to solve the problem, without giving the solution itself.
Microsoft guys are working on a new API called the LINQ. LINQ stands for Language Integrated Query. This is a new extension to the .NET languages that is mainly responsible for interacting with data- all sorts of data, such as databases, XML data, arrays and collections, registry data, and allows for interactions and transformation between all these data and objects in code.
LINQ itself can be regarded as a sign of the deficiency of SQL, where some technology experts see new query languages and data representations -such as XQuery, Object databases, XML- a sign for the problems of SQL, and see that these new data representations and query language try to give solutions for some of these problems, but they useless solutions. I think that LINQ is the best proposed solution till now.
LINQ allows for very good set of features, such as compiler support, compile time type checking, Interaction between structured data and objects, interaction between nullable and unnullable types, and much more.
From what has been declared till now, it seems to me that LINQ is a solution to solve some errors- not all the errors- that were already known in SQL and XML with nobody knowing how to solve them.
For example, this is an ordinary SQL select statement”
SELECT CompanyName FROM Customers WHERE (City = ‘London’)
There’s a problem with the flow of data flow in SQL. You have to evaluate the From statement first before you can process the SELECT statement. In other words, the scope of data is flowing upwards.
In LINQ, scope of data is flowing downwards, and this is a natural thing. the above statement would be written like this in LINQ:
var q = From C in Customers Where C.City == “London” select C.CompanyName;
In the above code, we notice a new data type introduced, which is var. var is an object that gets its data type based on the value you assign to it.
LINQ also solves some of the problems known with the W3C DOM- such as support for namespaces, memory consumption, and document centricity- with XLINQ, which is part of LINQ responsible for the interaction with XML.
LINQ is a major design change-or addition- to the .NET programming languages, a change which is oriented in the direction of describing what we want to be done with the data, instead of how to do it.
Now we come to the downside. When interacting with databases, the compiler would change this LINQ commands to a SQL queries to be executed on the database. I think this is a point of weakness, or at least, a point where future development is possible. SQL contains problems itself on the database side, which now are not solved. An example of these problems is redundancy. SQL allows us to get a given result by more than one query- each having its own execution path. It is the responsibility of the database engine to look at the query and optimize the query to a form that takes the shortest execution path. This introduces a new problem, where the database engine has to use some sort of execution path caching to avoid defining an optimized execution path for each received query. Unfortunately, not all database engines succeed in doing this effectively.
In an amazing experiment, a technology expert wrote two queries that both return the same result, and executed both of them on a database – he didn’t mention the name of the DBMS. One of the queries was executed in 2 seconds, and the other was executed in 2500 seconds!
So the way you write SQL queries may affect the performance of your application unless you are working on a powerful database. This is an example for a problem that LINQ doesn’t address.
LINQ would be released with C# 3.0. Dan Fernandez has a post in his blogs about LINQ that contains a sample code. You can read it here.
Andres Hejlsberg has a video speaking about LINQ here.
New Methodolgy for Explaining Software Engineering
Software engineering is a delicate science that is difficult to fully understand and hold. Programming is generally a difficult science, which sometimes cannot be understood or mastered by people who are considered smart on the general measures. Bill Gates says that any programmer who will ever be good will be good in few years. After that, whether the programmer is good or not is cast in concrete. But software engineering is more difficult to master even for good programmers, and more over, it is difficult to measure the level of understanding of software engineering students and their ability to apply for what they have learnt in the practical software projects.
One of the noted phenomena in software engineering is that known problems are always repeated, even if the project manager or the development team has read about these problems and their solutions in before.
One of the most famous software engineering books, “the Mythic Man Month is known as the software engineering bible. Although the book is relatively old, the problems in this book are repeated always in software development project. Fred Brooks, the author of the book, comments on this phenomenon with a sense of humor saying “That’s why they call it software engineering bible, because every body reads it and no one applies what is found in it”.
Here I propose a new methodology for explaining software engineering that I think can increase the level of understanding of software engineering.
Difficulty of understanding and applying software engineering:
Software engineering has two points of difficulty:
- Software engineering is a heuristic science. A heuristic science is the science that does not give you solutions to your problems; it just gives you ways in which you can walk to solve this problem. Other examples for heuristic sciences are psychology and architectural design.
There’s one common characteristic shared between heuristic sciences: they all accept innovation more than other sciences. In fact, they need some sort of innovation when applying them.
- The natural method of thinking makes software engineering is difficult to understand and apply correctly. To understand this, we have to take a look at who the human mind works.
The Human Mind’s Boxing:
The human mind has a natural strong ability for pattern recognition and applying previously know solutions, a mechanism that is called boxing.
The human mind tends to search for characterizing attributes for situations and problems. It has metaphoric boxes each with a name of situation or problem on it. When attributes of a specific situation or problem is recognized, the situation or problem is put into the metaphoric box in mind that has the name of this problem. Once the problem or situation is boxed inside a specific box, the human mind deals with the situation depending on its previous knowledge about this problem or situation.
A perfect situation for using boxing is solving mathematical equations. First the mind tries to classify the equation inside one big box. If the equation contains differentiation symbols, then the problem is put inside the box that has the words differential equations on it. Here the differentiation symbols are the attributes of the problem.
Further boxing can be done, for example to find the degree and order of the equation. For example if the equation was a second order second degree problem, then it is boxed inside a smaller box that has the words second order second degree. After that, the mind applies previously known methods of solving second order second degree differential equations.
This default natural mechanism cannot be used in software engineering. Actually it is the way in which software engineering is written in books or explained in classes that makes it difficult to use boxing with software engineering.
Boxing and Software Engineering:
The default natural boxing mechanism cannot be used in software engineering. Actually it is the way in which software engineering is written in books or explained in classes that makes it difficult to use boxing with software engineering.
Most software engineering books and publications do three mistakes:
- They Mention problems without clearly identifying them. What are the Symptoms of this problem? What are the causes of this problem? Etc.This makes it difficult for the software engineer use the natural boxing mechanism to detect problems or wrong development models at early stages of the development cycle.
- They mention development models and procedures (for example water fall model) without clearly mentioning what problems this solution or set of procedures addresses. This also can decrease the ability of software engineers to use the natural boxing mechanism to select the appropriate solution or development model when identifying problems in their software development lifecycle.
- They mention solutions and software development models without explaining how these models or solutions try to solve a given problem. This can decrease the ability to decrease the ability of software engineers to evaluate a given model and compare it with other models. This can even decrease the ability to create new solutions or modify an old one to fit a given situation. There’s no panacea, and every solution for a problem is only a possible solution, not the only one.
How to improve the level of understanding of software engineering?
Software engineering writers and teachers should take care of three points:
- Problems should be definitely explained. The attributes or symptoms of each problem should be stated precisely.
- When explaining development models, the problems which this development model addresses should be clearly mentioned.
- It should be mentioned how every development model tries to solve given problems, with a focusing on the idea that this is a possible solution, not a panacea. This solution or model can be modified or replaced by a better one to fit a given situation.
PInvoke and IJW
One of the additions to the C++\CLI was a new method to call unmanaged code from managed code, and Microsoft guys called it IJW, which is a short form for “It Just Works”.
The old method for invoking unmanaged code from inside managed code in C# was called “Platfrom invoke” or simply PInvoke.
I was very interested when I read in an article that the performance of IJW is extremely higher than that of PInvoke. The writer wrote two small programs, one was written in C# and used PInvoke to call an unmanaged function, and the other one was written in C++, and used the IJW method in calling the same unmanaged function.
But in one of the comments to the article, someone noted that there’s a performance overhead in the C# code for changing a string builder into a char array. When removing this overhead by using other data type than StringBuilder, the performance of the program written in C# using PInvoke became the higher one!
I thought of trying it myself. I modified the two programs to look the same as much as possible. The input to the unmanaged method was an array of TCHAR. I used a char array in C# instead of using a StringBuilder or even a string, to make the C++ and the C# code as similar as possible. I used the QueryPerformanceCounter method to measure the elapsed time to get the most possible accuracy.
Here are the two programs- I am sorry because this blogging application doesn’t have C++ or C# syntax highlighting features, so the code may be some what difficult to read.
The C++ program:
void Test (int x)
{
TCHAR Buffer[512];
for (int i=0;i
- Theoretically, when using PInvoke, a managed data type should be marshaled to the same unmanaged type as when using IJW. This is not true for all cases. Arjun Bijanki from the Visual C++ team has noted in a discussion that some data types can be marshaled to different types when using IJW than when using PInvoke.
- The MSIL code generated is not the same for the PInoke as that for the IJW, even for a very similar code.
- Not all things mentioned in the docs are true!
- From the syntax point of view, the IJW method is more easy and elegant. This is a strange thing to find that there’s something that can be done in C++ in a more easy and convenient way than in C#, but it is true. Microsoft has played it well this time!