Chapter 7 Good programming practice and declaring data types

7.1 Good programming practice

Writing clear code, just as with any language, requires a certain amount of thought. A good piece of code should be easy to understand by any reader, clearly laid out so that the structure can be followed, and (ideally) efficiently designed so that it will run clearly. For more complicated programs it is a good idea to have the code broken down into smaller sections so that different tasks are consigned to specific parts, as this allows for the re-use of certain key components and makes modifying the code much easier.

There are two main reasons why you should care about the clarity of your code. The first is that this will make it easier for you to debug your code, or to modify it at a later date. The second is that it will make it easier for other people who use your code to understand what it is doing and adapt or extend it.

In this section we can only touch on some of the most basic ways to encourage a clear coding style. We will concentrate on four main methods:

  • Using informative variable names.
  • Using indentation as a tool for exposing the structure of your program.
  • Adding line breaks to make your code more readable.
  • Adding comments to help the reader understand your code.

Later we will discuss two other aspects:

  • Declaring variables using types.
  • Using subroutines to structure large programs.

To illustrate our points we will consider the following example of some VBA code.

Function mm()
Dim mm1 As Variant
Dim mm2 As Variant
Dim mm3(1 To 3, 1 To 3)
mm1 = Range("A1:C3").Value
mm2 = Range("E1:G3").Value
For i = 1 To 3
For j = 1 To 3
mm3(i, j) = 0
For k = 1 To 3
mm3(i, j) = mm3(i, j) + mm1(i, k) + mm2(k, j)
Next
Next
Next
mm = 0
For i = 1 To 3
mm = mm + mm3(i, i)
Next
End Function

This is intended to be hard to follow, so do not worry about what it does for the moment. But notice that it is hard to tell where different operations begin or end, and impossible to work out what the different variables might be referring to. We will show how the first four principles outlined above can be used to make it more readable.

In fact, this code is designed to take two matrices, multiply them together, and calculate the trace of the resulting matrix. This is not a complicated procedure, and with better presentation the code can be quite easily followed. There are a few features (such as the lines beginning with Dim) which we have not seen yet, but we will not worry about these for now.

Next we will illustrate these methods by rewriting the code above. We have not yet explained all of the commands used in the example, but you should find that the new version is much easier to understand.

The first thing we will do to make it more comprehensible is replace all of the variable names by something more informative. In this example we are working with two matrices \(A\) and \(B\), so we will call them by these names instead of mm1 and mm2. We have to work out the entries of a new matrix \(C\), and we would like to refer the entry by the variables row and column; however Row is already used by VBA, so we will use rowC and colC instead.

Next we will indent the code (as described in Section 5.3) so that the structure of the loops and other commands is easier to follow. This is particularly useful if your code contains nested loops or other complicated structures.

Function traceAB()
' will calculate the trace of the matrix C=AB where
' A is the matrix in cells A1:C3
' B is the matrix in cells E1:G3
' first set up the array types of A, B, and C
Dim A As Variant
Dim B As Variant
Dim C(1 To 3, 1 To 3)
' read A and B from the worksheet
A = Range("A1:C3").Value
B = Range("E1: G3").Value
' calculate each entry in C
For rowC = 1 To 3
    For colC = 1 To 3
        ' calculate entry of the matrix C in row rowC
        C(rowC, colC) = 0
        For i = 1 To 3
            C(rowC, colC) = C(rowC, colC) _
                    +A(rowC, i) * B(i, colC)
        Next
    Next
Next
' calculate the trace of AB
traceAB = 0
For i = 1 To 3
    traceAB = traceAB + C(i, i)
Next

When lines of code become very long they are hard to read. If you wish to break a long line into two (or more) shorter ones, then end each part of the line (except the last) with a space followed by an underscore character: _. As an example, we have taken the line of code which calculates the entries in the product matrix and split it over two lines in the new version. When a line of code is formatted in this way, the continuations are usually indented to highlight that they are part of the preceding line, and in this case we have lined the indented text up with the equality sign to make it clear what we are calculating. Splitting lines becomes more important if you are using long variable names to make your code more readable.

Finally, we will explain what our code does by inserting comments in various places. A comment is simply a fragment of text which is inserted into the code to explain what is going on at that point. To avoid confusing comments with code, we use the apostrophe symbol at the start of each comment. VBA will then ignore everything that follows the apostrophe until the end of that line. You will see that all of my comments occupy an entire line. Some people do add comments at the end of lines of code, but I find this harder to read and more cluttered.

Comments are a good way to break up the code into chunks that do different things. In the example the first set of comments sets up the problem and defines some variables. Then we read in the input matrices. The entries of \(C\) are calculated, and finally we determine the trace of this matrix. As long as the reader is familiar with matrices and traces, they should be able to work through and understand the code even if they are not very familiar with VBA.

We could even improve the example code a little more by inserting blank lines to separate out the different stages of the process, in this case just after we read B and just before we calculate the trace.

7.2 Data types in VB

We have seen in lectures that variables can be assigned a type. Each type corresponds to a different kind of variable, and tells VBA how the variable must be regarded, and how much memory to assign to it. Computer memory is built up of sequences of 0s and 1s. A single bit is the smallest unit of memory, and can take one of two values (0 or 1). Eight bits form a byte, which can hold \(2^8\) different possible values. We will consider a variety of different data types, and describe the amount of bytes need to store them.

Even if we only restrict ourselves to numbers, there are many possible choices. Here are the standard types, the number of bytes of memory they occupy, and the range of values they can take.

Data type Bytes used Range of Values taken
Integer 2 -32,768 to 32,767
Long 4 -2,147,483,648 to 2,147,483,647
Single 4 -3.402823 x \(10^{38}\) to -1.401298 x \(10^{-45}\)
and 1.401298 x \(10^{-45}\) to 3.402823 x \(10^{38}\)
Double 8 -1.79769313486232 x \(10^{308}\) to -4.94065645841247 x \(10^{-324}\)
and 4.94065645841247 x \(10^{-324}\) to 1.79769313486232 x \(10^{308}\)

You will see that there are two kinds of integer types: Integer and Long. With modern computers, and for the kind of programs that you are likely to produce, there would be no harm in using the Long type to reduce the risk of answers being two large or two small for the variable type to hold. But notice that the Long type only goes up a few billion in size. This can be a problem in some applications.

For real numbers, there are again two main types: Single and Double. Again the difference is one of size, with Single having a much smaller range of values than double. Although we say these can be used to represent real numbers, they actually round them slightly. Due to the degree of precision the rounding errors will be very small, but can occasionally cause problems.

There are other data types apart from numerical one: here are a few that you are likely to come across.

Data type Bytes used Range of values taken
Date 8 1/1/100 to 31/12/9999, and 0:00:00 to 23:59:59
Currency 8 -922,337,203,685,477.5808 to 922,337,203,685,477.5807
Boolean 2 TRUE or FALSE
Variant varies any type
String 1 per character any sequence of characters

It may appear that the only type needed is Variant, as such variables can be of any type. However, this defeats the point of giving variables types, and Variant should not be used unless the type of the variable is going to vary. In exams when you are asked to use types you will lose marks for using Variant where a more appropriate type can be used.

The Currency type works with decimals with up to 15 digits to the left of the decimal point, and up to 4 to the right. It has the advantage of not introducing rounding errors (provided we work with numbers of this kind).

The Date type can be a little confusing, so we will consider it in more detail. Dates are stored as numbers, where the integer part gives the date, while the decimal part gives the time. A given date is stored as the number of days which it is from 30 December 1899, with earlier dates having negative values, and later dates having positive values. Thus 30 December 1899 corresponds to the number 0, while 1 December 1899 corresponds to -29. For the decimal part, this corresponds to the fraction of a day which has passed. So if the fractional part is 0.5 then half a day has passed and the time is 12:00:00.

Because VBA associated dates to numbers, it makes sense inside a program to “subtract” two dates. This will give the number of days that lie between the two dates.

There are several useful VBA functions that take variables of date type as input, and can be used inside your programs. Here are some of the most useful ones.

Function Output
Month(date) a number between 1 and 12 representing the month in which that date occurs
Weekday(date) a number between 1 and 7 representing the day with Sunday=1, Monday =2, etc.
Year(date) a number between 100 and 9999 representing the year
Hour(date) a number between 0 and 23 representing the month
Minute(date) a number between 0 and 59 representing the minute
Second(date) a number between 0 and 59 representing the second

As in Excel, the function Now() returns the current date and time.

To illustrate some of the above functions, consider the simple function given below.

Function age(birthdate As Date) As Integer
age = Int((Now()-birthdate) / 365)
End Function

The input is the date of birth (which takes the Date type of course). The output is intended be the age today of someone with that birthdate (which will be of Integer type).

Here the expression (Now()-birthdate) calculates the current age in days. Two convert this into years we divide by 365. The number obtained will not be an integer, so we use the function Int to find the integer part of this number. (For example, Int(23.4)=23.)

Why will this function not always give the right answer?

7.3 Declaring variables and constants

When writing VBA programs, we usually need to use variables which only exist inside the program, and are not part of the input data. For example, in Section 7.1 there were a variety of variables (such as rowC and colC, and i which only appear inside the code itself.

It is often useful to define these variables at the start of the code in which they appear. This has the same advantages as declaring the types of variables in general, and can make your code tidier and easier to understand.

In order to define such variable we write Dim variablename [As Type] where variablename is the name of the variable being defined, and Type is the type of this variable. If you do not include the As Type part then the variable will be considered to be of Variant type. Dim here is short for dimension.

In the code in Section 7.1, we only defined the variables A, B, and C. To be completely rigorous we should have added the lines

Dim rowC As Integer
Dim colC As Integer
Dim i As Integer

It is usual to collect all the Dim statements together and put them at the start of the code, so that they are easy to find. But it is only necessary that they occur before the line where the defined variable is first used.

It is possible to force your code to require every variable to be declared. This is a very good way to force yourself to adopt a good coding style. If you add the line Option Explicit at the start of your code, then the code will not run if there are any undeclared variables, but will instead produce an error message. If you have more than one module open, then the Option Explicit statement will only apply to the module in which it occurs.

One advantage of using the Option Explicit statement is that it helps you to spot small typos. Suppose that you are using a function called Mississippi. It would be very easy to mistype this and write a statement such as

Missisippi=1.47

Without the Option Explicit statement, VBA will treat this as a new variable it has not seen before, and may give rise to the wrong answer without you realising that an error has occurred. With Option Explicit included, VBA will generate an error and highlight the rogue statement.

Another advantage of declaring variables, which is particularly useful when combined with the use of long variable names to improve comprehensibility, is that declared variables can be entered via shortcuts. If you type the first few letters in a variable name and then press Ctrl+Space then VBA will either complete the variable name or list the various variables with those initial letters for you to select from. This also works for in-built functions.

It is even possible to ensure that Option Explicit always appears automatically at the start of every module. If you wish to do this, select Tools and Options in the menu bar and turn on the Require Variable Definition option in the Editor tab.

As well as variables, you may sometimes want to use constants in your code. These are declared much like variables, except that we use a statement of the form

Const name [As Type]=value

where value is the value that the constant name takes. Such constants are only available inside the current program. If you wish then to be available in the entire worksheet, you should instead use a statement of the form

Public Const name [As Type]=value

and put it before the Function statement at the start of the module. The value of a constant cannot be changed by other statements in the code, and attempts to do so will generate an error.