June 2009 - Posts

If you're a member of Facebook (who isn't? - don't answer that), there's a free draw for a book on Excel 2007 Pivot Tables.

And here's where it gets good: it's not written by just anyone, but Debra Dalgleish, undisputed heavyweight champion of Pivot Tables.

Here's the offering:

http://www.facebook.com/home.php?ref=home#/event.php?eid=109974158504

As I mentioned in my free, online Excel resources blog entry:

http://msmvps.com/blogs/nateoliver/archive/2009/05/29/online-excel-resources.aspx

One can learn a lot about Pivot Tables by reading Debra's insights, e.g.,

http://contextures.com/xlfaqPivot.html

While I haven't had the chance to read this offering, yet, myself, I can only expect it to be much more comprehensive than her [great] website.

So, if you're a Facebook member, and are feeling lucky, sign up. This appears to end tomorrow, so don't hesitate.

Incidentally, Debra also does a great Jon Peltier impression.

Top-5 Primer Worksheet Functions 

Excel’s a robust and dynamic piece of software offering a variety of capabilities. Today we will discuss the most common use of Excel, an analytical tool with given data sets and 5 must-know, general usage Worksheet Functions to begin to work with Excel. These 5 Worksheet Functions include Sum(), Count(), SumIf(), CountIf() and VLookup().

 Sum()

Sum() allows one to add values in a variety of manners. The most common use would be to sum a given, contiguous range, e.g., =Sum(A1:D10), the result being every cell in A1 through D10 being added together.

Sum() also allows for arguments via the use of a comma, which allows for multiple, non-contiguous ranges to be added. Two differing examples would include both single-cell, non-contiguous ranges, e.g., =Sum(A1,D1,F1) and multi-cell, non-contiguous ranges, e.g., =Sum(A1:A5,C1:C5,E1:E5).

One thing to note with Sum() is that it does not ignore error values in your data, e.g., if you have 1,000 valid cells of data and a single cell in the calculation range contains an error (e.g., DIV/0 – division by 0), Sum() will not return the sum of the range, but the error, DIV/0. This behavior is by design, to alert you that there is a potential issue with your data. However, later we will explore a simple work-around (see SumIf()).

Count()

Count() allows one to gather a count numeric data in their spreadsheet. As with the previously discussed Sum() function, Count() can work with single ranges, non-contiguous ranges and multi-cell ranges - the latter with comma-separated arguments.

SumIf()

Sum() works great if you want to add everything in sight, but what if we want to add values based on a specific criteria?

Enter SumIf(), which will sum a given range of values, if a given criteria is met. In pseudo code, SumIf() reads as =SumIf(Range of Criteria Values, Your Criteria, Range to be Summed). The last argument, separated by commas, is optional – if omitted, the Range of Criteria Values is summed.

Note that SumIf() supports wildcard characters and mathematical operators, as mentioned  here:

http://office.microsoft.com/en-us/excel/HP052092921033.aspx?pid=CH062528291033

As promised, a quick and risk-free work-around to summing ranges that house error values within its cells. It’s not recommended that Excel users allow error values to flow through their spreadsheets; however, it is imaginable that an individual has a pressing engagement, say a tee-time, and needs to provide a colleague a sum of data that houses errors, much sooner than later.

SumIf() allows one to do this by invoking a simple mathematical operator, which forces SumIf() to ignore all non-numeric data, of any sort, including error values. So, the following transaction is risk-free: =SUMIF(A1:A10,"<=9.99999999999999E+307")

The reason that this transaction is risk-free is that 9.99999999999999E+307 is the largest number allowable in Excel (see the help file on “Limitations”), so all functional numbers in Excel must either equal or be less than this number, and only numeric cell entries will be considered at this point.

Be careful with 9.99999999999999E+307, there’s a reason I’m burying it in quotes, Excel can do strange things with numbers, due to its Data Types – a subject for another day.

CountIf()

Like SumIf(), CountIf() provides us with a manner of conditional counting. The slight variance is that it takes two arguments, the range you want to evaluate and the criteria to evaluate that range by, e.g., =CountIf(Range you want to count, Criteria to be met).

CountIf() also supports wildcard characters and mathematical operators, as should be noted, here:

http://office.microsoft.com/en-us/excel/HP052090291033.aspx?pid=CH062528311033

VLookup()

VLookup() is an extremely useful worksheet function, in Excel, and chances are if you’re interviewing for a job that requires somewhat advanced Excel experience, you will be asked about it. While we won’t cover the several underlying algorithms, let’s have a look at really understanding the implementation.

Most of Excel’s worksheet functions are designed with names that are memorable, in an English-sense, and VLookup() is no exception, in layman’s terms, it is short for “Vertical Lookup”. This function performs a search on a column of values, a column being vertical, and if said value is found, offsets a number of specified columns, and returns a corresponding value.

Before using, I recommend reviewing the help file for this function, as well:

http://office.microsoft.com/en-us/excel/HP052093351033.aspx?pid=CH062528281033

So what does this mean? First of all, pretend you don’t know the last argument, for now, let’s stick with False. Here in pseudo code is what this means:

=VLOOKUP(Value to Find, Range of Values – Including the Offset Range, Number of Columns to Offset from first Column, False - for now)

The left-most column of the 2nd argument refers to the column you’re actually searching, and the right-most isn’t what this is limited to, but is generally recommended to be - the farthest column you want to offset to, to return your value. So, if you want to return a value from column B, that is dependant on a match from column A, the example could look something like this:

=VLOOKUP("August",A1:B12,2,FALSE)

Where 2 indicates to return a value in the 2nd column of the specified Range of values.

Returning to the 4th argument of VLookup() and the True argument. This can be more efficient due to the manner in which the underlying algorithm processes. But, as noted in the help file, your first column has to be sorted in ascending order for it to be trustworthy, due to a difference in the way it searches.

Conclusion

There are quite a few of differing, very useful worksheet functions in Excel, and most of them are well documented. This is a quick-hit list of some that might prove to be more frequently used and useful for general purposes.

Chances are pretty good that if you're reading this blog, you use your computer quite a bit. Today I wanted to mention a nice freeware utility to help you manage your resources, CCleaner:

http://www.ccleaner.com

What is CCleaner? In their words:


"CCleaner is a freeware system optimization, privacy and cleaning tool. It removes unused files from your system - allowing Windows to run faster and freeing up valuable hard disk space. It also cleans traces of your online activities such as your Internet history. Additionally it contains a fully featured registry cleaner. But the best part is that it's fast (normally taking less than a second to run) and contains NO Spyware or Adware! :)"


That's a mouthful! Basically, it's free software that removes junk that accumulates in your file space and registry. If you run CCleaner, you might be surprised how much of aforementioned junk accumulates. I ran CCleaner last Sunday and it removed 700 MBs of temporary junk.

The registry cleaner is great, too. It's not recommended that you play games with your registry without a utility like this, unless you have a very good handle on what you're doing - a false step in the registry could mean an OS reinstall. And some of the junk that accumulates in your registry is non-trivial. E.g., while upgrading Ad-Aware, the upgrade process left an old Ad=Aware registry entry in place, not only being unnecessary, but actually causing Ad-Aware to simply not work. Clean up the registry, and you're in like Flynn.

There's two other nice utilities, as well, an Application uninstaller, and a memory manager which allows you to control what on your system is loaded at start up. The latter's nice, e.g., I want Windows Live on my PC, as an example, but I don't want it to launch on every start up. A lot of programs you want and install have a nasty habbit of doing this for you - problem solved.

The GUI is intuitive and the program itself is small and has a light footprint in memory. Others seem to agree that this is nice software:

http://www.ccleaner.com/reviews

So, if interested, you can download, here:

http://www.ccleaner.com/download

Personally, I can't recall life before using CCleaner.

Now, it's go-time. We had some fun considering the differences between various growth models, we know how to handle the compounding one, and why it's of interest, i.e.,

http://msmvps.com/blogs/nateoliver/archive/2009/06/06/the-cagr-installment-1.aspx

So, let's handle it!

I wish I was smart enough to have conceptualized this, but it comes from a question, from the University of Florida - I'm a Big 10 guy. And you can tell I'm not enthused, at first:

http://www.mrexcel.com/forum/showpost.php?p=1583022


"You're talking compounding growth from 1985-1990? And you want to do that scenario for each state for every possible combination of years?

That would require a ton of iterations on each state. It's doable, and my CAGR example is how you do it, but I didn't set it up to easily do that. You'd need a pretty big table for each state (unless you did this with VBA) to accomplish this. I suppose I could do a UDF..."


She wants every single possibility where the CAGR, in any given year, exceeds her threshold. She's not looking at preferred dividends, she's looking at real, real estate price indexes. Real as in not nominal - inflation adjusted.

It's a cool problem, from an analytical standpoint - it won't brew your morning coffee, but it will flag periods that might be worth reviewing and attempting to figure out why they grew at such strong rates.

In order to solve this problem I took a night off. Sounds odd, right? I often don't write my best code while looking at my computer. Here's what I came up with:

 

Public Function CAGRThresh( _
    ByRef rngYears As Range, _
    ByRef rngPrincipal As Range, _
    ByVal curThresh As Currency) As String
Dim varYears() As Variant, varPrincipal() As Variant
Dim strRet() As String
Dim i As Long, j As Long, lngCount As Long
Dim lngUpper As Long
Dim curCAGR As Currency
Let varYears = rngYears.Value
Let varPrincipal = rngPrincipal.Value
Let lngUpper = UBound(varPrincipal, 2)
ReDim strRet(1 To (lngUpper ^ 2 * 0.5 + lngUpper * -0.5))
For i = LBound(varPrincipal, 2) To lngUpper - 1
    For j = i + 1 To lngUpper
        Let curCAGR = ((varPrincipal(1, j) / _
            varPrincipal(1, i)) ^ (1 / (j - i))) - 1
        If curCAGR >= curThresh Then
            Let strRet(lngCount + 1) = varYears(1, i) & _
                "-" & varYears(1, j) & ": " & _
                Format$(curCAGR, "0.00%")
            Let lngCount = lngCount + 1
        End If
    Next j
Next i
If lngCount > 0 Then
    ReDim Preserve strRet(1 To lngCount)
    Let CAGRThresh = Join$(strRet, ", ")
    Else: Let CAGRThresh = "N/A"
End If
End Function



There's a few things to note in all of this. As we've been discussing, this is a buffered Array:

ReDim strRet(1 To (lngUpper ^ 2 * 0.5 + lngUpper * -0.5))


How did I come to this conclusion? I used multiple-regression - we'll talk about this, soon.

Your data needs to be cross-tabbed, or pivoted, whichever terminology you prefer. Your data can be anything given a compounding growth model, as long as you have items to the left and time periods across the top.

I'll upload an example, but note, it has two VBA UDFs, two Subroutines and an ActiveX control embedded in it. If you or your organization aren't comfortable with this, I advise you to ignore said attachment - I provide it with no assurances, of any sort.

The first Worksheet in question speaks to the CAGR example. The second speaks to arithmetic growth, and why it doesn't work.

The third worksheet speaks to why I picked such a wild return format, a String? You can quickly parse all of this with Text-to-Columns, given a careful construct.

And, of course, that approach is limited in context, too. E.g., you could also stack an Array or disconnected Recordset.

I love saying "the CAGR!" It reminds me of the movie 'Highlander', and the Kurgan. There can be only one! I use a full-on Scottish accent when I say it, too.

So, what is the CAGR? It's a financial acronym, which is Compounding Annual Growth Rate. And it's widely used, as the Motley Fool mentions here:

http://www.fool.com/workshop/2000/workshop000302.htm

The term's a little misleading, as it implies the periods have to be annual, and they don't. The CAGR is to Financial Analysts what would be otherwise known as Geometric or Exponential Growth:

http://en.wikipedia.org/wiki/Exponential_growth

This phenomenon occurs in many different walks of life, investment values hopefully grow at an exponential rate (the S&P 500 tends to), biological cell growth could show this, rabbit population tends to follow this pattern, etc... The other day, while discussing String Concatenation and performance degradation, we were looking at geometric decay.

Why is the CAGR important? It shows trends. So, for an example, assuming resources are scarce, and you as a business planner have various choices as to where to invest, how do you choose? The goal is to maximize, and your best bet could be the highest area of compounding growth - noting history is not necessarily an indicator of the future.

Right, so how do we calculate the CAGR in Excel? There is no native way in Excel, that I know of, to calculate a CAGR in general, but a little arithmetic gets us there. E.g.,

=((B9/B2)^(1/7))-1

Or, in pseudo-speak:

=((Last Value/First Value)^(1/Number of periods less 1))-1

Make sure your time-periods are equally spaced with this logic.

I'm also uploading an example Excel File (w/ no VBA) - a fairly contrived one. With that, my next entry will explore a more extreme CAGR implementation. If you're an analyst, looking at compounding growth patterns, you might enjoy this.

This isn't perfect, and it isn't high-end A/V - but it's a potential approach to snagging a segment of a song. Perhaps a party trick, literally.

So, in the following we use Binary File Access and try to figure out where we are in terms of Bytes and Seconds within an existing song file and create a new file with the cropped song of our specifications:

 

Sub CropMp3s()

Dim b() As Byte, i As Long
Dim FileNum As Long
Dim lngFileLength As Long

'File length variables
Dim lngFileSeconds As Long
Dim lngStart As Long
Dim lngFinish As Long
Dim lngTimeDiff As Long

'\\\\File Input\Output
Const strFile As String = "c:\SongIn.mp3"
Const strOutput As String = "c:\SongIn_cropped.mp3"
'\\\\\

'\\\\Song Constants
Const lngStartSeconds As Long = 20 'Starting seconds into song
Const lngStartMinutes As Long = 4 'Starting minutes into song

Const lngEndSeconds As Long = 21 'Ending seconds into song
Const lngEndMinutes As Long = 5 'Ending minutes into song

Const lngSongSeconds As Long = 9 'Song's entire ending ending length in seconds
Const lngSongMinutes As Long = 7 'Song's entire ending ending length in minutes
'\\\\

Let lngFileLength = FileLen(strFile)
Let lngFileSeconds = (TimeSerial(0, lngSongMinutes, lngSongSeconds) * 86400)
Let lngStart = (TimeSerial(0, lngStartMinutes, lngStartSeconds) * 86400) * _
    lngFileLength / lngFileSeconds
Let lngFinish = (TimeSerial(0, lngEndMinutes, lngEndSeconds) * 86400) * _
    lngFileLength / lngFileSeconds
Let lngTimeDiff = lngFinish - lngStart

ReDim b(1 To lngTimeDiff)

Let FileNum = FreeFile
Open strFile For Binary As #FileNum
For i = LBound(b) To UBound(b)
    Get FileNum, i + lngStart, b(i)
Next
Close FileNum

Let FileNum = FreeFile
Open strOutput For Binary As #FileNum
Put FileNum, , UBound(b)
Put FileNum, , b
Close FileNum

End Sub

 

I've set up a few constants that need to be changed, first the file names. Then, the starting point you want, separately for both seconds and minutes, the ending point you want, again, separately for both seconds and minutes, and the the total length of the song, in seconds and minutes. I did not include hours into this, figuring that might be extreme...

A few questions you might have. One, why would anyone do this?

  1. For the sake of doing it
  2. While there's plenty of Freeware .exe's out there to do this, I'm not crazy about downloading random .exe files - hard to say what's in them

Two, why doesn't it work consistently?

I haven't accounted for a few potential Bytes on the front or tail-end of the file structure, which appear in the form of ID Tags, I'm not intimate with the .MP3 file structure. Otherwise, the ratios should work, in terms of kbps, and consequent file-size. It gets you pretty close, it's a matter of tinkering...

As I said, not really close to perfect, but a potential approach/start. If you're interesting in Binary File Access, or File I/O in general, see the following:

http://msdn.microsoft.com/en-us/library/aa716281.aspx

Sequential and Random File Access are worth reading up on, too.

As I was writing today's blog entry, I realized I needed to bake in a little arithmetic. I could have done this in the front-end of Excel, but this is a VBA sample, and we're in RAD-mode. So, I, without much thought, tried the following:


Sub foo1()
Debug.Print (24 * 3600)
End Sub


Whoops - overflow error. Serves me right, eh? I should be doing this sort of thing in my head.

VBA is treating these numbers as Integer Data Types and the result, which wants to also be an Integer, is out of the allowable range of this Data Type (in VBA/VB6). The range of numbers being documented in the Help File.

How do we deal with this? There are two workarounds, both of which have come up recently. The first is to use a Type-Declaration Character as explained, here, to make sure we're working with Long Values:

http://msdn.microsoft.com/en-us/library/s9cz43ek.aspx

And this works well, e.g.,


Sub foo2()
Debug.Print (24& * 3600&)
End Sub


Actually, you can make sure the first value in question, only, is a Long, and you're fine from there, e.g.,


Sub foo3()
Debug.Print (24& * 3600)
End Sub


The second approach is to use VBA's explicit Type Conversion functions, that are mentioned at the end of the following article:

http://msdn.microsoft.com/en-us/library/aa716185.aspx

And an example would be as such:


Sub foo4()
Debug.Print (CLng(24) * 3600)
Debug.Print TypeName((CLng(24) * 3600))
End Sub


As you can see, I've introduced something different here - a call to TypeName(). TypeName(), is described here:

http://msdn.microsoft.com/en-us/library/aa263394.aspx

TypeName() can be a very useful function when in doubt as to what Data Type, and limitations, you're working with. You can end up with a lot of funky Types in VBA, whether it's due to you performing implicit Type conversion or under-the-hood workings that you didn't intend - something to keep in mind. I know I've made some surprising calls to TypeName(), in the past...

And, as a last example, we can see that the Type-Declaration character does indeed change our perspective on the matters at hand:


Sub foo5()
Debug.Print TypeName(24)
Debug.Print TypeName(24&)
End Sub


This behavior's a little quirky, seeing as it's my understanding that 32-bit VBA wants a Long-integer vs. an Integer-integer, either way. But it's what we have, so we'll work with it. And, today's entry will be deferred until tomorrow - until then.