Tuesday, November 13, 2012

excel data analysis 1.0 (data sorting, general stats)

This is a series of articles to demonstrate use of Microsoft Excel for financial data analysis; it will cover a pretty wide range of functions such as volatility forecasting, backtesting, cross-product analysis. I want to share some of the things I’ve found useful, hopefully for the readers too.

This post deals with an overview of historical data for NASDAQ listed CSCO, Cisco Systems

Getting the historical data
Historical daily prices are available at Yahoo Finance. I got the data going from Mar. 26 1990 to Nov. 13 2012, about 5,700 days (rows) of data. It is usually a good idea to save the data in .xlsx format (different if you don’t use office 2007) instead of the default .csv so that the later applied formulas would be retained.  

Sorting data vertically
The default spreadsheet from yahoo has the latest values at the top and the earliest bottom. I like my data the opposite, feel free to skip to next part unless you also want your data going from top to bottom. To do this, we highlight all the data EXCEPT the column headers (top row), and click on Sort.

Overview of quartile statistics
To keep this simple, I’ve kept only the Adjusted Closing values for the sample period.
I usually insert a few rows at the top so that we could have room to see the quartile and average values for each column of data. In this case I highlighted rows 2 – 10 for the insert, and hit Freeze Pane so that these rows would always show when we scroll down the spreadsheet.

Why Quartiles and not Standard Deviations? Because financial data sets are usually not explained by normal/Gaussian distributions. The excel quartile formula set allows us to see each data column’s 25% ranges going from minimum, median, to maximum. It provides a much more practical overview of the data from a risk management perspective. From the above screenshot, we can see that since inception, CSCO has spent about 50% of the time above $16.6/share, with the average price at $15.86/share.

Excel formulas (press Ctrl + ` to make formulas show in excel):
Note that I've made the ranges reach 11,000+, it is so I wouldn't have to update the formulas until the spreadsheet exceeds that many rows.

dX (change in price)
The change in price, here I use 1Day per time step, is needed for lots of more complex analysis. Sticking with mathematical finance, I use the logged return formula ln[X(t)/X(t-1)] where
X = last adjusted closing price
t =  time step units, in this case the day where X is observed, and X(t-1) would be the price from the previous day.

Excel formulas (press Ctrl + ` to make formulas show in excel):

So, we end up looking at this:
From the above we can immediately notice a few things about CSCO. It has grown significantly since inception, with the average day to day return at 0.09% before transaction costs and etc. From the quartile analysis, we can see that 50% of the time it gets quite volatile, where the single day moves have exceeded 20%.

Most importantly however, you must keep in mind that the state of the company today is in no way identical to where it was in March of 1990. Therefore, this 0.09%/day return is highly unlikely to persist into the future.

I will finish this post at this point. Feel free to contact me for any help with excel data analysis.

0 Reflections: