Oct 1, 2014

Some Great Data Science and Big Data Links

The analytical hub of Data Science Central did an extensive research on the liked or mentioned sites and blogs among their member base. The result is a comprehensive list of the best data science sources. Please find the list here.

If you are looking to expand your list of regularly visited blogs check out the similar list of 50 Data Science and Statistics Blogs Worth Reading. There are some true gems.

In case you need some large data to sharpen your skills or for any other purpose go to 20 Big Data Repositories You Should Check Out.

Of course, you can always find some original content and random analytical-related thoughts here.

Sep 24, 2014

Used Car Market in Bulgaria - Where is The Data?

Couple of weeks ago I shared some difficulties that come with forecasting the new cars market. This market is interesting for marketers for obvious reasons but it forms the smaller portion of the total car market. The second hand car market got my interest and I looked around for some data. I focused my curiosity on my home country and I would like to put some popular myths against hard data. It turned out that there is virtually no data at all and I had to do some digging for details.

Sep 9, 2014

The Winding Roads of Car Sales Forecasting

 Forecasting sales of a car dealer is a tough business, much tougher than predicting the total market. Winning a car race is matter of right combination of engine tuning, tires type and pressure, quality of petrol and the other fluids, the race track parameters, weather conditions, the pilot mental and physical status and many others as well as all these of the competitors. Similarly, sales results depend on plethora of interconnected factors that makes forecasting it Heracles-grade labour. However, the challenges in predicting car sales are not unique and it is a good illustration of some of common problems.

Sep 5, 2014

Nice Article About Application of Analytics in Restaurant Business

Restaurants are probably among the most ancient businesses. Analytics steps on the vast experience in the field and adds new insights and creates lot more opportunities. The article Tables, Tablets, Data And Eating published on techcrunch.com/ has some nice examples for that. There are some good points about retail analytics as well. Enjoy!

Aug 29, 2014

ModelOff 2014 is Now Opened

The biggest and best Excel challenge is now opened! Professionals and students in Finance, Banking, Accounting, Investments and Quantitative industries who love and use Microsoft Excel could accept the challenge and compete with the best from 100+ countries. The tasks are very challenging but the fun is guaranteed - I have done most of the tasks from the previous competitions and I know for sure. There are three rounds - first two are on-line case studies and the third is on a live event in New Your City. 

Please visit http://www.modeloff.com/  for further details. If you are not going to participate, make sure you review the past questions and learn some new tricks.

Aug 28, 2014

VBA or Formulas?

In the process of development complex spreadsheets in Excel there is always the question of the balance of using VBA macros vs. implementation of with formulas. There is no one answer and the choice depends on the target users, complexity of operations and logic, size of the sheet and others. I would like to share some thoughts on this matter.

Aug 20, 2014

Transition from Excel to R

The R language is a powerful tool but the learning curve could be very steep and intimidating. However, as a business user you might be looking into it in search of speed, flexibility and better handling of larger data. An Excel user would be thinking in terms of the most common tasks in Excel - summaries, pivots, look-ups, filtering and charting and the first questions about R naturally would be how to perform these tasks. The answers are not always easy to find. Fortunately, there are some good people on the Internet come to help. I have put together a short list of blogs and sites that could be very useful for the initial part of the Excel to R transition.

Aug 13, 2014

How to Deal With Somebody Else's Excel Workbook

No matter how closely you follow your religion's prescriptions for righteous life sooner or later the life will serve you the task of dealing with an Excel model made by somebody else. The clash with a different design style, approach of problem solving and sheet organization is definitely not joyful pleasant experience. However, over the time I developed sort of a methodology to follow in this process to deliver fast result with minimum efforts and keeps my sanity intact.

Jul 23, 2014

Perception vs Data: Is This The Rainiest Summer Ever?

Weather report on bTV
"Oh, not again!". This is the thought that flashed through my sleepy head this morning when the weather woman stood in front of a map densely covered by small pictures of rain and lightnings. It has been very rainy summer. The never ending rain and the missing sunshine have been a major topic in the conversations for the last few months. Everyone around is sort of angry for the lost opportunities for good times outdoors. The phrases of the day are "This is the rainiest summer ever!", "There has never been such a summer before" and "I don't remember a summer like this one". It got me thinking if it really is the case or it is our perception playing tricks? I pondered on a similar question in my World Cup post where I tried to find why this tournament is considered to be a fantastic one. In contrast to the emotionally charged football tournaments, the weather perception should be relatively simple to analyze as it has been well recorder for a long time and there is hard data on it as well as there are much less factors to consider.

Jul 3, 2014

Is This World Cup Really Better Than Previous Two?

The World Cup in Brazil is perceived to be the best World Cup tournament among the last few ones. This is according to my circle of friends and acquaintances and me as much as I could be a reliable source. It made me think about what makes a football tournament a good one and why exactly this one is better than the ones in 2006 and 2010. I asked around to gather some opinions and I also decided to go and see some data to find whether it could tell bit more about that.

Jun 19, 2014

Friday Funny: The Fastest Way Ever to Becoming a Data Scientist

Data scientist and data analyst are among the most wanted positions. It also sounds very cool to be a scientist, despite Sheldon Cooper. There is one little thing that makes it hard to get on the board and it is the amount of work one need to put in developing the required skills. I have a good news to you! Recent research has proved that this an absolute misconception and data-science-related skills could be developed much faster than previously though. There is a recipe to open the door to this lucrative field and it is based on numerous observations on the evolution of experts in the field, so it could be trusted.

Jun 13, 2014

Can You Do Better? Some Great Examples of Excel Dashboards

Every year the awesome site of Chandoo organizes a Excel dashboards contest. The site has just released the best entries of the latest competition and there is some great stuff. The participants took time to produce fine examples of dashboards. There are a concise comments for what is good and what is bad. Also, every dashboard is available for download if you would like to get in the details of how it is made. Go at the post  and get some ideas, learn and enjoy.

Jun 11, 2014

Analytics for The Small Business: Mission Possible

Small business usually has to be very smart  to compete on the market. "Smart" includes not only the personal quality of people running it but also, with growing importance, the ability to analyze business data. The sad truth is that these businesses are left behind by the majority of analytics vendors and the owners struggle to find solutions to meet their needs. The question is what are these solutions that could meet the demand.

Jun 3, 2014

My Favourite Job-related Joke

Once there was a company and it had a computer system for its intensive operations. One day, out of he blue sky, the system broke down. The company froze - it could not do anything with it - no sales, no orders processed, no purchases. All the gurus from the IT department sweat over restoring the system, but nothing worked, even Google could not give an advice.

May 21, 2014

Great Online Course for Data Mining!

Data mining appeal for companies and analytic practitioners is growing by the day. So where should you start with it? Recently I have been evaluating data mining software and courses and I came across a very good one that I can recommend without any hold-backs. This is the MOOC organized by University of Waikato. MOOC stands for "massive open online course" but do not be fooled by the name - its a serious course that delivers right on the target.

May 14, 2014

The Raise of Data Scientist - Have We Seen That Before?

In a recent conversation somebody was very excited about the marketability of skills in R and similar tools as well as with the growing demand for people having them. The story went about the bright career perspectives - money, good position in the management hierarchy, fame and Aston Martins with Victoria Secret models in them. This person is not alone and his opinion is obviously backed by the growing number of job ads requiring R or similar skills. However, I beg to disagree because we all have seen something very similar and things developed differently.

May 12, 2014

Age of Miss America Correlated to Murders by Steam?

Age of Miss America And Murders by steam, hot vapors and hot objects

One of the dangers of too much data and too many "scientist" are the spurious correlations. These are correlations that happen purely by chance. If you have time series to explain and massive amount of data sets to test, sooner or later you wold find a meaningful correlation. I have posted about that some time ago but today a co-worker have sent a link with some excellent illustrations of the point and is too good not to share. Go to Spurious Correlations to see them all - some are very funny others are puzzling. I did know that somebody could die by becoming tangled in their bedsheets! Scary bedsheets!

May 9, 2014

Big Data Deja Vu?

I was running series of machine learning algorithms on few huge files the other day in search of some meaningful information. I was enjoying all the fun that comes with large volumes of data - painfully long processing times, slow response to any data operation and loud laptop cooler to mention a few. As I was optimizing memory usage and calculation time I had a deja vu about long time ago in a lab far, far way. Back then I was calculating big set of parameters from hundreds of physics experiments. The PCs had less computing and storage power than an entry-level smart phone and all the data operations and calculations had to be performed in a clever way in order to get something meaningful in your lifetime. Back at these times nobody talked about Big Data. It was probably because quite often the data was big. Of course, the ability for collecting large volumes of data was galaxies away from the powers we have today but still, there were many domains that amounted large volumes of data. It got me thinking. Going even further back made me realize that large data sets have been with us since the beginning of the computer era. Big data is defined in many ways (see Defining Big Data) but if we adopt the simplest definition we see that it. It seems our abilities to generate data always are one step ahead of our abilities to process all of it.

Apr 29, 2014

The p Value is Not One Number to Rule Them All

I have just seen an article that is too good not to be shared. It is published by Scientific American and discusses issues with the statistical significance and its effect on scientific results. It also lays out the alternative statistical methods that need to be taken into account. It is written in an a readable manner and has lots of links for the curious ones. I recommend it to everyone in data and analytics-related fields as well as everyone else interested in application of scientific methods. Find the article here: Statistical significance and its part in science downfalls. Enjoy!

Apr 28, 2014

Where Are We on the Big Data/Analytics Hype Curve and What Does It Mean?

We are all familiar with the new technologies hype-cycle curve as described by GartnerGroup. The question now is where are we on this curve and what would be a good strategy for development in both personal career plan and business-wise.

Apr 17, 2014

Tips for Training Co-Workers in Excel

A friend of mine said yesterday that knowing Excel was everything you need to know these days. It was a joke of course but it had an ounce of truth in it. It made me think again about the role of this application in the office life and even more on the way we acquire the skills. In my opinion the workplace is where we get the most of it and it is in the best interest of organizations to provide better environment for that. External Excel trainers are good only to an extent as they are usually far from the business context, not that flexible, do not have enough time outside the classes and are expensive. A better alternative is the in-house training by an employee with proven expertise. Another is hiring a trainer that comes from or still is in a business similar to yours to do a customized class. I would like to share with you some tips from my experience for a successful in-house Excel class. It is assumed that the class will be held by a non-professional lecturer. 

Apr 9, 2014

So You Think You Are THAT Good with Excel?

So you think you are an Excel guru with a ninja black belt skills? I hope you are! The good news is you could challenge the world and test your worth! There are few Excel contests that would give you the opportunity to do just that.

Apr 2, 2014

Things to Consider When Selecting a Data Mining Software

Data mining software package is not something we usually choose. It either already in place in the company or taught in the university or the decision is made on corporate level. However with the growing number of companies tapping into the power of analytics the question probably is asked a lot. I have recently been asked for an advice for a good data mining software and I came up with some points I would like to share with you.

Mar 27, 2014

Excel Tip: Dynamic Named Ranges

Named ranges are a great tool in Excel. It helps writing and reading formulas faster, make less range-related errors and if you are not using it in complex calculations, you should. Directions on defining and using them you could find here. The functionality is simple but you should pay attention to the names you assigned to the ranges - you want name to be meaningful and unique. Otherwise you would waste time checking what was behind that name and reduce its benefits.

Mar 20, 2014

The Do’s and Don’ts of Data Mining

Checklists and Business Intelligence-like "Ten Things You Need to Know" over a huge and general topics have dubious value or impact. This is probably because we tend to read them fast and between other stuff as well as it seems that we all know better everything included there. However, sometimes it is really worth to pause and reflect on some of the included topics. KDnuggets recently published an article about the do’s and don’ts of data mining that is a fine example of this. As you might expect, a practitioner would find lot of "I know that" there and often it is hard won knowledge. Still, it is worth to stop and think for while over the topics as in the daily hurry and pressing deadlines we tend to go in a frame and veer off the best course of action.

Mar 12, 2014

Pedestrians Are the New Threat to the Nature

The war waged by governments, mindless green-heads, NGOs and a selection of free-roaming lunatics on the global warming had a range of effects on society. One is the  new breed of eco-minded people could finally find financing to go the rainforest where they are bitten and die, get lost and die or take some pictures for their Facebook timeline and return safely home to die by an unknown parasite. Other consequence is that some rich people could get even richer by selling cheap movies or inventing a new trading papers. But my point is not about that. It is about the pedestrians and their found hate towards cars.

Mar 10, 2014

Defining Big Data

Would you be surprised if I told you there is not one broadly accepted definition for "big data"? Would you be surprised if I told you 99.99% of people using the term have never even tried to find a definition for it? I bet you won't. I know some people who use "big data" for spreadsheets with more than thousand rows, some other use it every time they talk about something related to data and have no idea what they are talking about - this seems to be a common case actually. Relying on gut feeling could be misleading. Two computer science students attempted to catalog  all the definitions out there. You can see the article at  arXiv:1309.5821

Mar 4, 2014

"Trouble with the Curve" and Some Lessons for Analytics

Yesterday I saw Trouble with the Curve starring Clint Eastwood and Amy Adams. It is a movie with a predictable plot and typical characters but still nice to watch. What is interesting is that unintentionally it makes a good point about applicability of data analytics and statistics in decision making process as well as reveals some hidden dangers of relying too much on it.

Feb 13, 2014

The P-Value Was Never Meant To Be Used The Way It's Used Today

I have just came across Scientific method: Statistical errors  published Nature's website. I found it a very good article about the misuse and misinterpretation of p-values and about scientific evidence in general. It says, among other things, that “the P value was never meant to be used the way it's used today.”. I was not aware of the history of introduction of this measure and it was interesting thing to read. I was not surprised by some of the points about the quality of scientific research as it has been covered a lot by Taleb, Goldacre and Ioanidis to mention a few but it was nice to see the connection between these two. The author suggest an approach to curb the p-hacking effects in science but considering the nature of science studies these day I do not believe it is applicable for the majority of research. There is other good thing to learn in the article, so please read it yourself.

Feb 12, 2014

Ban GDP as A Driver in Regressions!

If there is one thing in analytics that grinds my gears it is the overuse of GDP as a driver in regression. If it were up me, I would ask the government and academy of science to issue a book with all approved regression drivers and distribute it as a standard repository for all knowledge and wisdom there is related to regression. Then one copy of this book and one copy only would include GDP. This copy will be in Klingon and as a precautionary measure will be booby trapped with a potent explosive and buried 100m in the ground. On the Moon. On the dark side of it.

Jan 30, 2014

Apply This Method For A Better Life With Spreadsheets

If you are a regular user of Excel for calculations or model building, I bet you spend quite some time to prove the calculations are correct or locating the cell formula that is messing up your impeccable design. I had had my fair share of long hours in navigating from cell to cell and sheet to sheet in search of what-is-wrong. After my frustration had reached its critical mass I realized that the quality of my spreadsheets need to be built in the development process itself. Ever since I practice this belief and hours spent in debugging has been a very small part of the whole process. I will leave aside the errors related to the calculation or model design and I would like to focus on the ones typical for excel - wrong ranges, bad formulas and all others that have the potential to ruin otherwise brilliant spreadsheet.

Jan 22, 2014

Friday Funny: How to Look Smart Any Place Any Time

Even a dachshund could look smart!
These days the pressure to look smart in everything, everywhere all the time is greater than ever. There are also all these moments when we have to or want to look very smart or as if we give a damn but we are bored out of our mind or the head is occupied by nothing but an enormous emptiness. Sometimes even glasses and the latest of mobile technology could not help us out. I may have a solution for this! It does not require purchase of any Apple products, clothing or facial alterations. It is a simple trick that does not requires any specific skill or knowledge.

Jan 20, 2014

How to Fail an Analytical Project

While there are many ways to a succeed in a project the roads to its failure seem to be very few. I would like to start a discussion of the key DON'T-s in specific area of the analytical projects. What are the things to consider before starting the project, avoid and watch for during the project? Read further for some answers.

Jan 9, 2014

The 10 Biggest Myths In Economics - №10 - Economics is a Science

Business Insider has a worth-reading article today - The 10 Biggest Myths in Economics. There are some good points and myths-busting. Number 10 is Economics is a Science. It goes :

"Economics is often thought of as a science when the reality is that most of economics is just politics masquerading as operational facts.  Keynesians will tell you that the government needs to spend more to generate better outcomes.  Monetarists will tell you the Fed needs to execute a more independent and laissez-fairre policy approach through its various policies.  Austrians will tell you that the government is bad and needs to be eliminated or reduced.   All of these “schools” derive many of their understandings by constructing a political perspective and then adhering a world view around these biased perspectives.  This leads to a huge amount of misconception which has led to the reason why I am even writing a post like this in the first place.  Economics is indeed the dismal science.  Dismal mainly because it’s dominated by policy analysts who are pitching political views as operational realities."