A very neat little trick to compare tables of data in Microsoft Excel to reveal differences, is to highlight the data (all rows and columns containing data), and Define a name for the data, then use an ODBC Driver for Excel to subsequently query the tables as if they were a database.
Each tab in the Excel Workbook can contain a different table of (similar) data and the tables can be linked in Microsoft Query Editor so that SQL queries can be performed against the linked tables to reveal all rows of data where the same information exists in two separate columns of two separate tables, or even more useful (in my experience) the rows of data where the data that exists in one column of one table doesn’t exist in the other.
In Windows 7, the Data Sources (ODBC) management console will only show drivers for SQL Server, and not the list of different drivers seen in previous versions of Windows.
In order to create an Excel Data Source, you need to close the 64-bit version of the management console and run the following instead..
In the 32-bit odbcad32.exe (shown below), upon clicking Add, you’ll see the long list from which you can choose Excel Driver and point it at your spreadsheet that has the defined data within it.
Note, if you intend to use this method to query the data in multiple Excel tables using the Query Analyzer, you will have to save a copy of the workbook in the older Excel 97-2000 .xls format before you create an ODBC Data Source for it.
Upon returning to Excel, Create a new tab in the same workbook or create a new Workbook if you like, and Open Query Analyzer as shown…
Choose the Data Source Name (DSN) created using the 32 bit ODBC dialog, choose what tables (Definitions) you wish to include in your query and manually join the tables containing the similar data, then edit the SQL statement as required (usually just changing = for <> produces the sort of results I’m looking for, i.e. differences between the two columns, not similarities.
My personal preference when it comes to comparing columns of data is to export the columns to separate text files, WinSCP the text files onto a Linux/UNIX machine, then use cat | sort | uniq on each file, then comm (not diff) to perform the comparison and show entries in one but not the other of the two files being compared. I’ll endeavour to cover this method in another post to compliment this Windows 7 oriented post.
[paypal-donation]
Dear Matthew,
Thanks immensely for this your article. Please I am trying to set up odbc to allow me query excel table as part of my class work. So when I saw this your work, I thought it will resolve the issue but I ran other complications. Like you rightly pointed out, the excel drivers I need is not listed in the data sources management console in my windows 7. So I closed it as you advised and used the C:\windows\sysWOW64\odbcad32.exe but when I click add in the odbc data source administrator, the create new data source dialog box that comes up does not contain a long list of drivers. The first impression I had was that probably the excel drivers I need are simply not installed in my system. However the drivers are there in the 32 bit area of the system. How do I get these drivers to start showing up in the create new data source window as you illustrated? Thanks immensely for your help.
Best regards,
Anthony
Hi Antony. I don’t have a Windows 7 machine to hand, so I can’t practically go through the motions of solving your problem for you – I prefer to do it for real, then document it knowing my advice is sound. However, using “Everyones favourite Search Engine”, I can tell you that the official response from Microsoft to your problem is to be found here. Good luck. It’s a pain isn’t it?
http://support.microsoft.com/kb/2721825/EN-US
Best regards
Matt Bradley
Thanks Immensely Matthew! I will follow the link and later let you know of the outcome. Many thanks again for your help.
Regards,
Anthony