SQL Saturday #175 Fargo – Recap

Last Saturday was the very first SQL Saturday event in Fargo, ND. It was a great success with about 125 attendees. Thanks to everyone who helped organize the event, the speakers, and to everyone who attended. Also a big thanks to everyone who attended my session. The board room style was a first for me but was kind of cool! I felt like an executive running a meeting. 🙂

This was also a SQL Saturday first because we had a special guest appearance by the man himself, Mr. Bill Gates!!

Bill Gates

Well in all honesty, he was actually in the same building to talk to the Microsoft employees. However, they were kind enough to let us hear him speak. How cool is that?!

The event was held at the Microsoft Executive Briefing Center which was extremely nice. The Microsoft employees onsite, such as Kelly Obach, who helped with the event were also super awesome.

Microsoft Executive Briefing Center Fargo, ND

All in all, a fantastic time. There was talk of having it again next year so hopefully I will be back again soon.

SQL Saturday #189 Costa Rica – Recap

Last Saturday was the SQL Saturday #189 Costa Rica event and what an awesome event! Thanks to everyone who helped make this such a great event; the organizers, the sponsors, the attendees and the speakers. Being well into April, it was quite a shock to come back to Minnesota and receive another 9 inches of snow. I should have stayed in Costa Rica a bit longer. 😀

Friday night the group went out to a restaurant called Mirador Tiquicia in Escazu. It was a bit of a drive and up a very steep hill, but it was worth it! At the top it had an amazing view of the entire city of San Jose. Here is a glimpse of the view with my wife and I.

city lights of san jose

For dinner we had a delicious buffet of traditaional Costa Rican food. Then, traditional Costa Rican dancers performed in the middle of the restaurant. What a great night!

Costa Rican dancers

On Saturday, it was time for the big event. There were 320 registered, and 257 who attended. The conference took place at the Aurola Holiday Inn in San Jose. Here you can see Eduardo Castro making the welcome announcements. In this picture he’s introducing the speakers for the day. I’m on the bottom of this slide.

sql saturday costa rica welcome

My 2 sessions were in the afternoon so I mingled and attended the morning sessions. Then it was off to lunch which was on an upper floor of the hotel. Again, we were treated to a nice view of San Jose and a great lunch. Here are some pictures of San Jose from the dinning area.

San Jose

San Jose

After a short break, it was then time to get to work. Here is me getting ready for my first session.

Getting ready for my sessions

I first presented on Master Data Services, and then on Data Quality Services. Thank you to everyone who attended my sessions and for being such a great crowd. The questions were good and hopefully everyone left knowing a bit more about MDS and DQS.

presenting on master data services

I had such a great time and met so many fun people; Eduardo Castro, Kenneth Ureña (Thanks for the coffee it was awesome!), Joaquin Zuñiga, Kevin Boles, Luis Carlos Diaz, Alex Vargas, Jesus Gil, Russell Fustino, and many others! Latin America has a great SQL Server community. Thank you for letting me be a part of this event. Hopefully I’ll be able to attend again next year!

Comparing Real and Float Data Types to Decimal

On projects in the past I’ve been in situations where I needed to decide between using a approximate data types (Float and Real) or a precise data types (Numeric and Decimal). If you look up the descriptions of these on msdn you will find that decimal and numeric data types are functionally equivalent and have a fixed precision and scale. This means that during design you must anticipate and define the maximum total digits that will be used, as well as the maximum decimal places to use. While this may be perceived as a potential downfall by not being as flexible, they do make up for this in the fact that they are very accurate.

On the other hand you have float and real which are quite flexible. You do not need to define the precision or scale. However, they are not nearly as accurate. But that is where the question arises; what does it mean to not be accurate? Let’s take a look.

Let’s run the query below and take a look at the difference it makes on aggregations.

use AdventureWorksDW
GO

select sum(cast(SalesAmount as float)) as SalesAmount_Float
, sum(cast(SalesAmount as real)) as SalesAmount_Real
, sum(cast(SalesAmount as decimal(15,4))) as SalesAmount_Decimal
, FloatDifference = sum(cast(SalesAmount as decimal(15,4))) - sum(cast(SalesAmount as float))
, RealDifference = sum(cast(SalesAmount as decimal(15,4))) - sum(cast(SalesAmount as real))
from FactInternetSales

select sum(cast(TotalProductCost as float)) as TotalProductCost_Float
, sum(cast(TotalProductCost as real)) as TotalProductCost_Real
, sum(cast(TotalProductCost as decimal(15,4))) as TotalProductCost_Decimal
, FloatDifference = sum(cast(TotalProductCost as decimal(15,4))) - sum(cast(TotalProductCost as float))
, RealDifference = sum(cast(TotalProductCost as decimal(15,4))) - sum(cast(TotalProductCost as real))
from FactInternetSales

Here are the results.

As you can see the float and real values are are indeed different when compared to the decimal values. Keep in mind that this is a relatively small amount of records (60,000) and the more data you have, the larger the variance will be. In our original data, the values only have a maximum of four decimal places. However, on our query the float and real values somehow end up with either seven or eight decimal places.

Ok, let’s have some more fun.
In this query, we will pick some values and add them up to see what happens.

DECLARE @TestValue1 as nvarchar(80)
DECLARE @TestValue2 as nvarchar(80)
set @TestValue1 = '1111.1114'
set @TestValue2 = '1111.1114'

select
cast(@TestValue1 as float) + cast(@TestValue2 as float) as FloatSum
,cast(@TestValue1 as real) + cast(@TestValue2 as real) as RealSum -- drops a decimal value and rounds
,cast(@TestValue1 as decimal(15, 4)) + cast(@TestValue2 as decimal(15, 4)) as DecimalSum


On this query, you can see that the real data type starts to quickly fall apart. It drops a decimal place and rounds up.

On this query we will add to the precision and scale to see what happens.

DECLARE @TestValue3 as nvarchar(80)
DECLARE @TestValue4 as nvarchar(80)
set @TestValue3 = '111111.11111'
set @TestValue4 = '111111.11111'

select
cast(@TestValue3 as float) + cast(@TestValue4 as float) as FloatSum
,cast(@TestValue3 as real) + cast(@TestValue4 as real) as RealSum
,cast(@TestValue3 as decimal(15, 4)) + cast(@TestValue4 as decimal(15, 4)) as DecimalSum

On this query, float is somehow adds another decimal value. Real is also having issues. Somehow we lost 3 decimal places on the real value.

Alright, let’s try one more. This time we will change the values a bit.

DECLARE @TestValue5 as nvarchar(80)
DECLARE @TestValue6 as nvarchar(80)
set @TestValue5 = '1111111.44444'
set @TestValue6 = '1111111.44444'

select
cast(@TestValue5 as float) + cast(@TestValue6 as float) as FloatSum
,cast(@TestValue5 as real) + cast(@TestValue6 as real) as RealSum
,cast(@TestValue5 as decimal(15, 5)) + cast(@TestValue6 as decimal(15, 5)) as DecimalSum


On this query, we found the tipping for real as we lost all decimal places and it decided to round up again. However, float managed to pull through successfully.

As you can see your mileage may vary with float and real but if you are looking for accurate calculations you would be well advised to use either decimal or numeric!

Importing Cleansed Values into a Data Quality Services Knowledge Base

Data Quality Services is a new tool in SQL Server 2012 that helps cleanse data. At the heart of this product are knowledge bases. Knowledge bases are a repository of managed information with correct data values. Once you have a good working knowledge base in place, you can then run ‘dirty’ data against the knowledge base. Fuzzy logic is applied to try and correct, or cleanse, the unclean data and you are presented with suggested clean data which you can approve or reject.

After cleansing data against a knowledge base, the question usually arises. If there are new values in the ‘dirty’ dataset which I manually corrected/approved and are not in the knowledge base, does it automatically get added to the knowledge base? The short answer is no, it doesn’t. While it sounds like a great idea to just keep automatically adding these values to the knowledge base, in reality it’s supposed to be more of a managed process.

Even though new values aren’t automatically added to the knowledge base, DQS does provide the ability to import newly cleansed values into the knowledge base afterwards. Let’s take a look at how this is done. First, select Domain Management on the Knowledge Base.

DQS Domain Management

Select the desired Domain and on the left side. Now, near the top right grab the drop down on the ‘Import Values’ button. You are presented with two choices; Import project values, and Import valid values from Excel. The first option allows you to import values from a previous cleansing project. As stated, the second option allows data to be imported from Excel.

DQS Import Project Values

In our example, we will import values from a previously ran cleansing project. Simply select the desired project from the list.

DQS Import Values From Project

The next screen will present new values which may include Correct values, Errors, and Invalid values. All of which will continue to add value to the knowledge base. In this screenshot, we see that there will be two corrected values (Errors) and one new value (Correct) added.

DQS Imported Values

Click finish to close the import box and let’s take a look at what happened. After looking at the Domain values, we can see that our values have indeed been added and therefore helped improve the knowledge base.

DQS Imported Values Complete

Knowledge bases are very important when cleansing data in DQS. In this example we saw how to continually improve knowledge bases by importing cleansed data. As your knowledge bases continue to evolve, you should also see the quality of cleansed data improve over time.

Installing Master Data Services in SQL Server 2012

Master Data Services provides a central location where you can store non-transactional information. This repository serves as the primary source for accurate data within your organization. In this post I will show you how to install Master Data Services for SQL Server 2012. First launch the ‘Master Data Services Configuration Manger’ from the program menu.

MDS Config Manager

You will be presented with the main screen like the one shown below. If either of the two prerequisites, IIS and PowerShell, don’t check out you will need to install them before continuing with the installation.

MDS Config Manager Summary

Once you have PowerShell and IIS installed, select Database Configuration from the menu on the left.
On this screen you will be presented with two options. Create a new database, or select an existing MDS database. If you later decide to modify some settings on your MDS database, you can select your existing database from this screen, make your adjustments and then apply your changes. Since we are going to show a new installation let’s go ahead and select ‘Create Database’.

MDS Config Create Database

Now you will be presented with a Create Database wizard with several steps. First, we need to connect to the SQL Server instance where we want to install the MDS database. Select your authentication preference, User name and Password to connect. Click the Next button.

MDS Database Wizard Connect to Server

On the ‘Database’ tab of the wizard, enter a Database name. You can also set some settings such as collation. Once you are satisfied with the name and settings, click Next.

MDS Database Wizard Database Name

The next screen is where you enter the User name for the Administrator Account. Note: Once you click the Next button the wizard will try to authenticate the entered user account to a domain controller. You will need to make sure you can access the domain controller. If you cannot, it will provide an error and you cannot continue with the install.

MDS Database Wizard Administrator Account

Next is the ‘Summary’ screen, and finally the ‘Progress and Finish’ screen.
If everything checks out, you should see a success page like the one below.

MDS Database Wizard Success

Click Finish and you will be brought back to the Database Configuration section. However, this time you will now have a database and in the System Settings section you can make adjustments. Simply make your changes and click the Apply button.

MDS Selected Database

Next, we will configure IIS which will serve as one of the primary interfaces for MDS. Click on the Web Configuration option on the left. On this screen you can match up an IIS website with an MDS database. First we need to find a suitable IIS website for MDS. From the dropdown you can either select an existing website, or create a new one. Since we are doing a new install, we will select ‘Create new website’.

MDS Web Configuration Screen

Select a Website name and port. If MDS is the only website you plan on hosting from your server, you may want to leave it on port 80. If you host multiple sites, or are testing, feel free to enter in a new port number. In our case we are going to select port 991.

MDS Create Website

You can leave Protocol and IP Address as the default. However, you will need to add values to the Application Pool area. Select a name for your application pool, and provide a User name and Password. Once finished, click the OK button.

You should now have a Website listed. You should also have a Database listed under the ‘Associate Application with Database’ section. If you do not, simply click the Select button and find your MDS database you just created. You should now see a screen similar to the one below with an active ‘Apply’ button available to you. If you do, excellent, go ahead and click apply. This is where the leap of faith comes in. It can take about a minute to finish the install and it doesn’t give you a progress bar.

MDS Install Finish

After patiently waiting, you should finally be presented with a webpage like the one listed below. This is the Getting Started page. It offers some helpful links to get you going.

MDS Getting Started Page

Once you are ready to dig in, click the first link on the top called ‘Open the Master Data Manager home page’.
This will bring to you to the actual MDS home page where you can start working with MDS. I will cover those areas in more detail in future posts.

Even though the Getting Started page comes up, you will still have the last screen on the MDS installation open. If you have Data Quality Services installed and want to link it with MDS, this would be the time to do so. You would simply click on the now highlighted button called ‘Enable Integration with Data Quality Services’. More details about connecting MDS and DQS can be found on my other blog post here.

When you are done with the MDS configuration manager, simply click Exit and you’re ready to start using MDS!

Using the ‘Quick Analysis’ feature in Excel 2013 for some quick, yet powerful, analytics.

Last November while I was at PASS Summit, one of the presenters used Excel and added some data bars right on top of the data. I thought, ‘Wow, that’s cool!’ This is a new feature in Excel 2013 called ‘Quick Analysis’. This handy little tool offers some really quick and easy, yet powerful ways to visually analyze data. Let’s take a look at what it can do.

First, pick a set of data to analyze. For this example we are going to analyze some January metrics over a five year span. Highlight the actual data and not the headers. Notice on the bottom right there is a little icon. If you hover over it, you will see that it is the Quick Analysis tool. You can also use the Ctrl + Q shortcut.

quick analysis selecting data

The first menu option that pops up is the Formatting tab which has some neat features. The first option on the left is Data Bars. This gives a nice visual bar chart representation of each data value.

quick analysis data bars

The next option is Color Scale. This colors the data with red representing the low values, white for mid-range values, and green for high values. With a quick glance at the data we can easily see which values are high and which ones are low.

quick analysis color scale

The third option is Icon Set. KPI icons are shown such as a red arrow pointing down for low values, a yellow right-facing arrow for mid-range values, and a green arrow pointing up for high values.

quick analysis icon set

The fourth option is called Greater Than. This option highlights in red some of the higher values.

quick analysis greater than

Finally, the last formatting option is the Top 10%. This highlights in red the top 10% values.

quick analysis top 10 percent

Now let’s take a look at some of the other tabs in Quick Analysis. Let’s jump to the last tab called Sparklines. On this tab you have three options: Line, Column, and Win/Loss. Line shows a sparkline to the right side of your data and indicates over the data the trend, up/down/steady. In this screenshot we selected Column which shows a mini column chart to the right and indicates your data trend.

quick analysis sparkline columns

If we jump over to the Totals tab we can quickly add some totals to the bottom of our data such as Sum, Average, Count, % Total, Running Total, and more. In this example we are adding a % Total.

quick analysis percent total

As you can see our bland set of data is quickly turning into something we can easily analyze. To top it off why not add a quick chart? For the chart select all of the entered data including headers. Now on the Charts tab, we are presented with options such as Line, Stacked Area, Clustered Column, Stacked Column, etc. We are presented with a preview.

quick analysis line chart

Once selected, it adds the chart to the worksheet. The screenshot below shows the final product. One last tab that we didn’t show is Tables. There are two available options on this tab; Table and Blank PivotTable. Table in our case just adds a filter on our table headers. The second option, Blank PivotTable, allows us to create a PivotTable on a new worksheet with our data. One last thing to note is that you can stack all of the Formatting options. For example, you can do a Color Set with Data Bars and the Icon Set. It starts to get a little cluttered, but at least you have some options.

quick analysis final

Let’s review what Quick Analysis just helped us do. We can tell annual growth % from the totals on the bottom, the sparkline tells us the annual trend by day, the color formatting let’s us easily identify high and low values, and the chart gives us a great visual comparison of all of the metrics together. Talk about some powerful visuals, and it only took about a minute to do! Quick Analysis is a powerful new tool in Excel 2013 that provides a quick and easy way to analyze data.

Adding a 12 Month Trendline in Excel 2013

Trends over time can be a great way to track business health and goals. It’s common to track such metrics on a daily basis throughout the year. Week by week you may have several peaks and valleys. Adding a trend line is a good idea to help smooth out the spikes and better analyze the trending values throughout the year. In this post, I will show how to add a trend line.

First let’s start with collecting data. You should have a column for each day of the year, and another column for metric values you want to track.
Trend Line Data

Next, highlight all of the data values but do NOT highlight the column names. On the ‘Insert’ tab, click on the ‘Line Chart’. Your chart should like the one displayed here.

Insert Line Chart

Now that we have a line chart, let’s add the trend line. Right click on the chart line and click ‘Add Trendline’.

Add Trendline

The Trendline Options are now displayed. There are three tabs to work with; ‘Fill & Line’ (Left), ‘Effects’ (Middle), and ‘Trendline Options’ (Right).
We’ll take a look at each tab, but let’s first start on the right most tab (Trendline Options). Since we are adding an annual trend line, select the ‘Moving Average’ option and set the Period to 12 for 12 months.

Moving Average Trendline

Next, let’s jazz up the line a bit. Click on the left tab for ‘Fill & Line’ options. Here you can select all kinds of line options such as style, size, and color.

Trendline line options

The middle tab is for ‘Effects’ where you can add things such as glow, shadow, and soft edges. A couple of nice ones to set here are the Presets and Color. On Presets, select the top left Outer option called ‘Offset Diagonal Bottom Right’. On Color, select Black to give a nice drop shadow effect when combined with the above mentioned Presets option.

Trendline effects

After you are done formatting the trendline to your liking, simply close the Format Trendline options and you should now have a nice annual trendline.

Annual Trendline

SQL Saturday #175 – Fargo, North Dakota

SQL Saturday #175

Saturday, April 27th I will be presenting at SQL Saturday #175 in Fargo, North Dakota. This event will be held at the Microsoft Executive Briefing Center at 4550 42nd Street South, Fargo ND, 58104.
I will be speaking on the following topic:

Data Quality Services
As the old adage goes, ‘Garbage in, garbage out.’ Stop spending hours manually cleansing your data, and start using DQS! DQS is the new data quality tool provided with SQL Server 2012. By leveraging knowledge bases, you can correct invalid data and improve the quality of your data.

In this session we will cover creating a knowledge base, data cleansing, data matching, and using 3rd party knowledge bases (Reference Data). We will talk about installation, show how to use the DQS tool, and how to integrate DQS into your SSIS packages. After this session you should have a good understanding of what DQS is, how to use it, and how to get started with your own Data Quality initiative.

Check out the SQL Saturday #175 page for more information.

See you at the event!

Bienvenidos al SQL Saturday #189 – Costa Rica

SQL Saturday #189

I’m excited to announce that I will be speaking at SQL Saturday #189 in beautiful Costa Rica! This event will be held on April 13th, 2013 at the Aurola Hotel Holiday Inn San Jose Downtown.
I will be presenting on the following two topics:

Master Data Services
Does your company have similar data in multiple locations? Not really sure which set of data is accurate? With MDS, Microsoft gives you the ability to have one accurate source of non-transactional data to use with your applications or send to other databases and ETL processes. Leverage data stewards to help manage the data repository to ensure it is up-to-date and accurate. In this session we will cover MDS from start to finish including installation, creating models/entities, working with data, using the Excel add-in, security, hierarchies, views, versions, and business rules. After this session, you should have a good understanding of what MDS does, how to use it, and how to get started with your own MDS project.

Data Quality Services
As the old adage goes, ‘Garbage in, garbage out.’ Stop spending hours manually cleansing your data, and start using DQS! DQS is the new data quality tool provided with SQL Server 2012. By leveraging knowledge bases, you can correct invalid data and improve the quality of your data.

In this session we will cover creating a knowledge base, data cleansing, data matching, and using 3rd party knowledge bases (Reference Data). We will talk about installation, show how to use the DQS tool, and how to integrate DQS into your SSIS packages. After this session you should have a good understanding of what DQS is, how to use it, and how to get started with your own Data Quality initiative.
………………………………….
Last November I was fortunate enough to meet several chapter members at the PASS Summit event in Seattle. I look forward to the opportunity to meet again in San Jose this April.

Check out the SQL Saturday #189 for more information.