Importing Cleansed Values into a Data Quality Services Knowledge Base

Data Quality Services is a new tool in SQL Server 2012 that helps cleanse data. At the heart of this product are knowledge bases. Knowledge bases are a repository of managed information with correct data values. Once you have a good working knowledge base in place, you can then run ‘dirty’ data against the knowledge base. Fuzzy logic is applied to try and correct, or cleanse, the unclean data and you are presented with suggested clean data which you can approve or reject.

After cleansing data against a knowledge base, the question usually arises. If there are new values in the ‘dirty’ dataset which I manually corrected/approved and are not in the knowledge base, does it automatically get added to the knowledge base? The short answer is no, it doesn’t. While it sounds like a great idea to just keep automatically adding these values to the knowledge base, in reality it’s supposed to be more of a managed process.

Even though new values aren’t automatically added to the knowledge base, DQS does provide the ability to import newly cleansed values into the knowledge base afterwards. Let’s take a look at how this is done. First, select Domain Management on the Knowledge Base.

DQS Domain Management

Select the desired Domain and on the left side. Now, near the top right grab the drop down on the ‘Import Values’ button. You are presented with two choices; Import project values, and Import valid values from Excel. The first option allows you to import values from a previous cleansing project. As stated, the second option allows data to be imported from Excel.

DQS Import Project Values

In our example, we will import values from a previously ran cleansing project. Simply select the desired project from the list.

DQS Import Values From Project

The next screen will present new values which may include Correct values, Errors, and Invalid values. All of which will continue to add value to the knowledge base. In this screenshot, we see that there will be two corrected values (Errors) and one new value (Correct) added.

DQS Imported Values

Click finish to close the import box and let’s take a look at what happened. After looking at the Domain values, we can see that our values have indeed been added and therefore helped improve the knowledge base.

DQS Imported Values Complete

Knowledge bases are very important when cleansing data in DQS. In this example we saw how to continually improve knowledge bases by importing cleansed data. As your knowledge bases continue to evolve, you should also see the quality of cleansed data improve over time.

Advertisement

2 responses to “Importing Cleansed Values into a Data Quality Services Knowledge Base

  1. Jeff,
    Excellent article. What I like about your example: short, simple, right to the point and very visual.
    Way to go 🙂
    Thank you very much for taking the time to put this together.
    David Kats

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s