The guide was written in a specific way that a junior SQL Server DBA can learn from and it even touches some advanced features as well.
First of all, I would like to thank you for reading this guide. The guide was written in a specific way that a junior SQL Server DBA can learn from and it even touches some advanced features as well.
This guide is written like a mountain in which you climb from the base and up. There may be some subjects that were considered out of scope for this guide as they deviate from the focus. If you find any missing piece, I am sure that MSDN is the right place for you to look for. For the examples of this guide, I am using SQL Server 2014 (Enterprise Edition) and this guide refers to the features existing on this edition and version. Basically, I kept things simple without putting you through all of the bit and bytes but you can expect some from time to time. We will be going through the basic SSISDB architecture, it`s pros and cons and some other specifications. Then, we will see how we can try to avoid future mistakes and how can we monitor our activities. Along the way we will be focusing on debuging SSISDB errors because… errors will always happen. A good virtue for a DBA is to prevent future mistakes and errors while a great virtue of a DBA is know how to bounce back and repair the problem. Thus, at the final part we will be focusing on the debugging part.
Given that SSISDB is relatively new to the database engine on SQL Server, I have several expectations from you, the DBA. I do expect you to fully understand what ACID is as it affects the performance and the errors produced from running an SSIS package. The most common example it a dead-lock, which is a pure result from a good ACID mechanism. I won`t dive into the ACID mechanism right now but you can easly find extensive learning materials on MSDN. Plus, you need to know to “workaround” the SSMS as most of the actions made are through the SSMS grphical interface. That being said… let`s go and dive into the real interesting stuff. Hope you enjoy.
The first step for climbing the moutain that this guide offers, is to understand the basic architecture of the SSISDB. Although it is stored on the database engine, SSISDB has a diferent archtecture. Actually the SSISDB is split into two main components. The first one is the SSISDB database, which is a database like any other one which has its own MDF and LDF files like any other database in SQL Server. It is bound to the same rules as any other database (such as the recovery model, compatibility level etc.) and therefore it has its virtues as any normal database (such as backups etc.). The second part os the SSISDB catalog itself which is seen as the “Integration Services Catalogs” in the SSMS (we will get to this part later). The first important this to know is that the database containing the SSISDB is called “SSISDB”. It will apear in the databases part of the database engine and as we already said, it is a database just like any other one. The database contains the system tables required for the normal operations of the integration services running the SSIS packages deployed. The SSISDB holds the metadata for the package deployment and execution. Have no mistake, these tables have nothing to do with the SQL Agent metadata that might execute the SSIS package. The metadata for the SQL Agent is stored only on the MSDB system database. There is no direct relation between the metadata stored on MSDB and what is stored on SSISDB. As I mentioned before, I won`t get you into all of the bits and bytes, but there is one important fact we need to know about the SSISDB. It has two schemas, called “catalog” and “internal” which can be managed for security purposes. If you will look closer, you will see that the tables are stored in the “internal” schema and the metadata views are created on the “catalog” schema. The general reason is the permissions to write or read-only from the database. It is expected that you, the DBA, would want to read some metadata but it is not expected that anyone other from the system would write or update some data. Therefore, you can assign permissions to the “catalog” schema to enable metadata investigations.
As any other database, and as said before, you MUST have backups of it. This database stores everything there is about your SSIS actions, history, metadata and any other information. Even the SSIS packages, are stored inside this database in a binary format. That means that if you lose you database, you also lose your deployed packages. Altough you can backup the database using a full backup and then take a differential one, it is recommended for this database only that you will always take full backups. The main reason behind that is that if a transaction of a package deployment is undergoing on the time of backup, and the restore would fail (if it uses roll forward) you will lose the entire package definition. More then that, you may even get your database into a prolonged restore mode, disabling all other operations. My general recommendation is to use the simple recovery model with this database and then take full backups from time to time.
In past versions of SQL Server (such as 2008), you could deploy a single package into the catalog without all of its wrapping. Today (SQL Server 2014), the deployment model as changed in order to acoomodate a full project of visual studio. The main reason behind this change is the ability to make the migration from development to deployment as simpler as could be for SSIS developers. When creating a SSIS solution on visual studio, you can create/change many objects in that side such as packages, parameters and resources. In order for you, as the developer, to better pack your project, the outcome of the project building process is the generation of a file with the extention “.ispac”. This file, which is actually a simple compression file, contains anything you need to deploy it to the SSIS catalog. It contains all of the packages, the connection managers and the package/project level parameters and variables.
If you need to see or extract the files in the the “ispac” file, you can always just change the extention of the file to “zip”, and then extract the files and then import the packages to a new visual studio project.
After we figured out the past to present, and the general layout of the SSISDB, lets examine the second part of this guide, which is the integration services catalog.
When you first login to the database server using the SSMS, you will see a tree item called “Integration Services Catalogs”. This tree item will contain the objects of your SSISDB database. This tree item is a visual and managed reflection of all your metadata in the SSISDB database. It is not automaticaly created and you need to create it from scratch or import (restore) and existing one. Always remember that each SQL Server instance can hold only one SSISDB. You can not manage two catalogs on the same instance.
The integration catalog has several object types and a main hierarchy. Lets take a look on the general structure of the catalog so we can better understand it:
Integration Catalog: Folder
As you can easly see, the first object that can contain/own other objects is a folder. The folder is a logical group of projects (ispac files) that you can manage with permissions. The whole idea is to give permissions to the folder and then those permissions would be inherited by the projects. For example, the BI manager would have full permissions for a folder called “BI”… giving him all of the permissions needed to manage the projects under this folder`s scope.
Before diving to the project, you probably notices that there are two entities living inside a folder. The first one is the project and the second one is the environment (we will dive to both of them shortly). It is important to know that the environment is not a part of the project and it is an independent entity in the catalog. The project is the main feature of everything and this is the part that is executed (well… a package from the project but we will get to it). Do you remember the outcome from the visual studio that you got from your developer ? that “ispac” file ? the project entity is the outcome of the ispac deployment process. The deployment process takes all of the packages (which each one includes its own connection managers) and the parameters definition which include the project level parameters/variables and places it inside a project leaf (in the SSMS tree). The project name would be the same as the project name that the developer gave it in the visual studio. One of the most important issues to remember with the project deployment process is that it overwrites anything the last project contained. That means that if your project contains three packages and you are deploying the same project again, but you removed one package at the development process, your new project would not have the third one. It is always recommended to take backups of the project before deploying the new one. To export the project, right click on it and choose “Export”. That would generate an ispac file for you. Another important entity in the project, and the only one which has nothing to do with visual studio (and that`s where you, the DBA, needs to take care), is the environment reference. Each project can reference an environment in order to use its values. A project can reference more than one environment and it your permissions allow it the project can refer to environments outside of the project`s scope/folder. Generally speaking, permissions are subject to change from time to time so it would be best to keep the values you need in your own environment so no future problems raise.
The environment entity is a kind of a list of names and values. Try to think about it as a “key-value” pair list. It contains variable names and their pre-defined values. For example, a variable name might be called “PhoneNumber” and its value can be “111-222-333”. The whole meaning of the environment is to provide consistent values to the project and to ease deployments between development environments (that means DEV / QA / PROD). The environment variables can overwrite any project variables/parameters. Finally, it is important to metion that when deploing projects that already exists in the SSISDB, the project`s reference to the environment is not overwritten. If the newly deployed project contains variables that did not exist in the previous version, these variables would not have any “key-value” name (environment variable) assign to them.
The first thing I learned about the SSISDB administration is to see the currently active packages that are running and I also believe that this is the right place for you too. Just before we start I would like to remind you again that the topics in this guide will not cover the SQL Agent job monitoring as it is out of this guide`s scope. The active operations window shows all SSIS packages that are currently running. Right click on the SSISDB and choose “Active Operations”. You will see a new window that lists the running packages. The “Id” column in the list is the execution id given for this runtime only by the SSISDB internal engine. The “Object Name” column shows the name of the package running (keep in mind that two different project can contain packages with the same name). The “Start Time” is the start time that the package was executed (according to the server`s clock). The “Caller” is the user that executes the package. You can always stop a running package by clicking the “Stop” button. That would stop the package in the currently selected row.
Again, not covering the SQL Agent here but there is one very important rule we must know. Generally, you would create agent jobs to enhance regulation and automation. Keep in mind that if you right click the job and run it… it will alwas run with the windows user that runs the agent service and not with your own user eventhough you are currently connected to the SSMS. When you right click on a package (in the integration services catalog) and choose “Execute” the user that runs the package is you. So, that being said, just keep in mind the permissions issues all the time.
The integration catalog provide a powerful tool to see if you are on the “right track” or not. You can always validate a package before you execute it. Although the validate would not “catch” all of the errors that might happen along the way but it would surely give you a good clue as to what might go wrong. The validation process checks and validates all of the metadata against their targets and reports mismatches. It also checks the existance of columns against queries but it would not alert you (in the validation process) that the data type is invalid. Always validate your packages before you execute them… as said before, it would give you a good direction if something is supposed to go wrong.
The integration catalog, stores all validations and executions logs so you can always refer to them. You can manage and define how long would the catalog store the log data (measured in days). Right click on the SSISDB inside the integration catalog (and not the SSISDB database) and have a look at two important properties. The first one determins if log clearing would take place at all, and is called “Clean Logs Periodically”. If this is set to true, then the older then the number of days defined in the “Retention Period (days)” peroperty would be deleted. Generally speaking, all validations and all executions create logs in the SSISDB. To view the validation log, right click the package and select “Reports All Validations”. To view the execution log, right click the package and select “Reports All Executions”.
WHAT IS THE MEANING OF SQL SERVER BI (BUSINESS INTELLIGENCE)?
In the latest release of SQL Servers (SQL Server 2019), Microsoft added a range of tools and services that enhanced its compatibility with Azure.