SSIS Variable Scope - Some clarifications
I recently had an email exchange with Allen Mitchell, SQL Server MVP.
He had some questions about the way variable scoping works in packages. I figure, if he's asking questions about it, likely there're a few others unclear on it as well.
So, here's a quick primer on the way variable scoping works in IS packages and then a quick tip on a feature that you can use to simplify package factoring. But first, a review. Currently, in DTS 2000, if you want to create a temporary place to store information for the package, what's the first thing you do? Well, likely, you create a global variable. As your package gets more complex and the number of global variables increases, it's sometimes difficult to remember which ones are for what part of the package, etc. It can get confusing and difficult to manage.
SSIS places even more emphasis on variables because tasks may no longer view the package object model. So, we anticipate even more variables being created in packages to manage the bits of information between tasks etc.
So we've added variable scope to packages. Packages are containers as are task hosts, sequences, for loops and foreach loops. You can create variables on each of those containers just like you do on packages. Click on a task, create a variable in the variable window and that variable is only visible to the task it was created on. Same goes for other containers as well.
Think of containers (including tasks and packages) as functions in traditional programming and then think of variables in that context. Then, think of how variables scope in that context. Variables defined outside of main() (in C code) are global. We don't really have "true" global variables because in this analogy, packages are the "main()" function.
Can't I just create all my variables at package level and have something similar to global variables? Yes. In fact, it's exactly the same. All the variables you create at package scope (Click on the package in the designer, create a new variable) are “visible“ to all the other containers in the package.
So, why do I care about this? Well, because, if you begin using variables defined at “local“ scope, in other words, create the variable on the container where it is used, instead of at the package level, you run less of a risk of using the variable in the wrong location and overwriting it. It also localizes the variable to the location where it's being used which helps managability. The concept is similar to the traditional programming best practice. In fact, it was inspired by it. It also effects how variables flow from a parent package to a child package when using the execute package task.
Back to the programming analogy, each parent-child relationship between containers represents a call tree. Variables defined on a container are analogous to variables defined locally to a function. A child container always has "more local" variable scope than its parent.
The Execute Package Task is unique, but still aligns with the analogy. It's a container with scope (Taskhost) and it's scope is more global than its child, the package that it executes.
So, here are some rules to help you understand how variable scoping works:
1) A variable defined on a container will “hide“ a variable with the same name on that container's parent or grandparent (and all direct ancestors).
2) A variable defined on a package is visible anywhere within the package, except for the above case.
3) A variable defined on a container is visible to all the container's children. Basically the same as rule 2, since a package is a container.
4) A variable defined on a execute package task is visible to the package executed by it. Or to put it another way, child packages may see the parent execute package task variables.
5) If a child package uses a variable from a parent package, that variable will not be available when the package executes on its own. It will fail to find the parent's variable. It must execute as a child of the package that contains the variable to see the variable.
When you create solutions, you'll ideally create packages that factor along certain functions, and then call out to those packages when you need their function so that you only implement the function once. This is smart package design because it eliminates duplicated effort and if there is an issue with the package, you only need change it in one place. But, that's a problem then, right? If you want to develop your packages independently, since, it's easier to do that way. You'll want to be able to pass values from parent packages into child packages, a la DTS 2000 outer variables. Enter Parent Package Configurations. With parent package configurations, you specify what parent package variable you want to have configure any property in the child package, including the value property of a variable.
The way this works in practice is that you create the child package with the variable you'll use in the package. Then, in the parent package, you define the variable that you'll set with the value that you want passed down to the child package. Now, you create a link between the two with a Parent Package Configuration such that when the child executes, the child variable will recieve the value of the parent's variable. This decouples the two packages so that they can be designed/executed separately, but automatically use the parent packages variable value when executed as a child package.
One final point, using the rules above then, if a variable is defined on the execute package task, it will be visible to the child package. If the variable on the execute package task has the same name as a variable defined on the package, it will hide the variable defined on the package.
Sounds a little complicated, but it becomes intuitive once you've worked around it a while.
Try creating variables at different scopes with the same name but different values at different scoping levels. Watch what happens as you move from container to container. You can still do things the old way, but as your packages increase in complexity, the flexibility to define variables where they're used will become more useful.
Thanks,
Universe.Earth.Software.Microsoft.SQLServer.IS.KirkHaselden