SQL Server Synonyms and How to Leverage Them in an ETL Process

Written by Team SCS | Jul 27, 2021 2:41:49 PM

In 2005 Microsoft introduced the synonym functionality for SQL Server. While Synonyms are not something SQL developers are likely to use on a daily basis, they are very powerful and useful, despite being relatively simple. They are essentially a placeholder that can be used as a static value in a SQL statement that points towards another object. While static redirects are a core competency, the ability to modify the base object (what the synonym “points” to) is a huge part of what makes them flexible and worth implementing in the right scenarios. This article will cover what a synonym is, how they can be leveraged during an ETL (Extract, Transform, and Load) process, and finally will give code examples of how they function, and how you can implement them.

Synonyms Basics

Microsoft defines SQL Server synonyms as functionality that “Provides an alternative name for another database object, (referred to as the base object) that can exist on a local or remote server”. Synonyms “provide a layer of abstraction that protects a client application from changes made to the name or location of the base object”. Base objects can include most tables, procedures, and functions – with a few exceptions. For database developers, this means that synonyms allow us to create a static placeholder that can be used in code, that will point towards an object that we can then change as needed – without updating the assets that have hardcoded the synonym. This can be useful in many circumstances, but we will be discussing the use specifically in an ETL setting.

A common data warehousing task is obtaining and loading incremental data. Often this process is referred to as ETL. Synonyms can be a useful tool to aid in the automation of ETL processes. A common tactic is to standardize the control flow of the process and reuse it in several similar scenarios. For example, let us say that we have a few data sources that need to have data extracted from them. While the queries and extracted data will be different, a good ETL design will standardize the naming of the assets, the methods that handle the execution of the query (i.e. handler), and what is done with the data once it is extracted.

In the coding examples below, we will look at how one job can create a batch of tables that can each be used to store data from a specific source, on a specific day. We will use synonyms to load data to this table, even though the load procedure does not know the specific underlying table it is populating. The translate step can also use this synonym to translate data from a different table each day, without ever changing the translate code. The focus of this scheme is to keep extraction as simple as possible, while containing the bulk of the logic in the translate step. If this were to be done without synonyms, a considerable amount of dynamic SQL would be required. While there is a time and place for dynamic SQL, the consensus is to only use it when safe to do so, and when it is absolutely necessary. Even though synonyms are not a complete replacement for dynamic SQL, they can replace it in situations where a table or database may need to change based on context. Part of what makes synonyms so flexible is that the binding between a synonym and its base object is by name only. Meaning the base object can be modified, dropped, or replaced by another object that has the same name as the original base object. Since the synonym is not evaluated until called, it can be created, or replaced up until the point that the referencing statements are executed – even if, for example, the synonym is defined in the same procedure that is referencing it.

In this section, we will look at how synonyms are created, deleted, and modified, as well as an example of how they can be used in an ETL process.

First create two tables, and populate them with some data:

CREATE TABLE dbo.ExampleTable (ID INT)  

INSERT INTO dbo.ExampleTable 

VALUES (1),(2),(3),(4) 

 

CREATE TABLE dbo.ExampleTable2 (ID VARCHAR(1))  

INSERT INTO dbo.ExampleTable2 

VALUES ('A'),('B'),('C'),('D')

Next, create a synonym to reference the first example table:

1	CREATE SYNONYM dbo.ExampleSynonym FOR dbo.ExampleTable

If you query either the table, or synonym, the results are identical.

1
2
3

SELECT * FROM dbo.ExampleSynonym 

SELECT * FROM dbo.ExampleTable

If you modify the data, both will reflect the change:

DELETE FROM dbo.ExampleSynonym WHERE ID = 4 

SELECT * FROM dbo.ExampleSynonym 

SELECT * FROM dbo.ExampleTable

A very useful feature of synonyms is that the underlying information is stored in the database, and can be referenced. Take a look at the synonym you just created by querying the sys.synonym table:

1	SELECT * FROM sys.synonyms WHERE [name] = 'ExampleSynonym'

Here you can see the synonym name, and the base object it references. Another important column is the schema. Identically named synonyms can be created with different schemas. If you do not specify a schema during creation, the default database schema will be used. It is considered a good practice to explicitly supply a schema during creation, as to avoid creating it in an unexpected schema.

Synonyms cannot be directly modified, so to do so we need to drop and recreate them.

1
2
3

DROP SYNONYM dbo.ExampleSynonym 

CREATE SYNONYM dbo.ExampleSynonym FOR dbo.ExampleTable2

Run the below queries to see this change:

1
2
3

SELECT * FROM sys.synonyms WHERE [name] = 'ExampleSynonym' 

SELECT * FROM dbo.ExampleSynonym

Many actions you can perform on an object, you can do with a synonym – i.e. Inserts, Updates, and Deletes. However others may reference the synonym itself instead of the underlying object. For example, you cannot drop a base object by referencing its synonym:

1	DROP TABLE dbo.ExampleSynonym

The OBJECT_ID function will also return the OBJECT_ID for the synonym, not the base object.

1
2
3

SELECT OBJECT_ID('dbo.ExampleSynonym') 

SELECT OBJECT_ID('dbo.ExampleTable')

In these situations, the sys.synonyms table can be used to get the information needed to process those actions:

DECLARE @DelStatement VARCHAR(100) = 'Drop Table ' + ( 

SELECT base_object_name FROM sys.synonyms WHERE [name] = 'ExampleSynonym' 

) 

SELECT @DelStatement 

--EXEC(@DelStatement) --UnComment to Drop Table

Before we move on to the next section, these commands can be used to clean up the example assets created so far:

IF OBJECT_ID('dbo.ExampleTable2') IS NOT NULL 

DROP TABLE dbo.ExampleTable2 

IF OBJECT_ID('dbo.ExampleTable') IS NOT NULL 

DROP TABLE dbo.ExampleTable 

DROP SYNONYM dbo.ExampleSynonym

Synonyms and Incremental Data Loads

The next example moves beyond the basic commands and shows how synonyms can be applied to manage incremental data loads in an ETL process. As mentioned above, in this example scenario, a table is created every day for each data source.

First create some template tables. This would only need to be done once, when setting up the data source the first time. The thought is that a simple query would be performed on a data source, with little to no transformation being done.

F OBJECT_ID('TableA_TEMPLATE') IS NULL  

CREATE TABLE TableA_TEMPLATE (ID INT, Col1 INT) 

IF OBJECT_ID('TableB_TEMPLATE') IS NULL  

CREATE TABLE TableB_TEMPLATE (ID INT, Col2 VARCHAR(255)) 

IF OBJECT_ID('TableC_TEMPLATE') IS NULL		 

CREATE TABLE TableC_TEMPLATE (ID INT, Col3 BINARY)

The code below will create/execute the following actions for each of the tables in the pfx cursor.

Create a new table using the {Prefix}_TEMPLATE table as a template. Name it with the format: dbo.{Prefix}{yyyyMMdd}

If a synonym already exists (i.e. from yesterday, if this is running daily), drop it

Recreate a synonym with the format {Prefix}Current, but point it at the new table (from step 1)

--Creates variables that will be used to create tables and synonyms 

DECLARE @SynonymDrop_Current VARCHAR(255)  

DECLARE @SynonymCreate_Current VARCHAR(255) 

DECLARE @NewTableName VARCHAR(255) 

DECLARE @CreateTable VARCHAR(255) 

DECLARE @Postfix VARCHAR(8) = FORMAT(GETDATE(), 'yyyyMMdd') 

DECLARE @Prefix VARCHAR(25)  

 

--Creates Cursor for each table prefix that will be processed 

DECLARE Pfx CURSOR FAST_FORWARD FOR 

SELECT 'TableA' UNION 

SELECT 'TableB' UNION 

SELECT 'TableC' 

 

OPEN Pfx 

FETCH NEXT FROM Pfx 

INTO @Prefix 

 

WHILE @@FETCH_STATUS = 0 

BEGIN  

--Creates commands to be executed 

SET @SynonymDrop_Current = 'DROP SYNONYM IF EXISTS dbo.' + @Prefix + 'Current' 

SET @SynonymCreate_Current  = 'CREATE SYNONYM dbo.' + @Prefix + 'Current FOR [UBBstage].[dbo].' + @Prefix + @Postfix 

SET @NewTableName = '[dbo].' + @Prefix + @Postfix 

SET @CreateTable = 'SELECT * INTO ' + @NewTableName + ' FROM [dbo].' + @Prefix +'_TEMPLATE' 

--Drops the table that will be created if it already exists (i.e. if this has been run more than once in a day) 

IF OBJECT_ID(@NewTableName)  IS NOT NULL  

EXEC  ('DROP TABLE ' + @NewTableName) 

 

--View Commands that will be executed 

SELECT @CreateTable 

SELECT @SynonymDrop_Current 

SELECT @SynonymCreate_Current 

 

----Uncomment to execute commands  

--EXEC (@CreateTable) 

--EXEC (@SynonymDrop_Current) 

--EXEC (@SynonymCreate_Current ) 

 

FETCH NEXT FROM Pfx 

INTO @Prefix 

END 

 

CLOSE Pfx; 

DEALLOCATE PFX;

As a result – using TableA as an example – a table would be created named dbo.TableA20210129, and a synonym dbo.TableACurrent would be created to reference it. The same thing would happen for TableB and TableC.

If this were to be used in an ETL process, this could be run to prepare for a daily data load. An initial extraction procedure could then be written like:

INSERT INTO dbo.TableACurrent 

SELECT * 

FROM db.dbo.SourceTable 

 

 

A view or procedure could then be written to transform the data like: 

SELECT  

  ID 

, COUNT(*) AS ExampleTransformation 

, SUM(Col1) AS ExampleTransformation2 

FROM dbo.TableACurrent 

GROUP BY ID

This would be repeatable and scalable for each data source that is needed. The source queries can be kept simple, and all logic will be in the transform step, which is important for troubleshooting issues as the data store is populated. Since each table holds a relatively small amount of data (one day), they will have acceptable performance in most cases without indexing or trimming down to the bear minimum rows/columns during the source table query. The table will of course need to be managed, as many will be created if done daily, but this can be done on interval, and can be flexible to individual requirements (i.e. back it up into a tall table or delete it if source data is persisted elsewhere).

To clean up the assets created in this section, run the following:

IF OBJECT_ID('dbo.TableA_TEMPLATE') IS NOT NULL 

DROP TABLE dbo.TableA_TEMPLATE 

IF OBJECT_ID('dbo.TableB_TEMPLATE') IS NOT NULL 

DROP TABLE dbo.TableB_TEMPLATE 

IF OBJECT_ID('dbo.TableC_TEMPLATE') IS NOT NULL 

DROP TABLE dbo.TableC_TEMPLATE 

 

DECLARE @DropStarementA VARCHAR(500) =  

'IF OBJECT_ID(''dbo.TableA' + FORMAT(GETDATE(), 'yyyyMMdd')+''') IS NOT NULL 

DROP TABLE dbo.TableA' + FORMAT(GETDATE(), 'yyyyMMdd') 

DECLARE @DropStarementB VARCHAR(500) =  

'IF OBJECT_ID(''dbo.TableB' + FORMAT(GETDATE(), 'yyyyMMdd')+''') IS NOT NULL 

DROP TABLE dbo.TableB' + FORMAT(GETDATE(), 'yyyyMMdd') 

DECLARE @DropStarementC VARCHAR(500) =  

'IF OBJECT_ID(''dbo.TableC' + FORMAT(GETDATE(), 'yyyyMMdd')+''') IS NOT NULL 

DROP TABLE dbo.TableC' + FORMAT(GETDATE(), 'yyyyMMdd') 

 

EXEC (@DropStarementA)  

EXEC (@DropStarementB)  

EXEC (@DropStarementC)  

 

DROP SYNONYM IF EXISTS dbo.TableACurrent 

DROP SYNONYM IF EXISTS dbo.TableBCurrent 

DROP SYNONYM IF EXISTS dbo.TableCCurrent

Conclusion

These have been just a few examples of how synonyms can be used. Here are a few others:

If a table name changes, but changing it in an application is not feasible, a synonym can be created pointing the old name to the new table.

If your production environment does not use the same database or servers as the development environment, a synonym can be used to simplify moving code to production.

If a central repository of functions exists, synonyms can be used instead of recreating them on each database or server.

There are of course many other areas where synonyms can be used, so this list is not exhaustive. Any time a layer of abstraction would be beneficial, synonyms may be of use. This was written as an introduction and is not a definitive guide. Here is some further reading if you are interested in the complexities of the functionality:

https://docs.microsoft.com/en-us/sql/relational-databases/synonyms/synonyms-database-engine?view=sql-server-ver15

https://docs.microsoft.com/en-us/sql/t-sql/statements/create-synonym-transact-sql?view=sql-server-ver15

https://docs.microsoft.com/en-us/sql/t-sql/statements/drop-synonym-transact-sql?view=sql-server-ver15

https://en.wikipedia.org/wiki/Synonym_(database)

Got questions? Team SCS is here to help.

View full post