Ok, I promised to blog about my replacement for the snapshot replication.  Well, it is not really a replacement, but more of a work around for the built in snapshot (Sorry do dissapoint. I promiss it is worth the read <grin>)

In our environment, we have some large transactional tables that we replicate to a reporting environment.  We all know that the reporting environment can sometimes get less than stellar hardware.  As a result, our reporting environment was pretty fragile.  Add to that an entire company of users that were banging on it, with some pretty ugly SQL I might add, and you end up with an environment that lends itself to needing an article to be re-pushed when you least expect it.

The built in snapshot will lock your table while it does the push (Concurrent Snapshot was added after we had figured this out.  We tried the shiny new concurrent snapshot, but it didn't work until SP2 I believe.  We have not tried it since...)  What we needed was a way to snapshot the data using an uncommitted isolation level to keep the locks to a minimum.  The CIO gets a little peeved when you have to bring your site down to push replication to your reporting environment...  However, uncommitted isolation has it's issues too, namely duplicate records.  It is easy to fix, just clean out the duplicates before you create your unique indexes.

So we created a DTS for the table we are going to push and use the nolock query hint to get the data.  When moving around large amounts of data, I like to truncate and drop the table (Dropping a table is a logged operation, Truncating is not ;) ).  Then move the data and then recreate your indexes.  At the end of this process, you now have a DTS package that will push your data, without locking your base table.  Do not run this yet.  You need to create the subscription before pushing the data.  This allows for no gaps in the data.  If you were to push the data, then create the subscription, you would have a whole in your data if records were new or updated between the push and the creation of the subscription.  Since the push uses a dirty read, we affectionately call this a "Dirty Push".

The next step is to create your publication.  I like to create one publication per article.  It makes troubleshooting and management very easy.  If you have to re-initialize an article, you have to re-initialize the publication.  Re-initializing the publication will re-push all the articles in that publication. That might get ugly in situations where all or some of the articles are large.  After you have created your publication it is time to create your subscription(s).

You will create one subscription per subscriber for your publication.  When you get to the “Initialize Schema” section of the wizard, select the radio button for "No, the Subscriber already has the schema and data".  This tells SQL Server to forego the snapshot and just start sending data.  We want to stop the distribution agent for that subscriber so that new rows are not being inserted at the subscriber just yet.  At this point records will start collecting in the Distribution.dbo.ms_ReplCommands table for your subscriber. 

Now, we can push the data using our shiny new DTS package.  Now for those of you paying attention, we have just created another problem.  We are queuing data to go to the subscriber in ms_replCommands.  We are also going to push this same data to the subscriber in our DTS.  If you have worked with replication at all, you will recognize this real quick.  The stored procedures that SQL Server creates at the subscriber will not handle this very well.  They will try to insert data that already exists, or update records that don't exist etc.  and raise an error, effectively breaking replication.  In the next entry, I will go into detail what we do in these stored procedures to ensure that replication continues, and still maintain data integrity.

 

A couple of friends of mine have been asking me to blog about some of the things I have done with custom replication in SQL 2000.  I figured a good place to start would be with some basic principles about replication, and then dive deeper into the guts of what I have done over a few blog entries.  Hopefully I won’t bore anyone.  This post will cover the basics of Transactional Replication.  It has been trimmed down for brevity, so please no complaints about “oh you left this out…” you can comment about it yourself and fill it in.

Microsoft SQL Server 2000 has three flavors of replication.  Snapshot, Transactional, and Merge.  For this series (if you can do that in a blog ;) I will focus on Snapshot and Transactional replication.

To start with, Microsoft uses a magazine metaphor for describing the parts of replication.  You start off with a publication and add article(s).  This publication is then subscribed to, and then distributed to the particular subscribers on some schedule (yes including continuous).
 
After you have defined a Subscription, you must synchronize the subscriber with the publisher.  This is where the snapshot comes in.  When you create the subscription, one of the questions you are asked is whether you want to start the snapshot agent.  Selecting yes will start the agent after the wizard is finished.  The snapshot agent will script out the table, copy the script over and BCP (ok I am not sure if it is BCP, or DTS.  It gets the data there) and put down three stored procedures used by the distribution agent.  If you did not select yes, do not worry.  The distribution agent will not start sending data until it thinks the data is synchronized.

Now, data gets synchronized by a couple of different methodologies.  The snapshot replication, backup and restore, or BCP.  Now I assume that Snapshot replication is the preferred method for synchronizing, but we found a few issues with it that I will go into later in the series.  Suffice to say, the distribution agent must “think” that the data is in sync.  When you create the subscription, you can tell SQL Server that the subscriber already has the data, and SQL Server will take your word for it.  Isn’t that special…

While the wizard was creating the snapshot agent job, it also created a couple of other agent jobs.  The wizard also created the distribution agent job and the logreader agent job.  The logreader job does exactly what it sounds like.  It reads the transaction log and creates records in a table called ms_ReplCommands in the Distribution database.  This table is the queue that the distribution agents use to serialize the commands that get the data into the subscriber.  This job is the conduit that data will flow to the subscriber.

Ok, at this point you have a rough idea of the objects involved.  At least the objects that we are going to fiddle with.  Stay tuned because I am going to blog about what I don’t like about the snapshot agent, and how I get around it.  On the way you learn some cool things about replication with SQL Server 2000.