2006/07/27

Distribution agents & network efficiency for transactional replication with immediate updates

Scenario: We have a VPN set with two xDSL between our main office and a branch site. There are two SQL Servers (at each side) with transactional replication with immediate updates. As you might imagine, bandwidth is critical (it is not a T1 line, we have only 300kbps maximum) and it must be used for other purposes also (file transfer/replication using Windows DFS, web browsing, email, etc.) We have a database of ~150 tables divided in 28 publications. Hence we have 28 distribution agents running against the same subscriber. We have also set up a network bandwidth monitoring software (http://www.cacti.net/) to be able to check the status of the physical lines, and VPN, and see their evolution in time, if there are bottlenecks, and so on.

Problem description: We have been using SQL Server 2000 for quite a long time, both at the publisher and at the subscriber. One month ago, more or less, we installed SQL Server 2005 at the main office as new publisher (we are migrating step by step, subscriber will go next), and the network traffic has boosted.

We are now consuming all the 300kbps (uploading from main office to the branch site) permanently, and 95kbp of downloading (from branch to main office). We have confirmed that all that traffic is caused by SQL Server, and not other kind of traffic.

Furthermore, after some tests, we have trace it down and guessed that Distribution Agent is to blame. Default agent profile for distribution agents has as profile parameters one called PollingInterval, set by default to 5 seconds. We have created a new user profile based on the default one, and changed PollingInterval from 5 to 60 seconds.

The traffic has been reduced in somewhat near 12 times less ( 60/5 = 12 ) from 300kbps to something near 40kbps (being so small, those 40kbps can contain other kind of traffic).

The reasons for Distribution Agents being to blame is explained later. According to SQL Server 2005 BOL and regarding PollingInterval parameter:

Is how often, in seconds, the distribution database is queried for replicated transactions. The default is 5 seconds.

Both the agents and the distribution database are at the same computer: the publisher at the main office. Being that way, in theory, changing that parameter from 5 to 60 seconds would only affect local traffic.

Even more, during the night, when there is no activity at none of the offices, no matter the frequency the agent checks the distribution database, there should be no pending transactions to be applied to subscriber, and the subscriber should not be contacted.

Our experience shows that there is traffic between publisher and subscriber, even when there are no transactions to be applied, and the bandwidth of that traffic depends directly on the value of Distributor Agents' PollingInterval parameter.

One year of network traffic

In this one we can see how, since the middle of June, the date when SQL Server 2005 started to act as publisher, the network traffic reaches the maximum (300kbps).

One day of network traffic

In this capture, we can see the last 24 hours. The valley in traffic around 13:00h belongs to a change of all Distributor Agents' profiles, with a PollingInterval=60 and a restart of the SQL Server Agent service. Half an hour later (more or less), in order to check, the default agent profile is set again (PollingInterval=5) for every Distributor Agent and a new restart of the SQL Server Agent service is done also.

It is very strange that the traffic stalls permanently around 300kbps, even during the night, when there is no activity (just a few jobs that call to some stored procedures to update a few administrative tables, scheduled to run at 00:00 and 5:00am)

Steps to reproduce the problem: In order to find what kind of traffic is being executed against the subscriber, I used Profiler at the subscriber:

  1. 1. Create a publication with transactional replication with immediate updates.
  2. 2. Create a subscription to that publication.
  3. 3. Initialise the publication and wait until the subscriber gets the data.
  4. 4. Start Profiler on the subscriber with a filter:
    HostName:        PUBLISHERNAME
    ApplicationName: PUBLISHERNAME_dbname_publicationname
You will see, every 5 second a RPC:CommandCompleted:
exec sp_executesql
  N'update MSreplication_subscriptions
    set transaction_timestamp = cast(@P1 as binary(15)) +
        cast(substring(transaction_timestamp, 16, 1) as binary(1)), "time" = @P2
    where UPPER(publisher) = UPPER(@P3) and publisher_db = @P4 and
          publication = @P5 and subscription_type = 0 and
          (substring(transaction_timestamp, 16, 1) = 0 or datalength(transaction_timestamp) < 16)',
  N'@P1 varbinary(14),@P2 datetime,@P3 nvarchar(5),@P4 nvarchar(5),@P5 nvarchar(7)',
  0x00004F3200000032000100000000,
  'Jul 24 2006 9:03:59:000AM',
  N'PUBLISHERNAME',
  N'dbname',
  N'publicationname'

Since our database has ~150 tables in 28 publications (agents), if we use the default profile (5secs) and do a simple calculation: 60*28/5 = 336 RPC calls to the subscriber 24/7 without doing any user legitimate transaction (only this unjustified traffic). This consumes all our uploading bandwidth at the main office (300kbps at the publisher). If there is nothing to replicate, I though that the subscriber should not be contacted (at least, not so frequently as 5 seconds) because it goes against network efficiency for slow-links configurations.

Other possibility is using shared agents (default for SQL Server 2000 when dealing with transactional replication) vs. Independent agents (default for SQL Server 2005 for transactional replication). You can read about this on BOL Administering Replication Agents when dealing with Independent and Shared Agents:

An independent agent is an agent that services one subscription. A shared agent services multiple subscriptions; if multiple subscriptions using the same shared agent need to synchronize, by default they wait in a queue, and the shared agent services them one at a time. Latency is reduced when using independent agents because the agent is ready whenever the subscription needs to be synchronized. Merge replication always uses independent agents, and transactional replication uses independent agents by default for publications created in the New Publication Wizard (in previous versions of SQL Server, transactional replication used shared agents by default).

My feedback to Microsoft on this issue: Distribution agents & network efficiency for transactional replication with immediate updates This issue was resolved as 'By Design' and my suggested workaround confirmed as the solution.

Workaround/Solution: As you can guess, the solution is changing PollingInterval to a value that fits with your network bandwidth and desired latency. You need to find a compromise between both of those: with a higher latency, less bandwidth is used; if you need a lower latency, you will consume more bandwidth. Instead of having a single new user profile fo all distribution agents, you could create a set of user profiles with PollingInterval set to (for instance) 15, 60 and 300 seconds.

Publications that require realtime latency, can be assigned the default profile (5 seconds), for the rest, depending of the nature of tables included in the publication, you can assign the 15 secs. profile for near realtime, and the 300 seconds for subscriptions that are seldom updated.

Keywords: SQL Server 2005, transactional replication with immediate updates, Microsoft, slow link, bandwidth, latency, traffic, distribution agent, nothing to replicate, error, workaround, resolution

2006/07/21

Reporting Services & MS Word format (.doc) rendering.

Problem description: Feature request One of the most frequently asked questions in Reporting Services (RS) forums and newsgroups is: Why did not RS include the option to render documents into MS Word (.doc) format? According to BOL, Reporting Services 2005 has the ability to render/output documents into:

  • CSV
  • Acrobat (PDF) file
  • Excel
  • Web archive
  • TIFF File
  • XML

Why did I marked XML? Because RS is not only able to output the results to raw XML format but also can make transformations based on XSLT stylesheets! That means that you can render into ANY format that you want, provided you feed Reporting Services with the proper transformation file. This document is based (and extends) the idea from CarlosHM and his blog document: WordML in Reporting Services.

The solution In order to illustrate the process, we will render a report into MS Word format.

1. Create your template using Microsoft Word 2003. You can make it as complex as you need/want. When you reach the point where data fields are expected to be filled, just write the name of the fields surrounded by special characters (to be able to find them afterwards). For instance, if there will be a field called company that should be on bold, arial 16, left justified, etc... just write «Company» and format that piece of text as needed. Add every fixed text as usual (I suppose you know how to work with Word). If you want a new page for every record returned by Reporting Services, remember to add a hard break (CTRL+Enter) at the end of the format.

Screen capture of the template inside MS Word 2003

2. Save the document When you have your layout ready, Save document as... XML document. and name it TemplateDocument.xml. Just for your information, you can see that, despite the fact that its extension is .xml, its icon is not the standard for an XML file and it is still linked to Word. That is because in the inside of it there is a line:

<?mso-application progid="Word.Document"?>

3. Reformat the generated .xml file. MS Word generates XML files without indenting (smaller file size), thus making them hardly human readable. Since we need to manually modify this file, we will reformat it and indent it to avoid making mistakes. For this you might need tidy, a little command line program to reformat html/xml files or use your favourite xml editor.

tidy -utf8 -xml -i -wrap 999 -output TidyOutput1.xml TemplateDocument.xml

Note that if your input document has embedded images, using -i option will break them. There is no problem with this since they can be repaired later, calling tidy again without passing -i parameter.

4. Modify the formatted .xml in order to achieve .xsl Now we have a human readable .xml file. Let's use it in order to create the .xsl file we need to upload to RS. First, make a copy of it, renamed it to .xsl and open it with Notepad:

copy TidyOutput1.xml TidyOutput1.xsl
notepad TidyOutput1.xsl

We will need to do some search and replaces of some blocks of text. Search for the line that starts with <o:DocumentProperties> and remove everything from the top of the file up to that particular line (not included). In place of the removed text, insert this one:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"
xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:wsp="http://schemas.microsoft.com/office/word/2003/wordml/sp2"
xmlns:st1="urn:schemas-microsoft-com:office:smarttags">
  <xsl:output method="xml" media-type="text/xml" version="1.0"
  encoding="UTF-8" standalone="yes" indent="yes"
  omit-xml-declaration="no" />
  <xsl:template match="/">
    <xsl:processing-instruction name="mso-application">
    progid="Word.Document"</xsl:processing-instruction>
    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"
xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:wsp="http://schemas.microsoft.com/office/word/2003/wordml/sp2"
xmlns:st1="urn:schemas-microsoft-com:office:smarttags"
w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no"
xml:space="preserve">
<xsl:apply-templates />
</w:wordDocument>
  </xsl:template>
  <xsl:template match="Report">
    <o:SmartTagType o:namespaceuri="urn:schemas-microsoft-com:office:smarttags"
    o:name="PersonName" />

Now go to the end of the file. You will find that the last tag is:

</w:wordDocument>

Delete it and insert these two lines:

  </xsl:template>
</xsl:stylesheet>

Now save your document. Let's check if it is well-formed yet:

tidy -utf8 -xml -i -wrap 999 -output TidyOutput2.xsl TidyOutput1.xsl

If you see No warnings or errors were found in the output, you can continue (using TidyOutput2.xsl). If any error appears, revise the previous steps to see where the error was. You need to have a valid (well-formed) XSL file in order to continue.

5. Modify .xsl file to include the database fields. In step 1, when we created the template using MS Word, we named the fields as «FieldName». Now we will use that names to search and replace every field that you entered with the correct xsl command to replace it with the value of the Reporting Services returned xml file. Open TidyOutput2.xsl with Notepad and do a search and replace:

Search: «FieldName»
Replace: <xsl:value-of select="@FieldName" />

Repeat the search & replace for every field. When you think you had finished, search for « to check that you did not leave any field behind. Save as TidyOutput3.xsl. As I told you before, if your original template had embedded images, reformatting using tidy with -i option broke them. Now we will run:

tidy -utf8 -xml -output TidyOutput4.xsl TidyOutput3.xsl

Without the -i option everything will be reformatted to start at column 0 and the images ( <w:binData> tags) will be repaired.

6. Upload .xsl file to RS. In order to do our test we will create a new simple report, using MS Visual Studio, with the underlying query:

SELECT 'Company 1' As Company, 'Address 1' As Address
UNION
SELECT 'Company 2' As Company, 'Address 2' As Address

This will return us 2 records, without involving any database. Of course, you will need to create your own report with your own underlying query so that you can retrieve all the information you need. Then design the report as simple as you can, since it will only be used to retrieve the data. The formatting will be applied using TidyOutput4.xsl we had prepared before. Both the generated report (Report1.rdl) and TidyOutput4.xsl must be deployed to Reporting Services.

Screen capture of the html generated report inside MS Explorer

7. Test it If you have reached this point, now it's time to check if RS can send us a Word document. Type this on your explorer navigation bar:

http://reportserver.yourdomain.com/ReportServer?/directory/Report1
 &rs:Format=XML
 &rs:Command=Render
 &rc:XSLT=TidyOutput4.xsl
 &rc:FileExtension=xml

If everything has gone nicely, you should be asked for opening a file called Report1.xml. Click open and see the results. MS Word should open and you should see something like your template, but without the data. The reason for this is that we forgot to add...

8. <xsl:for-each> Most of the reports have a master/details structure. You can think about an invoice, with master part with customer, invoice number, address etc, and a details part with a list of items. In our Report1.rdl we used a table (named table1) and we need to use <xsl:for-each> in order to iterate for every record in it. Open TidyOutput3.xsl (indented) and save it as TidyOutput5.xsl. Now search for the occurrence of your first field (@Company). Now move the cursor some lines up, until you reach the opening <w:p ...> (word paragraph) in which your field is placed. Right between the previous closing paragraph tag ( </w:p> ) and the opening tag you have found, insert the following:

<xsl:for-each select="table1/Detail_Collection/Detail">

We use table1 because that is the name of the table in Report1.rdl. Now we need to find the place for the closing tag:

</xsl:for-each>

In our example, we have placed it right before the <w:sectPr...> tag. The correct place for your case will depend on your particular layout. To be able to easily find the correct places for the starting and ending xsl:for-each tags, it might be usefull to use placeholders (special characters) as we did with the field names (during the layout preparation). Since TidyOutput5.xsl is an indented version (images are broken), we generate the non-indented version of it:

tidy -utf8 -xml -output TidyOutput6.xsl TidyOutput5.xsl

and upload TidyOutput6.xsl to Reporting Services.

9. Production Now the report is in place and the correct .xsl is also deployed to reporting services server. You can retrieve the document using:

http://reportserver.yourdomain.com/ReportServer?/directory/Report1
 &rs:Format=XML
 &rs:Command=Render
 &rc:XSLT=TidyOutput6.xsl
 &rc:FileExtension=xml

Screen capture of the resulting Word XML report inside MS Word 2003

Note that the final document is not a Microsoft Word Document file (.doc), but a XML file that contains information to be interpreted by Microsoft Word 2003 and be displayed as if it were a DOC file. If you really need the file to be a Microsoft Word Document file you can then (once opened inside Word) Save as... and select the .doc format. This can be achieved also using VBScript:

Set oApp = CreateObject("Word.Application")
oApp.WindowState = 1  ' 0=wdWindowStateNormal; 1=wdWindowStateMaximize; 2=wdWindowStateMinimize
oApp.Application.Visible = False
oApp.Application.Documents.Open XMLFile ' Dim XMLFile As String
oApp.Application.ActiveDocument.SaveAs DOCFile, wdFormatDocument ' Dim DOCFile As String
oApp.Application.Visible = True ' Now the currently opened file is a DOC file

Download: You can download all the sample files together with the resulting report from here.

Keywords: reporting services, rs, render, export, generate, word, ms word, word document, word file, doc, xml, xsl, templates

2006/07/16

Breaking changes in SQL Server 2005

This particular change does not affect me, since I do not use Merge Replication. However it is very important to have your triggers prepared to handle multirow inserts/updates/deletes in any case (it does not matter whether you use replication or not). Extracted from http://msdn2.microsoft.com/en-us/library/ms143470.aspx regarding Breaking Changes for Merge Replication
Batching of changes In previous versions of SQL Server, changes made by the Merge Agent were performed on a row-by-row basis. In SQL Server 2005, changes are batched to improve performance; therefore, more than one row can be inserted, updated, or deleted within a single statement. If any published tables in the publication or subscription databases have triggers, ensure that the triggers can handle multi-row inserts, updates, and deletes. For more information, see Multirow Considerations for DML Triggers.
And here is the important info with examples (from SQL Server 2005 BOL, as usual) Multirow Considerations for DML Triggers

2006/07/13

"Row handle is invalid" error in SQL Server's Queue Reader Agent

Today I came to the office and found a bunch of SQL Server alert emails, all of them related to queue reader agent. I use transactional replication with immediate updates so... I guessed what could be queued and why did this error appear? I went to see the job's history for the agent and found the following (sorry but my SQL Server version is spanish):
Fecha  12/07/2006 9:00:17
Registro  Historial de trabajos ([MYSERVER].8)

Id. del paso  2
Servidor  MYSERVER
Nombre del trabajo  [MYSERVER].8
Nombre del paso  Ejecutar agente.
Duración  00:18:35
Gravedad de SQL  0
Id. de mensaje de SQL  0
Operador notificado mediante correo electrónico
Operador notificado mediante envío de red
Operador notificado mediante localizador
Reintentos efectuados  0

Mensaje
-XSERVER MYSERVER
   -XCMDLINE 0
   -XCancelEventHandle 000000000000118C
2006-07-12 07:18:03.678 Agente de lectura de cola [MYSERVER].8 (Id. = 1) iniciado
2006-07-12 07:18:03.678 Valores de parámetro obtenidos del perfil de agente:
   -historyverboselevel 1
   -logintimeout 15
   -pollinginterval 5000
   -querytimeout 1800
2006-07-12 07:18:33.724 No hay ninguna transacción en cola disponible
2006-07-12 07:18:48.755 No hay ninguna transacción en cola disponible
2006-07-12 07:18:52.552 El Agente de lectura de cola ha detectado el error 'Identificador de fila
 no válido.' al conectarse a 'mydb' en 'MYSERVER'. Asegúrese de que la publicación y la
 suscripción se han definido correctamente y de que los dos servidores están en ejecución.
2006-07-12 07:18:52.584 Error al aplicar el mensaje en cola al publicador
2006-07-12 07:18:52.599 Subproceso de trabajo 4412 : error en la tarea
2006-07-12 07:18:52.599 No hay ninguna transacción en cola disponible
2006-07-12 07:18:52.599 Anulando lector de colas
Since the error did not make clear where (which publication) the error was nor what do I had to do to solve it I started worrying. The good part of it was that, since I was not using queued transactions, the replication among my servers kept working fine (I checked some inserts/updates at both publisher and subscribers). I first tried to restart SQL Server service (and related services) at the publisher without luck. Since the problem was already in the queue, when Queue Reader Agent was restarted, it began to throw the errors again. I then checked BOL for Replication Queue Reader Agent in order to start it manually on a command line (cmd.exe) and see the errors on my own: I started a cmd.exe running as the SQL Server Agent service account. I changed dir to C:\Program Files\Microsoft SQL Server\90\COM and launched:
qrdrsvc -Distributor [MYSERVER] -DistributionDB [distribution] -DistributorSecurityMode 1
 -Continuous -OutputVerboseLevel 2 > outputlog.txt
Then I checked the outputlog.txt and found the following (sorry, there are things in spanish here too):
2006-07-12 08:36:34.643 Agente de lectura de cola de réplica de Microsoft SQL Server 9.00.2047.00
2006-07-12 08:36:34.643 Copyright (c) 2005 Microsoft Corporation
2006-07-12 08:36:34.643
2006-07-12 08:36:34.643 Las marcas de hora anexadas al inicio de las líneas de salida se
 expresan como hora UTC.
2006-07-12 08:36:34.643 Valores de parámetro de agente especificados por el usuario:
   -Distributor MYSERVER
   -DistributionDB distribution
   -DistributorSecurityMode 1
   -Continuous
   -OutputVerboseLevel 2
2006-07-12 08:36:34.643 Connecting to OLE DB qrdrsvc at datasource: 'MYSERVER', location: '',
 catalog: 'distribution', providerstring: '' using provider 'SQLNCLI'
2006-07-12 08:36:34.689 OLE DB qrdrsvc: MYSERVER
   DBMS: Microsoft SQL Server
   Version: 09.00.2047
   catalog name: distribution
   user name: dbo
   API conformance: 0
   SQL conformance: 0
   transaction capable: 1
   read only: F
   identifier quote char: "
   non_nullable_columns: 0
   owner usage: 15
   max table name len: 128
   max column name len: 128
   need long data len:
   max columns in table: 1000
   max columns in index: 16
   max char literal len: 131072
   max statement len: 131072
   max row size: 131072

[... A BUNCH OF LINES REMOVED HERE... SCROLL TO THE END OF FILE ...]

2006-07-12 08:37:41.220 OLE DB SUBSCRIBER 'SUBSCRIBER':
 {? = call sp_getsqlqueueversion (?, ?, ?, ?)}
2006-07-12 08:37:41.314 OLE DB SUBSCRIBER 'SUBSCRIBER':
 {? = call sp_replsqlqgetrows (N'MYSERVER', N'mydb', N'Equipos')}
2006-07-12 08:37:41.439 OLE DB qrdrsvc 'MYSERVER':
 exec sp_helpdistpublisher @publisher = N'MYSERVER'
2006-07-12 08:37:41.454 Connecting to OLE DB MYSERVER at datasource:
 'MYSERVER', location: '', catalog: 'mydb', providerstring: '' using provider 'SQLNCLI'
2006-07-12 08:37:41.454 OLE DB MYSERVER: MYSERVER
   DBMS: Microsoft SQL Server
   Version: 09.00.2047
   catalog name: mydb
   user name: dbo
   API conformance: 0
   SQL conformance: 0
   transaction capable: 1
   read only: F
   identifier quote char: "
   non_nullable_columns: 0
   owner usage: 15
   max table name len: 128
   max column name len: 128
   need long data len:
   max columns in table: 1000
   max columns in index: 16
   max char literal len: 131072
   max statement len: 131072
   max row size: 131072
2006-07-12 08:37:41.454 OLE DB MYSERVER 'MYSERVER':
 set arithabort on set numeric_roundabort off set concat_null_yields_null on
 set quoted_identifier on set ansi_nulls on set ansi_padding on set ansi_warnings on
2006-07-12 08:37:41.454 OLE DB MYSERVER 'MYSERVER':
 select pubid, conflict_policy from dbo.syspublications where name = N'Equipos'
2006-07-12 08:37:41.454 OLE DB MYSERVER 'MYSERVER':
 {? = call sp_getqueuedarticlesynctraninfo (N'Equipos', 233)}
2006-07-12 08:37:41.454 El Agente de lectura de cola ha detectado el error 'Identificador de
 fila no válido.' al conectarse a 'mydb' en 'MYSERVER'. Asegúrese de que la publicación y
 la suscripción se han definido correctamente y de que los dos servidores están en ejecución. 
2006-07-12 08:37:41.470 Error al aplicar el mensaje en cola al publicador
2006-07-12 08:37:41.485 Subproceso de trabajo 396 : error en la tarea
2006-07-12 08:37:41.485 No hay ninguna transacción en cola disponible
2006-07-12 08:37:41.485 Disconnecting from OLE DB SUBSCRIBER 'SUBSCRIBER'
2006-07-12 08:37:41.485 Disconnecting from OLE DB SUBSCRIBER 'SUBSCRIBER'
2006-07-12 08:37:41.485 Disconnecting from OLE DB MYSERVER 'MYSERVER'
2006-07-12 08:37:41.485 Anulando lector de colas
The translation for the key line is: The Queue Reader Agent has encountered the error 'Row handle is invalid.' when connecting to 'mydb' on 'MYSERVER'. Ensure that the publication and subscription are defined properly and that both servers are running. I found here that the publication having problems was called 'Equipos'. We are a bit nearer to the solution. As a first idea, I tried to reinitialise the publication... and it worked. Reinitialising the publication probably wiped away any commands that were waiting (queued) to be applied at the publisher because the new fresh snapshot has the current data. After the initial snapshot was applied to subscribers, I monitored the Queue Reader Agent for some minutes and found that it was working fine. The problem was solved, but the big question remains... why did this error appear? I'm still wondering...