Pages

How does BizTalk detect if a host instance is dead in BTS2K6, 2K6R2 and 2K9?

Friday, June 10, 2011
This job is done by the cooperation between the front end BizTalk host instance process and the backend BizTalk SQL job MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb.

1. When a BizTalk host instance is started, it will call the store procedure bts_ProcessHeartbeat_ once as the below.

exec [dbo].[bts_ProcessHeartbeat_] @uidProcessID=NULL,@dwCommand=1,@nHeartbeatInterval=60

Here @dwCommand=1 means Process Startup.

2. When a BizTalk host instance is stopped, it will call the same store procedure bts_ProcessHeartbeat_ once again as the below.

exec [dbo].[bts_ProcessHeartbeat_] @uidProcessID=NULL,@dwCommand=2,@nHeartbeatInterval=60

Here @dwCommand=2 means Process Shutdown.

3. Look at the code of the store procedure bts_ProcessHeartbeat_, could see the int_ProcessCleanup_ will be called if dwCommand=1(Process Startup) or dwCommand=2(Process Shutdown). The store procedure int_ProcessCleanup_ is used internally to clean up the records related with the host instance process and release all messages and service instances which are locked by this process.

4. It is easy to understand why int_ProcessCleanup_ is called when a host instance is shutdown, the messages and service instances locked by the host instance can be freed before the shutdown so they can be picked up by the same restarted host instance again or they can be picked up by other host instances which are still running in a multiple boxes BizTalk environment.

5. The questions is why int_ProcessCleanup_ is also called when a host instance is startup if the locked messages and services instances were already freed by the same SP when shutdown? Yes, it is redundant if the same SP was already called when the instance shutdown normally, but the problem is that the process could be terminated or crashed unexpectedly without calling the SP, You can see that abnormal shutdown easily by killing a host process, the SP bts_ProcessHeartbeat_ with the dwCommand=2 is not called if the host process is killed by the command “kill /f”. As we can’t guarantee the internal cleanup SP is already called before the host instance is started, to call the internal cleanup SP again at the startup is the good choice although it is redundant some times.

Another question is raised at this point, if a host instance process is totally hang or dead there, or it is just crashed or terminated unexpectedly without restarting, who can help to detect the situation and release these locked messages and service instances? How? Let’s continue.

6. In each running BizTalk host instance, you can see one thread as the below.

0:016> kc

ntdll!NtWaitForSingleObject
kernel32!WaitForSingleObjectEx
kernel32!WaitForSingleObject
BTSMessageAgent!CAdminCacheRefresh::OnCall
BTSMessageAgent!CThreadPoolWrapper::ThreadWorker
ntdll!RtlpWorkerCallout
ntdll!RtlpExecuteWorkerRequest
ntdll!RtlpApcCallout
ntdll!RtlpWorkerThread
kernel32!BaseThreadStart

7. The thread calls the store procedure bts_ProcessHeartbeat_ with @dwCommand=0 every 60 seconds by default. The interval can be modified through the column ConfigurationCacheRefreshInterval of the table adm_Group in BizTalkMGMTDb or through the settings of the BizTalk group in BizTalk admin UI.


exec [dbo].[bts_ProcessHeartbeat_Newhost] @uidProcessID=NULL,@dwCommand=0,@nHeartbeatInterval=60


Here @dwCommand=0 means Process Live.

8. Look at the code of the store procedure bts_ProcessHeartbeat_, the internal SP int_ProcessCleanup_ will not be called if dwCommand=0(Process Live). The bts_ProcessHeartbeat_ will only update a record or insert a new record if there is no existing one in the table ProcessHeartbeats in BizTalkMsgBoxDb for the host instance.

9. Let’s look at the record in the table ProcessHeartbeats and see what it is look like.

uidProcessID
nvcApplicatioName
dtCreationTime
dtLastHeartbeatTime
dtLastHeartbeatTime
4d50992e-2ae1-4bbd-9d61-b712b052b99c
BizTalkServerApplication
11/25/2009 4:52:44 AM
11/25/2009 4:52:44 AM
11/25/2009 5:02:44 AM
uidProcessID is the unique GUID for each host instance. You can get the uidProcessID for a host instance by query the UniqueId column of the table adm_HostInstance in BizTalkMGMTDb. You also can get the uidProcessID for a host instance by checking the service property “Path to Executable” of the corresponding BizTalk service in Windows Service Control Manager on a BizTalk box. For example, the following is the “Path to Executable” setting for my "BizTalkServerApplication" in my BizTalk testing box. You can see the ID is after the command line option “-btsapp”.

"C:\Program Files (x86)\Microsoft BizTalk Server 2006\BTSNTSvc.exe" -group "BizTalk Group" -name "BizTalkServerApplication" -btsapp "{4D50992E-2AE1-4BBD-9D61-B712B052B99C}"

You can note that the dtNextHeartbeatTime is about equal to (dtLastHeartbeatTime+10*HeartbeatInterval). Since the default HeartbeatInterval is about 60 seconds by default(we mentioned how to modify that interval at the above), the dtNextHeartbeatTime is about 10 minutes after the dtLastHeartbeatTime by default. Of course if the HeartbeatInterval is modified to 30 seconds, then the dtNextHeartbeatTime would be 5 minutes after the dtLastHeartbeatTime.

10. Normally the record for a specified host instance in the table ProcessHeartbeat is updated every [HeartbeatInterval] seconds through the SP bts_ProcessHeartbeat_ by the running host instance process.

11. Take a look at the SQL job MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb, it is configured to execute the store procedure bts_CleanupDeadProcesses every minute.

12. Look at the code of the SP bts_CleanupDeadProcesses, it is simply go through the table ProcessHeartbeats and check if there is a host instance which dtNextHeartbeatTime is older than the current time. If there is, which means we already lost 10 heartbeats at least from this host instance, this host instance is then considered as dead and the store procedure int_ProcessCleanup_ for that instance is called by the SQL job to clean up the records related with the host instance process and release all messages and service instances which are locked by the process.

If this SQL job is disabled or get a problem when it is running, BizTalk will lost the capability to detect “dead” of a host instance.

No comments:

Post a Comment

Post Your Comment...