atom beingexchanged: April 2009

Monday, April 27, 2009

The Dread Pirate Re-Seed – Part 1

Among the most common questions I get from clients about the new data-protection features in Exchange 2007 (and the soon-to-be-released Exchange 2010) is, “What is a re-seed and why does it happen?”  This mostly falls into the category of “fear of the unknown” since the technology is new, and documentation on how it works is somewhat scarce.

Re-seeds are a commonly confusing part of most protection methods, though they fall under different names and methodologies.  In a solution like Double-Take (see disclaimer below), they’re called re-mirrors or re-synchronization operations – and are typically differences only.  In a tape-backup solution it’s a restore operation, and might be everything, incremental pieces, or some combination thereof.  In Exchange 2007 these operations are called re-seeds, basically the replay of data from a server that has a “correct” copy to one that does not.  Today, we see these operations in Exchange 2007 Local Continuous Replication (LCR), Cluster Continuous Replication (CCR) and Server – or Standby – Continuous Replication (SCR).  Today, we’ll talk about CCR, and address LCR/SCR next week.

CCR allows an active node of a 2-node Active/Passive Exchange Cluster to replicate a copy its data to the passive node.  This allows the passive node to take over with a nearly-current copy of the data if the production system fails due to hardware or software failure.  There is a log replay lag to be considered, but it’s only 50 logs that need to be applied to the passive node during a rollover event, and that does give you some measure of protection against corruption if you catch it fast enough.  Otherwise, the system acts much like a traditional Single Copy Cluster (formerly Shared Disk Cluster) in behavior, and is controlled with a combination of Windows cluster tools and PowerShell.

Whenever a log file is committed, and a new prime log (usually E00) is created, the closed log is copied over to the passive node via an SMB share, where it is held until it passes the 50 log replay limit and is then committed to the database, or a rollover occurs and the logs are committed immediately.  Exchange 2010 will move away from the SMB share, but will utilize a similar methodology overall, if the beta is to be taken at face value.

In order to get the passive node in sync with the active data, the CCR system starts with a re-seed operation.  All data from the database is copied from the active node to the passive node, as well as any non-truncated logs.  From then on, only log files are copied, as they are committed on the active node.  If all goes well, this will probably be the only re-seed you see unless you have a rollover.

If you do flip nodes – let’s say from Node A to Node B – then Node B will re-seed back to Node A if Node A becomes divergent. In other words, if Exchange cannot determine what logs still exist on Node A, or if the logs are inconsistent, or if some are missing.  A graceful rollover will not cause a re-seed, but most emergency rollovers will require it.

The same will happen if you haven’t rolled over, but instead Node B was offline for some other reason.  When Node B comes back online, Node A will see if all the required logs are on both machines, and then either just continue CCR protection or else initiate a re-seed to copy the data over again if anything is amiss.  The only issue here is if your backup tools purge logs while Node B is still offline.  In that case the servers will be considered divergent and need a re-seed to get back up and running properly.

Finally, if a cluster is restored from a backup (tape or otherwise) to the active node, then a re-seed must be manually initiated to re-sync the nodes properly.  You will see errors telling you to do this after the restore is complete and you bring Node A back online.

One other condition exists, but it is a manually created condition. If you perform Offline Defragmentation of the database, you will trigger a re-seed operation when Node A is brought back online.  As long as the first Exchange log is still present (which it should be) then this will happen automatically. Otherwise, it will need to be initiated manually.

So, why is this an issue?  Normally, it’s not, but keep in mind that re-seed operations are *full* copies of the entire database.  So if you have relatively small databases and only a few of them, this isn’t a problem.  But let’s say you have over 1 Terabyte of data in your Exchange cluster.  Re-seeding that much data locally will be time and resource consuming, and doing it over a WAN (for distributed failover clustering) could be problematic – to say the least.  So you want to avoid re-seed operations at all costs and wherever possible, which means treating the CCR cluster very carefully, and following all the best practices from Microsoft on Exchange 2007 Clustering in general.

For information on when re-seeds occur, take a look at this TechNet article.  They’re not an everyday occurrence, but you will need to be sure you know when and why they will happen to avoid confusion and frustration.

Labels: , , , ,

Bookmark and Share
posted by Mike Talon at 0 Comments

Wednesday, April 22, 2009

STILL no flying cars

Those of my readers who were following me when I first started the blog over a year ago will remember my treatise on the lack of a long-promised improvement to Exchange – namely an overhaul of the database engine.  Well, Exchange 2010 has gone beta (formerly Exchange 14), and we’re still waiting for it.

Yet again, Microsoft has created a revolutionary (in the colonial sense) jump in Exchange technology while simply updating the tired old ESE engine that has been part of Exchange since 5.5.  While the older engine does work, it is still not the SQL back end that we’ve been promised for over a decade now. 

Change is hard, I understand that.  Re-writing the Exchange system to embrace a more flexible, more open database platform would require an amazing development effort and take time and resources.  I really do get all that.  However, *giving* us a more flexible, more open database engine would go much farther toward providing an upgrade worthy of the name *upgrade* instead of one that should have been provided as a service pack for Exchange 2007.

Integrated archiving is a very good thing.  Enhanced recovery options are also good – though they’re still built on log-shipping and therefore not the best possible solution (see disclaimer below, I’m biased on this topic). Many of the tools in this release are quite welcome, but not a really good reason to spend thousands of dollars in tight times on a brand new messaging platform.  A revised – preferably SQL based – database engine would indeed have made Exchange 2010 cost-justifiable.  With file-system streaming, better control and more access for those who are granted it, Exchange could become the true cornerstone of messaging and collaboration it was designed to be.  Without it, Exchange is just another email system, and nothing more.

Still lagging in support for Public Folders, even though the market has clearly shown that they must remain part of Exchange for the foreseeable future.  Still relying on log-shipping (which could benefit from CDM technologies in SQL).  Still forcing users to learn new tools over and over and over again.  Exchange 2010 is not what we expected, and not truly what we wanted.  If it was released as Exchange 2007 R2 or even a Service Pack, I think the market would have embraced it, but as a brand new (non-backward compatible) platform, IT managers will be hard pressed to get this on their datacenter floors.

Yes, I am biased on topics revolving around High Availability, and I am clear to say that in my blog postings and conversations.  Even in light of this stated fact, I feel I have to speak out about Exchange 2010 and the timing, structure and nature of its release. Vista was a nightmare, Exchange 2010 may become a disaster.

Labels:

Bookmark and Share
posted by Mike Talon at 0 Comments

Monday, April 13, 2009

Public airing of Exchange laundry...

Last week we finished up talking about what happens to Private Mail Stores during Online Maintenance (OM) on your Exchange servers.  While most people really tend to focus on what happens in those Private Stores the Public Stores are just as important, especially if you're not on Exchange 2007 yet.

For all versions of Exchange, Public Folders experience most of the same processes that their Private Mailbox cousins do, with the Exchange system scavenging items that are marked for deletion and have passed deleted item retention to start.  The system will also get rid of expired folders - a similar process to deleted mailbox systems in the Private Mailbox Stores - and perform an online defragmentation of the databases.  General maintenance and cleanup for Public Folders should be considered to be the same priority as OM on your Private Mailbox Stores, even if you're on Exchange 2007.

With Public Folders in Exchange 2000 and 2003, and even in 2007 in some circumstances, there is another vital process that goes on each day. It is not technically part of OM, but should be considered on par with the OM process.

As many people notice, the first time you boot up a newly-installed Exchange server, Outlook will complain about not being able to find "a required file" when you do a send/receive operation.  This is because - unless you're using a native Exchange 2007/Outlook2007 environment - Outlook cannot find the Offline Address Book (OAB) which doesn't get created in the Administrative Public Folders until around 5am the morning after the Exchange server goes online.  While not officially part of the OM process, the creation and subsequent updates of the OAB do happen overnight - 5am by default, just after the OM process is typically completed. 

OABgen (the process that - wait for it - generates the OAB) should be considered part of your OM systems. If you do not allow this to occur regularly, your OAB doesn't get updated and therefore your end users don't have the latest information - causing no end of frustration for you and your technical staff.  You could manually create/update the OAB using tools inside of Exchange, but the automated process is still your best bet for making sure these critical systems get updated regularly.

And one final note.  For those who believe they can get away with no Public Folders and/or PF maintenance because you are on Exchange 2007, be aware of one very important caveat.  All Outlook clients must be on Outlook 2007, and no Entourage or other MAPI clients can be used. That's because all MAPI systems before Outlook 2007 required the use of Public Folders for administrative information, like the OAB and other components.  In Outlook 2007, that's handled by Active Directory and Exchange itself, but if there is even one legacy client in your environment, you'll still need at least one PF tree, and therefore need to perform regular Online Maintenance on those Public Folder Stores.

Next week, back to regular blog topics - and speaking of which, if you have a topic you'd like me to cover, go ahead and drop me a line at MikeTalonNYC@yahoo.com

Labels: , , , , , ,

Bookmark and Share
posted by Mike Talon at 0 Comments

Tuesday, April 7, 2009

Mailbox Separation Anxiety

As we have been discussing over the last few weeks, Exchange 2000-2007 runs through quite a few operations during the Online Maintenance (OM) process.  We've talked about how databases get cleaned up, and how messages get deleted (temporarily and then permanently).  This week, let's look at Deleted Mailbox Retention (DMR) and what it can do for you.

DMR is the process that Exchange uses to remove mailboxes that have aged past their expiry date in the Private Database.  These expired mailboxes can be created by different operations within Exchange, either directly by the administrator of Exchange, or by operations in Active Directory that inadvertently cause a mailbox disconnect.  Disconnected mailboxes are simply mailboxes that not longer have an associated user or object in Active Directory. The simplest way to do this is to delete a user account in AD, which will disconnect the mailbox assigned to that user as part of the process.  However, this isn't the only operation that can cause mailboxes to become marked for deletion.  There is an odd side-effect to moving mailboxes that can happen if you jump between AD sites and don't finish fast enough (You can read about it here), and of course; if you get hit with AD corruption it could disconnect mailboxes as well.

Once a mailbox is disconnected, the first OM run that happens after the disconnect will mark the mailbox for expiry in a user-defined number of days. The default is 30 days, so if you run maintenance daily, the mailbox will stay in the database, but disconnected, for 30 days after it is disconnected from the mail-enabled object (like a user) in AD.  It also means you have to take when you do OM into account for your deleted item retention numbers.  For example, if you only do OM once per week, then you will need to add up to 7 days to your overall retention time for planning purposes.  Since the 30 day clock doesn't start ticking until the mailbox is marked during OM, you will extend the amount of time data remains in the database.  This is critical for regulatory compliance, as holding data for an extra week could put you over the line in terms of meeting your regulatory guidelines.

The other part of this OM process happens when Exchange scavenges the database for expired mailboxes.  If it finds one, and the mailbox hasn't been re-assigned to another AD object, Exchange will delete the data from the database permanently and create white-space in the database that gets cleaned up by OM to be reused by Exchange.

So, here again, it becomes vital that OM is permitted to run on a regular basis.  If not, deleted mailboxes will not be removed from the database unless you manually launch the cleanup wizards.  This is more than just a regulatory problem, as without removing the old information and freeing up white space, your databases will grow much more rapidly than if you are regularly cleaning up your mail stores.

Next week... Public Folder operations during Online Maintenance, so stay tuned!

Labels: , , , , , ,

Bookmark and Share
posted by Mike Talon at 0 Comments

Friday, April 3, 2009

Posted an article on CCR and SCR on DTUG

Hi folks, I posted a new article on SCR and CCR on the Double-Take User Group Blog (http://userblog.doubletake.com/) that goes into some great detail on where to use the built in tools versus 3rd-party stuff.  Note the disclaimer below - I'm a DT employee and so the article isn't really objective enough to get posted here directly, but I thought my readers might like to see it, so I'm providing links!

Of SCR--And CCR--And Double-Take--And Things--

Still going to continue with my articles on Online Maintenance next week, so stay tuned.

And here's a snippet of my article on the DTUG:

Over the course of the last 18 months, many more clients have been experimenting with the new data protection technologies in Exchange 2007, most notably Cluster Continuous Replication (CCR) and Server (or Standby) Continuous Replication (CCR).  These tools were included with Exchange 2007 RTM and SP1 respectively, and celebrated a milestone in the history of Exchange Server.  Microsoft recognized that to get more companies to switch to, or stick with, Exchange Server, a compelling story for data protection had to be introduced - in much the same way as the Walrus and the Carpenter needed to create stories to convince their oyster friends to follow them.  Hopefully, it’s not to the same end as the Walrus and the Carpenter (you can read the entire narrative poem here), but at least in theory, they are the same.

Bookmark and Share
posted by Mike Talon at 0 Comments