Tag Archives: Exchange 2010

Exchange Server and Azure: “not now” vs “never”

Wow, look what I found in my drafts folder: an old post.

Lots of Exchange admins have been wondering whether Windows Azure can be used to host Exchange. This is to be expected for two reasons. First, Microsoft has been steadily raising the volume of Azure-related announcements, demos, and other collateral material. TechEd 2014 was a great example: there were several Azure-related announcements, including the availability of ExpressRoute for private connections to the Azure cloud and several major new storage improvements. These changes build on their aggressive evangelism, which has been attempting, and succeeding, to convince iOS and Android developers to use Azure as the back-end service for their apps. The other reason, sadly, is why I’m writing: there’s a lot of misinformation about Exchange on Azure (e.g. this article from SearchExchange titled “Points to consider before running Exchange on Azure”, which is wrong, wrong, and wrong), and you need to be prepared to defuse its wrongness with customers who may misunderstand what they’re potentially getting into.

On its face, Azure’s infrastructure-as-a-service (IaaS) offering seems pretty compelling: you can build Windows Server VMs and host them in the Azure cloud. That seems like it would be a natural fit for Exchange, which is increasingly viewed as an infrastructure service by customers who depend on it. However, there are at least three serious problems with this approach.

First: it’s not supported by Microsoft, something that the “points to consider” article doesn’t even mention. The Exchange team doesn’t support Exchange 2010 or Exchange 2013 on Azure or Amazon EC2 or anyone else’s cloud service at present. It is possible that this will change in the future, but for now any customer who runs Exchange on Azure will be in an unsupported state. It’s fun to imagine scenarios where the Azure team takes over first-line support responsibility for customers running Exchange and other Microsoft server applications; this sounds a little crazy but the precedent exists, as EMC and other storage companies did exactly this for users of their replication solutions back in Exchange 5.5/2000 times. Having said that, don’t hold your breath. The Azure team has plenty of other more pressing work to do first, so I think that any change in this support model will require the Exchange team to buy in to it. The Azure team has been able to get that buy-in from SharePoint, Dynamics, and other major product groups within Microsoft, so this is by no means impossible.

Second: it’s more work. In some ways Azure gives you the worst of the hosted Exchange model: you have to do just as much work as you would if Exchange were hosted on-premises, but you’re also subject to service outages, inconsistent network latency, and all the other transient or chronic irritations that come, at no extra cost, with cloud services. Part of the reason that the Exchange team doesn’t support Azure is because there’s no way to guarantee that any IaaS provider is offering enough IOPS, low-enough latency, and so on, so troubleshooting performance or behavior problems with a service such as Azure can quickly turn into a nightmare. If Azure is able to provide guaranteed service levels for disk I/O throughput and latency, that would help quite a bit, but this would probably require significant engineering effort. Although I don’t recommend that you do it at the moment, you might be interested in this writeup on how to deploy Exchange on Azure; it gives a good look at some of the operational challenges you might face in setting up Exchange+Azure for test or demo use.

Third: it’s going to cost more. Remember that IaaS networks typically charge for resource consumption. Exchange 2013 (and Exchange 2010, too) is designed to be “always on”. The workload management features in Exchange 2013 provide throttling, sure, but they don’t eliminate all of the background maintenance that Exchange is more-or-less continuously performing. These tasks, including GAL grammar generation for Exchange UM, the managed folder assistant, calendar repair, and various database-related tasks, have to be run, and so IaaS-based Exchange servers are continually going to be racking up storage, CPU, and network charges. In fairness, I haven’t estimated what these charges might be for a typical test-lab environment; it’s possible that they’d be cheap enough to be tolerable, but I’m not betting on it, and no doubt a real deployment would be significantly more expensive.

Of course, all three of these problems are soluble: the Exchange team could at any time change their support policy for Exchange on Azure, and/or the Azure team could adjust the cost model to make the cost for doing so competitive with Office 365 or other hosted solutions. Interestingly, though, two different groups would have to make those decisions, and their interests don’t necessarily align, so it’s not clear to me if or when we might see this happen. Remember, the Office 365 team at Microsoft uses physical hardware exclusively for their operations.

Does that mean that Azure has no value for Exchange? On the contrary. At TechEd New Orleans in June 2013, Microsoft’s Scott Schnoll said they were studying the possibility of using an Azure VM as the witness server for DAGs in Exchange 2013 CU2 and later. This would be a super feature because it would allow customers with two or more physically separate data centers to build large DAGs that weren’t dependent on site interconnects (at the risk, of course, of requiring always-on connectivity to Azure). The cost and workload penalty for running an FSW on Azure would be low, too. In August 2013, the word came down: Azure in its present implementation isn’t suitable for use as an FSW. However, the Exchange team has requested some Azure functionality changes that would make it possible to run this configuration in the future, so we have that to look forward to.

Then we have the wide world of IaaS capabilities opened up by Windows Azure Active Directory (WAAD), Azure Rights Management Services, Azure Multi-Factor Authentication, and the large-volume disk ingestion program (now known as the Azure Import/Export Service). As time passes, Microsoft keeps delivering more, and better, Azure services that complement on-premises Exchange, which has been really interesting to watch. I expect that trend to continue, and there are other, less expensive ways to use IaaS for Exchange if you only want it for test labs and the like. More on that in a future post….

3 Comments

Filed under General Tech Stuff, UC&C

The value of lagged copies for Exchange 2013

Let’s talk about… lagged copies.

For most Exchange administrators, the subject of lagged database copies falls somewhere between “the Kardashians’ shoe sizes” and “which of the 3 Stooges was the funniest” in terms of interest level. The concept is easy enough to understand: a lagged copy is merely a passive copy of a mailbox database where the log files are not immediately played back, as they are with ordinary passive copies. The period between the arrival of a log file and the time when it’s committed to the database is known as the lag interval. If you have a lag interval of 24 hours set to a database, a new log for that database generated at 3pm on April 4th won’t be played into the lagged copy until at least 3pm on April 5th (I say “at least” because the exact time of playback will depend on the copy queue length). The longer the lag interval, the more “distance” there is between the active copy of the mailbox database and the lagged copy.

Lagged copies are intended as a last-ditch “goalkeeper” safety mechanism in case of logical corruption. Physical corruption caused by a hardware failure will happen after Exchange has handed the data off to be written, so it won’t be replicated. Logical corruption introduced by components other than Exchange (say, an improperly configured file-level AV scanner) that directly write to the MDB or transaction log files wouldn’t be replicated in any event, so the real use case for the lagged copy is to give you a window in time during which logical corruption caused by Exchange or its clients hasn’t yet been replicated to the lagged copy. Obviously the size of this window depends on the length of the lag interval, and whether or not it is sufficient for you to a) notice that the active database has become corrupted b) play the accumulated logs forward into the lagged copy and c) activate the lagged copy depends on your environment.

The prevailing sentiment in the Exchange world has largely been “ I do backups already so lagged copies don’t give me anything.” When Exchange 2010 first introduced the notion of a lagged copy, Tony Redmond weighed in on it. Here’s what he said back then:

For now, I just can’t see how I could recommend the deployment of lagged database copies.

That seems like a reasonable stance, doesn’t it? At MEC this year, though, Microsoft came out swinging in defense of lagged copies. Why would they do that? Why would you even think of implementing lagged copies? It turns out that there are some excellent reasons that aren’t immediately apparent. (It may help to review some of the resiliency and HA improvements delivered in Exchange 2013; try this this excellent omnibus article by Microsoft’s Scott Schnoll if you want a refresher.) Here are some of the reasons why Microsoft has begun recommending the use of lagged copies more broadly.

1. Lagged copies are better in 2013

Exchange 2013 includes a number of improvements to the lagged copy mechanism. In particular, the new loose truncation feature introduced in SP1 means that you can prevent a lagged copy from taking up too much log space by adjusting the the amount of log space that the replay mechanism will use; when that limit is reached the logs will be played down to make room. Exchange 2013 (and SP1) also make a number of improvements to the Safety Net mechanism (discussed fully in Chapter 2 of the book), which can be used to play missing messages back into a lagged copy by retrieving them from the transport subsystem.

2. Lagged copies are continuously verified

When you back up a database, Exchange checks the page checksum of every page as it is backed up by computing the checksum and comparing it to the stored checksum; if that check fails, you get the dreaded JET_errReadVerifyFailure (-1018) error. However, just because you can successfully complete the backup doesn’t mean that you’ll be able to restore it when the time comes. By comparison, the Exchange log playback mechanism will log errors immediately when they are encountered during log playback. If you’re monitoring event logs on your servers, you’ll be notified as soon as this happens and you’ll know that your lagged copy is unusable now, not when you need to restore it. If you’re not monitoring your event logs, then lagged copies are the least of your problems.

3. Lagged copies give you more flexibility for recovery

When your active and passive copies of a database become unusable and you need to fall back to your lagged copy, you have several choices, as described in TechNet. You can easily play back every log that hasn’t yet been committed to the database, in the correct order, by using Move-ActiveMailboxDatabase. If you’d rather, you can play back the logs up to a certain point in time by removing the log files that you don’t want to play back. You can also play messages back directly from Safety Net into the lagged copy.

4. There’s no hardware penalty for keeping a lagged copy

Some administrators assume that you have to keep lagged copies of databases on a separate server. While this is certainly supported, you don’t have to have a “lag server” or anything like unto it. The normal practice in most designs has been to store lagged copies on other servers in the same DAG, but you don’t even have to do that. Microsoft recommends that you keep your mailbox databases no bigger than 2TB. Stuff your server with a JBOD array of the new 8TB disks (or, better yet, buy a Dell PowerVault MD1220) and you can easily put four databases on a single disk: the active copy of DB1, the primary passive copy of DB2, the secondary passive copy of DB3, and the lagged copy of DB4. This gives you an easy way to get the benefits of a 4-copy DAG while still using the full capacity of the disks you have: the additional IOPS load of the lagged copy will be low, so hosting it on a volume that already has active and passive copies of other databases is a reasonable approach (one, however, that you’ll want to test with jetstress).

It’s always been the case that the architecture Microsoft recommends when a new version of Windows or Exchange is released evolves over time as they, and we, get more experience with it in the real world. That’s clearly what has happened here; changes in the product, improvements in storage hardware, and a shift in the economic viability of conventional backups mean that lagged copies are now much more appropriate for use as a data protection mechanism than they were in the past. I expect to see them deployed more and more often as Exchange 2013 deployments continue and our collective knowledge of best practices for them improves.

1 Comment

Filed under UC&C

Getting ready for MEC 2014

Wow, it’s been nearly a month since my last post here. In general I am not a believer in posting stuff on a regular schedule, preferring instead to wait until I have something to say. All of my “saying” lately has been on behalf of my employer though. I have barely even had time to fly. For another time: a detailed discussion of the ins and outs of shopping for an airplane. For now, though, I am making my final preparations to attend this year’s Microsoft Exchange Conference (MEC) in Austin! My suitcase is packed, all my devices are charged, my slides are done, and I am prepared to overindulge in knowledge sharing, BBQ eating, and socializing.

It is interesting to see the difference in flavor between Microsoft’s major enterprise-focused conferences. This year was my first trip to Lync Conference, which I would summarize as being a pretty even split between deeply technical sessions and marketing focused around the business and customer value of “universal communications”. In reviewing the session attendance and rating numbers, it was no surprise that the most-attended sessions and the highest-rated sessions tended to be 400-level technical sessions such as Brian Ricks’ excellent deep-dive on Lync client sign-in behavior. While I’ve never been to a SharePoint Conference, from what my fellow MVPs say about it, there was a great deal of effort expended by Microsoft on highlighting the social features of the SharePoint ecosystem, with a heavy focus on customization and somewhat less attention directed at SharePoint Online and Office 365. (Oh, and YAMMER YAMMER YAMMER YAMMER YAMMER.) Judging from reactions in social media, this focus was well-received but inevitably less technical given the newness of the technology.

That brings us to the 2014 edition of MEC. The event planners have done something unique by loading the schedule with “Unplugged” panel discussions, moderated by MVP and MCM/MCSM experts and consisting of Microsoft and industry experts in particular technologies. These panels provide an unparalleled opportunity to get, and give, very candid feedback around individual parts of Exchange and I plan on attending as many of them as I can. This is in no way meant to slight the many other excellent sessions and speakers that will be there. I’d planned to summarize specific sessions that I thought might be noteworthy, but Tony published an excellent post this morning that far outdoes what I had in mind, breaking down sessions by topic area and projected attendance. Give it a read.

I’m doing two sessions on Monday: Exchange Unified Messaging Deep Dive at 245p and Exchange ActiveSync: Management Challenges and Best Practices at 1145a. The latter is a vendor session with the folks from BoxTone, during which attendees both get lunch (yay) and the opportunity to see BoxTone’s products in action. They’re also doing a really interesting EAS health check, during which you provide CAS logs and they run them through a static analysis tool that, I can almost guarantee, will tell you things you didn’t know about your EAS environment. Drop by and say hello!

Leave a comment

Filed under UC&C

iOS 7 Exchange ActiveSync problems revisited

Back in September I posted an article about a problem that occurred when synchronizing iOS 7 devices against Exchange 2010 SP2. The wheels of justice grind slowly, but Microsoft has released a KB article and accompanying hotfix that describe the symptoms precisely.

I also got an odd report from a large enterprise customer; they had several hundred iOS 7.0.2 devices, all on Verizon in one specific region, that were having synchronization problems. The issue here turned out to be a network configuration issue on Verizon’s network that required some action from them to fix.

Now you’re probably starting to see the value in solutions like those from BoxTone

 

 

 

3 Comments

Filed under UC&C

Odd iOS 7.0x Exchange ActiveSync problem

from the oops-they-may-have-done-it-again department…

I just got an e-mail from a former coworker reporting a problem with synchronizing some, but not all, iOS 7.0.x devices with Exchange 2010 SP2. There are four users (Alex, Eric, James, and Peter, let’s say) with shiny new iPhone 5s devices. Two of them get the same error when syncing: the Provision verb is returning a status of 110 and throwing an exception from Microsoft.Exchange.Security.Compliance.MessageDigestForNonCryptographicPurposes.HashCore. This seems to point to a problem with crypto negotiation with the devices, but I haven’t been able to look at a trace of the conversation between the device and the server to check.

James’ device works fine. Alex’s device works fine. Peter’s device does not work, either with his own account or Alex’s. Eric’s device does not work with his account; no other accounts have been tested. This seems to indicate that the problem is not (necessarily) with the account. Peter and Eric have both wiped their devices, deleted their Exchange accounts, rebooted the devices, and done all the other stuff you might try when faced with this problem, but to no avail.

This Apple support forum thread seems to indicate that a few others who have the same problem, but none of the recommended fixes have worked for Alex or Peter. My working theory is that this is due to an unwanted interaction of some kind between Exchange 2010 SP2 and iOS 7.x, but I can’t prove that yet. As far as I can tell, Exchange 2013 CU2 doesn’t have the same problem.

I’m posting this in hope that it might come to the attention of anyone else who’s having a similar problem so I can get a sense of its scope and nature.

More news when there is news…

7 Comments

Filed under UC&C

Do mailbox quotas matter to Outlook and OWA?

Great question from my main homie Brian Hill:

Is there a backend DB reason for setting quotas at a certain size? I have found several links (like this one) discussing the need to set quotas due to the way the Outlook client handles large numbers of messages or OST files, but for someone who uses OWA, does any of this apply?

Short answer: no.

Somewhat longer answer: no.

The quota mechanism in Exchange is an outgrowth of those dark times when a large Exchange server might host a couple hundred users on an 8GB disk drive. Because storage was so expensive, Microsoft’s customers demanded a way to clamp down on mailbox size, so we got the trinity of quota limits: prohibit send, prohibit send and receive, and warn. These have been with us for a while and persist, essentially unchanged, in Exchange 2013, although it is now common to see quotas of 5GB or more on a single mailbox.

Outlook has never had a formal quota mechanism of its own, apart from the former limit of 2GB on PST files imposed by the 32-bit offsets used as pointers in the original PST file format. This limit was enforced in part by a dialog that would tell you that your PST file was full and in part by bugs in various versions of Outlook that would occasionally corrupt your PST file as it approached the 2GB size limit. Outlook 2007 and later pretty much extinguished those bugs, and the Unicode PST file format doesn’t have the 2GB limit any longer. Outlook 2010 and 2013 set a soft limit on Unicode PSTs of 50GB, but you can increase the limit if you need to.

Outlook’s performance is driven not by the size of the PST file itself (thought experiment: imagine a PST with a single 10GB item in it as opposed to one with 1 million 100KB messages) but by the number of items in any given folder. Microsoft has long recommended that you keep Outlook item counts to a maximum of around 5,000 items per folder (see KB 905803 for one example of this guidance). However, Outlook 2010 and 2013, when used with Exchange 2010 or 2013, can handle substantially more items without performance degradation: the Exchange 2010 documentation says 100,000 items per folder is acceptable, though there’s no published guidance for Exchange 2013. There’s still no hard limit, though. The reasons why the number of items (and the number of associated stored views) are well enumerated in this 2009 article covering Exchange 2007. Some of the mechanics described in that article have changed in later versions of Exchange but the basic truth remains: the more views you have, and/or the more items that are found or selected by those views, the longer it will take Exchange to process them.

If you’re wondering whether your users’ complaints of poor Outlook performance are related to high item counts, one way to find out is to use a script like this to look for folders with high item counts.

Circling back to the original question: there is a performance impact with high item count folders in OWA, but there’s no quota mechanism for dealing with it. If you have a user who reports persistently poor OWA performance on particular folders, high item counts are one possible culprit worth investigating. Of course, if OWA performance is poor across multiple folders that don’t have lots of items, or across multiple users, you might want to seek other causes.

Leave a comment

Filed under UC&C

Microsoft Certified Systems Master certification now dead

I received a very unwelcome e-mail late last night:

Microsoft will no longer offer Masters and Architect level training rotations and will be retiring the Masters level certification exams as of October 1, 2013. The IT industry is changing rapidly and we will continue to evaluate the certification and training needs of the industry to determine if there’s a different certification needed for the pinnacle of our program.

This is terrible news, both for the community of existing MCM/MCSM holders but also for the broader Exchange community. It is a clear sign of how Microsoft values the skills of on-premises administrators of all its products (because all the MCSM certifications are going away, not just the one for Exchange). If all your messaging, directory, communications, and database services come from the cloud (or so I imagine the thinking goes), you don’t need to spend money on advanced certifications for your administrators who work on those technologies.

This is also an unfair punishment for candidates who attended the training rotation but have yet to take the exam, or those who were signed up for the already-scheduled upgrade rotations, and those who were signed up for future rotations. Now they’re stuck unless they can take, and pass, the certification exams before October 1… which is pretty much impossible. It greatly devalues the certification, of course, for those who already have it. Employers and potential clients can look at “MCM” on a resume and form their own value judgement about its worth given that Microsoft has dropped it. I’m not quite ready to consign MCM status to the same pile as CNE, but it’s pretty close.

The manner of the announcement was exceptionally poor in my opinion, too: a mass e-mail sent out just after midnight Central time last night. Who announces news late on Friday nights? People who are trying to minimize it, that’s who. Predictably, and with justification, the MCM community lists are blowing up with angry reaction, but, completely unsurprisingly, no one from Microsoft is taking part, or defending their position, in these discussions.

As a longtime MCM/MCSM instructor, I have seen firsthand the incredible growth and learning that takes place during the MCM rotations. Perhaps more importantly, the community of architects, support experts, and engineers who earned the MCM has been a terrific resource for learning and sharing throughout their respective product spaces; MCMs have been an extremely valuable connection between the real world of large-scale enterprise deployments and the product group.

In my opinion, this move is a poorly-advised and ill-timed slap in the face from Microsoft, and I believe it will work to their detriment.

18 Comments

Filed under FAIL, UC&C