Back to blog

Mapping Mayhem For Email Archive Migrations

Feb 5, 2018 by Jason Jacobo

So, you’ve decided to do an email archive migration. That’s a great first step!

But what’s next?Sometimes deciding to take on a project like this can feel like one of the biggest hurdles. Once it’s done and you actually need to get started, it can be more than a little overwhelming – not to mention unclear what your first steps should be. Simply put, you must initially identify what data is in scope for migration, determine whose data that is, and where is an appropriately owned repository to gently place that content within.  Sounds straightforward…. right?

Shortly after popping the seal on a new migration, you need to get mapping – unfortunately, it’s not always as simple as it sounds. The following blog will provide some guidance on how an organization could go about collecting the data needed to perform a basic mapping project.

Although our email archive migration solution, Archive Shuttle, automates this process, some migration products do not, and some engineers leave it to the customer to provide the matching information. This frees consultants from responsibility for mismatched archives, which can result in scores of users’ data being migrated to the wrong location. If you are in discussions with Migration vendors, you may wish to get a clear description of your organization’s responsibilities in the project – as this complicated and time-consuming task may be yours to accomplish.

This blog will discuss performing a mapping task under the following context: your systems are consistent, there are no real data integrity issues, and your organization’s practices are “typical”. For this purpose we will also assume that you, as the engineer needing to collect this information, have sufficient access to do so. Sure, this may be a little unrealistic in the wild, but it is where we start nonetheless, plus any migration professional will know how to adapt to an environment’s idiosyncrasies, and get you to this starting point.

Now, we will walk through the process of gathering the data needed to perform such a migration, assembling that data into Excel, and then matching that data to produce a file that matches source IDs to target mailboxes. For our example, we will be using Veritas’ Enterprise Vault, as it is an example of one of the more popular email archiving systems.

Data collection before your archive migration

Let’s get started with data collection.

Source identification

To begin with, all we need is a list of archives that are in the source and are in scope for migration. This means that the organization will have to determine what to do with:

  • Shared Archive data
  • Users who have left the organization but still have data retained in the source system.
  • Users that got married and now have a different last name.
  • All other data to qualify what they want to migrate.

Once this is determined, we need a list of these archives and their corresponding unique identifier. You will quickly find that the Usage report provided lacks the archive ID required to be of any real value. To get this data you will have to author (or borrow) a SQL query. Going with the latter, I found a forum post that provides a query with nearly everything that I wanted. Some quick modifications and I had the following query for use:

SELECT EME.MbxDisplayName AS [Mailbox Name], ESE.ExchangeComputer AS [Exchange Server], EMS.Name AS [Exchange Database], EME.ADMbxDN AS [AD Details],

       T.SID AS OwningSID,

       VSE.VaultStoreName AS [Vault Store],

          ArchivePointId AS ArchiveID,

       AV.ArchiveName,

       vLV.ArchivedItems,

       CAST(vLV.ArchivedItemsSize AS decimal(20, 0)) AS ArchivedItemsSize,

       vLV.ModifiedDate

FROM view_ListVaults AS vLV

LEFT OUTER JOIN EnterpriseVaultDirectory.dbo.ExchangeMailboxStore AS EMS

INNER JOIN EnterpriseVaultDirectory.dbo.ExchangeMailboxEntry AS EME ON EMS.MbxStoreIdentity = EME.MbxStoreIdentity

INNER JOIN EnterpriseVaultDirectory.dbo.ExchangeServerEntry AS ESE ON EMS.ExchangeServerIdentity = ESE.ExchangeServerIdentity

INNER JOIN EnterpriseVaultDirectory.dbo.Root AS R ON EME.DefaultVaultId = R.VaultEntryId

INNER JOIN EnterpriseVaultDirectory.dbo.Archive AS A ON R.RootIdentity = A.RootIdentity

INNER JOIN EnterpriseVaultDirectory.dbo.VaultStoreEntry AS VSE ON A.VaultStoreEntryId = VSE.VaultStoreEntryId

RIGHT OUTER JOIN EnterpriseVaultDirectory.dbo.ArchiveView AS AV ON
R.RootIdentity = AV.RootIdentity ON vLV.ArchivePointId = AV.VaultEntryId

LEFT OUTER JOIN EnterpriseVaultDirectory.dbo.Trustee AS T ON R.OwningTrusteeIdentity = T.TrusteeIdentity

ORDER BY ESE.ExchangeComputer,

         EMS.Name,

         EME.MbxDisplayName ASC

This query had to be run per EV Vault Store database, and assumes the EV directory DB is on the same server. This query specifically suited my needs because we intended to have the organization’s Leavers in scope for migration, and this represented those users as a “Null” value being returned for some archives. Even in just one of our EV labs, we had four databases to collect data from.

We executed the query and wrote the results to a tab in an Excel Worksheet .Once compiled, we then created a new sheet that had all the output in one view. This sheet becomes very useful later, as you have the size of the archives and what vault store they come from. If you are going to a local target like Enterprise Vault or Exchange, you can use this for planning your waves to ensure they are balanced over all your resources.  For our purposes, it is the sheet you will use to match to targets. In our simple example, this is the only info we require – some migration solutions require the Display Name, ODN, canonical name and/or distinguished name in addition to the information we collected.  But before all that, you need to collect the account and target information.

Account and target identification

So we now have all the source information available for matching, but we need to collect the accounts they’re associated with, and the target for those accounts. For targets like Exchange or Office 365 this can be easy as all the required information is contained within Active Directory, and is therefore accessible via PowerShell.

There are many ways to get this information. For my purposes I used the following PowerShell command:

Get-User -RecipientTypeDetails UserMailbox -ResultSize Unlimited | where {$_.UseraccountControl -notlike "*accountdisabled*"}
| Select-Object Sid,DisplayName,WindowsEmailAddress,UserAccountControl | Export-Csv d:\temp\ActiveAccounts.csv

I took the resulting CSV and added it to my growing workbook, and now I’m ready to move on.

Data assembly

For a simple EV to Exchange migration, I have all the data I need. This same process is very similar for Exchange Online mappings, and the concept is the same for all migrations. Now, we need to put it together and provide the desired output. At this point, it’s important to determine what output the migration solution you are using requires.

Some solutions are flexible but typically they need the source ArchiveID and target mailbox for Exchange or Exchange Online migrations. That is what we will be seeking. I will be the first to tell you, I am not an Excel wizard like many people I know, but my Excel skills have increased dramatically since I have been working with migrations!

The method or approach you take towards this task may depend on what data you have in scope. For this example, I am assuming that all identified archives are in scope for migration, so I want to start with all the ArchiveIDs found through my query against the EV databases.

Once I have these in my spreadsheet, I use vLookup (which, if you have not used it before can change your life!) to create a simple formula to match the SID output of the source archive ID to the SID from your active account, returning the associated archive ID. If that reads like gibberish to you, don’t worry – it looks like the following example.

=VLOOKUP(AllSourceArchives!E2,ActiveAccounts!A:C,3,FALSE)

This results in a mapping file I could feed a migration product or associated SQL server to map the source Archive ID to the target email account:

[vc_single_image image=”6122″ img_size=”medium”]

As you can tell (even in our limited data set) we have archives with no clear owner…. something commonly caused by users leaving the organization. This is not surprising because Veritas recommends you archive all leaver’s email into their archive as part of the disablement process. This means that ownerless archives from people who left the organization (commonly known as “Leavers”) are common, and how to address them is a question for the organization.

Completing your email archive migration

These were some quick and easy example as to how you might go about matching source archives to their target mailboxes. This is a task that is one of the most basic requirements of any migration. It is also wrought with the potential of manual error and that can easily result in scores of users having content migrated to targets that is not theirs. Because of these reasons and more, this is a task frequently left to the organizations that require the migration, not to the engineers performing it.

Some solutions do not automate this at all, and import of CSV is the most common and practical means of wave mapping. Others have a rudimentary system that does not accommodate for the variety of circumstances seen outside of sterile lab environments. We have been successful enough with Archive Shuttle’s bulk mapping feature that we do not even offer the ability to import a CSV file for mapping within the UI!

One of our largest strengths as an organization is to listen, understand, define, and implement feedback from engineers, partners, and customers into our products to make them be better tomorrow then they are today. A gleaming manifestation of this can be seen in our User Dashboard. Here we present the information gathered about a single user prior to, and through, a migration. You can look up a user and review anything from his AD attributes to his associated archive(s), from his mailbox status, to any desired detail of his migration status (if enabled). We know this information is valuable to whomever is performing the migration, and manually tracking it down looks a lot like the above process.

If you are not able to use a product like Archive Shuttle to make the complicated task of a migration much easier, then make sure you are aware of the task ahead. Make sure to ask any potential vendors if their proposed solution has an automated means to match source archives to their associated targets. If they don’t, or if the default means fail, ask what about the options and what is required of you to accommodate those options.  If “provide a CSV” comes up, they you better brush up on some Excel because chances are you will have quite a high-risk task before you!

Here are some links to resources that helped me along the way:

How to use the Vault Store Usage Reporter for Enterprise Vault (EV)

PowerShell list of mailboxes belonging to enabled AD users

VLOOKUP function

What is the process if a mailbox that was Enterprise Vault (EV) enabled is no longer used or deleted?