Programmers looking at computer software updating while preventing data loss

Avoiding Data Loss When Upgrading Software

A big concern when upgrading legacy software is keeping all the data intact and usable. In the worst case, being locked into old data formats may prevent an upgrade. In other cases, it might be necessary to accept some data loss. Often there are ways to get around the problem, but they take some extra work.

Consider several scenarios that can cause problems:

  • The existing software uses a proprietary, undocumented data format which is no longer supported.
  • The new software lets you import the old data, but the process is imperfect and risky.
  • The old data format is documented, but there’s no direct way to import it into the new software.

These situations are all serious headaches, but there are ways of dealing with them. Often there are several options, and some preserve important information better than others. Any conversion technique needs testing to make sure it does the job completely and accurately.

Undocumented Data Formats

Old applications sometimes store data in a format which no one else uses and don’t provide documentation. This can look like an insurmountable barrier to upgrading, but there’s usually a solution.

The legacy application may support exporting its data into an interchange format. Hopefully, the new software can import it directly. If not, at least it’s a known format, and it can be converted into a form which the new application can use.

Third-party utilities may be able to convert the format. Some of them are open-source code which doesn’t have an elegant user interface but gets the job done.

There may be third-party information on the format which will let a developer write a conversion script. The Archive Team site, for example, has technical information on many obscure formats.

Import Options

The new software may have an option for importing legacy data. If the old and new software are from the same publisher, chances are this approach will work well without data loss. Software which imports someone else’s proprietary format may not work so well. The developers might not have had full information on the old format, and they may have found it necessary to reverse-engineer it.

Software creators have more of an economic incentive to import data well than to export it well. New applications often are able to deal with old data.

Custom Conversion

Sometimes there’s no direct way to get from the old format to the new application. As long as it’s a documented format, this just means some custom software is needed. A conversion utility to transform data into a new format isn’t a huge task for a competent developer. There are open-source conversion applications, such as Pandoc, which let developers write new input and output modules.

Sometimes manual editing seems like the only solution. That approach is labor-intensive and error-prone, so it should be a last resort.

Minimizing the Risk of Data Loss

Any data format conversion carries risks. The process should be tested as early as possible to make sure no essential information is lost or damaged. A successful conversion will convert every data item to an equivalent in the new format and won’t damage any of them.

A conversion might be only partially successful. For instance, it might copy dates but not put them into the format which the new software demands. Testing the converted data on a small scale will catch problems early enough to find solutions.

Evaluate more than one solution, if possible. The most obvious way to migrate the data isn’t always the best one. There might be a utility that does a better job than the legacy application’s export feature. If custom software is necessary to do the job right, it’s a better choice than a built-in conversion that loses important data.

With careful planning, a migration from legacy software can overcome difficulties in migrating data and avoid losing information.