SharePoint 2010 Versions – managing the data explosion

Something you find out early on about SharePoint, though sometimes not until after you’ve got it going, is that with versioning enabled it creates a lot of extra versions of your documents – one for every editing session. This can lead to an explosion in your storage needs.
[See this article on how versioning works, and when additional versions are created automatically.]

The temptation to disable versioning on your libraries is one of the early reactions, but there are good ways to prevent that data explosion using both settings and a regular housekeeping run across your documents to purge older minor versions. In the following I’ll make a few comments about settings and about how you might make a housekeeping tool.

Versioning Settings

Versions aren’t always needed so you’ll want to assess which of your libraries should have it turned on. Note, if you are migrating documents from an older DMS into SharePoint you’ll need versioning on to add the version history of your documents, and also to keep versioning on if you want to see those versions.

No versioning: If your library is set with no versioning, editing a document overwrites the copy that is there, just like if it was on your C drive. No separate versions are possible.

SharePoint has two versioning options – Major versions only,and Major with Minor (draft) versions.

Major versions only: SharePoint creates a new version for every editing session on a document, so the Major versions only setting is most useful if you are checking the document out and checking a new published version back in. Otherwise you end up with a whole bunch of extra major versions that appear significant (they are a major version) but are really just drafts.
[Note: You can do a housekeeping run to delete older major versions but you can’t easily tell which, if any, of those older versions should be kept.]

Major and minor versions: For editing documents within SharePoint, if you want meaningful version control you need to have create majorand minor versions turned on. This creates minor versions for every editing session and the user determines when to create a major version. Also, minor versions can be hidden from readers.

Managing the drafts (minor versions) can be done by setting how many major versions to keep drafts for. If you set it to 1, then as soon as you create a major version all the minor versions disappear. This can still lead to a lot of versions being created, especially if users don’t bother to create a major version. This is where a housekeeping tool comes in handy.

A housekeeping tool to remove versions

I’m sure there must be some commercial products available to do this but so far I haven’t spotted them. I’ll update this when I do, or make a tool available here. [Note: This issue has just come up on a current project which inpsired me to write this blog now rather than wait until we had sorted out our way forwards.]

Technically it appears quite straight forward to create such a tool, something that you might run once a month to purge all minor versions that are older than some settable number of months.

Technical notes:

The raw info for creating such a tool can be gleaned from articles like these:

  • Clean up item and file versions in SharePoint using PowerShell by Stefan Bauer which talks about how to create a script with a bit of C# embedded in a PowerShell script for removing all versions (not quite what we want but useful info). It also looks at using the ContentIterator class which apparently removes the need to iterate over every site and document, as well as the need to block the database with loads of requests. It’s not thread safe, but there is a complete sample script provided.
    I haven’t tested it yet but at the very least it looks to have some good building blocks. If you use the methods of the SPFileVersionCollection object in the linked article below it should all hang together
  • Documents and versioning by Anita which discusses the class objects that make it easy to use – the SPFileVersionCollection in particular is useful as it holds all but the current version and allows you to DeleteAllMinorVersions except the current one!!!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.