Filename issues when moving from Dropbox to SharePoint

By aaron.axvig, Tue, 03/31/2020 - 16:47

I was recently helping someone with a transition from Dropbox to SharePoint/Teams/OneDrive for Business.  They were running into issues with filenames.  As an all-Mac business there were many files they had created with colons in the file or folder name.  Windows doesn't allow those characters and SharePoint does not appreciate them either.

Some Googling suggested that downloading a ZIP of the Dropbox folder might solve the problem.  I found that when I extracted the ZIP the offending files were just missing.

I ended up creating a droplet on Digital Ocean and syncing the Dropbox folder to it.  Even though it was 20GB and 14,000 files, the sync only took three or four minutes!

Then I set about carefully renaming things.  Thanks to StackOverflow I mainly worked with variations of this command: find . -type f -name "*:*" -exec rename -n 's/:/-/g' {} + It renames all files that contain a colon by replacing that with a dash.  If you have directories that contain a colon then it will fail to rename those.  Run again with -type d to rename those.  Remove the -n to actually make the changes; with -n it just tells you what it would do.  Append | wc -l on the end if you want to count how many issues you have.

Towards the end I was just using find . -name "*:*" to do final checks.  And I checked all the invalid characters.  Some of them require \ as an escape character; for example find . -name "*\?*"

Then you may find out other odd things.  For example, file and folder names cannot start with a space.  find . -type f -name " *" -exec rename -n 's/\/ /\//' {} + can help with that.

For the actual sync I setup Dropbox and OneDrive for Business on the same machine (enable long file names).  Then I used robocopy /MIR to copy files into the OneDrive folder.  After my first run of that, OneDrive started syncing as expected.  Then it got upset about some filenames, which is when I realized that there are additional characters not allowed in SharePoint that are OK in Windows.  It offered to fix those, replacing the characters with an underscore.  Then I found out about the spaces issue and fixed that back on the Linux droplet.  After Dropbox synced I ran robocopy /MIR again to put in those new fixes.  A significant problem with this method is that OneDrive for Business changes the filesize on some Microsoft filetypes so if you have a large amount of those types of files it might cause difficulties.  After the initial robocopy the magnitude of this issue can be quite reduced by using the /MAXAGE parameter to limit the files that it copies to the last few days.

Tags