Wednesday, February 6, 2013

SharePoint 2013 Duplicate Search Results Missing

This one had me scratching my head for some while. Duplicates are not shown by default and the view duplicates link needs to be activated on the search results web part for users to be able to find duplicates.

As SharePoint is being super clever and recognises that files with different file names can still be duplicates, it is not clever enough.
I had three documents, all pdf scans, with the same metadata, different content type and different binary footprint. SP deemed them all to be the same and filtered out two. even though the files were completely different and had completely different binary data. But as the indexable content of the three documents was exceedingly limited and very much the same, SharePoint deemed them to be too similar.

The frustrating part was that the date range slider would indicate that there are three results in the index but I could not get them to show.

Only after enabling the "View Duplicates Link" check box in the search results web part was I able to then click on the view duplicates link on the preview dialog and voilla, all results appeared as if by magic.

Lesson of the day? Beware when indexing scanned pdf documents without OCR. SharePoint will throw them all in one bucket if too much of the metadata is similar. And, the duplicates link needs to be activated on the web part before you can see them.


Ravi Khambhati said...

Earlier we had a property in search result which includes duplicate results. So there will be no link to view duplicate results.

But now with 2013 I am not able to see that property anymore.

Can you please let me know how can i set that property in SharePoint 2013?

Alex Dean said...

Previously you could enable/disable the link. that is not possibile as such no more. Either duplicates are included and you have to scour for the link or they are simply excluded and no link will get them in.
I did try out lots of different options. :-( sadly the way SP 2010 handled it was much better.