Skip to main content

Concurrency & Performance

Duplicates Detection

Difficulty: Hard

In this exercise, you will build an app that identifies potential duplicate images from a provided zipped folder. Duplicates are determined by comparing the visual similarity of images based on their pixel data. Your solution should process the images efficiently in the background while ensuring a smooth and responsive user experience.

App Structure

Home Screen

  • Display all images from the folder in a grid layout (order does not matter).
  • Add a "Find Duplicates" button at the bottom of the screen.
  • While duplicate detection is running:
    • Disable the button.
    • Update the button title to "Finding duplicates...".
    • Keep the image grid scrollable to confirm the main thread remains responsive.

Duplicate Detection

When the "Find Duplicates" button is tapped, start processing in the background using parallel execution where appropriate. Recommended detection strategy:

  1. Downsize each image to a small fixed size (such as 16x16 pixels).
  2. Convert each image to grayscale.
  3. Compare pixel brightness values between images.
  4. Mark images as potential duplicates if their similarity score exceeds a defined threshold (such as 90%).

Results Screen

Automatically present the Results Screen when image processing completes. On the Results Screen, display a list of detected duplicate groups. Each group should appear as a horizontal stack of duplicate images, allowing for easy visual comparison.

Enable smooth navigation between the Home and Results screens.


Test Data

A sample images.zip folder is provided for this exercise. It contains a collection of images designed to test your duplicate detection logic:

  • <file> and <file_copy> are identical images. These should be flagged as duplicates.
  • <file> and <file_subtle> are slight variations. These may be flagged depending on your similarity threshold.
  • <file> and <file_vary> are similar in appearance but not duplicates. These should only be flagged if the threshold is set low.

Use this dataset to validate your solution and experiment with different threshold settings.


Hints

Start with a small number of images (around 5) to validate your logic. Include intentionally similar images to test duplicate detection. Gradually increase the dataset size to assess performance.

iOS

Use Core Graphics (CGContext) to resize images and convert them to grayscale, giving you access to pixel brightness values for comparison.

Keywords: CGContext, CGColorSpaceCreateDeviceGray(), UIGraphicsPushContext, draw(in:), pixel buffers.

Android

Use the Bitmap class to resize images and extract pixel data with getPixel(). Convert pixels to grayscale by averaging the color channels, then compare brightness values.

Keywords: Bitmap.createScaledBitmap(), getPixel(), color channels (red, green, blue), grayscale conversion.