This is the 4th and final part of Visual Search using Deep Learning. I highly recommend checking part 0, part 1 and part 2 before reading this post.

At this stage we have successfully trained a model that can embed images such that distance between embedding of visually similar images is small and the distance of embedding of visually dissimilar images is large.

Now, we will generate an embedding for all images in our catalogue. Code for generating embeddings is quite simple. This I will skip posting the same code again.
Embeddings shape for a subset of my dataset.

  • names1 is array consisting of namepaths of the images.
  • encoding1 contains embeddings for the previously mentioned images.


print(names1.shape)
print(encoding1.shape)
(137317,)
(137280, 4096)

Computing nearest neighbor

Now to generate recommendations, we need to find other images in our dataset whose embeddings are at a small distance from embedding of given image.

neigh = NearestNeighbors(5,n_jobs=-1)
neigh.fit(encoding)
NearestNeighbors(algorithm='auto', leaf_size=30, metric='minkowski',
         metric_params=None, n_jobs=-1, n_neighbors=5, p=2, radius=1.0)


Saving the model.

"""
Save Nearest Neighbor model object
"""
# joblib.dump(neigh, model_path+'nn_mtshirt.pkl')
"""
Load Nearest Neighbor model object
"""
neigh = joblib.load(model_path+'nn_mtshirt.pkl')  


Now we will find nearest neighbor for all images in the dataset.

file = open("results_mtshirt","a")

for i in tqdm(range(int(137280/100))):
    c = neigh.kneighbors(encoding1[i*100:(i+1)*100], 10, return_distance=False)
    t = ""
    for a in c:
        for i in a:
            t= t+str(i)+"\t"
        t = t+"\n"
    file.write(t)
file.close()   
 52%|█████▏    | 718/1372 [3:49:56<3:28:26, 19.12s/it] <br> Sample output <br>
# %timeit c = neigh.kneighbors(encoding1[100:110], 10, return_distance=False)
# print(t)
100	63224	16260	42554	17549	59820	73586	125306	122035	115010	
101	75406	70201	82771	117563	119667	51405	133667	5979	96449	
102	73952	103936	77570	1505	34074	26714	125377	36586	130659	
103	50453	11533	88831	45600	11997	128065	67361	67496	27089	
104	113559	17717	38480	97309	19264	3914	2309	84518	88550	
105	66126	36091	128527	39117	125919	122481	44133	132392	130074	
106	40671	72583	123225	120801	56280	104665	52862	69204	65332	
107	55736	49595	84281	39546	7113	80444	18963	108602	102513	
108	49159	47182	32244	47495	129107	82115	49216	82738	136818	
109	102283	106145	78651	75553	32166	23820	109334	78924	40682	
110	23012	123944	75373	65893	58203	107466	77792	34084	21356	

My output

Now I will show some of the particulrly good examples of my model. I haven’t formatted the output pretty well so all suggestions will occur one after the other.

check = [i for i in range(10,15)]
for i in check:
    print("******************************************")
    b = Image.open(path+"images/men-tshirts/"+names[i])
    plt.imshow(b)
    plt.show()
    
    c = neigh.kneighbors([encoding[i]], 10, return_distance=False)
    for k in range(5):
#         print(c[0][k])
        b = Image.open(path+"images/men-tshirts/"+names[c[0][k]])
        plt.imshow(b)
        plt.show()






png png png png png png




png png png png png png




png png png png png png




png png png png png png




png png png png png png




The GOOD

See below some of the particularly good examples.

check = [0,1,15,9,11,18,23,24,29,35,42,47,52]

for i in check:
    print("******************************************")
    c = neigh.kneighbors([encoding[i]], 5, return_distance=False)
    for k in range(5):
        b = Image.open(path+"images/tops-and-tees-menu/"+names[c[0][k]])
        plt.imshow(b)
        plt.show()

******************************************

png png png png png





png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




png png png png png




The BAD

See below some of the particularly icorrect recommendations.

#good 1,15,9,11,18,23,24,29,35,42,47,52
#fail 31,41
check = [31,41]

for i in check:
    print("******************************************")
    c = neigh.kneighbors([encoding[i]], 5, return_distance=False)
    for k in range(5):
        b = Image.open(path+"images/tops-and-tees-menu/"+names[c[0][k]])
        plt.imshow(b)
        plt.show()


******************************************

png png png png png




png png png png png




Quick drawbacks of current solution

Some of the readily observed issues in the recommendation model are :

  • The embedding sometimes encodes the model’s face as a feature. Thus, in some cases suggestions for an item is some other random item worn by the same model.
  • Similarly, the embedding seems to encode the pose of a model and makes suggestions accordingly.
  • The embedding also seems to embed background patterns which also leads to incorrect suggestions.

Many of these issues can be resolved by preprocessing the image ie. zooming in on the apparel and removing background details like model face & pose.

Conclusion

Over this 4 part series on Visual Search using Deep Learning, I have tried to explain my workflow and share my implementation details. I hope you found this helpful and are able to successfully implement this or a better solution for your application. Let me know in the comments about your interesting application or solution to visual search. See you in another post. Bye!