Skip to content

ThomasEhling/Caption_Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Caption_Generation

Deep Learning Photo Caption Generator

Final Project of the Computer Vision class at Illinois Institute of technology, Fall semester 2018.

Developed with Benjamin Scialom.

The whole project and code is based on the following tutorial : https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/

We implemented our own model for the feature extraction, based on the vgg16 architecture, to compare the final scores. The NLP part is exactly the same.

We are using google colab to accelerate the data preparation and training process.

The architecture between our files is the following :

If you wish to implement the project, please copy the "/data" folder into your drive and mount it locally.

The /data folder contains also all the model weigths.

The /doc folder contains both of our report, for any more details please refer to the poriginal tutorial or these reports.

The /src folder contains all the source code.

The main model architecture is :

Our Final results are :

We used the Flickr8K dataset, so we are require to cite here : M. Hodosh, P. Young and J. Hockenmaier (2013) "Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics", Journal of Artifical Intellegence Research, Volume 47, pages 853-899 http://www.jair.org/papers/paper3994.html

About

Deep Learning Photo Caption Generator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published