Use the power of AI to caption images with a simple right-click.
Let's make the web a more accessible place.
Auto Alt Text is a chrome extension that can generate descriptive captions for pictures.
Currently, users who are visually impaired must rely on metadata and alt-text descriptions put
in by website developers in order to understand what an image actually contains. However, not all
web developers take the time to caption all their images. This is where Auto Alt Text steps in.
Using artificial intelligence, the extension can analyze an image and detect the contents of the scene
depicted in it within 5 seconds!
It's pretty simple to get up and running!:
Auto Alt Text is based off of the im2txt model which was created by Vinyals et al for the 2015 MCOCO Image Captioning Challenge.
The model itself is based off of a encoder-decoder neural network (basically a deep conv net paired with a LSTM). The deep conv net first encodes an image into a vector representation using Inception v3 (a popular image recognition model). The LSTM then creates a captioning model based on the Inception v3 encodings.
I converted the model into an API and pared it down so that it could fit on a Lambda instance and stay loaded into memory for blazing fast responses under 5 seconds (compared to the > 15 seconds needed for the model to classify out of the box).
If you want to learn more about the model itself, you can read the paper here.
My name is Abhinav Suri and I am junior at the University of Pennsylvania. I love CS + Biology and am always looking for
ways to benefit the community around me through programming.
One of the causes I am involved in is Hack4Impact. We're a 501(c)3 student-led organization that
works with nonprofits and other socially responsible organizations to build apps to serve the community. We've worked on apps to
combat wage theft, help
foster youth find resources around them, and much more. If you'd like to work with us
(or know any nonprofits that have app ideas), shoot me an email at [email protected] or donate to us.
All donation proceeds go towards Hack4Impact
Also, if you want to help with the code, I have open sourced everything related to the API on Github. Shoot me a PR and i'll take a look :)