Project Introduction

image segmentation prompt

This project aims to build a bridge (a connection) between users' text request and object detection inside an image.

- First input: Users' text request (query or prompt) about an object;

- Second input : The image;

- Output : The requested object, filtred and highlighted (segmented).

For example: the user has an image of people playing in the park, and wants to filter out dogs in the picture.

in order to do so, the user inserts the picture and writes this query: "highlight dogs in the picture"

The output would be a processed images where dogs are highlighted

How were we able to do that ?

Building from scratch a model, that is trained on a dataset according to the field of interest.

What's new about the project ?

Preparing an image dataset for training a model on segmentation is a time and energy consuming task, this process is done manually where one has to draw a contour on each object and label it.

The bridge, the connection or the model we are building from scratch uses FOUNDATION MODELS for training (look at like a human sitting on a computer, drawing contours and labeling each object on the image). This enable optimization of time and labor resources and open doors to the use of large-scale datasets for training and application purposes using flexible prompt.

This project goes way beyond the scope of detecting dogs in parks and may be used to perform object detection on any image in any field.

Project building strategy:

Modular components

Manual implementation: Each component is implemented manually for pedagogical reasons

Build to last strategy : Simple, accessible documentation with practice examples

Accuracy-oriented: Replacing manually implemented components with imported frameworks for more accuracy

Project Introduction

image segmentation prompt

Documentation axes