AI Tool Profile

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning

In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories.

Paper and LLMs Semantic Segmentation Instance Segmentation

Website

github.com

Pricing model

Free

Price start

Free

GitHub Link

The GitHub link is https://github.com/aim-uofa/segprompt

Introduce

The repository "aim-uofa/SegPrompt" contains the official implementation of the ICCV 2023 paper titled "SegPrompt Boosting Open-World Segmentation via Category-level Prompt Learning." The authors propose SegPrompt for improving open-world segmentation through category-level prompt learning. They introduce a new benchmark called LVIS-OW, which involves reorganizing COCO and LVIS datasets into Known-Seen-Unseen categories for better evaluating open-world models. The repository provides dataset preparation instructions, benchmark details, and evaluation scripts. Acknowledgments are given to related repositories like Mask2Former and Detectron2, and the paper encourages proper citation if the project is used. In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories.

Content

1Zhejiang University, 2The University of Adelaide, Please follow the instructions in Mask2Former Here we provide our proposed new benchmark LVIS-OW. First prepare COCO and LVIS dataset, place them under $DETECTRON2_DATASETS following Detectron2 The dataset structure is as follows: Or you can directly use the command to generate from the json file of COCO and LVIS. We thank the following repos for their great works: If you found this project useful for your paper, please kindly cite our paper.

Alternatives & Similar Tools

Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation Free

Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.

Visit →

Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation Free

Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages.

Visit →

Free Google Gemini: the best largest and most capable AI model Free

Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.

Visit →

LongLLaMA-handle very long text contexts, up to 256,000 tokens Open Source

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.

Visit →

LAMA: Human motion data to realistic complex 3D model actions Open Source

LAMA utilizes a reinforcement learning framework combined with a motion matching algorithm. Reinforcement learning helps the model make appropriate decisions in various scenarios, while motion matching algorithms ensure that synthesized actions match real human actions. In addition, LAMA also utilizes the motion editing framework of manifold learning to cover various possible changes in interactions and operations.

Visit →

Replicate-AI model GFPGAN can help restore old photos Paid

Replicate – Run open-source machine learning models with a cloud API

Visit →

Compare SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning

Quick compare routes for nearby alternatives.

All compare routes →

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning vs Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation

Compare SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning with Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation and jump into the preserved compare route.

Open compare route →

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning vs Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation

Compare SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning with Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation and jump into the preserved compare route.

Open compare route →

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning vs Free Google Gemini: the best largest and most capable AI model

Compare SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning with Free Google Gemini: the best largest and most capable AI model and jump into the preserved compare route.

Open compare route →