X-InstructBLIP Code #599

artemisp · 2023-11-30T19:42:27Z

Adds multimodal capabilities to LAVIS (audio + 3D processors and encoders). Includes ULIP code (https://github.com/salesforce/ULIP) for 3D encoding and BEATs (https://github.com/microsoft/unilm/tree/master/beats) for audio encoding. Includes model code for X-InstructBLIP an extension of InstructBLIP to multiple modalities.

Finally, a collection of dataset support is added, including: [Audio]: AudioCaps, Clotho, AudioSet, WavCaps, ESC50, [3D] Objaverse, ModelNet [Video] MusicAVQA, VALOR, VATEX, WebVideo, VLEP, VIOLIN.

In addition, separate instruction tuning datasets are introduced with their own configs + an instruction text processor that randomly samples instruction templates for a variety of modalities and tasks.

…ypos/cleaned up some comments

X-InstructBLIP Code

Init Commit

a3637ae

salesforce-cla bot added the cla:signed label Nov 30, 2023

artemisp added 8 commits December 1, 2023 03:40

Update main LAVIS readme

88f0e9c

Update README.md

522d9e8

Update vicuna7b_v2.yaml

27662ab

Update readme

8e85d9e

Update readme

ec3d452

Add init files for processors,models,datasets were missed before

1f3253b

Add missing files from earlier commit: mostly datasets, and updated t…

71a2da9

…ypos/cleaned up some comments

Updated README

018b106

henryhungle merged commit ac8fc98 into salesforce:main Dec 12, 2023
1 of 2 checks passed

maulikmadhavi pushed a commit to maulikmadhavi/LAVIS that referenced this pull request May 19, 2024

Merge pull request salesforce#599 from artemisp/main

b28106d

X-InstructBLIP Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

X-InstructBLIP Code #599

X-InstructBLIP Code #599

artemisp commented Nov 30, 2023

X-InstructBLIP Code #599

X-InstructBLIP Code #599

Conversation

artemisp commented Nov 30, 2023