This repository will be updated to contain the codebase, models and datasets for the paper:
"On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning" (https://arxiv.org/abs/2406.11823)
We are currently in the process of updating this repository. We appreciate your patience during this period and invite you to revisit for upcoming updates. Thank you for your interest and support.