sinandrei's comments

sinandrei · 2026-02-11T17:16:29 1770830189

Has anyone experiment with using VLM to detect "marks"? Thinking of pen/pencil based markings like underlines, circles,checkmarks.. Can these models do it?

leetharris · 2026-02-11T17:31:19 1770831079

None of them do it well from our experience. We had to write our own custom pipeline with a mixture of legacy CV approaches to handle this (AI contract analysis). We constantly benchmark every new multimodal and VLM model that comes out and are consistently disappointed.

coder543 · 2026-02-11T17:44:01 1770831841

If someone releases a benchmark/dataset, I'm sure that significantly increases the chances of one of these AI labs training on the task.

sinandrei · 2026-02-11T16:51:05 1770828665

Would this work by just feeding it my obsidian vault?

segmenta · 2026-02-11T17:07:47 1770829667

Yep - if you just want the assistant to access your Obsidian vault, you can point it to the vault during chat. If you’d like it to show up in the UI as well, or don't want to point to it on every chat, you can copy the vault to a folder under knowledge: ~/.rowboat/knowledge/obsidian-vault.

sinandrei · 2026-01-15T09:57:16 1768471036

Anyone use these approaches with academic pdfs?

urschrei · 2026-01-15T10:02:05 1768471325

Another approach is to teach Claude Code how to use your Zotero library's full-text search: https://github.com/urschrei/zotero_search_skill.

alansaber · 2026-01-15T14:57:30 1768489050

I've not seen any impressive products. But products do exist ie https://scibite.com/solutions/semantic-search/

amelius · 2026-01-15T10:31:19 1768473079

Anyone using them for electronics datasheets?

bradfa · 2026-01-15T13:42:21 1768484541

I would like to. I haven't yet found a solution that works well.

The problems with datasheets is tables which span multiple pages, embedded images for diagrams and plots, they're generally PDFs, and only sometimes are they 2-column layout.

Converting from PDF to markdown while retaining tables correctly seems to work well for me with Mistral's latest OCR model, but this isn't an open model. Using docling with different models has produced much worse results.

sosojustdo · 2026-01-15T14:02:51 1768485771

I've been working on a tool specifically to handle these messy PDF-to-Markdown conversions because I ran into the same issues with tables and multi-column layouts.

I’ve optimized https://markdownconverter.pro/pdf-to-markdown to handle complex PDFs, including those tricky tables that span multiple pages and 2-column formats that usually trip up tools like Docling. It also extracts embedded diagrams/images and links them properly in the output.

Full disclosure: I'm the developer behind it. I’d love to see if it handles your specific datasheets better than the models you've tried. Feel free to give it a spin!

bradfa · 2026-01-15T14:34:32 1768487672

Cool! But given that often electronics documentation is covered by NDAs, my preferred solution is local-first if at all possible.