Rattibha

Dylan Freedman @dylan@journa.host

7 Tweets 52 reads Apr 25, 2023

I'm excited to announce Semantra: an open source multi-tool for semantic search 🎉 github.com
- Launch a local search engine over text and PDF files
- Search by concepts/meaning
- Refine results via tagging and adding/subtracting queries
Try it out now 🚀📚🔍

github.com/freedmand/sema…

GitHub - freedmand/semantra: Multi-tool for semantic search

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

Semantra is built for those seeking needles in haystacks: journalists, researchers, students, and more.
I've found it useful personally across a wide range of content, including books, reports, speeches, and government documents.
Tutorial: github.com

github.com/freedmand/sema…

semantra/tutorial.md at main · freedmand/semantra

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

Semantra runs locally, keeping your data safe, or it can optionally use OpenAI's paid embedding models to offload computation.
Install with Python/pipx:
```
python3 -m pip install --user pipx
python3 -m pipx ensurepath
```
In a new terminal, run:
```
pipx install semantra
```

To run Semantra over a collection of documents (text or pdf):
```
semantra <filenames>
```
It will download embedding models as needed, analyze the documents in chunks, and launch a local web app for interactive analysis ✨

Here's an example using Semantra on a collection of US inaugural speeches. You can play with this document collection in the tutorial github.com
After downloading the documents, analyze them all at once with:
```
semantra us_inaugural_speeches/*.txt
```

github.com/freedmand/sema…

semantra/tutorial.md at main · freedmand/semantra

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

Semantra is full of flexible options: you can run any Hugging Face transformers model, change the window sizes for the embeddings, switch up the results algorithm, and more.
Processed documents are cached by content so Semantra only ever does the initial processing work once.

I wrote documentation for Semantra in hopes it will be serviceable. Please let me know if you have any feedback, encounter any issues, or have any suggestions/ideas!
Repo: github.com
Tutorial: github.com
Guides: github.com

github.com/freedmand/sema…

semantra/guides.md at main · freedmand/semantra

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

github.com/freedmand/sema…

semantra/tutorial.md at main · freedmand/semantra

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

github.com/freedmand/sema…

GitHub - freedmand/semantra: Multi-tool for semantic search

Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...

Loading suggestions...

GitHub - freedmand/semantra: Multi-tool for semantic search

semantra/tutorial.md at main · freedmand/semantra

semantra/tutorial.md at main · freedmand/semantra

semantra/guides.md at main · freedmand/semantra

semantra/tutorial.md at main · freedmand/semantra

GitHub - freedmand/semantra: Multi-tool for semantic search

Categories

More from this author

Related Threads

Popular Threads

GitHub - freedmand/semantra: Multi-tool for semantic search

semantra/tutorial.md at main · freedmand/semantra

semantra/tutorial.md at main · freedmand/semantra

semantra/guides.md at main · freedmand/semantra

semantra/tutorial.md at main · freedmand/semantra

GitHub - freedmand/semantra: Multi-tool for semantic search

Categories

More from this author

Related Threads

Popular Threads

Unroll Thread