I'm excited to announce Semantra: an open source multi-tool for semantic search ๐ github.com
- Launch a local search engine over text and PDF files
- Search by concepts/meaning
- Refine results via tagging and adding/subtracting queries
Try it out now ๐๐๐
- Launch a local search engine over text and PDF files
- Search by concepts/meaning
- Refine results via tagging and adding/subtracting queries
Try it out now ๐๐๐
Semantra is built for those seeking needles in haystacks: journalists, researchers, students, and more.
I've found it useful personally across a wide range of content, including books, reports, speeches, and government documents.
Tutorial: github.com
I've found it useful personally across a wide range of content, including books, reports, speeches, and government documents.
Tutorial: github.com
Semantra runs locally, keeping your data safe, or it can optionally use OpenAI's paid embedding models to offload computation.
Install with Python/pipx:
```
python3 -m pip install --user pipx
python3 -m pipx ensurepath
```
In a new terminal, run:
```
pipx install semantra
```
Install with Python/pipx:
```
python3 -m pip install --user pipx
python3 -m pipx ensurepath
```
In a new terminal, run:
```
pipx install semantra
```
To run Semantra over a collection of documents (text or pdf):
```
semantra <filenames>
```
It will download embedding models as needed, analyze the documents in chunks, and launch a local web app for interactive analysis โจ
```
semantra <filenames>
```
It will download embedding models as needed, analyze the documents in chunks, and launch a local web app for interactive analysis โจ
Here's an example using Semantra on a collection of US inaugural speeches. You can play with this document collection in the tutorial github.com
After downloading the documents, analyze them all at once with:
```
semantra us_inaugural_speeches/*.txt
```
After downloading the documents, analyze them all at once with:
```
semantra us_inaugural_speeches/*.txt
```
Semantra is full of flexible options: you can run any Hugging Face transformers model, change the window sizes for the embeddings, switch up the results algorithm, and more.
Processed documents are cached by content so Semantra only ever does the initial processing work once.
Processed documents are cached by content so Semantra only ever does the initial processing work once.
I wrote documentation for Semantra in hopes it will be serviceable. Please let me know if you have any feedback, encounter any issues, or have any suggestions/ideas!
Repo: github.com
Tutorial: github.com
Guides: github.com
Repo: github.com
Tutorial: github.com
Guides: github.com
github.com/freedmand/semaโฆ
semantra/guides.md at main ยท freedmand/semantra
Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...
github.com/freedmand/semaโฆ
semantra/tutorial.md at main ยท freedmand/semantra
Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...
github.com/freedmand/semaโฆ
GitHub - freedmand/semantra: Multi-tool for semantic search
Multi-tool for semantic search. Contribute to freedmand/semantra development by creating an account...
Loading suggestions...