Rattibha

Rohan

11 Tweets 149 reads Oct 20, 2023

Extract tables from documents using @llama_index UnstructuredElementParser and then use RecursiveRetriever to enable hybrid tabular/semantic queries and also comparisons over multiple docs.
Let's see how to use this advanced RAG technique 🧵👇

@llama_index First we load the documents.
Then we create the new UnstructuredElementNodeParser from LLamaIndex.

@llama_index This parser:
- extracts tables from data
- converts those tables to Dataframe
- for each of those tables, it creates 2 nodes
- one Table Node that contains the Dataframe as string
- another IndexNode that stores the summary of that table and a reference to that Table Node

@llama_index Next we partition the nodes using this built-in function of the Unstructured parser.
Here BaseNodes contain the regular nodes and the IndexNodes (not the Table Nodes)
NodeMapping contains {id->Node} mapping for those remaining Table Nodes.

@llama_index Next, we create the vector_index using these BaseNodes (that doesn't have the Table nodes) and then create a vector_retriever with this index.

@llama_index Then, we create the RecursiveRetriever (detailed guide on this amazing retriever is in the oven , so stay tuned 🔥)
1st argument is the id of the recursion root, this is the retriever from where recursive retriever starts retrieving.

@llama_index 2nd argument is a dictionary containing all the retrievers, here we have only one, the root one, which we created using the base nodes earlier.

@llama_index For this use case, we only supply the NodeMapping of the Table nodes as node_dict argument to the RecursiveRetriever.
These node(s) will be retrieved if the IndexNodes referring to one of these Table nodes is retrieved by our root retriever.

@llama_index Now if we try some queries referencing info from the table, we'll get better retrieval compared to the naive top-k RAG.

@llama_index Details about it on the official documentation:
#extract-elements" target="_blank" rel="noopener" onclick="event.stopPropagation()">docs.llamaindex.ai

docs.llamaindex.ai/en/stable/exam…

Auto light/dark mode

In this example, we show how to ask questions over 10K with understanding of both the unstructured t...

@llama_index Thanks for reading.
I write about AI, LLMs, RAG etc. and try to make complex topics as easy as possible.
Stay tuned for more ! 🔥 #AI #RAG

Loading suggestions...

Auto light/dark mode

Categories

More from this author

Related Threads

Popular Threads

Auto light/dark mode

Categories

More from this author

Related Threads

Popular Threads

Unroll Thread