Document Store
PageCoordinates
module-attribute
Document
dataclass
Document holds text and metadata of a document.
Examples of documents are PDFs, Word documents, etc. A collection of related text in an NLP application can be thought of a document as well.
DocumentStoreBase
Abstract class for a store that can store text, and metadata from documents.
The store can be queried by text for similar documents.
add_document
abstractmethod
Adds a document to the store.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
document |
Document
|
Document object to be added |
required |
Returns:
Type | Description |
---|---|
None
|
None if the document was added successfully |
add_text
abstractmethod
Adds a text to the store. Args: text: Text to add. meta: Metadata to associate with the text.
Returns:
Type | Description |
---|---|
str
|
The id of the text. |
add_texts
abstractmethod
Adds a list of texts to the store. Args: texts: List of texts to add, and their associalted metadata. example: [{"I am feeling good", {"sentiment": "postive"}}]
Returns:
Type | Description |
---|---|
List[str]
|
List of ids of the texts. |
search
abstractmethod
Searches for pages which contain the text similar to the query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
Text to search for. |
required |
k |
int
|
Number of similar pages to return. |
4
|
Returns:
Type | Description |
---|---|
List[Page]
|
List[Pages] List of pages which contains similar texts |
Page
dataclass
RealEphemeralDocumentStore
EphemeralDocumentStore is a document store that stores the documents on local disk and use a ephemeral vector store like Faiss.
Creates a new EphemeralDocumentStore.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
vector_db |
VectorDBBase
|
VectorDBBase instance to use for storing the vectors. |
required |
path |
Optional[str]
|
Path to the database file store metadata. |
None
|