Skip to content

Commit

Permalink
removed tesseract ref
Browse files Browse the repository at this point in the history
  • Loading branch information
Sid Mohan authored and Sid Mohan committed Sep 4, 2024
1 parent 25e7e92 commit 6f556f8
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 96 deletions.
140 changes: 47 additions & 93 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,105 +48,14 @@ datafog-instructor show-fogprint

What is a fogprint? A fogprint is a template that you can re-use, with specific configuration settings for the models, filenames, model_ids, and other important information to instruct an LLM to detect entities. This file is currently saved as fogprint.json.

## Features
### Verify the installation:

### Detect Entities

### Manage Entity Types

## Default Entity Types

## Error Handling

## Development and Testing

For development purposes, you can install additional dependencies:

```
pip install requirements-dev.txt
## Documentation
To build the documentation locally:
```

pip install datafog-instructor[docs]
cd docs
sphinx

```
The documentation will be available in the `docs/_build/html` directory.
## Contributing
Contributions to the DataFog Instructor SDK are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License.
## Support
If you encounter any problems or have any questions, please open an issue on the GitHub repository or join our Discord community at https://discord.gg/bzDth394R4.
## Links
- Homepage: https://datafog.ai
- Documentation: https://docs.datafog.ai
- Twitter: https://twitter.com/datafoginc
- GitHub: https://github.com/datafog/datafog-instructor
# Entity Detection SDK: Installation and Getting Started Guide
Welcome to the Entity Detection SDK! This powerful tool uses transformers and regex-constrained outputs to accurately identify entities in text. Follow this guide to get up and running quickly.
## Installation
1. Clone the repository:
```

git clone https://github.com/your-username/entity-detection-sdk.git
cd entity-detection-sdk

```
2. Create a virtual environment (recommended):
```

python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`

```
3. Install the required dependencies:
```

pip install -r requirements.txt

```
## Getting Started
1. Initialize the SDK:
```

python -m entity_detection init

```
This will create a `fogprint.json` file with default settings.
2. Verify the installation:
```
python -m entity_detection list-entities
```

You should see a list of default entity types: PERSON, COMPANY, LOCATION, and ORG.

## Sample Operations
Expand Down Expand Up @@ -177,11 +86,13 @@ To change the default model or pattern:

1. Edit the `fogprint.json` file directly, or
2. Use the `init` command with the `--force` flag:

```
python -m entity_detection init --force
```

Follow the prompts to update your configuration.

## Advanced Usage
Expand All @@ -204,4 +115,47 @@ Exciting features are coming soon to enhance the SDK's capabilities:
2. **Embeddings Layer**: Future versions will incorporate an embeddings layer to improve entity recognition accuracy.

Stay tuned for updates!

```
## Development and Testing
For development purposes, you can install additional dependencies:
```

pip install requirements-dev.txt

## Documentation

To build the documentation locally:

```
pip install datafog-instructor[docs]
cd docs
sphinx
```

The documentation will be available in the `docs/_build/html` directory.

## Contributing

Contributions to the DataFog Instructor SDK are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License.

## Support

If you encounter any problems or have any questions, please open an issue on the GitHub repository or join our Discord community at https://discord.gg/bzDth394R4.

## Links

- Homepage: https://datafog.ai
- Documentation: https://docs.datafog.ai
- Twitter: https://twitter.com/datafoginc
- GitHub: https://github.com/datafog/datafog-instructor
7 changes: 7 additions & 0 deletions config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"model_id": "gpt2",
"filename": "gpt2",
"default_pattern": "(PERSON|COMPANY|LOCATION|ORG)",
"model_type": "huggingface",
"test_key": "test_value"
}
4 changes: 1 addition & 3 deletions tox.ini
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[tox]
envlist = py310,py311,py312
envlist = py310
isolated_build = True

[testenv]
Expand All @@ -10,11 +10,9 @@ deps =
-r requirements-dev.txt
extras = all
allowlist_externals =
tesseract
pip
commands =
pip install --no-cache-dir -r requirements-dev.txt
tesseract --version
pytest {posargs} -v -s --cov=datafog --cov-report=term-missing

[testenv:lint]
Expand Down

0 comments on commit 6f556f8

Please sign in to comment.