Skip to content

Commit

Permalink
started adding the files for the file_system provider
Browse files Browse the repository at this point in the history
  • Loading branch information
DinisCruz committed Jan 11, 2025
1 parent 779aab1 commit 158dc48
Show file tree
Hide file tree
Showing 9 changed files with 336 additions and 0 deletions.
129 changes: 129 additions & 0 deletions docs/providers/filesystem-summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# MGraph FileSystem Implementation

## Overview
We've designed and started implementing a filesystem representation using the MGraph architecture. The implementation focuses on simplicity, leveraging existing MGraph functionality while adding only necessary extensions.

## Key Design Decisions

### 1. Minimal Extension Approach
- Extended base MGraph classes only where necessary
- Leveraged existing edge functionality instead of creating custom types
- Kept data structures flat and simple
- Used attributes for flexibility

### 2. Schema Structure
```mermaid
classDiagram
class SchemaFileSystemGraph {
+Random_Guid root_id
}
class SchemaFileSystemGraphConfig {
+bool allow_circular_refs
}
class SchemaFileSystemItem {
+str folder_name
+datetime created_at
+datetime modified_at
+bool is_root
}
class SchemaFolderNode {
}
class SchemaMGraphGraph {
}
class SchemaMGraphNode {
}
SchemaFileSystemGraph --|> SchemaMGraphGraph
SchemaFileSystemGraph --> SchemaFileSystemGraphConfig
SchemaFileSystemItem --|> SchemaMGraphNode
SchemaFolderNode --|> SchemaFileSystemItem
SchemaFileSystemGraph : << Extends base graph with root tracking >>
SchemaFileSystemGraphConfig : << Controls folder relationship constraints >>
SchemaFileSystemItem : << Common attributes for filesystem items >>
SchemaFolderNode : << Folder-specific behavior >>
```

### 3. Important Architecture Decisions
- Paths are calculated from structure rather than stored
- Root tracking at graph level
- Circular reference control via configuration
- Timestamps in UTC
- Base attributes in Schema__File_System__Item

## Implementation Details

### Core Classes
1. `Schema__File_System__Graph__Config`
- Controls filesystem behavior
- Manages circular reference allowance

2. `Schema__File_System__Graph`
- Tracks root folder
- Manages overall structure

3. `Schema__File_System__Item`
- Common filesystem item properties
- Timestamp management
- Root status tracking

4. `Schema__Folder__Node`
- Folder-specific functionality
- Inherits common filesystem properties

### Key Features
- Automatic timestamp management
- Type-safe implementation
- Flexible attribute system
- Path calculation from structure
- Circular reference prevention

## Next Steps

### Immediate Tasks
1. Implement model layer classes
2. Add validation methods
3. Create helper functions for common operations
4. Develop test cases

### Future Enhancements
1. Add file support
2. Implement search capabilities
3. Add persistence layer
4. Create visualization tools

## Technical Debt/Considerations
- Path calculation performance for deep structures
- Timestamp synchronization in distributed systems
- Edge case handling for circular references
- Performance optimization for large directory structures

## Links to Key Files
- Schema implementations
- Configuration classes
- Core MGraph extensions
- Test suites (to be implemented)

## Usage Example (Planned)
```python
# Create filesystem graph
fs_graph = Schema__File_System__Graph()

# Create root folder
root = Schema__Folder__Node(
folder_name="/",
is_root=True
)

# Add folders
docs = Schema__Folder__Node(folder_name="docs")
src = Schema__Folder__Node(folder_name="src")

# Create relationships
fs_graph.add_edge(root.node_id, docs.node_id)
fs_graph.add_edge(root.node_id, src.node_id)
```

## Notes for Next Phase
- Consider implementing file operations
- Plan for permission system
- Think about versioning
- Consider metadata extensions
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from mgraph_ai.mgraph.schemas.Schema__MGraph__Graph import Schema__MGraph__Graph
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Graph__Config import Schema__File_System__Graph__Config


class Schema__File_System__Graph(Schema__MGraph__Graph):
graph_config: Schema__File_System__Graph__Config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from mgraph_ai.mgraph.schemas.Schema__MGraph__Graph__Config import Schema__MGraph__Graph__Config


class Schema__File_System__Graph__Config(Schema__MGraph__Graph__Config):
allow_circular_refs: bool = False
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from mgraph_ai.mgraph.schemas.Schema__MGraph__Node import Schema__MGraph__Node
from osbot_utils.helpers.Timestamp_Now import Timestamp_Now


class Schema__File_System__Item(Schema__MGraph__Node):
folder_name : str
created_at : Timestamp_Now
modified_at : Timestamp_Now
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Item import Schema__File_System__Item

class Schema__Folder__Node(Schema__File_System__Item):
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
from unittest import TestCase
from mgraph_ai.mgraph.schemas.Schema__MGraph__Edge import Schema__MGraph__Edge
from osbot_utils.helpers.Random_Guid import Random_Guid
from osbot_utils.helpers.Timestamp_Now import Timestamp_Now
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Graph import Schema__File_System__Graph
from mgraph_ai.providers.file_system.schemas.Schema__Folder__Node import Schema__Folder__Node
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Graph__Config import Schema__File_System__Graph__Config


class test_Schema__File_System__Graph(TestCase):

def setUp(self): # Initialize test data
self.graph_config = Schema__File_System__Graph__Config(graph_id = Random_Guid(),
allow_circular_refs = False )

self.fs_graph = Schema__File_System__Graph(nodes = {} ,
edges = {} ,
graph_config = self.graph_config ,
graph_type = Schema__File_System__Graph)

def test_init(self): # Tests basic initialization and type checking
assert type(self.fs_graph) is Schema__File_System__Graph
assert type(self.fs_graph.graph_config) is Schema__File_System__Graph__Config
assert self.fs_graph.graph_config == self.graph_config
assert len(self.fs_graph.nodes) == 0
assert len(self.fs_graph.edges) == 0

def test_type_safety_validation(self): # Tests type safety validations
with self.assertRaises(ValueError) as context:
Schema__File_System__Graph(nodes = "not-a-dict", # Should be Dict
edges = {},
graph_config = self.graph_config,
graph_type = Schema__File_System__Graph)
assert 'Invalid type for attribute' in str(context.exception)

def test_add_folder(self): # Tests adding a folder node
folder_node = Schema__Folder__Node(folder_name = "test_folder",
created_at = Timestamp_Now() ,
modified_at = Timestamp_Now() ,
attributes = {} ,
node_config = None ,
node_type = Schema__Folder__Node,
value = None )

# Add folder to graph
self.fs_graph.nodes[Random_Guid()] = folder_node
assert len(self.fs_graph.nodes) == 1
assert isinstance(list(self.fs_graph.nodes.values())[0], Schema__Folder__Node)

def test_folder_structure(self): # Tests creating a folder structure
# Create root folder
root_folder = Schema__Folder__Node(folder_name = "/",
created_at = Timestamp_Now(),
modified_at = Timestamp_Now(),
attributes = {},
node_config = None,
node_type = Schema__Folder__Node,
value = None)
root_id = Random_Guid()
self.fs_graph.nodes[root_id] = root_folder

# Create child folder
child_folder = Schema__Folder__Node(folder_name = "docs" ,
created_at = Timestamp_Now() ,
modified_at = Timestamp_Now() ,
attributes = {} ,
node_config = None ,
node_type = Schema__Folder__Node ,
value = None )
child_id = Random_Guid()
self.fs_graph.nodes[child_id] = child_folder
edge_id = Random_Guid() # Add edge between folders

self.fs_graph.edges[edge_id] = Schema__MGraph__Edge.from_json( {"from_node_id": root_id,
"to_node_id" : child_id })

assert len(self.fs_graph.nodes) == 2
assert len(self.fs_graph.edges) == 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import pytest
from unittest import TestCase
from osbot_utils.helpers.Random_Guid import Random_Guid
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Graph__Config import Schema__File_System__Graph__Config


class test_Schema__File_System__Graph__Config(TestCase):

def setUp(self): # Initialize test data
self.graph_id = Random_Guid()
self.allow_circular_refs = False
self.graph_config = Schema__File_System__Graph__Config(
graph_id = self.graph_id,
allow_circular_refs = self.allow_circular_refs)

def test_init(self): # Tests basic initialization and type checking
assert type(self.graph_config) is Schema__File_System__Graph__Config
assert self.graph_config.graph_id == self.graph_id
assert self.graph_config.allow_circular_refs == self.allow_circular_refs

def test_type_safety_validation(self): # Tests type safety validations
with pytest.raises(ValueError, match="Invalid type for attribute 'graph_id'. Expected '<class 'osbot_utils.helpers.Random_Guid.Random_Guid'>' but got '<class 'str'>'"):
Schema__File_System__Graph__Config(graph_id = "not-a-guid", # Should be Random_Guid
allow_circular_refs = self.allow_circular_refs)

with pytest.raises(ValueError, match="Invalid type for attribute 'allow_circular_refs'. Expected '<class 'bool'>' but got '<class 'str'>'"):
Schema__File_System__Graph__Config(graph_id = self.graph_id,
allow_circular_refs = "not-a-bool") # Should be bool


Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import pytest
import re
from unittest import TestCase
from osbot_utils.helpers.Random_Guid import Random_Guid
from osbot_utils.helpers.Timestamp_Now import Timestamp_Now
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Item import Schema__File_System__Item

class test_Schema__File_System__Item(TestCase):

def setUp(self): # Initialize test data
self.folder_name = "test_folder"
self.created_at = Timestamp_Now()
self.modified_at = Timestamp_Now()
self.node_id = Random_Guid()
self.fs_item = Schema__File_System__Item(folder_name = self.folder_name ,
created_at = self.created_at ,
modified_at = self.modified_at ,
attributes = {} ,
node_config = None ,
node_type = Schema__File_System__Item,
value = None )

def test_init(self): # Tests basic initialization and type checking
assert type(self.fs_item) is Schema__File_System__Item
assert self.fs_item.folder_name == self.folder_name
assert self.fs_item.created_at == self.created_at
assert self.fs_item.modified_at == self.modified_at

def test_type_safety_validation(self): # Tests type safety validations
with self.assertRaises(ValueError) as context:
Schema__File_System__Item(folder_name = 123, # Should be str
created_at = self.created_at ,
modified_at = self.modified_at ,
attributes = {} ,
node_config = None ,
node_type = Schema__File_System__Item ,
value = None )
assert 'Invalid type for attribute' in str(context.exception)

with pytest.raises(ValueError, match=re.escape("invalid literal for int() with base 10: 'not-a-timestamp'")):
Schema__File_System__Item(folder_name = self.folder_name ,
created_at = "not-a-timestamp" , # Should be Timestamp_Now
modified_at = self.modified_at ,
attributes = {} ,
node_config = None ,
node_type = Schema__File_System__Item ,
value = None )

Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
from unittest import TestCase
from osbot_utils.helpers.Timestamp_Now import Timestamp_Now
from mgraph_ai.providers.file_system.schemas.Schema__File_System__Item import Schema__File_System__Item
from mgraph_ai.providers.file_system.schemas.Schema__Folder__Node import Schema__Folder__Node

class test_Schema__Folder__Node(TestCase):

def setUp(self): # Initialize test data
self.folder_name = "test_folder"
self.created_at = Timestamp_Now()
self.modified_at = Timestamp_Now()
self.folder_node = Schema__Folder__Node(folder_name = self.folder_name ,
created_at = self.created_at ,
modified_at = self.modified_at ,
attributes = {} ,
node_config = None ,
node_type = Schema__Folder__Node ,
value = None )

def test_init(self): # Tests basic initialization and type checking
assert type(self.folder_node) is Schema__Folder__Node
assert self.folder_node.folder_name == self.folder_name
assert self.folder_node.created_at == self.created_at
assert self.folder_node.modified_at == self.modified_at

def test_inheritance(self): # Tests inheritance from File_System_Item
assert isinstance(self.folder_node, Schema__File_System__Item)

0 comments on commit 158dc48

Please sign in to comment.