Introduction
2daydir is a lightweight, cross‑platform directory management system designed to streamline file organization for individuals and organizations that rely on time‑sensitive data handling. At its core, the tool provides a deterministic naming convention and automated archival workflow that reduces manual intervention in daily file processing. By combining a simple command‑line interface with an extensible plugin architecture, 2daydir enables users to enforce consistency across disparate file sources and to maintain a clear audit trail of modifications.
Unlike generic file‑management utilities, 2daydir emphasizes temporal semantics. Files are tagged with a two‑day window that aligns with typical business cycles, such as financial reporting or laboratory sample processing. The system automatically segregates data into “current” and “archived” directories based on the age of each file, ensuring that the working set remains small and that long‑term storage complies with regulatory retention policies.
The design philosophy of 2daydir is influenced by the need for reproducibility in scientific research and the requirement for operational transparency in enterprise environments. By making the directory structure both human‑readable and machine‑parsable, the tool supports integration with version control systems, continuous‑integration pipelines, and data‑analysis workflows.
History and Development
The initial concept for 2daydir emerged in 2018 during a research project that required frequent ingestion of time‑stamped laboratory logs. The prototype was written in Python and demonstrated the viability of a two‑day retention policy. Following a pilot deployment in a small academic laboratory, the project evolved into a community‑driven open‑source initiative. The first public release, version 1.0, was published in early 2019 under the Apache License 2.0.
Subsequent releases focused on expanding platform compatibility, adding support for Windows NTFS journaling and macOS HFS+ metadata. A major milestone was the introduction of a declarative configuration language in version 2.0, which allowed administrators to specify complex directory hierarchies without editing code. The latest stable version, 3.1, released in mid‑2025, incorporates native integration with popular cloud storage services such as Amazon S3 and Google Cloud Storage.
Throughout its evolution, 2daydir has maintained backward compatibility. The developers introduced a migration tool that reads legacy configuration files and generates equivalent scripts for newer releases, ensuring that existing deployments can upgrade without data loss or operational downtime.
Architecture and Design
Core Components
2daydir is structured around four primary components: the command‑line interface (CLI), the configuration engine, the file‑monitoring daemon, and the archival subsystem. The CLI is the user‑facing entry point, translating user commands into operations on the file system. The configuration engine parses the declarative syntax and builds an in‑memory representation of the desired directory topology.
The monitoring daemon runs as a background service and watches specified source directories for file creation, modification, and deletion events. Using platform‑specific file‑system watchers, such as inotify on Linux or FSEvents on macOS, it ensures that file movements and renames are captured in real time. When a file event occurs, the daemon consults the configuration engine to determine the appropriate target directory based on the file’s timestamp and the two‑day rule.
The archival subsystem implements the retention policy. It periodically scans the archive directories, removing files that exceed the configured retention horizon. The subsystem also logs archival actions to a central audit file, facilitating compliance audits and forensic investigations.
Plugin Architecture
2daydir’s plugin system allows developers to extend functionality without modifying the core codebase. Plugins can hook into various lifecycle events, such as before a file is moved, after a directory is created, or when an error occurs. The plugin API is documented as a set of Python classes with well‑defined interfaces, enabling third‑party contributors to implement features such as encryption, custom naming schemes, or integration with external notification services.
The plugin manager loads all plugins from a designated directory during startup. It performs a sanity check on each plugin’s metadata, ensuring that required entry points are present and that the plugin’s dependencies are satisfied. If a plugin fails to load, the system logs the error but continues operation, providing resilience against faulty extensions.
Cross‑Platform Compatibility
The developers of 2daydir intentionally avoided platform‑specific libraries whenever possible. The core library relies on the standard Python 3.8+ library and a small subset of third‑party packages that are available across major operating systems. For file‑system monitoring, the tool uses conditional imports to select the appropriate watcher based on the host OS. This design choice minimizes the installation footprint and reduces the complexity of deployment scripts.
Core Features
Deterministic Two‑Day Naming
When a file is first introduced into the system, 2daydir generates a destination path that encodes the file’s creation date and the current two‑day window. For example, a file created on 2025‑02‑10 would be moved to a path resembling /data/2025/02/10_02. The suffix indicates the two‑day period (10‑11 or 12‑13) and allows for straightforward retrieval of files within a specific time slice.
Automated Archival
Files are automatically moved from the active working directory to the archive directory once they cross the two‑day threshold. The archival subsystem can be configured to compress files on the fly, reducing storage consumption. Additionally, the tool supports hierarchical archival, where older files are promoted to secondary storage tiers, such as tape or cold cloud buckets.
Declarative Configuration
Users describe the desired directory structure and retention policies using a YAML‑style syntax. The configuration file includes sections for source directories, target templates, retention periods, compression settings, and plugin declarations. Because the configuration is declarative, it is easy to version control and to share across teams.
Audit Trail and Logging
Every operation performed by 2daydir is recorded in a structured log file. The log includes timestamps, operation types, source and destination paths, and the user performing the action. In the event of an error, the log captures exception details. The audit trail is designed to satisfy common compliance frameworks such as ISO 27001 and GDPR, which require traceability of data handling processes.
Extensibility via Plugins
Plugins can introduce custom file‑selection logic, such as excluding files that match a regular expression or including only files larger than a certain size. They can also add features like email notifications when a file reaches the archival threshold, or integration with a message queue to trigger downstream processing pipelines.
Cloud Integration
2daydir can interface with object‑storage services via a simple plugin. When configured, the tool streams files directly to a cloud bucket after they are processed, bypassing the local archive. The plugin handles authentication, multipart uploads, and checksum validation, ensuring data integrity during transfer.
Implementation Details
File Monitoring
On Linux, the daemon registers inotify watches for each source directory. For each event, the handler checks if the file is a regular file, ignoring directory events and temporary files. On macOS, FSEvents is used, which provides a coalesced stream of file system changes. Windows support is achieved via the FileSystemWatcher class from the .NET runtime, accessed through a C# wrapper that exposes a Python API.
Concurrency Management
2daydir uses a thread pool to process file events concurrently. The pool size is configurable, allowing the user to balance throughput against CPU usage. File operations are wrapped in file‑lock mechanisms to prevent race conditions, particularly when multiple instances of 2daydir may run on the same host or when multiple source directories share a common target.
Configuration Parsing
The configuration engine leverages the ruamel.yaml library to parse YAML files, preserving comments and formatting for readability. It then validates the configuration against a JSON schema, ensuring that required fields are present and that values fall within acceptable ranges. Validation errors are reported with line numbers to aid debugging.
Archival Scheduling
Archival checks run on a schedule defined by a cron‑style expression in the configuration file. The scheduler uses the APScheduler library to fire archival jobs at the top of each hour. Each job scans the archive directories, applies retention rules, and removes files that exceed the configured threshold.
Logging Format
Log entries are written in JSON Lines format, which facilitates downstream processing with tools such as jq or log‑aggregation services. Each line contains fields like timestamp, level, message, event_type, source_path, destination_path, and user. The logger also supports log rotation based on file size or age, preventing uncontrolled log growth.
Testing Strategy
The codebase includes a comprehensive suite of unit tests using the pytest framework. Integration tests simulate file creation, movement, and archival in a temporary directory structure, verifying that the system behaves as expected under various scenarios. Continuous integration pipelines run the full test suite on every commit, ensuring regression safety.
Use Cases
Scientific Data Acquisition
Research laboratories often generate large volumes of raw data daily. 2daydir can be configured to accept data from instrument output directories, rename files according to a standard convention, and archive them after a two‑day window. The deterministic naming aids in batch processing scripts and ensures reproducibility of analyses.
Financial Reporting
Financial institutions require strict control over transaction logs. 2daydir can enforce a two‑day retention policy for audit trails, moving completed reports to an archive while keeping the latest two days in an active staging area. The audit logs capture every file movement, satisfying regulatory requirements.
Log Management in DevOps
Systems that generate rotating logs can benefit from 2daydir’s automated archival. By directing logs to a target template that includes timestamps, the tool can compress and move older logs to cold storage, freeing up disk space while preserving historical data for compliance or debugging purposes.
Media Asset Management
Production houses that handle video and audio files can use 2daydir to organize assets by project and date. The two‑day rule can align with episode release schedules, ensuring that assets from the current production cycle are readily accessible while older material is archived.
Compliance‑Aware File Sharing
Organizations that share sensitive documents with external partners can use 2daydir to enforce a policy that limits the exposure window of each file. Once the two‑day window passes, the file is removed from the shared directory and archived, reducing the risk of accidental exposure.
Integration
Version Control Systems
Because 2daydir’s directory structure is deterministic, it can be incorporated into a Git repository for small‑scale projects. The tool can automatically commit new files to the repository after moving them to the target directory, ensuring that the file history is captured.
Continuous‑Integration Pipelines
CI systems such as Jenkins or GitHub Actions can invoke 2daydir as a step in a build process. For instance, after a test suite generates coverage reports, 2daydir can move the reports to an archive, maintaining a clean workspace for subsequent jobs.
Message Queue Systems
Plugins can publish messages to RabbitMQ or Kafka when a file is archived. Downstream consumers can then trigger analytics jobs, data quality checks, or notification services, enabling a fully automated data lifecycle.
Database Synchronization
In scenarios where file metadata must be persisted in a relational database, 2daydir can expose an API that writes entries to a PostgreSQL table whenever a file is moved. This allows queries based on upload date, retention status, or source directory.
Enterprise Storage Gateways
By integrating with storage gateway solutions like NetApp ONTAP or Dell EMC Isilon, 2daydir can route files to appropriate storage tiers based on age and access patterns, optimizing performance and cost.
Community and Ecosystem
Contributors
The project is maintained by a core team of developers from academia and industry. Contributions are accepted via pull requests on the repository, and the community follows a code of conduct that emphasizes respectful collaboration. Recent contributors have added support for containerized deployments and improved the documentation for plugin developers.
Documentation
The official documentation includes a user guide, developer manual, and API reference. It is hosted as a static website built with MkDocs and is updated in tandem with major releases. Documentation is available in multiple languages, including English, Spanish, and Mandarin, to accommodate a global user base.
Training Materials
Several online courses and workshops cover 2daydir usage. These resources cover basic installation, configuration best practices, and advanced topics such as plugin development and integration with cloud services. The training materials are available as downloadable PDFs and video lectures.
Community Forums
An active mailing list and a discussion forum provide venues for users to ask questions, report bugs, and propose enhancements. The forum also hosts tutorials written by power users, offering insights into real‑world deployments.
Commercial Support
Several consulting firms offer commercial support for 2daydir deployments in regulated environments. Services include custom plugin development, integration with enterprise logging solutions, and training sessions for staff.
Performance and Benchmarking
Throughput
Benchmark tests performed on a mid‑range workstation (Intel Core i7, 16 GB RAM, NVMe SSD) indicate that 2daydir can process approximately 1,200 file events per second under optimal conditions. This throughput scales linearly with the number of worker threads until CPU saturation occurs.
Latency
The average latency from file creation to completion of the archival action is measured at 12 ms on Linux. On macOS, the latency is slightly higher due to the overhead of the FSEvents API. Windows shows the highest latency at 18 ms, attributable to the file system watcher’s polling interval.
Memory Footprint
During idle periods, the daemon consumes around 45 MB of RAM. When processing bursts of events, the peak memory usage rises to 120 MB, primarily due to the temporary storage of event metadata and the JSON parsing of configuration files.
Disk I/O
Compression during archival reduces the average file size by 35 % for typical text logs and 55 % for media files. The tool employs buffered I/O to minimize disk seek times, achieving read/write speeds that match the underlying storage device’s specifications.
Scalability
Deployments on a 10‑node cluster, each running a separate instance of 2daydir with a dedicated source directory, demonstrate near‑linear scaling of throughput. The architecture is inherently stateless, allowing horizontal scaling without coordination between nodes.
Security Considerations
Access Controls
2daydir respects the underlying operating system’s file permissions. When moving files, it preserves the original permission bits and updates ownership if configured to do so. Users can restrict operations by setting file system ACLs on source directories.
Input Validation
All file names are sanitized to remove path traversal characters. The daemon rejects files that contain null bytes or sequences that could lead to injection attacks when logged or when passed to plugins.
Data Integrity
Checksums (SHA‑256) are calculated for each file before and after movement. For cloud uploads, the plugin verifies the checksum at the destination, ensuring that data corruption during transfer is detected.
Plugin Isolation
Plugins run in separate processes or isolated containers to limit their impact on the core daemon. This isolation prevents malicious or buggy plugins from compromising the entire system.
Authentication
When integrating with cloud services, 2daydir uses token‑based authentication. Tokens are stored in a local keyring (using the keyring Python library) and never written to disk in plaintext. The system also supports role‑based access to its API.
Audit Logging
All logs are written in append‑only mode and are protected by file system permissions. The tool can be configured to encrypt log files using AES‑256 in CBC mode, adding an extra layer of protection for highly sensitive environments.
Compliance
By providing traceable logs and deterministic directory structures, 2daydir satisfies key compliance frameworks. However, users should review the specific legal requirements in their jurisdiction to ensure that the tool’s default behavior aligns with mandated controls.
Future Directions
Real‑Time Analytics
Upcoming releases plan to expose a WebSocket API that streams file metadata in real time, enabling dashboards that visualize the data lifecycle.
Adaptive Retention Policies
Research is underway to support dynamic retention rules that adjust based on file age, size, or usage frequency. This would allow the system to move files to higher storage tiers when they become inactive.
Zero‑Trust Architecture
Plans include integrating with security‑by‑design platforms such as HashiCorp Vault for secret management, enabling a zero‑trust deployment model.
Multi‑Tenant Support
> Future enhancements will enable a single instance of 2daydir to manage multiple tenants, each with isolated configurations and logging, simplifying management for service providers.Machine‑Learning‑Based File Selection
Machine‑learning models can be integrated as plugins to automatically identify files that require special handling, such as those flagged for manual review.
Conclusion
2daydir is a robust, extensible solution for managing file lifecycles across a variety of domains. Its deterministic naming, automated archival, and comprehensive audit logging provide a foundation that satisfies both operational efficiency and compliance requirements. By embracing a plugin architecture, 2daydir remains adaptable to evolving workflows, while its community ecosystem ensures ongoing improvement and support.
No comments yet. Be the first to comment!