This project is a modular C implementation of a hash-based database system supporting both primary and secondary indexing on structured records. It is designed for efficient storage, retrieval, and management of records using static hashing and block-based file management. The system demonstrates practical database indexing, file organization, and extensible design for real-world data applications.
- Block File Layer (BF): Abstracts low-level file operations, block allocation, reading, writing, and error handling. All modules interact with data through this layer.
- Primary Hash Index (HT): Implements a static hash table for fast access to records by primary key (e.g.,
id). Provides creation, opening, closing, insertion, and search operations. - Secondary Hash Index (SHT): Implements secondary indexes (e.g., on
name,surname,address) using separate static hash tables. Each secondary index maintains references to primary records and supports efficient secondary-key queries. - Record Management: Defines the structure and handling of records, including insertion, deletion, and retrieval. Records are fixed-format for performance and simplicity.
- Statistics Module: Collects and reports statistics on hash file usage, such as block counts, bucket occupancy, and overflow handling.
- Creation and initialization of primary and multiple secondary hash index files for each database.
- Insertion of records via command line or batch file import.
- Search and retrieval by primary key or any supported secondary key.
- Block-level management for efficient storage and access.
- Collection of hash file statistics (block usage, bucket distribution, overflow analysis).
- Batch operations using input data files and automated test scripts.
- Uses fixed-size blocks for all file operations, with metadata stored in the first block of each file.
- Employs static hashing for both primary and secondary indexes, with configurable bucket counts.
- Each database is stored in its own subdirectory under
db/, with separate files for primary and each secondary index. - Secondary index entries reference the block location of the corresponding primary record.
- Overflow blocks are managed for buckets that exceed their capacity.
- Modular design separates block management, indexing, and record logic for maintainability.
src/— C source files for all modules (main programs, hashfile, shashfile, etc.)include/— Header files for all moduleslib/— Precompiled libraries (e.g., BF_64.a)build/— Compiled object files and executablesdb/— Generated database files during execution (organized by database name)data/— Input data files (e.g., records1K.txt)Makefile— Build and run instructionstest— Testing
- Database Creation:
- Use the
create <dbname> [buckets]command to create a new database. This creates a primary index and three secondary indexes (onname,surname, andaddress) in a subdirectory underdb/.
- Use the
- Selecting a Database:
- Use the
use <dbname>command to select a database for operations. Only one database can be in use at a time.
- Use the
- Inserting Records:
- Use
insert <id> <name> <surname> <address>to add a record to the selected database. The record is inserted into the primary index and all secondary indexes. - Use
insertfile <filename>to batch import records from a file (e.g., fromdata/records1K.txt).
- Use
- Searching Records:
- Use
search <id>to search for a record by primary key. - Use
searchs <field> <value>to search by a secondary key (name,surname, oraddress).
- Use
- Closing a Database:
- Use
closeto close the current database and release all resources.
- Use
- Exiting:
- Use
exitto close any open indexes and exit the program.
- Use
$ make
$ ./build/main_kernel
Commands:
create <dbname> [buckets] Create a new database with optional bucket count (default 100)
use <dbname> Select a database to use
close Stop using the current database
insert <id> <name> <surname> <address>
Insert a record into the selected database
insertfile <filename> Insert records from a file into the selected database
search <id> Search for a record by id in the selected database
searchs <field> <value> Search for records by secondary field (name, surname, address)
help Show this help message
exit Exit the program
db> create mydb 100
Databases created.
db> use mydb
mydb> insert 1 John Smith Athens
Record inserted.
mydb> search 1
id: 1 name: John surname: Smith address: Athens
mydb> searchs surname Smith
id: 1 name: John surname: Smith address: Athens
mydb> close
db> exit
- Input files for batch import should be placed in the
data/directory. - All generated database files are stored in
db/<dbname>/. - Only one database can be in use at a time; use
closebefore switching. - The system is modular and can be extended to support additional secondary indexes or record formats.