A modern Python library that provides shell-like utilities for file operations, text processing, and subprocess management. Inspired by Unix coreutils, pycoreux offers a Pythonic API for building portable, scriptable command-line workflows with ease.
- File Operations: Read, write, and manipulate files with ease
- Text Processing: Count lines/words, search patterns, head/tail operations
- Process Management: Execute subprocesses with simple APIs
- String Utilities: Pattern matching, text manipulation
- Path Operations: Directory listing, file system navigation
- Archive Operations: Create and extract tar/zip archives
- Pipeline Support: Chain operations together like shell pipes
pip install pycoreuxIf you're already familiar with shell scripting and the Unix toolset, here is a comprehensive guide to the equivalent pycoreux operation for each Unix command:
| Unix / shell | pycoreux equivalent |
|---|---|
cat file.txt |
FileOps.cat("file.txt") |
head -n 10 file.txt |
FileOps.head("file.txt", 10) |
tail -n 10 file.txt |
FileOps.tail("file.txt", 10) |
wc file.txt |
FileOps.wc("file.txt") |
ls -la |
FileOps.ls(".", show_hidden=True, long_format=True) |
grep pattern file.txt |
TextUtils.grep("pattern", "file.txt") |
grep -i pattern file.txt |
TextUtils.grep("pattern", "file.txt", ignore_case=True) |
grep -n pattern file.txt |
TextUtils.grep("pattern", "file.txt", line_numbers=True) |
grep -v pattern file.txt |
TextUtils.grep("pattern", "file.txt", invert=True) |
sort file.txt |
TextUtils.sort(lines) |
sort -r file.txt |
TextUtils.sort(lines, reverse=True) |
sort -n file.txt |
TextUtils.sort(lines, numeric=True) |
uniq file.txt |
TextUtils.uniq(lines) |
uniq -c file.txt |
TextUtils.uniq(lines, count=True) |
nl file.txt |
TextUtils.nl("file.txt") |
echo "hello world" |
TextUtils.echo("hello", "world") |
cut -d',' -f1 file.txt |
TextUtils.cut(line, delimiter=",", fields=1) |
sed 's/old/new/g' file.txt |
TextUtils.replace(text, "old", "new") |
find . -name "*.txt" |
PathUtils.find(".", name="*.txt") |
find . -type f |
PathUtils.find(".", type_filter="f") |
which python |
ProcessUtils.which("python") |
ps aux |
ProcessUtils.ps() |
tar -czf archive.tar.gz files/ |
ArchiveUtils.tar_create("archive.tar.gz", ["files/"], "gz") |
tar -xzf archive.tar.gz |
ArchiveUtils.tar_extract("archive.tar.gz") |
gzip file.txt |
ArchiveUtils.gzip_file("file.txt") |
gunzip file.txt.gz |
ArchiveUtils.gunzip_file("file.txt.gz") |
Let's see some simple examples. Suppose you want to read the contents of a file as a string:
from pycoreux import FileOps
content = FileOps.cat("test.txt")That looks straightforward enough, but suppose you now want to count the lines in that file:
lines, words, chars = FileOps.wc("test.txt")
print(f"File has {lines} lines")For something a bit more challenging, let's try finding all lines in the file that contain "Error":
from pycoreux import TextUtils
error_lines = TextUtils.grep("Error", "test.txt")
print(error_lines)Want to get just the first 10 lines of a file?
first_ten = FileOps.head("test.txt", 10)
print(first_ten)Let's combine operations - read a file, find lines containing "Error", and get only the first 5 matches:
# Read file and split into lines
content = FileOps.cat("test.txt")
lines = content.split('\n')
# Filter for error lines
error_lines = [line for line in lines if "Error" in line]
# Get first 5
first_five_errors = error_lines[:5]
print('\n'.join(first_five_errors))
# Or using grep directly for the same result
error_output = TextUtils.grep("Error", "test.txt")
error_lines = error_output.split('\n')
first_five = error_lines[:5]
print('\n'.join(first_five))Want to sort some data? No problem:
lines = ["zebra", "apple", "banana", "cherry"]
sorted_output = TextUtils.sort(lines)
print(sorted_output)
# Output:
# apple
# banana
# cherry
# zebraLet's try something more complex - count unique occurrences:
lines = ["apple", "apple", "banana", "apple", "cherry", "banana"]
unique_output = TextUtils.uniq(lines, count=True)
print(unique_output)
# Output:
# 2 apple
# 1 banana
# 1 apple
# 1 cherry
# 1 bananaRunning external commands:
from pycoreux import ProcessUtils
# Simple command execution
result = ProcessUtils.run("ls -la")
if result.success:
print(result.stdout)
# Capture output directly
output = ProcessUtils.capture("date")
print(f"Current date: {output.strip()}")
# Find executable location
python_path = ProcessUtils.which("python3")
print(f"Python is at: {python_path}")Working with archives:
from pycoreux import ArchiveUtils
# Create a tar.gz archive
ArchiveUtils.tar_create("backup.tar.gz", ["important_files/"], compression="gz")
# Extract it later
extracted_files = ArchiveUtils.tar_extract("backup.tar.gz", "restore/")
print(f"Extracted: {extracted_files}")
# Compress a single file
compressed_file = ArchiveUtils.gzip_file("large_file.txt")
print(f"Compressed to: {compressed_file}")Let's use pycoreux to write a program that system administrators might actually need. Suppose we want to analyze web server logs to find the most frequent visitors. Given an Apache log file, we want to extract IP addresses and count their occurrences.
In a shell script, you might do:
cut -d' ' -f 1 access.log | sort | uniq -c | sort -rn | head -10Here's the equivalent using pycoreux:
from pycoreux import FileOps, TextUtils
# Direct pipeline style - each step mirrors the Unix command
def analyze_access_log(log_file):
content = FileOps.cat(log_file) # cat access.log
ips = [TextUtils.cut(line, ' ', 1) for line in content.split('\n') if line.strip()] # cut -d' ' -f 1
sorted_ips = TextUtils.sort(ips) # sort
unique_counts = TextUtils.uniq(sorted_ips.split('\n'), count=True) # uniq -c
reverse_sorted = TextUtils.sort(unique_counts.split('\n'), reverse=True, numeric=True) # sort -rn
return FileOps.head(content=reverse_sorted, lines=10) # head -10
# Usage
print(analyze_access_log("access.log"))
# Or using the TextUtils.pipe function for functional composition:
from functools import partial
pipeline = TextUtils.pipe(
FileOps.cat,
lambda content: [TextUtils.cut(line, ' ', 1) for line in content.split('\n') if line.strip()],
TextUtils.sort,
lambda ips: TextUtils.uniq(ips.split('\n'), count=True),
lambda counts: TextUtils.sort(counts.split('\n'), reverse=True, numeric=True),
lambda sorted_counts: FileOps.head(content=sorted_counts, lines=10)
)
result = pipeline("access.log")Output:
16 176.182.2.191
7 212.205.21.11
1 190.253.121.1
1 90.53.111.17
You can chain operations together for powerful data processing, mimicking Unix shell pipelines:
from pycoreux import FileOps, TextUtils
# Example 1: Simple one-liner chaining
# Shell equivalent: cat app.log | grep "ERROR" | head -5
first_errors = FileOps.head(content=TextUtils.grep("ERROR", content=FileOps.cat("app.log")), lines=5)
print(first_errors)
# Example 2: Multi-step chaining (easier to read)
# Shell equivalent: cat app.log | grep "ERROR" | sort | head -3
content = FileOps.cat("app.log") # cat app.log
errors = TextUtils.grep("ERROR", content=content) # | grep "ERROR"
sorted_errors = TextUtils.sort(errors.split('\n')) # | sort
top_errors = FileOps.head(content=sorted_errors, lines=3) # | head -3
print(top_errors)
# Example 3: Processing multiple files
# Shell equivalent: cat *.log | grep "WARN" | head -10
import glob
all_logs = '\n'.join([FileOps.cat(f) for f in glob.glob("*.log")]) # cat *.log
warnings = TextUtils.grep("WARN", content=all_logs) # | grep "WARN"
first_warnings = FileOps.head(content=warnings, lines=10) # | head -10
print(first_warnings)pycoreux also provides command-line tools:
# Using the CLI scripts
python -m pycoreux.scripts.pycoreux_cli cat myfile.txt
python -m pycoreux.scripts.pycoreux_cli head -n 5 myfile.txt
python -m pycoreux.scripts.pycoreux_cli grep "pattern" myfile.txt
python -m pycoreux.scripts.pycoreux_cli wc myfile.txtcat(filepath)- Read and return file contentshead(filepath=None, lines=10, content=None)- Return first N lines as string (from file or content)tail(filepath, lines=10)- Return last N lines as stringls(path=".", show_hidden=False, long_format=False)- List directory contentswc(filepath)- Count lines, words, and characterstouch(filepath)- Create empty file or update timestampmkdir(dirpath, parents=False)- Create directoryrm(filepath, recursive=False)- Remove files or directories
echo(*args, sep=" ", end="\n")- Join and return arguments as stringgrep(pattern, filepath=None, content=None, ignore_case=False, line_numbers=False, invert=False)- Search for patterns (in file or content)nl(filepath, start=1, skip_empty=True)- Add line numberssort(lines, reverse=False, numeric=False)- Sort linesuniq(lines, count=False)- Remove duplicate consecutive linescut(line, delimiter="\t", fields=1)- Extract fields from linereplace(text, pattern, replacement, count=0, ignore_case=False)- Replace patterns in textwc(text)- Count words, lines, characters in textpipe(*functions)- Create a pipeline of functions for chaining operations
run(command, shell=True, **kwargs)- Execute command and return ProcessResultcapture(command, **kwargs)- Execute command and return stdoutpipe(commands)- Chain commands with pipeswhich(program)- Find program in PATHkill(pid, signal=15)- Send signal to processps()- List running processes
find(path=".", name=None, type_filter=None, max_depth=None)- Find files and directorieswhich_all(program)- Find all instances of program in PATHdu(path, human_readable=False)- Calculate disk usagechmod(path, mode)- Change file permissionsstat_info(path)- Get detailed file informationcopy(src, dst, recursive=False)- Copy files or directoriesmove(src, dst)- Move/rename files or directoriessymlink(target, link_name)- Create symbolic linkreadlink(path)- Read symbolic link target
tar_create(archive_path, files, compression=None)- Create tar archivetar_extract(archive_path, extract_to=".")- Extract tar archivetar_list(archive_path)- List tar archive contentszip_create(archive_path, files, compression_level=6)- Create zip archivezip_extract(archive_path, extract_to=".")- Extract zip archivezip_list(archive_path)- List zip archive contentsgzip_file(file_path, output_path=None)- Compress file with gzipgunzip_file(file_path, output_path=None)- Decompress gzip filecompress_file(file_path, method="gz")- Compress with specified methoddecompress_file(file_path)- Auto-detect and decompress file
# Clone the repository
git clone https://github.com/kumarmunish/pycoreux.git
cd pycoreux
# Install in development mode
pip install -e ".[dev]"
# Set up pre-commit hooks (optional)
pre-commit install
# Run tests
pytest
# Format code
black .
isort .
# Type checking
mypy pycoreux
# Run all checks (like CI)
black --check .
isort --check-only .
flake8 pycoreux
mypy pycoreux
pytestpycoreux is published on PyPI and can be installed using pip:
pip install pycoreux- Latest Version: 0.1.1
- License: MIT
- Python Support: 3.8+
- Platform: Cross-platform (Windows, macOS, Linux)
- Dependencies: No external dependencies required for core functionality
- Status: Alpha
- Intended Audience: Developers, System Administrators, DevOps Engineers
- Use Cases: Shell scripting in Python, file processing, log analysis, automation
- v0.1.1 (2025-07-31): Refactored to renamed functions for consistency
- v0.1.0 (2025-07-31): Initial release with core shell-like utilities
MIT License – see LICENSE file for details.
Have a shell utility you wish you could use from Python? Spotted a gap in the toolkit or have an idea for a new feature?
Contributions are welcome! Feel free to open an issue or submit a Pull Request — whether it’s for a new utility, improvement, or bug fix.
Let's make pycoreux even more powerful together.
